14 - Data hunt resources
This week was supposed to be an invidividual mini-project that focused on hunting for data and documents for a story – records you can readily find and records you might need to FOIA. But I couldn’t figure out how to do that remotely, so we’re going to have two classes:
- Tuesday: A demo of how you might go about backgrounding a topic to find data and documents.
- Thursday: A lab in groups of 3 or 4 that does the same thing for a story idea.
Tuesday
Here are the resources from Tuesday that will be of use to you:
- The public records seminar from the Howard Center, which walk through how and why you might want to find public records.
- A filled-in Google Docs template of an outside-in research approach, starting with news and other secondary sources, and narrowing in on official government records and agencies. This has a lot of the search tips that I showed in class.
- One tip sheet on finding readily available data. This is the one with the News 21 Hate in America example.
- Another tip sheet on using the outside-in method of reporting to find public records for your story.
Here are some of the things we talked about in class:
-
Two versions of Google search parameters. The ones I use most are
intitle:
,intext:
,after:
orbefore:
, andsite:
. These come and go regularly, and I never know exactly what is true any given week. For example, one of these claims that Google ignores parentheses to group AND and OR statements, but they seem to work and the other tipsheet says they should. Google isn’t the most transparent company in the world when it comes to its search algorithms! -
How to create a custom Google search engine for just the sources you care about. If you want everything on a site, you’d enter, for example,
*.azcentral.com/*
. If you just want a subsection, enter the starting page, likehttps://pbs.org/frontline/
Our version of Nexis has very few publications in it, but it does have ways to target your search more specifically that are missing in Google.
- hlead(word or phrase) is anything that’s in the headline or lead paragraphs
- atleast10(word or phrase) means the term has to be mentioned at least 10 times
- length(>1000) means at least 1000 words.
- “stem” words, such as
evict*
to mean any word that begins with “evict” – the asterisk doesn’t work the same in Google, though it often finds variations of a word anyway.
Thursday
We’ll break into groups and do an in-class lab replicating the example from class – the outside in reporting strategy to find documents and data.
I’m in the market for good story ideas that make a good match for this assignment. I have had a few suggestions, but please Slack me with anything you think you’d like to try – I’ll let you know if it’s just too hard, or too easy, for this assignment. I need about 5-7 ideas.