Story memo guidelines

This project makes up 15 percent of your final grade. It involves analyzing a dataset that contains records that would be of interest to a story or feature.

Key deadlines:

  • March 6: A one-paragraph proposal with your dataset choice and a very brief description of what you hope to do with it.
  • April 5: An optional draft of your memo, with your documentation
  • April 12: Final story memo with documentation

Choosing a dataset

I’ve tried having people find their own data in the past, but it’s harder than it sounds to find one that’s right for this project. I’m happy to add options to this list if you have ideas – let me know by the end of January if you have a good candidate.

Here’s a list of datasets:

  • H1B Visa applications, from the US Labor Department. (I have data through FY 2019 (about 6 months old), but their link appears dead in the Labor Department. The USCIS also has data files, which don’t have quite as much information as the Labor Department’s. We can see which one is better - or if joining them works best! )
  • Contributions to federal candidates and political action committees from Arizona, for the 2020 election cycle through December 2019. We can reverse this and get a dataset of contributions TO Arizona candidates if you prefer.
  • A collection of tweets posted by Joe Arpao, the candidate and former Maricopa County sheriff, collected in a Workbench recipe for the past year. If you have another user you’d like to analyze, I can show you how to get the most recent 2000 tweets easily.
  • Facebook political ads - I’m still figuring out how to get this, which I want to do but may have to wait for NICAR for someone to help me.
  • A Spotify playlist that contains about 100-150 songs that you’d like to analyze. You need to make the playlist public and send me the link.
  • Education Department’s Civil Rights dataset, including discipline, crimes and arrests in schools, for News 21.
  • For the Howard Center: Prosecution decisions made under president Obama and Trump in federal courts, excluding immigration cases. This will be good to see, for example, if prosecutions of white collar crimes has fallen, or if the government has re-started prosecutions for low level drug offenses. Note that these do not contain names or case numbers – you’ll have to reverse engineer it in PACER or another court database to find the individual case details. This is a VERY complex database, and isn’t for the faint of heart, but will be very useful for stories.
  • Small Business Administration loans to Arizona businesses, including defaults.
  • All Olympics participants from 1898 in Athens to 2016 in Rio, from Kaggle.
  • The Lahman Baseball statistics database, using the R package, “lahman”, for statistics on players and teams from 1871 to 2019.
  • The Youth Behavioral Risk survey from the CDC, for those of you in the suicide project. This is a semi-annual survey of high school students which includes, among other questions, those regarding mental health, thoughts of suicide, and drug use. It’s a small sample so getting state level data will be tough, but you’ll get good long-term trends nationally.
  • …. or anything else we agree on after discussion.

The memo

The final memo can take several forms, but most will be traditional story pitches of about 1,000 words. They should include:

  • A nut graf or lede that clearly relates the theme of the story and the results of your analysis so far.
  • Examples of places, people, or other anecdotes derived from the data that illustrate the story
  • A description of the analysis and research you’ve done so far, and
  • A detailed description of the reporting that is left to do before you can finish the story (including document requests and street reporting).

Your story memo doesn’t have to result in a pitch for a typical “numbers” story. It can be a story that was derived from the data but has no numbers, such as a narrative about a person or place you found in the data. It can be an infographic or a visual story of some kind. Or it can be a more traditional investigative piece.

Final deliverables

  • The memo itself
  • All documentation needed to get from the dataset provided to your to your end product. In R, this can be written into your R Markdown document, which is why I strongly prefer that you do your analysis that way.
  • If necessary, a copy of any additional data you used in your project, such as Census files.

Grading

Your work will be graded based on:

  • Your story idea and the quality of your written memo.
  • The skill and imagination you showed in teasing out crucial details and facts from your data.
  • The research and reporting you did to understand the promise and pitfalls of your data as it relates to your idea.
  • Accuracy in your data analysis and your characterization of it.
  • How well you took direction during the project.
  • Your methodology and documentation documents.