The data walk
Due date: Friday, April 27. You’ll do a lot of the work for this in teams during class time. NOTE: By moving to remote classes, we may have to adjust this a little for those of you who are physically not in Phoenix. I think the rest of it should still work. The Friday due data reflects the amount of work I expect – this should not spill over into the weekend.
This mini-project is inspired by the website, Data Walking: A research project exploring data gathering and data visualization by David Hunter.
There are lots of examples and instructions on that site, but we’ll be adapting it to Phoenix, the downtown campus and on the kinds of data that you might collect for stories. I’ll post more about this later in the semester, but the basics are laid out below.
Documenting everyday life in data
One of the most famous examples of documenting your everyday life in data comes from Stefanie Posavic and Giorgia Lupi. They’re both designers, who met each other and decided to create post cards every week visualizing something they counted. The result is a book of stunning everyday databases and a website both called “Dear Data.”
Here’s some background on that project that might help inspire you as you think about it.
Stories from homemade databases
Minneapolis Star Tribune:
- “How We Built a Database from Thousands of Police Reports , Mary Jo Webster, Source, Aug. 23, 2018.
- “Denied Justice, Minneapolis Star Tribune: Part 1: When Rape is Reported and Nothing Happens; and Analysis: Five Factors that can determine the fate of a sexual assault case.
Washington Post:
- “Inside the Washington Post’s Police Shooting Database: An oral history”, WP on Medium, December 2015
- “A Year of Reckoning: Police fatally shoot nearly 1,000”, Washington Post staff, Dec. 26, 2015
- Washington Post github readme.md of police shootings. Consider downloading the data and using your newfound filter, sort and pivot table skills to practice.
OPTIONAL IRE Tip Sheet: “Lessons Learned from Building a Database with Colleagues, Mary Jo Webster, Todd Wallack and Dana Amihere. Note that you have to sign in with your IRE account to see the download. The pdf is at the bottom of the description.
Designing your own database
I’ll approve or help you revise your idea for the data walk. It should be challenging enough so that you have to make difficult decisions and that you have to spend some time collecting it, but not so difficult that you’ll never find any instances of what you want to measure.
Now that you’ve seen all the ways that data can be messy and difficult to use, designing a database that fits our tidy scenario is easier to imagine. That doesn’t make it easier to do.
You might find it easier for one of you to sign up for a free Airtables account. It has a mobile app that’s good for data collection and it expects you to need some consistency in your dataset. It will also accept attachments and photos, for example, as a field.
Record layout
Once you have your idea, be sure to write out a record layout / data dictionary. This will describe exactly what every column is called, what kind of data resides in it, and what it means. You may have to have some fairly elaborate rules to categorize items. You might need more than one data frame, sheet or table to hold the details vs. the summary information.
Try collecting 5 or so rows of data and come back together to re-jigger it.
Once you think you have a good dataset, go back out and collect 50 rows’ worth of data among you. You might want to go in pairs, so one person can log the information while the other makes sure that you’ve observed it accurately.