This week, we’ll be switching to a more robust platform that you can use for your database projects.
A walkthrough of sqlite and db browser for sqlite: https://sarahcnyt.github.com/data-reporting/db-browser/
An introduction to merging (or “joining”, or “matching”) tables in SQL: https://sarahcnyt.github.com/data-reporting/sql-join/
We’ll walk through joining databases that were never intended to be matched, such as convicted felons and teachers; White House visitors and lobbyists; and even demographics of zipcodes with contributors.
We’re going to practice on a really old dataset that is only unrealistic because it contains details about government employees that are rare to get now. We’ll look at how you might combine two database, one with child protective orders and the other with salaries of government employees.
The dataset can be found here
We’re also going to start looking through the documentation and some examples of the files you will be able to use for your projects. The choices are:
Fatal Accident Reporting System, a simplified version but still challenging. We’ll get familiar with it by looking at the online query tool, which doesn’t help you analyze it well, but gives you a sense of what is included: https://www-fars.nhtsa.dot.gov//QueryTool/QuerySection/SelectYear.aspx
Hazardous material spills in Arizona, 1990-present. We’ll look at one year’s spill so you can see what tables are included in the dataset: http://www.nrc.uscg.mil/
And we’ll look at campaign finance in Arizona, which is probably the easiest dataset to use, but has the least amount of news in it. You already have the 2016 individual contributions.