R Study Guide
Introduction
This is probably your first introduction to coding. Don’t be worried. With effort, much of what reporters do in coding can be learned in a few weeks.
Like most reporters, I learned the coding that I know (which isn’t a lot) because I wanted to get a story done. In our class, we are not trying to become programmers or social scientists. We’re working on stories.
You saw during the pivot table lesson that spreadsheets have limits. We couldn’t easily get the city with the most police shootings because we would have had to put both city and state into the pivot table. A median is missing from pivot tables entirely. It’s easy to lose track of where you are and what you did. That’s the reason to learn some coding – there is something we want to know that isn’t very easy to get in other ways.
All programming languages have one thing in common: You write instructions, called algorithms, and the program executes your statements in order. It means you can do more complicated work in computer programming than in point-and-click parts of Excel. It can also scale – you can repeat your instructions millions of times, tweak it a little, and re-run the program without messing anything else up. The computer won’t mind. Really.
Writing code can also be self-documenting. You can largely eliminate those painstaking Excel data diaries and replace them with documents that explain your work as you go. You’ll still need to record information about your interviews and decisions, but you’ll no longer have to write down every mouse click.
If you’re nervous about getting started with programming, take look at the Appendix: A gentle introduction to programming and Jesse Lecy’s “Learning how to Learn”, where he says:
If this is your first programming language, you will get frustrated at times. Take a step back and remember that after a semester of Spanish you can only operate at the level of a three-year old. You only know one verb tense, a few dozen verbs, and several hundred words. You have so many emotions that you can’t express in your new language!
The good news for journalists is that you can accomplish most of what we need with the vocabulary of a three-year-old.
All I ask is that if you get very frustrated, walk away from the computer for a little while. Get help if, after a break, you don’t know what you might do next to make some progress. #dj-sos
on Slack is one option. Use it. If you’re stuck, it’s quite likely others are as well. But don’t let it get to you. As Lecy says, your morale is a limited commodity.
R or Python?
If you ask a data scientist or technologist which language you should learn first, you’ll start a heated debate between advocates of R, Python, Javascript , SQL, Julia and others. Ask the same question of a data journalist and the answer will be: “Choose one that is free and that your colleagues use so you can get help.” For our purposes, it really doesn’t matter – any of the standard languages will do.
My only rule is that you stick to your first language for a little while before trying a new one. It would be like trying to learn Portuguese and Spanish at the same time, when you know neither one to begin with. They’re related, but very different.
Employers who hire data reporters usually don’t care which programming language you know because it’s relatively easy to learn another once you’re comfortable with the concepts and good data journalism habits. In a few cases, such as the Associated Press, R is preferred. In others, like the Los Angeles Times, it’s a little easier to work with the team if you work in Python. Visualization teams generally work in Javascript. But most employers will just be happy that you are reasonably self-sufficient in any language.
I chose R because I find it a little easier to use when trying to puzzle something out step by step, and it is particularly good at working with the weird and varied forms of data thrown at us, but it’s really just a matter of taste and comfort.
The following chapters lay out the fundamentals of data journalism with R. I’ll be adding chapters as we want them during class. The format of these chapters varies a little from the rest of the book. Each section will begin with the key concepts and skills that are included, which I hope will make them easier to find when you need them.
Structure of the chapters
Each chapter includes a summary of what is included, intended to help you navigate the book when you are doing your own coding. At the end is a “recipe” chapter that puts together the specific funcitons and commands that were covered in earlier chapters, and actually goes a little further to help you polish your document.
Pay attention to these :
These are warnings that something isn’t quite what it seems and might go wrong.
These are cautions – something bad could happen if you ignore this.
Pro tips!
A dark background contains specific instructions you should follow for class or if you want to follow along
Credits
The first few chapters in this section rely on work done by Andrew Heiss and Christian McDonald.