The skills required to be a successful data journalist are many, ranging from numeracy and spreadsheet fluency to being able to create visualizations and interpret and perform statistical analyses. In most moderate to large newsrooms, some data tasks are divided among desks and departments, with reporters, editors, designers and coders working in teams. Still, it is important for all team members to have some familiarity with what the others are doing. And the core skills of working with numbers and telling stories in the public interest are fundamental to all newsroom work.
This syllabus covers these core skills while also giving students some familiarity with relevant software, statistical and visualization techniques and programming. Over all, issues of data ethics and valid interpretation are front and center here. This syllabus is informed by the idea that data journalism is practiced in its highest form not when it is just involved in creating dazzling graphics, but when its methods are used to investigate wrongdoing, hold the powerful accountable and spotlight public policy failings.
Computer-assisted reporting, or CAR, has been around for decades. While this area of journalism has long been considered an important subfield in newsrooms and journalism schools, societal and industry changes now demand that the basic skills needed to work with data become, in effect, ubiquitous and mainstream among reporters, editors and instructors. A structural shift in how information is being produced, used, and, at times, misused, dictates a shift in how journalists prepare for the profession. Nearly all of the powerful institutions in society – from government agencies to businesses, sports franchises to insurance firms – are heavily invested in collecting and leveraging data. There is simply no way journalists can perform their watchdog functions if they do not have baseline skills and knowledge to interrogate the activities of these agencies.
Background resources and context:
For journalism schools and faculty looking for wider recommendations and ideas about how to incorporate data-related skills and knowledge into the curriculum, the 2016 Knight Foundation-Columbia Journalism School report “Teaching Data and Computational Journalism,” by Charles Berrett and Cheryl Phillips, offers a comprehensive overview of the field, based on a survey of 113 schools. The American Press Institute has also published a series of articles describing how data journalism could be integrated into curricula.
A 2017 study based on data collected through Harvard’s Journalist’s Resource project, “Knowing the Numbers,” offers an overview of the debates in the media industry and academia over journalists’ proverbial “math phobia” as well as its consequences and what can be done about it. For those interested in the evolution of this field, see Mark Coddington’s study “Clarifying Journalism’s Quantitative Turn,” Digital Journalism, 2015.
Course objectives
Students should be able to:
- Think critically and deeply about the limitations of datasets and evaluate the strengths and weaknesses of data.
- Assess how institutions may be collecting and using data and the implications of these processes for the public.
- Use and manipulate datasets with ease and comfort, being able to ask interesting questions and explore various angles.
- Deploy basic software and applications of various kinds to analyze and visualize data in creative ways.
- Demonstrate a solid grasp of data storytelling techniques that can help broad audiences understand data.
Course sequence design
This course will acquaint students with the basics of cleaning, analyzing and interpreting information in tabular form – rows and columns. It will challenge them to improve their understanding of numbers and quantification, as well as offer tools and frameworks for presenting data to audiences. The syllabus also covers special topics such as interpreting academic research, advanced visualization techniques and emerging fields such as artificial intelligence.
Supplemental texts:
Various articles are suggested as readings for each unit. While no single text is required for this sequence of lessons, the following list may be useful for instructors and students:
- Jonathan Stray, The Curious Journalist’s Guide to Data, 2016.
- Brant Houston, Computer-Assisted Reporting: A Practical Guide, 2014.
- David Herzog, Data Literacy: A User’s Guide, 2016.
- The Data Journalism Handbook, eds. Gray, Bonnegru, Chambers, 2012.
- Alberto Cairo, The Functional Art: An Introduction to Information Graphics and Visualization, 2013.
- John W. Foreman, Data Smart: Using Data Science to Transform Information Into Insight, 2014.
- Tamara Munzner, Visualization Analysis and Design, 2014.
- Philip Meyer, The New Precision Journalism, 1991.
Regularly review these publications:
ProPublica
The Upshot (The New York Times)
FiveThirtyEight
Vox
Data resources:
- National Institute for Computer-Assisted Reporting (NICAR/IRE)
- Stanford Computational Journalism Lab
- Flowing Data
- Data is Plural. Sign up for http://tinyletter.com/data-is-plural. All datasets can be found in an updated master spreadsheet.
- Northeastern University Library’s visualization tip sheets
- Data Stories podcast
- Storybench.org “how-to’s”
- Data Is Beautiful, a community on Reddit
Week 1: Drilling Down on Numbers
While many students have taken advanced math at some point in their academic lives, most need a refresher on basic concepts. Working with raw data in tabular form can seem like a novel task, even though the analytical tasks of arithmetic, ratios, rates and the like are not particularly complex. This week focuses on familiarizing students with basic strategies for doing data analysis and introducing some frameworks for critical thinking.
Class 1: Specifics of counting and quantification
Readings:
- The Curious Journalist’s Guide to Data, Stray, pp. 1-47
- “Become Data Literate in Three Steps,” The Data Journalism Handbook
- “Math Basics for Journalists: Working with Averages and Percentages,” Journalist’s Resource
- “Tips for Journalists Working with Math, Statistics: A List of Key Resources,” Journalist’s Resource
Activity:
Use the DataBasic.io tutorial on data in tabular form and CSV files to explore data on passengers of the Titanic. Look at visualization of data for each column in the dataset and discuss the nature of the data offered, inferences that could be made and limits of the data.
Class 2: Numeracy and the importance of critical thinking
Readings:
- Data Literacy, Herzog, Sections I-II, pp. 1-64
- Philip Meyer, “Mathematics Competency Test for Journalists”
- “Guide to Critical Thinking, Research, Data and Theory: Overview for Journalists,” Steven Van Evera/Journalist’s Resource
Activity:
Students should explore the website CensusReporter and identify towns or cities they might have an interest in covering. They should review the demographic profiles of these municipalities, note interesting patterns and compile a list of ideas for stories they might pursue using this data.
Dataset & data story of the week:
“The Deadliest Jobs in America,” Bloomberg News, May 2015
Census of Fatal Occupational Injuries, U.S. Department of Labor
Week 2: Data in Tabular Form: The Fundamentals
This week focuses on the core skills of data manipulation. To facilitate foundational knowledge in how to manipulate and analyze data in tabular form, instructors should assign the NICAR Coursepack, or a similar sequence of Excel, Google Sheets or other spreadsheet-oriented exercises.
As a way of framing this essential, but often pain stakingwork, students should read the interviews that Journalist’s Resource has done with two prominent data journalists — Sarah Cohen of the New York Times and Steve Doig of Arizona State University –as well as Scott Klein’s 2016 article published in Nieman Journalism Lab, “Want to Start a Small Data Journalism Team in Your Newsroom? Here are 8 Steps.” With these expert views in mind, students should reflect on the skills they are building, the areas in which they want to build further knowledge and what they believe are keys for success in the field.
Class 1: Sorting, Summing and Percentage Change
Readings/Materials:
- Exercises 1-5 in NICAR Coursepack
- “Some Elements of Data Analysis,” The New Precision Journalism
Activity:
Students should explore ProPublica’s “Debt by Degrees” database, which provides information on student debt issues and schools. Afterward, students should identify patterns and potential stories they think would interest news audiences in their state and region.
Class 2:
Readings/Materials:
- Exercises 6-9 in NICAR Coursepack
- Various selections from “Case Studies,” The Data Journalism Handbook
- Video: “The Newest Muckrakers: Investigative Reporting in the Age of Data Science,” Sarah Cohen, C+J Symposium Stanford, 2016
Activity:
Using sorting and filtering techniques, students should use data collected through 311 telephone calls to practice mapping civic complaints in a city. To locate and map the data, follow the Storybench.org tutorial and import the data into Carto.com.
Dataset & data story of the week:
Andrew Ryan, et al., “City Payroll Soars after Police and Fire Deals,” The Boston Globe, 2015
City of Boston employee payroll data, 2014
Week 3: Challenges with Data: Finding and Cleaning
Getting clean data is rarely easy, and it should come as a relief for data journalists to know that even the most accomplished data scientists spend a substantial amount of time cleaning and transforming datasets for use. It is slow and patient work, requiring rigorous systems and work sequences to ensure data integrity at all steps of the process. Still, there are large, professionally-curated administrative datasets that are increasingly easy to use and can be accessed from statistical collection agencies at the federal and state levels of government. (See Journalist’s Resource to find all of the federal government’s administrative datasets in one place.) This week, students will look at some of the challenges associated with data requests, cleaning and analysis.
Class 1:
Readings:
- Steve Lohr, “For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights,” The New York Times, 2014
- Video: “Let Lookup Save You from Boring, Repetitive Work You’ve Forgotten You’re Even Doing,” NICAR 2016, Christopher Groskopf
- “Key Data- and Research-Oriented Government Agencies that Media Members Should Know About,” Journalist’s Resource
- Video: “Finding Story Ideas in Large Datasets,” C+J Symposium Stanford, 2016
Activity:
Public records requests are a key piece of data journalism. Students should review the activities of a government information-requesting project called MuckRock as well another project called FOIA Mapper. Students should make a request for data through MuckRock or directly through a government website. Careful consideration should be given to the scope of the request and the language used. Use the search tools at FOIA Mapper to review similar requests.
Class 2:
Readings:
- Data Literacy, Herzog, Section III, pp. 65-112
- “Getting Data from the Web,” The Data Journalism Handbook
- “Using Python to Scrape a Website and Gather Data: Practicing on a Criminal Justice Dataset,” Journalist’s Resource, 2014
- Select articles from The Upshot and FiveThirtyEight
Activity:
Students should download OpenRefine (Mac and PC versions), and try cleaning some data with it. They should also try Tabula to extract data tables from PDF files. Specifically, students might use these tools to review documents and data from local charities or nonprofit organizations. See “Investigating Nonprofits and Charities: Where to Find Internal Data, Public Records,” from Journalist’s Resource.
Dataset & data story of the week:
Ben Casselman, “Where Police Have Killed Americans In 2015,” FiveThirtyEight
Police Killings, FiveThirtyEight/data, GitHub
Week 4: Statistics: Basics of Inference, Correlation, Probability
The ability to manipulate numbers in a sophisticated way is increasingly important in data journalism. This week presents a variety of perspectives and empowers students with knowledge and tools to both interpret and perform some of this work.
Class 1:
Readings:
- The Curious Journalist’s Guide to Data, Stray, pp. 48-95
- Video: “Solve Every Statistics Problem with One Weird Trick,” NICAR 2016, Jonathan Stray
- “Chapter 3 – Harnessing the Power of Statistics,” Philip Meyer, The New Precision Journalism
- Tara Parker-Pope, “The Dark Arts of Statistical Deception,” The New York Times, 2010
- Classic F.J. Anscombe 1973 paper on statistical misrepresentation, famous “quartet”
- Video: “I Improved My Math Fluency, And So Can You,” NICAR 2016, Ryann Jones
Activity:
To prepare students to be critical consumers and producers of representations of data, they should review the practices displayed in “How to Spot Visualization Lies,” by Nathan Yau (Flowing Data, 2017). In teams, students should find 5 to 10 visualizations they find online that are flawed in some way and then describe how these visualizations could be improved.
Class 2: Polling and surveys
Readings:
- “Polling Fundamentals and Concepts: An Overview for Journalists,” Journalist’s Resource
- “Chapter 5 – Surveys,” Philip Meyer, The New Precision Journalism
- Nate Silver’s FiveThirtyEight mission statement/manifesto
- “Reporting with Web and Social Media Data: Some Helpful Tools,” Journalist’s Resource
- Video: Jennifer LaFleur, “Cats and Stats,” NICAR 2016
Activity:
Read: Harry Enten, “13 Tips For Reading General Election Polls Like A Pro,” from FiveThirtyEight. Discuss the national election results of 2016 and problems with polling. Students should blog about important lessons learned.
Dataset & data story of the week:
Gabriel Dance, Tom Meagher, “Crime in Context,” The Marshall Project, 2016
FBI’s “Crime in the United States, 2015” report
Week 5: Visualization Foundations
The art of data visualization has many forms and degrees of sophistication, from basic web applications to programming languages such as JavaScript’s D3 and R. Simplicity and clarity are the chief virtues of data graphics. But interactive functions, which can be complicated to create, often help audiences explore data in layers and hone in on specific facts and information that is most relevant to their own lives. This week looks at basic concepts and explores compelling recent examples in journalism.
Class 1: Visualization basics
Readings:
- The Functional Art, Introduction and Chapter 1, xv-xxi; pp. 6-23
- Steven Braun, “Elements of Information Design,” 2016
- Michael Friendly, “A Brief History of Data Visualization,” 2006
Activity: Break into teams and work on ways to show the relationships between two quantities through visual encoding. Refer to “45 Ways to Communicate Two Quantities,” Santiago Ortiz, 2013.
Class 2: Data visualization in journalism practice
Readings:
- David Sleight & Scott Klein, “The Year in Visual and Interactive Storytelling,” ProPublica, 2016
- Ariel Zambelich, “(Some Of) Our Favorite Visual Stories of 2016,” NPR, 2016
- Kayla Darling, “15 Cool Data Visualization Examples from 2016,” VisMe, 2016
- Steven Braun, “Want to Make a Digital Map?” and “Where Will Maps Take You Today?”, 2016
Activity: Make a simple line graph, charting a single variable over time. You might use this Journalist’s Resource post for guidance: “Dataset Digest: From Data.gov to Chartbuilder, A Lesson with Organic Farm Data.”
Dataset & data story of the week:
Jennifer Oldham, “Exhaustion Is Her Copilot: 6 Days with a Michigan Trucker,” Bloomberg News, 2014 ; (graphic) “Trucker’s Odyssey,” Bloomberg News, 2014
Large Truck and Bus Crash Facts, U.S. Department of Transportation
Week 6: Advanced Visualization Techniques
This week looks at some of the research and deeper thinking related to data visualization. Some foundational studies in the field are introduced, and some noteworthy applications are explored.
Class 1:
Readings:
- “Building and Shifting Visual Narrative,” Steven Braun, Northeastern, 2016
- John Wihbey/Michelle Borkin, “Understanding What Makes a Visualization Memorable,” Storybench, 2015
- “Get It Right in Black and White,” Maureen Stone, blog, 2010
- Edward Segel, Jeffrey Heer, “Narrative Visualization: Telling Stories with Data,” IEEE Transactions on Computer Visualizations and Graphics, 2010
- Famous critique of pie charts: “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods,” William S. Cleveland and Robert McGill, Journal of the American Statistical Association, 1984
Activity:
Tableau is a visualization software tool that classes can use for free; Tableau Public is a web application that also is free. See data journalism examples at Tableau and a gallery of recent hits. Students should familiarize themselves with Tableau’s user interface and then produce a visualization of moderate complexity.
Class 2:
Readings:
- Bahareh Harevi, “What Makes a Winning Data Story?” Data Driven Journalism, 2017
- Jill Larkin, Herbert Simon, “Why a Diagram is (Sometimes) Worth Ten Thousand Words,” Cognitive Science, 1987
- Ben Shneiderman, “The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations,” Processing Visual Languages, 1996
- Video: “Visualization and Journalism,” Tamara Munzner, C+J Symposium Stanford, 2016=
Activity:
Students should review the following: “10 Things You Can Learn from the New York Times’ Data Visualizations,” Andy Kirk, 2012; and selections from Flowing Data’s library of infographics. Students should write a blog post discussing two or three data visualizations and explain what specific techniques make these examples stand out.
Data story of the week:
Students should review selections from the New York Times’s Graphics department.
Week 7: Interpreting Academic Research: Part 1
The world of academic research is part of data journalism. Several leading news sites do work in this area, including FiveThirtyEight, Vox and The Upshot. They focus heavily on new research findings. Students should familiarize themselves with academic search engines and databases such as Google Scholar, PubMed, Microsoft Academic Search and the National Bureau of Economic Research.
Class 1:
Readings:
- “Academic Research and Studies: How They Work and Why Journalists Should Care,” Journalist’s Resource
- “How to Tell Good Research from Bad: 13 Questions to Ask” and “Eight Questions to Ask When Interpreting Academic Studies: A Primer for Media,” Journalist’s Resource
- “Statistical Terms Used in Research Studies: A Primer for Media,” Journalist’s Resource
Activity:
Students should use the Journalist’s Resource database to identify several studies for potential reporting projects. They should draw up a list of questions to ask the researchers who authored the studies selected.
Class 2:
Readings:
- Christie Aschwanden, “Science Isn’t Broken It’s Just a Hell of A Lot Harder than We Give It Credit For,” FiveThirtyEight, 2016
- “10 Things We Wish We’d Known Earlier About Research: Tips from Journalist’s Resource Staff,” Journalist’s Resource
- Mark Johnson, “Murray’s Problem,” Milwaukee Journal Sentinel, 2015
Activity:
Students should locate studies on the Journalist’s Resource website that have a strong geographic dimension and then relate it to public policy issues. Review the studies to generate ideas about the kinds of data that can spark good stories. Using the mapping application Carto.com, students should map government data in a way that clearly informs the public about an important policy issue (for example: polluting factories in a state; schools that underperform, etc.).
Data story of the week:
“Poisoned Places,” NPR and Center for Public Integrity, 2014 (series here)
Toxics Release Inventory (TRI) Program, U.S. Environmental Protection Agency
Week 8: Interpreting Academic Research: Part 2
This week deepens students’ understanding of academic research and data analysis. The second class proposes a case study around climate change, prompting students to use research to inform their reporting on relevant local data.
Class 1:
Readings:
- “Regression Analysis: A Quick Primer for Media on a Fundamental Form of Data Crunching,” Journalist’s Resource
- “The Journalistic Method: Five Principles for Blending Analysis and Narrative,” Journalist’s Resource
- “Writing about a Research Study: Good Examples of Using Scholarship in Reporting,” Journalist’s Resource
- “Research-Based Ideas for College Campus Reporting: Potential Stories,” Journalist’s Resource
Activity:
Students should use the application Timeline JS to create a sequence that tells the story of the evolution of knowledge in a field of study. For example, landmark studies published in cancer research, or major works that look at poverty in U.S. cities. The timeline need not be comprehensive, but it should help an audience understand how what we know about an issue through academic research has grown and changed. Using Google Scholar, students can find the most highly-cited studies on any given topic and examine chronology and citations.
Class 2: Scientific literature and data: Climate change case study
Readings:
- “Global Warming, Rising Seas and Coastal Cities: Trends, Impacts and Adaptation Strategies,” Journalist’s Resource
- Maxwell Boykoff, Jules Boykoff, “Balance as Bias: Global Warming and the U.S. Prestige Press,” Global Environmental Change, 2004
- “Localizing the Climate Change Mitigation Story in Your State and Region: Some Data Tools,” Journalist’s Resource
- John Wihbey, “Did Media Flub It or Ace It in News of Landmark Ice Sheet Study?” Yale Climate Connections, 2012
Activity:
Students should review “Writing about Think Tanks and Using Their Research: A Cautionary Tip Sheet,” from Journalist’s Resource. They should find studies from think tanks that may be biased for partisan reasons or potentially compromised by industry funding. In a blog post, they should discuss pieces of research that may be problematic for journalists and how that research might be properly cited and contextualized, if used at all.
Dataset & data story of the week:
Students should review selections from ProPublica’s data section.
Week 9: Special Topic 1: Health, Well-being and Medical Data
Health and medicine are tricky terrain for journalists, as new studies and data can be of utmost public importance but also promoted with hype and spin. In addition, health and medical topics are fraught with statistical perils.
To get students thinking, they might watch the well-known TED Talk by Hans Rosling, “The Best Stats You’ve Ever Seen,” about global health statistics and data problems, and review “Statistics for Journalists,” by Connie St. Louis. Students should then find an article or blog where they believe health, medical or epidemiological statistics might be used in a misleading way. They should write a short critique raising questions about the news item.
Class 1:
Readings:
- Jeff Leek, “Finally, a Formula For Decoding Health News,” FiveThirtyEight, 2014
- “Mining Census Data to Better Cover the Health-Gap Story: A Tip Sheet from AHCJ,” Journalist’s Resource
- “How to Cover Hospitals” and “Tools for Covering Hospitals,” Association of Health Care Journalists
Activity:
Students should use the federal Centers for Medicare & Medicaid Services’ Open Payments database to examine patterns of payments among doctors in their community. They should compile the data and create visualizations that could inform a news audience.
Class 2:
Readings:
- “Reporting on Health Risk in Medical Studies, Pharmaceutical Trials: Tips from The Poynter Institute,” Journalist’s Resource
- Schwartz, S. Woloshin, “The Media Matter: A Call for Straightforward Medical Reporting,” Annals of Internal Medicine, February 2004
- Michael Berens, “Seniors for Sale: Exploiting the Aged and Frail in Washington’s Adult Family Homes,” Seattle Times, 2011
- Students should familiarize themselves with U.S. county health rankings, census data and databases from The Commonwealth Fund, Robert Wood Johnson Foundation, and the University of Wisconsin’s Population Health Institute.
Activity:
Explore the ProPublica database “Treatment Tracker” and the associated story. Students should look at the “Local Stories” column on the ProPublica site and examine how other news outlets brought subsets of data to their audiences. With that in mind, students should produce a data graphic that can tell a story relevant to a local audience.
Dataset of the week:
Andrea Ball, Eric Dexheimer, “Missed Signs. Fatal Consequences,” Austin American-Statesman, 2015
Child abuse and neglect fatality database, Texas Child Protective Services, Austin American-Statesman
Week 10: Special Topic 2: Economic and Business Data
Perhaps the first subfield of journalism to embrace data, economic and business reporting is full of numbers and figures. But it is also a field filled with confusing and highly specialized subject areas, where numbers require a lot of context for interpretation. This week looks at select topics reporters might encounter, from accounting and small business concerns to housing and trade.
Class 1:
Readings:
- “Finding and Reading a Balance Sheet: Accounting Basics for Journalists,” Journalist’s Resource
- “Property Taxes 101: A Primer for Journalists,” Journalist’s Resource
- Students should look up reports for a local company on the federal government’s Occupational and Safety Health Administration website
- Allison Schrager, “The Problem with Data Journalism,” Quartz, 2014
Activity:
Review the Journalist’s Resource tip sheet “Free tools for Visualizing Economic Data” and create a chart or graph using FRED and World Bank applications.
Class 2:
Readings:
- Rebecca Greenfield, Kim Bhasin, “The Future of Shopping: Trapping You In A Club You Didn’t Know You Joined,” Bloomberg News, 2016
- “Housing Prices and Affordability: Where to Find Data,” Journalist’s Resource
- “Finding Trade and Tariff Data: Tips for Journalists,” Journalist’s Resource
Activity:
Investigate and chart local housing price trends over time. Use the application Plot.ly to make a chart or graph.
Dataset & data story of the week:
Find and review data relating to the intersection of business and politics at the Sunlight Foundation, OpenSecrets.org and FollowtheMoney.org.
Week 11: Special Topic 3: Crime and Public Safety Data
Among the most controversial areas of journalism — crime and criminal justice reporting — has attracted criticism from academic researchers for decades. News outlets’ tendency to hype violent crime and focus on episodic events can fuel public demands for all sorts of ill-advised policies. At the same time, journalists have also been accused of overlooking and ignoring important trends. This is difficult territory, and this week dives into some of these important issues.
Class 1:
Readings:
- Selections from The Marshall Project’s “Murder Rates” news story curation project
- Josh Keller, Adam Pearce, “This Small Indiana County Sends More People to Prison than San Francisco and Durham, N.C., combined. Why?” The New York Times, 2016
- “Data Journalism Lesson with Crime Stats: Parsing Close-Call Numbers,” Journalist’s Resource
Activity:
Examine the ProPublica project “Documenting Hate,” which attempts to collect data on hate crimes across the U.S. in a deeper, more thorough way than the government does. Ask students to sketch out and prototype a project of their own that would use crowdsourcing and networking techniques to collect hard-to-get data of some other kind in the field of criminal justice.
Class 2:
Readings:
- Philip Stinson, “Crime Stats Should Inform the Public. Trump is Misusing Them to Scare Us Instead.” The Washington Post, 2017
- Carl Bialik, “Scare Headlines Exaggerated the U.S. Crime Wave,” FiveThirtyEight, 2015
- “Exploring the Worldviews of Young Black Men in America: Research Chat with Sociologist Alford A. Young, Jr.,” Journalist’s Resource
Activity:
Use the following custom tutorial for Tableau visualizations of homicide and exoneration data. Students should follow the steps and produce both the bar chart and the tree map explained in the tutorial. Discuss how using a tree map can help reporters explore data.
Dataset & data story of the week:
“Fatal Force” dataset and series, The Washington Post
Week 12: Frontiers: Algorithms, Data Science, Artificial Intelligence
New trends in the fields of data science, machine learning and artificial intelligence may radically change the way journalists approach quantitative information. It is perhaps too soon to tell. But this week provides a solid overview of emerging fields and their possible implications.
Class 1:
Readings:
- Julia Angwin, Terry Parris Jr., Surya Mattu. “Breaking the Black Box: What Facebook Knows About You,” ProPublica, 2016
- Nicholas Diakopolous, “Algorithmic Accountability,” Digital Journalism, 2014
- Keith Kirkpatrick, “Putting the Data Science Into Journalism,” Communications of the ACM, 2015
- “The Rise of Artificial Intelligence and the End of Code,” Jason Tanz, Wired, 2016
- Aleszu Bajak, “What Journalists Need to Know about Code,” Storybench, 2015
- “Top 10 Data Science Skills, and How to Learn Them,” Eileen McNulty, Dataconomy, 2014
Activity:
Students should familiarize themselves with the programming language R and how it is used in research and data journalism. Watch the following video: “FiveThirtyEight’s Data Journalism Workflow with R,” User 2016 Conference. Also, listen to this podcast: “Amanda Cox on Working With R, NYT Projects, Favorite Data,” Data Stories, 2016. Then walk through the tutorial “Getting Started with R in RStudio Notebooks” (Martin Frigaard, Storybench, 2016).
Class 2:
Readings:
- Peter Aldhous, Charles Seife, “Spies in the Skies,” Buzzfeed, 2016
- Video: “Stories By and About Algorithms,” C+J Symposium Stanford, 2016
- Charles Ornstein and Annie Waldman, “How We Analyzed Privacy Violation Data,” ProPublica, Dec. 2015
- Review the application DataWrangler from Stanford Viz Group
- Video: “Introduction to Data Science (Using Spreadsheets) (4 parts),” John Foreman, 2014
Activity:
Continuing with their work in R, students should complete the tutorial “How to Create a Simple Line Chart in R” (Aleszu Bajak, Storybench, 2017).
Dataset & data story of the week:
Julia Angwin, Jeff Larson, Surya Mattu, Lauren Kirchner, “Machine Bias,” ProPublica, 2016
COMPAS Recidivism Algorithm dataset, ProPublica
Week 13: Ethical issues in Data Journalism
Data journalism is an exciting field, but it carries with it substantial responsibilities for reporters and editors, as they are often making original interpretations of datasets for the public. This final week looks at some of the pitfalls and problems associated with the field and shares cautionary lessons.
Class 1:
Readings:
- Samantha Sunne, “The Challenges and Possible Pitfalls of Data Journalism, and How You Can Avoid Them,” American Press Institute, 2016
- Helen Lewis, “When Is It Ethical to Publish Stolen Data?” Nieman Reports, 2015
- Sophie Chou, “To Scrape or Not to Scrape,” Storybench, 2016
- Seth C. Lewis, et al. “Big Data and Journalism: Epistemology, Expertise, Economics and Ethics,” Digital Journalism, 2015
Activity:
Read “Connecting the Dots” by Jacob Harris (2015) and discuss how people should or should not be represented through news visualizations. Students should find examples of visualizations produced by news organizations that are exemplary and possibly problematic.
Class 2:
Readings:
- “Data Journalism Ethics: Tricky Questions Buried in the Numbers,” Markkula Center for Applied Ethics, Santa Clara University
- Margaret Sullivan, “Times Magazine Editor on ‘Creative Apocalypse’ Article,” New York Times, Sept. 2015
Activity:
Read the following article from Alberto Cairo: “Data Journalism Needs to Up Its Own Standards” (Nieman Journalism Lab, July 2014.) In a blog post, students should respond to the critiques presented in the article and suggest ways data journalists can overcome the challenges articulated.
Dataset & data story of the week:
Review the data collection and analysis efforts of the Texas Tribune.
_______
A special thanks to John Wihbey, assistant professor at Northeastern University and a consultant to Journalist’s Resource, for his help preparing this syllabus.
Expert Commentary