This latest in the frequently asked questions series is an answer to an aspiring data journalism student who asks “Would you be able to direct me to any resources or text books that might help [prepare]?” Here are some recommendations I give to students on my MA in Data Journalism…
Books on data journalism as a profession
Data journalism isn’t just the application of a practical skill, but a profession with a culture, a history, and non-technical practices.
For that reason probably the first thing to recommend is not a book, but just general reading (and listening and watching) as much data journalism, and journalism generally, as possible. These mailing lists (and these) are a good start, and following data journalists on Twitter, and the hashtag #ddj, will expose you to the debates taking place in the industry.
In terms of books The Data Journalism Handbooks (one and two) are well worth browsing for case studies and advice on some of the softer elements of data projects (planning, ethics, law, collaboration, etc).
For a deeper dive into newsrooms and the varied roles that data journalism takes on within those, Nikki Usher‘s book Interactive Journalism is well worth reading.
Books for data analysis
I hope you’ll excuse me recommending my own books here first, as I wrote them to address the lack of existing books on data journalism techniques. Data Journalism Heist introduces basic spreadsheet skills (pivot tables), while Finding Stories With Spreadsheets goes into much more depth on a range of journalistic applications of Excel & Google Sheets functions.
Moving swiftly on… if you want to do data analysis with R you could go to Hadley Wickham‘s R For Data Science or if sticking with Python data journalist Ben Walsh‘s First Python Notebook ebook is a really good introduction to the pandas library.
With all of these books it helps if you have your own practical project to get stuck into — keep it relatively simple at first (like “what’s the most common crime in my area” using open data), then add complexity as you grow in confidence.
Books for data visualisation and interactivity
In terms of data visualisation Dona Wong‘s Wall Street Journal Guide to Information Graphics is both succinct and news-focused, and Alberto Cairo‘s books The Functional Art and The Truthful Art are recommended next steps.
Ian Bogost at al’s Newsgames is worth reading to give you ideas around interactivity and play in journalism.
It’s tricky to recommend practical data visualisation books as it depends what you’re trying to achieve (maps or charts? Interactivity?) and what language you prefer – but both Walsh’s and Wickham’s books mentioned above go into data visualisation in Python and R respectively.
Again, I would recommend having your own projects here, and keeping them simple to start with. Master the basic principles of visualisation (using the right chart, choosing colour schemes, etc.) using free tools such as Datawrapper, Infogram and Flourish first, before creating your own custom visualisations.
Books on data validity and statistics
Some reading on statistics and data validity is important. Jonathan Stray‘s The Curious Journalist’s Guide to Data is a good one to start with.
Darrell Huff‘s How to Lie With Statistics is a classic that also has the benefit of being a quick read, while The Tiger That Isn’t by Dilnot and Blastland looks at numbers from a journalistic perspective (it’s worth subscribing to Blastland’s BBC podcast More Or Less too).
Books on data cleaning
Data cleaning is enormously important but there aren’t many books on the skill. You will find it covered in some of the chapters in Finding Stories With Spreadsheets, as well as Hadley Wickham’s R For Data Science, but the Packt ebook Using Open Refine has the advantage of covering cleaning with Open Refine specifically, an essential addition to any data journalist’s toolkit.
Don’t forget the non-data skills
So far all the recommendations have focused on the data side of data journalism — but data journalists also need to be able to find case studies and experts, to interview people, understand complex systems and source key documents, and to be able to make a legal and ethical case for what you are doing.
On that front I particularly like David Randall‘s focus on the softer skills of newsgathering in The Universal Journalist, while Tony Harcup‘s Journalism: Principles and Practice takes in both the practical side and critical issues in the field. You should also have a book on media law and regulation in your country (in the UK that would be McNae’s).
A large part of the skill of data journalism is the ability to look at things from a different perspective. So I really like The Information by James Gleick as a broader insight into a conceptual understanding of data which can help you see opportunities for data journalism that others might miss.