Category Archives: data journalism

A reading list for studying online journalism

As a new semester nears, I thought I would anticipate the ‘What should I read?’ enquiries by sharing an aggregated reading list from the classes I teach at both Birmingham City University and City University London. Here are 10 key topics with varying numbers of books for each – I’d very much welcome other suggestions:

  1. Working in networks: Yochai Benkler, The Wealth of Networks; Richard Millington, The Proven Path (PDF)
  2. Content strategy: John Battelle, The Search; Bill Tancer, Click; David Kirkpatrick, The Facebook Effect
  3. Platforms: Mark Luckie: The Digital Journalist’s Handbook
  4. Live and mobile journalism: Mark Briggs, Journalism Next; Dan Gillmor, Mediactive
  5. Multimedia: Janet Kolodzy, Convergence Journalism and Practicing Convergence Journalism; Atton & Hamilton, Alternative Journalism; Wilma de Jong, Creative Documentary
  6. UGC, social media and community management: Axel Bruns, Gatewatching; Andrew Lih, Wikipedia Revolution; Jeff Jarvis, What Would Google Do?
  7. Data journalism: Bradshaw and Rohumaa, The Online Journalism Handbook; Andrew Dilnot, The Tiger That Isn’t; Darrell Huff, How to Lie With Statistics; Dona Wong, The Wall Street Guide to Information Graphics; Nathan Yau, Visualize This; Paul Bradshaw, Scraping for Journalists
  8. Law, ethics and online journalism: Friend and Singer, Online Journalism Ethics; Lawrence Lessig, Code; O’Hara and Shadbolt, Spy in the Coffee Machine; Curran, Fenton & Freedman, Misunderstanding the Internet
  9. Experimentation: Clay Shirky, Here Comes Everybody (ch10: Failure for Free); Michalko, Thinkertoys chapter 9; Ian Bogost, Newsgames; Matt Mason, The Pirate’s Dilemma (ch5: Boundaries)
  10. Enterprise: Ken Doctor, Newsonomics; Simon Waldman, Creative Disruption; David Weinberger, Everything is Miscellaneous

You might also find previous posts useful:

Data journalism training – places available

If you want to learn some basic or intermediate data journalism skills I’m running two single-day courses next week, with places still available.

The first is Introduction to data journalism: taming the numbers on Tuesday September 11.

The second is Intermediate data journalism: take data to the next level on Thursday September 13.

They’re being run with Journalism.co.uk and you can book places on either course through their site. If you book on both days you save £50.

How to teach a journalist programming

Cross-posted from Data Driven Journalism.

Earlier this year I set out to tackle a problem that was bothering me: journalists who had started to learn programming were giving up.

They were hitting a wall. In trying to learn the more advanced programming techniques – particularly those involved in scraping – they seemed to fall into one of two camps:

  • People who learned programming, but were taking far too long to apply it, and so losing momentum – the generalists
  • People who learned how to write one scraper, but could not extend it to others, and so becoming frustrated – the specialists

Continue reading

Has the increase in data changed your newsroom?

I’m currently researching if newsrooms have been changed by the increase in availability of data – from FOI and data.gov sites to open data and APIs. Specifically I’m interesting in the watchdog role of journalism, but any other uses are relevant too.

If you work in this area I’d really appreciate it if you can complete the survey below – and share it with others you know can contribute. Here it is.

A case study in online journalism part 3: ebooks (investigating the Olympic torch relay)

8000 Holes - book cover

In part one I outlined some of the data journalism processes involved in the Olympic torch relay investigation, in part 2 I explained how verification, SEO and ‘passive aggressive newsgathering’ played a role. This final part looks at how ebooks offered a new opportunity to tell the story in depth – and publish while the story was still topical.

Ebooks – publishing before the event has even finished

After a number of stories from a variety of angles I reached a fork in the road. It felt like we had been looking at this story from every angle. More than one editor, when presented with an update, said that they’d already ‘done the torch story’. I would have done the same.

But I thought of a quote on persistence from Ian Hislop that I’d published on the Help Me Investigate blog previously. “It is saying the same true thing again and again and again and again until the penny drops.”

Although it sometimes felt like we might be boring people with our insistence on continuing to dig we needed, I felt, to say the same thing again. Not the story of ‘Executive carries the torch’ but how that executive and so many others came to carry it, why that mattered, and what the impact was. A longform report. Continue reading

A case study in online journalism part 2: verification, SEO and collaboration (investigating the Olympic torch relay)

corporate Olympic torchbearers image

Having outlined some of the data journalism processes involved in the Olympic torch relay investigation, in part 2 I want to touch on how verification and ‘passive aggressive newsgathering’ played a role.

Verification: who’s who

Data in this story not only provided leads which needed verifying, but also helped verify leads from outside the data. Continue reading

A case study in online journalism: investigating the Olympic torch relay

Infographic: Where did the Olympic torch relay places go? What we know so far

For the last two months I’ve been involved in an investigation which has used almost every technique in the online journalism toolbox. From its beginnings in data journalism, through collaboration, community management and SEO to ‘passive-aggressive’ newsgathering,  verification and ebook publishing, it’s been a fascinating case study in such a range of ways I’m going to struggle to get them all down.

But I’m going to try. Continue reading

My first ebook: Scraping For Journalists (and programming too)

Next week I will start publishing my first ebook: Scraping for Journalists.

Although I’ve written about scraping before on the blog, this book is designed to take the reader step by step through a series of tasks (a chapter each) which build a gradual understanding of the principles and techniques for tackling scraping problems. Everything has a direct application for journalism, and each principle is related to their application in scraping for newsgathering.

For example: the first scraper requires no programming knowledge, and is working within 5 minutes of reading.

I’m using Leanpub for this ebook, because it allows you to publish in installments and update the book for users – which suits a book like this perfectly, as I’ll be publishing chapters week by week, Codecademy-style.

If you want to be alerted when the book is ready register on the book’s Leanpub page.

Let’s explode the myth that data journalism is ‘resource intensive’

"Data Journalism is very time consuming, needs experts, is hard to do with shrinking news rooms" Eva Linsinger, Profil

Is data journalism ‘time consuming’ or ‘resource intensive’? The excuse – and I think it is an excuse – seems to come up at an increasing number of events whenever data journalism is discussed. “It’s OK for the New York Times/Guardian/BBC,” goes the argument. “But how can our small team justify the resources – especially in a time of cutbacks?

The idea that data journalism inherently requires extra resources is flawed – but understandable. Spectacular interactives, large scale datasets and investigative projects are the headliners of data journalism’s recent history. We have oohed and aahed over what has been achieved by programmer-journalists and data sleuths…

But that’s not all there is.

Continue reading

Two guest posts on using data journalism techniques to investigate the Olympics

Corporate Olympic torchbearers exchange a 'torch kiss'

Investigating corporate Olympic torchbearers - analysing the data and working collaboratively led to this photo of a 'torch kiss' between two retail bosses

If I’ve been a little quiet on the blog recently, it’s because I’ve been spending a lot of time involved in an investigation into the Olympic torch relay over on Help Me Investigate the Olympics.

I’ve written two guest posts – for The Guardian’s Data Blog and The Telegraph’s new Olympics infographics and data blog – talking about some of the processes involved in that investigation. Here are the key points: Continue reading