Tag Archives: simon rogers

FAQ: Data journalism, laziness, information overload & localism

I seem to have lost the habit of publishing interview responses here under the FAQ category for the past year, but the following questions from a journalist, and my answers, were worth publishing in case anyone has the same questions:

Simon Rogers, Editor of the Datablog, said that he thinks in the future simply publishing the raw data will become acceptable journalism. Do you not think that an approach like this to raw data is lazy journalism? And equally, do you think that would be a type of journalism that the public will really be able to engage with?

It’s not lazy at all, and to think otherwise is pure journalistic egoism. We have a tendency to undervalue things because we haven’t invested our own effort into it, but the value lies in its usefulness, not in the effort. Increasingly I think being a journalist will be as much about making journalism possible for other people as it will be about creating that journalism yourself. You have to ask yourself: do I just want to write pretty stories, or allow people to hold power to account?

In a world where we can access information directly I think it’s a central function of journalists to make important information findable. The first level of that is to publish raw data.

It’s interesting to see that this seems to be a key principle for hyperlocal bloggers – making civic information findable.

The second level – if you have the time and resources – is then to analyse that raw data and pull stories out of it. But ultimately there will always be other ‘stories’ in the information that people want to find for themselves, which may be too specific to be of interest to the journalist or publisher.

The third level – which really requires a lot of investment – is to create tools that make it easier for the user to find what they want, to make it easier to understand (e.g. through visualisation), and to share it with others.

Do you think that alot of the information can be quite overwhelming and sometimes not go anywhere?

Of course, but that isn’t a reason for not publishing the information. It’s natural that when the information is released some of it will attract more attention than other parts – but also, if other questions come up in future there is a dataset that people can go back and interrogate even if they didn’t at the time.

At the moment we have a lot of data but very few tools to interrogate that. That’s going to change – just in the last 6 months we’ve seen some fantastic new tools for filtering data, and the momentum is building in this area. It’s notable how many of the bids for the Knight News Challenge were data-related.

Additionally, do you tihnk The Guardian continue to pursue stories from the masses of data as consistently as they have done in previous years?

Yes, I think the Guardian has now built a reputation in this field and will want to maintain that, not to mention the fact that its reputation means it will attract more and more data-related stories, and benefit from the work of people outside the organisation who are interrogating data. They’ll also get better and better as they learn from experience.

And why do you think that smaller news resources struggle to use this sort of information as a source for news?

Partly because data has historically been more national than local. Even now I get frustrated when I find a dataset but then discover it’s only broken down into England, Wales, Scotland and Northern Ireland. But we are now finally getting more and more local data.

Also, at a local level journalists tend to be less specialised. On a national you might have a health or environment or financial reporter who is more used to dealing with figures and data. On a local newspaper that’s less likely – and there’s a high turnover of staff because of the low wages.