Every news organisation should have a Datastore

You may know about The Guardian’s Datastore: a compilation of “publicly-available data for you to use free” that’s been around for a few months now. You know the sort of thing: university tables; MPs’ expenses; tax paid by the FTSE 100.

It has already produced some great work from what I once described as the “Technician” variant of distributed journalism.

But a column by Charles Arthur recently was the first example I’ve seen of Datastore being used for, well, more ordinary data – the sort of information journalists deal with every week. Here’s how it appeared in print:

“I dug up the figures from the UK music industry: the British record industry’s trade association (the BPI), and the UK games industry (via its trade body, Elspa) as well as the DVD industry (through the UK Film Council and the British Video Association). The results are over on the Guardian Data Store (http://bit.ly/data01), because they are the sort of numbers that should be available to everyone to chew over.

“What did I find? Total spending has grown – but music spending is being squeezed. The games industry – hardware and software – has grown from £1.4bn in 1999 (the year Napster started, and the music business stood rabbit-transfixed) to £4.04bn in 2008. That’s 12% annual compound growth. You’d kill for an endowment like that. Even DVD sales and rental take a £2.5bn bite out of consumers’ available funds, double that of 1999.

“So the music industry’s deadliest enemy isn’t filesharing – it’s the likes of Nintendo, Microsoft and Sony, and a zillion games publishers.”

That link (which frustratingly isn’t active in the online article) takes you to a Datablog post by Arthur which in turn links to a rather simple spreadsheet.

And it’s the simplicity that I think is important.

It’s one thing to link to huge datasets that benefit from lots of eyeballs looking for stories, or perming the data in different ways.

But it’s something else to link to the more everyday figures journalists deal with; to show your sums, in short.

Is this a natural extension of the blogging culture of linking to your sources? I think it is. And the more journalists get used to publishing their work on the likes of Google Spreadsheets, the better journalism we will get.

So why aren’t more journalists doing it? And why aren’t more news organisations providing a place for them to do it? Or are they? I’d love to know of any other individual or organisational examples.

8 thoughts on “Every news organisation should have a Datastore”

Tony Hirst June 15, 2009 at 11:07 am

A recent post on “Data Is A Dish Best Served Raw” [ http://eagereyes.org/data/dish-best-served-raw.html ] makes the point that summary data tables that describe the results from processing a raw data set are not that useful for doing further analysis.

However, if those summary tables are published with a link back to the original data set and an explanation (somewhere) of how the summary table was created, we have a case study that reveals some of the assumptions made in creating the summary table, as well as showing how raw data can be engaged with.

See also: Rory Cellan-Jones publishing the ‘raw audio’ of his interview with Tim Berners-Lee that provided the basis for the BBC dot.life blog post, “Sir Tim’s cry – ‘raw data now'” [ http://www.bbc.co.uk/blogs/technology/2009/06/sir_tims_cry_raw_data_now.html ]

And also the before and after of the Telegraph expenses letters: Playing Fair? MPs’ Expenses and a Tale of Three Media [ http://ouseful.wordpress.com/2009/06/04/playing-fair-mps-expenses-and-a-tale-of-three-media/ ]
As you say, it’s just the equivalent of “show your working”:-)

Reply ↓

Charles June 15, 2009 at 3:32 pm

Thanks for the kind words. (And the link is now fixed, not foxed.)

As for showing your working – it’s a nice idea, but it’s also one that takes quite a lot of time if the working isn’t the core of the story. If it’s incidental, then assembling it into a narrative can take just as long as writing the story.

This was a data-driven story; many aren’t.

Reply ↓

Pingback: Debate around the web « Press Review Blog

Pingback: MLB.com’s iPhone App Could Be a Model For Media Saving Itself | yKvz Blog

Pingback: MLB.com’s iPhone App Could Be a Model For Media Saving Itself | Techdare

Pingback: My Site! » MLB.com’s iPhone App Could Be a Model For Media Saving Itself

Pingback: Webleituras « dveras em rede

Pingback: Is there a ‘canon’ of data journalism? Comment call! | Online Journalism Blog