Wonderful post by Tony Hirst in which he sort-of-coins* a lovely neologism in explaining how data can be “laundered”: “The Deloitte report was used as evidence by Facebook to demonstrate a particular economic benefit made possible by Facebook’s activities. The consultancy firm’s caveats were ignored, (including the fact that the data may in part at least have come from Facebook itself), in reporting this
Read more…
There are two Cabinet Office consultations taking place at the moment around open data: one around data policy for the new Public Data Corporation (PDC), and another around the government’s policy around transparency and open data strategy. This should be of enormous interest to any media organisation – a key opportunity to influence the availability of information of public interest.
Read more…
Various commentators over the past year have made the observation that “Data is the new oil“. If that’s the case, journalists should be following the money. But they’re not. Instead it’s falling to the likes of Tony Hirst (an Open University academic), Dan Herbert (an Oxford Brookes academic) and Chris Taggart (a developer who used to be a magazine publisher)
Read more…
It’s an often encountered situation, but one that can be a pain to address – merging data from two sources around a common column. Here’s a way of doing it in Google Refine… Here are a couple of example datasets to import into separate Google Refine projects if you want to play along, both courtesy [...]![]()
Last summer, at the European Centre for Journalism round table on data driven journalism, I remember saying something along the lines of “your eyes can often do the stats for you”, the implication being that our perceptual apparatus is good at pattern detection, and can often see things in the data that most of us [...]![]()
I’m working on a new pattern using Google Refine as the hub for a data fusion experiment pulling together data from different sources. I’m not sure how it’ll play out in the end, but here are some fragments…. Grab Data into Google Refine as CSV from a URL (Proxied Google Spreadsheet Query via Yahoo Pipes) [...]![]()
Following the post earlier this week on XML and RSS for journalists I wanted to look at another important format for journalists working with data: JSON. JSON is a data format which has been rising in popularity over the past few years. Quite often it is offered alongside – or instead of – XML by various information services, such as
Read more…
Reading through the Online Journalism blog post on Getting full addresses for data from an FOI response (using APIs), the following phrase – relating to the composition of some Google Refine code to parse a JSON string from the Google geocoding API – jumped out at me: “This took a bit of trial and error…” [...]![]()
A post on the Guardian Datablog earlier today took a dataset collected by the Tweetminster folk and graphed the sorts of thing that journalists tweet about ( Journalists on Twitter: how do Britain’s news organisations tweet?). Tweetminster maintains separate lists of tweeting journalists for several different media groups, so it was easy to grab the [...]![]()
Regular readers will know how I do quite like to dabble with visual analysis, so here are a couple of doodles with some of the university fees data that is starting to appear. The data set I’m using is a partial one, taken from the Guardian Datastore: Tuition fees 2012: what are the universities charging?. [...]![]()
Recent Comments