Tag Archives: open data

Interview: Ton Zijlstra on open data in the EU (audio)

A couple weeks ago I spoke at the PICNIC festival in Amsterdam. While I was there I grabbed an interview with Ton Zijlstra, who has been following open data developments across EU governments very closely. You can find the interview embedded below:

[audio:http://audioboo.fm/boos/186944-ton-zijlstra-on-open-data-in-the-eu.mp3%5D

Charities data opened up – journalists: say thanks.

Having made significant inroads in opening up council and local election data, Chris Taggart has now opened up charities data from the less-than-open Charity Commission website. The result: a new website – Open Charities.

The man deserves a round of applause. Charity data is enormously important in all sorts of ways – and is likely to become more so as the government leans on the third sector to take on a bigger role in providing public services. Making it easier to join the dots between charitable organisations, the private and public sector, contracts and individuals – which is what Open Charities does – will help journalists and bloggers enormously.

A blog post by Chris explains the site and its background in more depth. In it he explains that:

“For now, it’s just a the simplest of things, a web application with a unique URL for every charity based on its charity number, and with the basic information for each charity available as data (XML, JSON and RDF). It’s also searchable, and sortable by most recent income and spending, and for linked data people there are dereferenceable Resource URIs.

“The entire database is available to download and reuse (under an open, share-alike attribution licence). It’s a compressed CSV file, weighing in at just under 20MB for the compressed version, and should probably only attempted by those familiar with manipulating large datasets (don’t try opening it up in your spreadsheet, for example). I’m also in the process of importing it into Google Fusion Tables (it’s still churning away in the background) and will post a link when it’s done.”

Chris promises to add more features “if there’s any interest”.

Well, go on…

Don't stop us digging into public spending data

A disturbing discovery by Chris Taggart last week: a number of councils in the UK are handing over their ‘open’ data to a company which only allows it to be downloaded for “personal” use.

As Chris himself points out, this runs completely against the spirit of the push to release public data in a number of ways:

  • Data cannot be used for “commercial gain”. This includes publishers wanting to present the information in ways that make most sense to the reader, and startups wanting to find innovative ways to involve people in their local area. Oh, and that whole ‘Big Society‘ stuff.
  • The way the sites are built means you couldn’t scrape this information with a computer anyway
  • It’s only a part of the data. “Download the data from SpotlightOnSpend and it’s rather different from the published data [on the Windsor & Maidenhead site]. Different in that it is missing core data that is in W&M published data (e.g. categories), and that includes data that isn’t in the published data (e.g. data from 2008).”

It’s a worrying path. As Chris sums it up: ” Councils hand over all their valuable financial data to a company which aggregates for its own purposes, and, er, doesn’t open up the data, shooting down all those goals of mashing up the data, using the community to analyse and undermining much of the good work that’s been done.”

The Transparency Board quickly issued a statement about this issue saying that “urgent” measures are taking place to rectify the problem.

And Spikes Cavell, who make the software, responded in Information Age, pointing out that “it is first and foremost a spend analysis software and consultancy supplier, and that it publishes data through SpotlightOnSpend as a free, optional and supplementary service for its local government customers. The hope is that this might help the company to win business, he explains, but it is not a money-spinner in itself.”

They are now promising to make the data available for download in its “raw form”, although it’s not clear what that will be. Adrian Short’s comment to the piece is worth reading.

Nevertheless, this is an issue that anyone interested in holding power to account should keep a close eye on. And to that aim, Chris has started an investigation on Help Me Investigate to find out how and why councils are giving access to their spending data. Please join it and help here.

(Comment or email me on paul at helpmeinvestigate.com if you want an invitation.)

Get used to reading this…

“We have a team of developers going through the data now – and we’ll let you know here what we learn as and when we learn it.”

If you had any doubt over the concept of ‘programmer as journalist’, that quote above from The Guardian’s liveblog of the opening of the COINS database gives you a preview of things to come. While you’re at it, you might as well add in ‘statistician as journalist‘ and ‘information designer as journalist‘ – or look at my post from 2008 on New Journalists for New Information Flows. Are we there yet?

The Great Government Data Rush – what does it mean for journalists?

Earlier this week I posted briefly on what I consider to be the most significant move for journalism by the UK government since the Freedom of Information Act. But I wanted to look more systematically at what is likely to be a huge change in the information landscape that journalists deal with…

So. In the spirit of data journalism, here is an embedded spreadsheet of the timetable of data to be released by national government, local government, and other bodies. I’ve added notes on how I feel each piece of data could be important, and any useful links – but I’d like you to add any thoughts on other possibilities. Here it is:

Meanwhile, over at Data.gov.uk, the Local Data Panel has published a post inviting comment on the format that data might be supplied in, and fields it might contain.

  • As a first stage, publish the raw data and any lookup table needed to interpret it in a spreadsheet as a CSV or XML file as soon as possible. This should be put on the council’s website as a document for anyone to download. Or even published in a service such as Google Docs
  • There is not yet a national approach for publishing local authority expenditure data. This should not stop publication of data in its raw, machine-readable form. Observing such raw data being used is the only route to a national approach, should one be required
  • Publishing raw data will allow the panel and others to assess how that data could/should be presented to users. Sight of the data is worth a hundred meetings. Members of the panel will study the data, take part in the discussion and revise this advice.
  • As a second stage, informed by the discussion, the panel and users can then give feedback about publishing data (RDF, CSV, etc) in a way that can be consistent across all local authorities involving structured, regularly updated data published on the Web using open standards.

Help Me Investigate contributor and all-round good guy Neil Houston has already responded with some very interesting points.

“You’d be surprised how many times there are some systems where it’s not totally easily to identify the payment, back to the relevant invoice (apart from a manual reconciliation), you need to know the invoice side of the transactions – as that is where the cost will be booked to (as the payment details will just be crediting cash, debiting Accounts Payable).”

Open Data in Spain: AbreDatos

I come from Argentina, where the government isn’t obliged by law to give away public information to citizens or NGOs that request it. There are, though, some access-to-information projects ready to be discussed in Congress in the next few days. Still, this is why I’m always amazed by all the open data initiatives in the USA and UK.

But now I can show you an open data project from Spain called Desafío AbreDatos, organized by the ProBonoPúblico association and supported by the Basque Government.

AbreDatos 2010 consists of two days’ programming by groups of 4 developers building websites, apps, widgets or mashups with at least one source coming from a public organization in digital format (APIs, XML, CSV, SPARQL / RDF, HTML, PDF, scanned images). Many of those sources can be found in datospublicos.jottit.com.

Of course the initiative wants to encourage the opening up of public data and transparency of administrations, and some of the projects are very interesting (my favorite is a website that shows if Congress staff really earn their salaries).

One to keep an eye on.