Tag Archives: open data

When information is power, these are the questions we should be asking

Various commentators over the past year have made the observation that “Data is the new oil“. If that’s the case, journalists should be following the money. But they’re not.

Instead it’s falling to the likes of Tony Hirst (an Open University academic), Dan Herbert (an Oxford Brookes academic) and Chris Taggart (a developer who used to be a magazine publisher) to fill the scrutiny gap. Recently all three have shone a light into the move towards transparency and open data which anyone with an interest in information would be advised to read.

Hirst wrote a particularly detailed post breaking down the results of a consultation about higher education data.

Herbert wrote about the publication of the first Whole of Government Accounts for the UK.

And Taggart made one of the best presentations I’ve seen on the relationship between information and democracy.

What all three highlight is how control of information still represents the exercise of power, and how shifts in that control as a result of the transparency/open data/linked data agenda are open to abuse, gaming, or spin. Continue reading

Is Ice Cream Strawberry? Part 4: Human Capital

This is the fourth part of my inaugural lecture at City University London, ‘Is Ice Cream Strawberry?’. You can find part one here, part two here, and part three here.

Human capital

So here’s person number 4: Gary Becker, a Nobel prize-winning economist.

Fifty years ago he used the phrase ‘human capital’ to refer to the economic value that companies should ascribe to their employees.

These days, of course, it is common sense to invest time in recruiting, training and retaining good employees. But at the time employees were seen as a cost.

We need a similar change in the way we see our readers – not as a cost on our time but as a valuable part of our operations that we should invest in recruiting, developing and retaining. Continue reading

Tell the government what you want from the Public Data Corporation

Public Data Corporation consultation

If who are excited about the prospect of open data, but frustrated by its execution (or just one of those people who complain that data doesn’t change anything), the government are inviting comments on what shape the Public Data Corporation should take.

It’s a refreshingly simple execution: a WordPress blog with each question as a separate blog post – presumably it cost a lot less than £300,000. But of course the questions are theirs, and they are:

1.      Which public sector datasets do you currently make use of?

2.      How easy is it to find out what datasets are held by public sector organisations?

3.      How do you, or would you, decide whether a dataset has value for you or for your organisation? What affects how valuable they are, for example timeliness, granularity, format?

4.      Which datasets are of most value to you or your organisation? Why?

5.      What methods of access to datasets would most benefit you or your organisation?

6.      What gets in the way of you or your organisation accessing datasets or data products?

7.      What are the most exciting applications of datasets or data products you are aware of – here or internationally? We are, again, particularly interested in the following areas: registration activities, environmental science, critical infrastructure and the built environment.

8.      Are there any datasets or products you’d like to see generated? How would you or your organisation use them, and what social or economic benefits do you think they would deliver?

9.      From your perspective, what would success look like for the Public Data Corporation?

10.  Have we got the name for this organisation right?  Do you have any suggestions on naming that might better convey our aims?

It’s a shame that there isn’t any space for more open discussion – and that so many of the questions resemble market research. But still, the more journalists who pile in – the more justifiably we can moan later. So go ahead.

Post your responses here.

3 new resources for data journalists

There have been a raft of new sites for data launched in the past couple of months which I haven’t had time to blog about, so here’s a quick round-up:

  • Tim DaviesOpen Data Cookbook aims to collect “step by step recipes for practical ways to use open data” – a useful complement to GetTheData. The recipes are currently aimed at the more technically minded but you know what to do to address that…
  • Is It Open Data? aims to “make it easy for people to make enquires of data holders, about the openness of the data they hold — and to record publicly the results of those efforts.”
  • And for those wishing to publish open data, The Open Data Manual provides information on what open data is, why you should publish open data, and how to do it. If you come up against an organisation that does not know how to publish their data in an open format, or needs convincing of why they should do so, this is a good place to point them to (or learn the arguments from).

If you’ve seen any other useful resources of late, please post a link in the comments.

Why journalists should be lobbying over police.uk’s crime data

UK police crime maps

Conrad Quilty-Harper writes about the new crime data from the UK police force – and in the process adds another straw to the groaning camel’s back of the government’s so-called transparency agenda:

“It’s useless to residents wanting to find out what was going on at the house around the corner at 3am last night, and it’s useless to individuals who want to build mobile phone applications on top of the data (perhaps to get a chunk of that £6 billion industry open data is supposed to create).

“The site’s limitations are as follows:

  • No IDs for crimes: what if I want to check whether real life crimes have made it onto the map? Sorry.
  • Six crime categories: including “other crimes”, everything from drug dealing to bank robberies in one handy, impossible to understand category.
  • No live data: you mean I have to wait until the end of the next month to see this month’s criminality?!
  • No dates or times: funny how without dates and times I can’t tell which police manager was in charge.
  • Case status: the police know how many crimes go solved or unsolved, why not tell us this?”

This is why people are so concerned about the Public Data Corporation. This is why we need to be monitoring exactly what spending data councils release, and in what format. And this is why we need to continue to press for the expansion of FOI laws. This is what we should be doing. Are we?

UPDATE: Will Perrin has FOI’d all correspondence relating to ICO advice on the crime maps. Jonathan Raper has a list of further flaws including:

  • Some data such as sexual offences and murder is removed – even though it would be easy to discover and locate from other police reports.
  • Data covers reported crimes rather than convictions, so some of it may turn out not to be crime.
  • The levels of policing are not provided, so that two areas with the “same” crime levels may in fact have “radically different” experiences of crime and policing.

Charles Arthur notes that: “Police forces have indicated that whenever a new set of data is uploaded – probably each month – the previous set will be removed from public view, making comparisons impossible unless outside developers actively store it.”

Louise Kidney says:

“What we’ve actually got with http://www.police.uk is neither one nor the other. Ruth looks like a crime overlord cos of all the crimes happening in her garden and we haven’t got exact point data, but we haven’t got first part of postcode data either e.g. BB5 crimes or NW1 crimes. Instead, we’ve got this weird halfway house thing where it’s not accurate, but its inaccuracy almost renders it useless because we don’t have any idea if every force uses the same parameters when picking these points, we don’t know how they pick their points, we don’t know what we don’t know in terms of whether one house in particular is causing a considerable issue with anti-social behaviour for example, allowing me to go to my local Council and demand they do something about it.”

Adrian Short argues that “What we’re looking at here isn’t a value-neutral scientific exercise in helping people to live their daily lives a little more easily, it’s an explicitly political attempt to shape the terms of a debate around the most fundamental changes in British policing in our lifetimes.”

He adds:

“It’s derived data that’s already been classified, rounded and lumped together in various ways, with a bit of location anonymising thrown in for good measure. I haven’t had a detailed look at it yet but I would caution against trying to use it for anything serious. A whole set of decisions have already transformed the raw source data (individual crime reports) into this derived dataset and you can’t undo them. You’ll just have to work within those decisions and stay extremely conscious that everything you produce with it will be prefixed, “as far as we can tell”.

“£300K for this? There ought to be a law against it.”

UPDATE 2: One frustrated developer has launched CrimeSearch.co.uk to provide “helpful information about crime and policing in your area, without costing 300k of tax payers’ money”

A portal for European government data: PublicData.eu plans

The Open Knowledge Foundation have published a blog post with notes on a site they’re developing to gather together data from across Europe. The post notes that the growth of data catalogues at both a national level (mentioning the Digitalisér.dk data portal run by the Danish National IT and Telecom Agency) and “countless city level initiatives across Europe as well – from Helsinki to Munich, Paris to Zaragoza.” with many more initiatives “in the pipeline with plans to launch in the next 6 to 12 months.”

PublicData.eu will, it says:

“Provide a single point of access to open, freely reusable datasets from numerous national, regional and local public bodies throughout Europe.

“[It] will harvest and federate this information to enable users to search, query, process, cache and perform other automated tasks on the data from a single place. This helps to solve the “discoverability problem” of finding interesting data across many different government websites, at many different levels of government, and across the many governments in Europe.”

What is perhaps even more interesting for journalists is that the site plans to:

“Capture (proposed) edits, annotations, comments and uploads from the broader community of public data users.”

That might include anything from cleaner versions of data, to instances where developers match datasets together, or where users add annotations that add context to a particular piece of information.

Finally there’s a general indication that the site hopes to further lower the bar for data and collaborative journalism by:

“Providing basic data analysis and visualisation tools together with more in-depth resources for those looking to dig deeper into the data. Users will be able to personalise their data browsing experience by being able to save links and create notes and comments on datasets.”

More in the post itself. Worth keeping an eye on.

Now corporations get the open data treatment

OpenCorporates __ The Open Database Of The Corporate World

In September I blogged about Chris Taggart’s website Open Charities, which opened up data from the Charity Commission website.

Today Taggart – along with Rob McKinnon – launches Open Corporates, which opens up companies information. This is a huge undertaking, but a vital one. As the site’s About page explains:

“Few parts of the corporate world are limited to a single country, and so the world needs a way of bringing the information together in a single place, and more than that, a place that’s accessible to anyone, not just those who subscribe to proprietary datasets.”

Taggart and McKinnon are well placed to do this. In addition to charities data, Taggart has created websites that make it easier to interrogate council spending data and hyperlocal websites; McKinnon has done the same for the New Zealand parliament and UK lobbying.

Below is a video explaining how you can interrogate data from the site using Google Refine. The site promises an API soon.

New UK site launches to tackle lobbying data

Who's Lobbying treemap

I’ve been waiting for the launch of Who’s Lobbying ever since they stuck up that little Post-It note on a holding page in the run-up to the general election. Well now the site is live – publishing and visualising lobbying data, beginning with information about “ministerial meetings with outside interests, based on the reports released by UK government departments in October.”

This information is presented on the homepage very simply: with 3 leaderboards and a lovely search interface.

Who's Lobbying homepage

There are also a couple of treemaps to explore, for a more visual (and clickable) kick.

These allow you to see more quickly any points of interest in particular areas. The Who’s Lobbying blog notes, for instance, that “the treemap shows about a quarter of the Department of Energy and Climate Change meetings are with power companies. Only a small fraction are with environmental or climate change organisations.”

It also critically notes in another post that

“The Number 10 flickr stream calls [its index to transparency] a “searchable online database of government transparency information”. However it is really just a page of links to department reports. Each report containing slightly different data. The reports are in a mix of PDF, CSV, and DOC formats.

“Unfortunately Number 10 and the Cabinet Office have not mandated a consistent format for publishing ministerial meeting information.

“The Ministry of Defence published data in a copy-protected PDF format, proventing copy and paste from the document.

DEFRA failed to publish the name of each minister in its CSV formatted report.

“The Department for Transport is the only department transparent enough to publish the date of each meeting.

“All other departments only provided the month of each meeting – was that an instruction given centrally to departments? Because of this it isn’t possible to determine if two ministers were at the same meeting. Our analysis is likely to be double counting meetings with two ministers in attendance.

“Under the previous Labour government, departments had published dates for individual meetings. In this regard, are we seeing less transparency under the Conservative/Lib Dem coalition?”

When journalists start raising these questions then something will really have been achieved by the open data movement. In the meantime, we can look at Who’s Lobbying as a very welcome addition to a list of sites that feels quite weighty now: MySociety’s family of tools as the grandaddy, and ElectionLeaflets.org (formerly The Straight Choice), OpenlyLocal, Scraperwiki, Where Does My Money Go? and OpenCharities as the new breed (not to mention all the data-driven sites that sprung up around this year’s election). When they find their legs, they could potentially be quite powerful.

Open data from the inside: Lichfield Council’s Stuart Harrison

I’m trying to get a feel for what some of the most innovative government departments and local authorities are doing around releasing data. I spoke to Stuart Harrison of Lichfield Council, which is leading the way at a local level.

What has been your involvement with open data so far?

I’ve been interested in open data for a few years now. It all started when I was building a site for food safety inspections in Staffordshire (http://www.ratemyplace.org.uk/), and after seeing the open APIs offered by sites such as Fixmystreet, Theyworkforyou etc, was inspired to add an API (http://www.ratemyplace.org.uk/api). This then got me thinking about all the data we publish on our website, and whether we could publish this in an open format. A trickle quickly turned into a flood and we now have over 50 individual items of open data at http://www.lichfielddc.gov.uk/data.

I think the main thing I’ve learnt is that APIs are great, but they’re not always necessary. My early work was on APIs that link directly into databases, but, as I’ve moved forward, I’ve found that this isn’t always necessary. While an API is nice to have, it’s sometimes much better to just get the data out there in a raw format.

What have people done with the data so far?

As we’re quite a small council, we haven’t had a lot of people doing work (that I know of) with much of our data. The biggest user of our data is probably Chris Taggart at Openly Local – I actually built an API (and extended the functionality of our existing councillor and committees system) to make it easier to republish. To be honest, unless I know the person and they actually told me, I doubt I’d actually know what was going on!

What do you plan to do next – and why?

Because of the problems stated before, we’ve got together with ScraperWiki to organise a Hacks and Hackers day on the 11th November, which will hopefully encourage developers and journalists to do something with our data, and also put the wheels in motion for organising a data-based community, which means that once someone does something with our data, we’re more likely to know about it!

Open data in Spain – guest post by Ricard Espelt

Ahead of speaking this week in Barcelona, I spoke to a few people in Spain about the situation regarding open data in the country. One of those people is Ricard Espelt, a member of Nuestracausa, “a group of people who wanted to work on projects like MySociety [in Spain]”. The group broke up and Ricard now runs Redall Comunicacao. Among Ricard’s projects is Copons 2.0: an “approach to consensus decision making”.

This is what Ricard had to say about the problems around open data, e-democracy and bottom-up projects in Spain:

I think there are three points to bear in mind when we to try to analyse how the tools are changing politics & public administration:

  • The process of the governments to review data, so it will be easier to use data for all the citizens. Open data.
  • The process of the governments to involve the citizens in the decisions. E-democracy.
  • The action of the citizens (individuals or groups) to engage other citizens to work for the community. Is a good way to make lobby and influence in the decisions of the governments.

Spain, like other countries, has been developing all these points with different levels of success. Continue reading