Last month I spoke to some health reporters from a national broadcaster about my favourite sources of health data. As part of that I wrote a post on Help Me Investigate. I’m cross-posting it here this month as I talk about sources of data on the Data Journalism MOOC:
I come upon examples of bad practice in publishing government data on a regular basis, but the Universal Jobmatch tool is an example so bad I just had to write about it. In fact, it’s worse than the old-fashioned data service that preceded it.
That older service was the Office for National Statistics’ labour market service NOMIS, which published data on Jobcentre vacancies and claimants until late 2012, when Jobcentre Plus was given responsibility for publishing the data using their Universal Jobmatch tool.
Despite a number of concerns, more than a year on, Universal Jobmatch‘s reports section has ignored at least half of the public data principles first drafted by the Government’s Public Sector Transparency Board in 2010, and published in 2012. Continue reading
When you start publishing online you move from the well-thumbed areas of defamation and libel, contempt of court and privilege and privacy to a whole new world of laws and licences.
This is a place where laws you never knew existed can be applied to your work – while other ones can come in surprisingly useful. Here are the key ones:
If people aren’t using data it isn’t just a problem for web developers – it’s a problem for journalists too. If not enough people are looking at information on crime, politics, health, education, or welfare then it makes our work harder.
On that subject, Tim Davies writes about the challenges of ‘getting data used’ and the inclination to focus on data-centric solutions. “Data quality, poor meta-data, inaccessible language, and the difficulty of finding wheat amongst the chaff of data were all diagnosed [at one hack day] as part of the problem,” he reports. “Yet these diagnosis and solutions are still based on linear thinking: when a dataset is truly accessible, then it will be used, and economic benefits will flow. Continue reading
Guest post by Duarte Romero
Since the start of the year the Argentinian newspaper ‘La Nación’ has been publishing ‘Nación Data’, a blog dedicated to data visualization, interactive projects and especially, all the news related with data journalism.
In this guest post, Damian Radcliffe highlights some topline developments in the hyper-local space during 2011. He also asks for your suggestions of great hyper-local content from 2011. His more detailed slides looking at the previous year are cross-posted at the bottom of this article.
2011 was a busy year across the hyper-local sphere, with a flurry of activity online as well as more traditional platforms such as TV, Radio and newspapers.
The Government’s plans for Local TV have been considerably developed, following the Shott Review just over a year ago. We now have a clearer indication of the areas which will be first on the list for these new services and how Ofcom might award these licences. What we don’t know is who will apply for these licences, or what their business models will be. But, this should become clear in the second half of the year.
Whilst the Leveson Inquiry hasn’t directly been looking at local media, it has been a part of the debate. Claire Enders outlined some of the challenges facing the regional and local press in a presentation showing declining revenue, jobs and advertising over the past five years. Her research suggests that the impact of “the move to digital” has been greater at a local level than at the nationals.
And on the content front, we saw Jeremy Hunt cite a number of hyper-local examples at the Oxford Media Convention, as well as record coverage for regional press and many hyper-local outlets as a result of the summer riots.
I’ve included more on all of these stories in my personal retrospective for the past year.
One area where I’d really welcome feedback is examples of hyper-local content you produced – or read – in 2011. I’m conscious that a lot of great material may not necessarily reach a wider audience, so do post your suggestions below and hopefully we can begin to redress that.
Tim Davies has done a wonderful job of combing through the fine print of the UK government’s Autumn statement open data measures (PDF), highlighting the dynamics that appear to be driving it, and the data conspicuous by its absence.
Here are the passages most relevant for journalists. Firstly, following the money and accountability:
“The [Data Strategy Board] body seeking public data will be reliant upon the profitability of the PDG [Public Data Group] in order to have the funding it needs to secure the release of data that, if properly released in free forms, would likely undermine the current trading revenue model of the PDG. That doesn’t look like the foundation for very independent and effective governance or regulation to open up core reference data!
“Furthermore, whilst the proposed terms for the DSB [Data Strategy Board] terms state that “Data users from outside the public sector, including representatives of commercial re-users and the Open Data community, will represent at least 30% of the members of DSB”, there are also challenges ahead to ensure data users from civil society interests are represented on the board”
Secondly, the emphasis on clinical data and issues surrounding privacy and the sale of personal data:
“The first measures in the Cabinet Office’s paper are explicitly not about open data as public data, but are about the restricted sharing of personal medical records with life-science research firms – with the intent of developing this sector of the economy. With a small nod to “identifying specified datasets for open publication and linkage”, the proposals are more centrally concerned with supporting the development of a Clinical Practice Research Datalink (CPRD) which will contain interlinked ‘unidentifiable, individual level’ health records, by which I interpret the ability to identify a particular individual with some set of data points recorded on them in primary and secondary care data, without the identity of the person being revealed.
“The place of this in open data measures raises a number of questions, such as whether the right constituencies have been consulted on these measures and why such a significant shift in how the NHS may be handing citizens personal data is included in proposals unlikely to be heavily scrutinised by patient groups? In the past, open data policies have been very clear that ‘personal data’ is out of scope – and the confusion here raises risks to public confidence in the open data agenda. Leaving this issue aside for the moment, we also need to critically explore the evidence that the release of detailed health data will “reinforce the UK’s position as a global centre for research and analytics and boost UK life sciences”. In theory, if life science data is released digitally and online, then the firms that can exploit it are not only UK firms – but the return on the release of UK citizens personal data could be gained anywhere in the world where the research skills to work with it exist.”
UPDATE: More on that in The Guardian.
Thirdly, it looks like this data will allow journalists to scrutinise welfare and credit (so plenty of material for the tabloids and mid-market press), but not data that scrutinises corporations or governments:
“When we look at the other administrative datasets proposed for release in the Measures the politicisation of open data release is evident: Fit Note Data; Universal Credit Data; and Welfare Data (again discussed for ‘linking’ implying we’re not just talking about aggregate statistics) are all proposed for increased release, with specific proposals to “increase their value to industry”. By contrast, no mention of releasing more details on the tax share paid by corporations, where the UK issues arms export licenses, or which organisations are responsible for the most employment law violations. Although the stated aims of the Measures include increasing “transparency and accountability” it would not be unreasonable to read the detail of the measures as very one-sided on this point: and emphasising industry exploitation of data far more than good governance and citizen rights with respect to data.
“The blurring of the line between ‘personal data’ and ‘open data’, and the state’s assumption of the right to share personal data for industrial gain should give cause for concern, and highlights the need for build a stronger constituency scrutinising government open data action.”
It’s nice to see a data initiative being greeted with a critical eye rather than Three Cheers for the Numbers.
UPDATE: On a similar note, Access Info Europe highlights problems with the Open Government Partnership, which “must significantly improve its internal access to information policy to meet the standards it is advancing”. Specifically:
“The policy should be reformed to incorporate basic open data principles such as that information will be made available in a machine-readable, electronic format free of restrictions on reuse.”
“A key problem is the lack of detail in the policy, which has the result of leaving important matters to the discretion of the OGP. Other key problems include:
» The failure of the policy to recognise the fundamental human right to information;
» The significantly overbroad and discretionary regime of exceptions;
» The failure of the draft Policy to put in place a system of protections and sanctions.”
In a guest post for the Online Journalism Blog, Christiane Schulzki-Haddouti explains how participants at an open data event helped crack open data on German aid spending. This post was originally published in German at Hyperland.
How does the foreign aid of Germany support other countries? The Federal Ministry of Economic Cooperation and Development (BMZ) releases no details, although about 6 billion euros is made available for aid every year. Now the Open Knowledge Foundation in Germany has broken down the data – with the unintended help of the OECD.
Until now it was a mystery to the German public which countries benefit, and to what extent, from their government’s spending on foreign aid: the BMZ publishes only a list of the countries that receive aid (PDF). It was also not known which particular sectors in these countries were being supported.
For the political scientist Christian Kreutz, member of the Open Knowledge Foundation Germany, the BMZ database for development aid was just disappointing:
“The relevant information is scattered, little data is available in open formats and a breakdown of financial data such as project expenses is not yet published.”
An institutional workaround
In the course of the Open Aid Data Conference in Berlin, participants decided to tackle the situation. They noted that there has long been a public database at international level which explains the expenditures at a larger scale: the BMZ regularly reports its data as part of “Official Development Assistance” (ODA) to the Organisation for Economic Co-operation and Development, better known as the OECD.
Now the data is also available on the website Aid Data.
For two days Christian Kreutz wrangled with the data sets, then he presented his first results on a new open-data map. More than half the ODA payments come from the BMZ, the rest come from other ministries. Kreutz concludes: “Hardly any country receives nothing.”
Interestingly, not only classic developing countries are supported. The lion’s share goes to BRIC countries, namely Brazil, Russia, India and China which have profited from high economic growth for years.
Russia received around 12 billion euros in the years 1995 to 2009, China and India around 6 and 4 billion euros respectively.
Current sites of conflict receive quite a lot of money: Iraq received 7 billion euros, with the majority coming from debt cancellation. A similar situation is found in Nigeria and Cameroon.
In comparison Afghanistan and Pakistan receive only about 1.2 billion euros.
Even authoritarian regimes benefit from German development aid: Syria received around 1 billion euros. A large proportion of the money is spent on debt relief as well as water and education projects.
Interestingly, however, some European states received more money: Poland got 2.8 billion, mainly going into the education sector.
EU aspirants Serbia and Turkey received 2 billion euros each.
Payment information was also combined with data from the Economist on democratic development. Here a kind of rule of thumb can be recognised: countries which are less democratic are encouraged.
Egypt, for example, not only received support for water projects and its textile industry, but also for its border police – by an unspecified federal ministry.
BMZ is opening up
The new aid data map does not break down numbers by donors yet. But it could do so, as the detailed OECD data supports it.
Christian Kreutz has filed a Freedom of Information Act request with the BMZ to get further data. But the ministry is already showing signs of movement: a spokesperson said that project funding data will be published soon on the ministry’s website.
The interesting question is how open and accessible the BMZ data will be. Recipients of ODA funds can not be inferred directly from the OECD database. Open data activists hope that the BMZ will not hide the data behind a restrictive search interface to prevent further analysis, à la Farmsubsidy.
There are two Cabinet Office consultations taking place at the moment around open data: one around data policy for the new Public Data Corporation (PDC), and another around the government’s policy around transparency and open data strategy.
This should be of enormous interest to any media organisation – a key opportunity to influence the availability of information of public interest.
For example, among the issues under consideration are (summed up by Tony Hirst): charging for PDC information, licensing and regulation.
These will all be vital elements in the future of journalism – news organisations and journalists should be vocal in shaping them.
The deadline for both consultations is October 27.
Instead it’s falling to the likes of Tony Hirst (an Open University academic), Dan Herbert (an Oxford Brookes academic) and Chris Taggart (a developer who used to be a magazine publisher) to fill the scrutiny gap. Recently all three have shone a light into the move towards transparency and open data which anyone with an interest in information would be advised to read.
Herbert wrote about the publication of the first Whole of Government Accounts for the UK.
And Taggart made one of the best presentations I’ve seen on the relationship between information and democracy.
What all three highlight is how control of information still represents the exercise of power, and how shifts in that control as a result of the transparency/open data/linked data agenda are open to abuse, gaming, or spin. Continue reading