Last month I spoke to some health reporters from a national broadcaster about my favourite sources of health data. As part of that I wrote a post on Help Me Investigate. I’m cross-posting it here this month as I talk about sources of data on the Data Journalism MOOC:
I come upon examples of bad practice in publishing government data on a regular basis, but the Universal Jobmatch tool is an example so bad I just had to write about it. In fact, it’s worse than the old-fashioned data service that preceded it.
That older service was the Office for National Statistics’ labour market service NOMIS, which published data on Jobcentre vacancies and claimants until late 2012, when Jobcentre Plus was given responsibility for publishing the data using their Universal Jobmatch tool.
Despite a number of concerns, more than a year on, Universal Jobmatch‘s reports section has ignored at least half of the public data principles first drafted by the Government’s Public Sector Transparency Board in 2010, and published in 2012. Continue reading
When you start publishing online you move from the well-thumbed areas of defamation and libel, contempt of court and privilege and privacy to a whole new world of laws and licences.
This is a place where laws you never knew existed can be applied to your work – while other ones can come in surprisingly useful. Here are the key ones:
If people aren’t using data it isn’t just a problem for web developers – it’s a problem for journalists too. If not enough people are looking at information on crime, politics, health, education, or welfare then it makes our work harder.
On that subject, Tim Davies writes about the challenges of ‘getting data used’ and the inclination to focus on data-centric solutions. “Data quality, poor meta-data, inaccessible language, and the difficulty of finding wheat amongst the chaff of data were all diagnosed [at one hack day] as part of the problem,” he reports. “Yet these diagnosis and solutions are still based on linear thinking: when a dataset is truly accessible, then it will be used, and economic benefits will flow. Continue reading
Guest post by Duarte Romero
Since the start of the year the Argentinian newspaper ‘La Nación’ has been publishing ‘Nación Data’, a blog dedicated to data visualization, interactive projects and especially, all the news related with data journalism.
In this guest post, Damian Radcliffe highlights some topline developments in the hyper-local space during 2011. He also asks for your suggestions of great hyper-local content from 2011. His more detailed slides looking at the previous year are cross-posted at the bottom of this article.
2011 was a busy year across the hyper-local sphere, with a flurry of activity online as well as more traditional platforms such as TV, Radio and newspapers.
The Government’s plans for Local TV have been considerably developed, following the Shott Review just over a year ago. We now have a clearer indication of the areas which will be first on the list for these new services and how Ofcom might award these licences. What we don’t know is who will apply for these licences, or what their business models will be. But, this should become clear in the second half of the year.
Whilst the Leveson Inquiry hasn’t directly been looking at local media, it has been a part of the debate. Claire Enders outlined some of the challenges facing the regional and local press in a presentation showing declining revenue, jobs and advertising over the past five years. Her research suggests that the impact of “the move to digital” has been greater at a local level than at the nationals.
And on the content front, we saw Jeremy Hunt cite a number of hyper-local examples at the Oxford Media Convention, as well as record coverage for regional press and many hyper-local outlets as a result of the summer riots.
I’ve included more on all of these stories in my personal retrospective for the past year.
One area where I’d really welcome feedback is examples of hyper-local content you produced – or read – in 2011. I’m conscious that a lot of great material may not necessarily reach a wider audience, so do post your suggestions below and hopefully we can begin to redress that.
Tim Davies has done a wonderful job of combing through the fine print of the UK government’s Autumn statement open data measures (PDF), highlighting the dynamics that appear to be driving it, and the data conspicuous by its absence.
Here are the passages most relevant for journalists. Firstly, following the money and accountability:
“The [Data Strategy Board] body seeking public data will be reliant upon the profitability of the PDG [Public Data Group] in order to have the funding it needs to secure the release of data that, if properly released in free forms, would likely undermine the current trading revenue model of the PDG. That doesn’t look like the foundation for very independent and effective governance or regulation to open up core reference data!
“Furthermore, whilst the proposed terms for the DSB [Data Strategy Board] terms state that “Data users from outside the public sector, including representatives of commercial re-users and the Open Data community, will represent at least 30% of the members of DSB”, there are also challenges ahead to ensure data users from civil society interests are represented on the board”
Secondly, the emphasis on clinical data and issues surrounding privacy and the sale of personal data:
“The first measures in the Cabinet Office’s paper are explicitly not about open data as public data, but are about the restricted sharing of personal medical records with life-science research firms – with the intent of developing this sector of the economy. With a small nod to “identifying specified datasets for open publication and linkage”, the proposals are more centrally concerned with supporting the development of a Clinical Practice Research Datalink (CPRD) which will contain interlinked ‘unidentifiable, individual level’ health records, by which I interpret the ability to identify a particular individual with some set of data points recorded on them in primary and secondary care data, without the identity of the person being revealed.
“The place of this in open data measures raises a number of questions, such as whether the right constituencies have been consulted on these measures and why such a significant shift in how the NHS may be handing citizens personal data is included in proposals unlikely to be heavily scrutinised by patient groups? In the past, open data policies have been very clear that ‘personal data’ is out of scope – and the confusion here raises risks to public confidence in the open data agenda. Leaving this issue aside for the moment, we also need to critically explore the evidence that the release of detailed health data will “reinforce the UK’s position as a global centre for research and analytics and boost UK life sciences”. In theory, if life science data is released digitally and online, then the firms that can exploit it are not only UK firms – but the return on the release of UK citizens personal data could be gained anywhere in the world where the research skills to work with it exist.”
UPDATE: More on that in The Guardian.
Thirdly, it looks like this data will allow journalists to scrutinise welfare and credit (so plenty of material for the tabloids and mid-market press), but not data that scrutinises corporations or governments:
“When we look at the other administrative datasets proposed for release in the Measures the politicisation of open data release is evident: Fit Note Data; Universal Credit Data; and Welfare Data (again discussed for ‘linking’ implying we’re not just talking about aggregate statistics) are all proposed for increased release, with specific proposals to “increase their value to industry”. By contrast, no mention of releasing more details on the tax share paid by corporations, where the UK issues arms export licenses, or which organisations are responsible for the most employment law violations. Although the stated aims of the Measures include increasing “transparency and accountability” it would not be unreasonable to read the detail of the measures as very one-sided on this point: and emphasising industry exploitation of data far more than good governance and citizen rights with respect to data.
“The blurring of the line between ‘personal data’ and ‘open data’, and the state’s assumption of the right to share personal data for industrial gain should give cause for concern, and highlights the need for build a stronger constituency scrutinising government open data action.”
It’s nice to see a data initiative being greeted with a critical eye rather than Three Cheers for the Numbers.
UPDATE: On a similar note, Access Info Europe highlights problems with the Open Government Partnership, which “must significantly improve its internal access to information policy to meet the standards it is advancing”. Specifically:
“The policy should be reformed to incorporate basic open data principles such as that information will be made available in a machine-readable, electronic format free of restrictions on reuse.”
“A key problem is the lack of detail in the policy, which has the result of leaving important matters to the discretion of the OGP. Other key problems include:
» The failure of the policy to recognise the fundamental human right to information;
» The significantly overbroad and discretionary regime of exceptions;
» The failure of the draft Policy to put in place a system of protections and sanctions.”