Category Archives: data journalism

The test of data journalism: checking the claims of lobbyists via government

Day 341 - Pull The Wool Over My Eyes - image by Simon James

Day 341 - Pull The Wool Over My Eyes - image by Simon James

While the public image of data journalism tends to revolve around big data dumps and headline-grabbing leaks, there is a more important day-to-day application of data skills: scrutinising the claims regularly made in support of spending public money.

I’m blogging about this now because I recently came across a particularly good illustration of politicians being dazzled by numbers from lobbyists (that journalists should be checking) in this article by Simon Jenkins, from which I’ll quote at length:

“This government, so draconian towards spending in public, is proving as casual towards dodgy money in private as were Tony Blair and Gordon Brown. Earlier this month the Olympics boss, Lord Coe, moseyed into Downing Street and said that his opening and closing ceremonies were looking a bit mean at £40m. Could he double it to £81m for more tinsel? Rather than scream and kick him downstairs, David Cameron said: my dear chap, but of course. I wonder what the prime minister would have said if his lordship had been asking for a care home, a library or a clinic.

“Much of the trouble comes down to the inexperience of ingenue ministers, and their susceptibility to the pestilence of lobbying now infecting Westminster. On this occasion the hapless Olympics minister, Hugh Robertson, claimed that the extra £41m was “worth £2-5bn in advertising revenue alone”, a rate of return so fanciful as to suggest a lobbyist’s lunch beyond all imagining. Robertson also claimed to need another £271m for games security (not to mention 10,000 troops, warships and surface-to-air missiles), despite it being “not in response to any specific security threat”. It was just money.

“This was merely the climax of naivety. In their first month in office, ministers were told – and believed – that it would be “more expensive” to cancel two new aircraft carriers than to build them. Ministers were told it would cost £2bn to cancel Labour’s crazy NHS computer rather than dump it in the nearest skip. Chris Huhne, darling of the renewables industry, wants to give it £8bn a year to rescue the planet, one of the quickest ways of transferring money from poor consumer to rich landowner yet found. The chancellor, George Osborne, was told by lobbyists he could save £3bn a year by giving away commercial planning permissions. All this was statistical rubbish.

“If local government behaved as credulously as Whitehall it would be summoned before the audit commission and subject to surcharge.”

And if you want to keep an eye on such claims, try a Google News search like this one.

20 free ebooks on journalism (for your Xmas Kindle)

For some reason there are two versions of this post on the site – please check the more up to date version here.

2011: the UK hyper-local year in review

In this guest post, Damian Radcliffe highlights some topline developments in the hyper-local space during 2011. He also asks for your suggestions of great hyper-local content from 2011. His more detailed slides looking at the previous year are cross-posted at the bottom of this article.

2011 was a busy year across the hyper-local sphere, with a flurry of activity online as well as more traditional platforms such as TV, Radio and newspapers.

The Government’s plans for Local TV have been considerably developed, following the Shott Review just over a year ago. We now have a clearer indication of the areas which will be first on the list for these new services and how Ofcom might award these licences. What we don’t know is who will apply for these licences, or what their business models will be. But, this should become clear in the second half of the year.

Whilst the Leveson Inquiry hasn’t directly been looking at local media, it has been a part of the debate. Claire Enders outlined some of the challenges facing the regional and local press in a presentation showing declining revenue, jobs and advertising over the past five years. Her research suggests that the impact of “the move to digital” has been greater at a local level than at the nationals.

Across the board, funding remains a challenge for many. But new models are emerging, with Daily Deals starting to form part of the revenue mix alongside money from foundations and franchising.

And on the content front, we saw Jeremy Hunt cite a number of hyper-local examples at the Oxford Media Convention, as well as record coverage for regional press and many hyper-local outlets as a result of the summer riots.

I’ve included more on all of these stories in my personal retrospective for the past year.

One area where I’d really welcome feedback is examples of hyper-local content you produced – or read – in 2011. I’m conscious that a lot of great material may not necessarily reach a wider audience, so do post your suggestions below and hopefully we can begin to redress that.

2 guest posts: 2012 predictions and “Social media and the evolution of the fourth estate”

Memeburn logo

I’ve written a couple of guest posts for Nieman Journalism Lab and the tech news site Memeburn. The Nieman post is part of a series looking forward to 2012. I’m never a fan of futurology so I’ve cheated a little and talked about developments already in progress: new interface conventions in news websites; the rise of collaboration; and the skilling up of journalists in data.

Memeburn asked me a few months ago to write about social media’s impact on journalism’s role as the Fourth Estate, and it took me until this month to find the time to do so. Here’s the salient passage:

“But the power of the former audience is a power that needs to be held to account too, and the rise of liveblogging is teaching reporters how to do that: reacting not just to events on the ground, but the reporting of those events by the people taking part: demonstrators and police, parents and politicians all publishing their own version of events — leaving journalists to go beyond documenting what is happening, and instead confirming or debunking the rumours surrounding that.

“So the role of journalist is moving away from that of gatekeeper and — as Axel Bruns argues — towards that of gatewatcher: amplifying the voices that need to be heard, factchecking the MPs whose blogs are 70% fiction or the Facebook users scaremongering about paedophiles.

“But while we are still adapting to this power shift, we should also recognise that that power is still being fiercely fought-over. Old laws are being used in new waysnew laws are being proposed to reaffirm previous relationships. Some of these may benefit journalists — but ultimately not journalism, nor its fourth estate role. The journalists most keenly aware of this — Heather Brooke in her pursuit of freedom of information; Charles Arthur in his campaign to ‘Free Our Data’ — recognise that journalists’ biggest role as part of the fourth estate may well be to ensure that everyone has access to information that is of public interest, that we are free to discuss it and what it means, and that — in the words of Eric S. Raymond — “Given enough eyeballs, all bugs are shallow“.”

Comments, as always, very welcome.

Tools or Tales?

Christmas gifts image by Michael Wyszomierski

Christmas gifts image by Michael Wyszomierski

This month’s Carnival of Journalism asks what journalists want for Christmas from programmers, and vice versa. Here’s my take.

Programmers and developers have already given journalists enough presents to last a century of Christmases. Programmers created content management systems and blogging platforms; they wrapped up networks of contacts in social networks, and parcelled up fast-moving updates on Twitter and SMS. They tied media in ribbons of metadata, making it easier to verify. They digitised content, making it possible to mix it with other content.

But I think it’s time for journalists to start giving back.

All of these gifts have made it easier for journalists to report stories. But that’s only part of publishing.

Technology’s place in journalism

Traditionally, journalism’s technology came after the story: sub-editors or designers laid the story out in the way they judged to be the most effective; printers gave it physical form; and distributors made sure it reached people.

Each stage in that process considers the next person. The inverted pyramid, for example, helps subs trim copy to fit available space. Subs talk to printers. Printers work with distributors. Processes are designed to reduce friction. The journalist’s work – whether they realise it or not – is a compromise reached over decades between different parties. An exchange of gifts, if you like.

But when it comes to publishing online, there’s been very little Christmas spirit.

Stories as a vehicle

Stories help us connect with current issues; they act as a vehicle for information that allows us to participate in society, whether that’s politically, socially, or economically.

The job of a journalist is to find stories in current events.

But those stories do not have to be told in one particular way. And if we were to try to tell them in some different ways (adding important metadata; publishing raw data; linking to supporting material; flagging false information), we could be giving a gift much desired by developers.

Here are some things that they could do with that gift – it is, if you like, my own fantasy Christmas list:

They’re just ideas – and will remain so as long as journalists assume they’re only writing for newspapers, and newspaper readers.

The newspaper is a tool: a way for groups of people to exchange information. In the 19th century those groups might have been political activists, or merchants who needed to know the latest trading conditions.

The web is a tool too – a different tool. We can use it to ask information to come to us, or to seek out supplementary information; we can use it to draw connections; and we can act on what we find in the same space. Stories need to adapt to the possibilities of the new tool they sit in.

This year, put a developer on your Christmas list. It’s the gift that keeps on giving.

New UK open data moves: following the money and other curiosities

Tim Davies has done a wonderful job of combing through the fine print of the UK government’s Autumn statement open data measures (PDF), highlighting the dynamics that appear to be driving it, and the data conspicuous by its absence.

Here are the passages most relevant for journalists. Firstly, following the money and accountability:

“The [Data Strategy Board] body seeking public data will be reliant upon the profitability of the PDG [Public Data Group] in order to have the funding it needs to secure the release of data that, if properly released in free forms, would likely undermine the current trading revenue model of the PDG. That doesn’t look like the foundation for very independent and effective governance or regulation to open up core reference data!

“Furthermore, whilst the proposed terms for the DSB [Data Strategy Board] terms state that “Data users from outside the public sector, including representatives of commercial re-users and the Open Data community, will represent at least 30% of the members of DSB”, there are also challenges ahead to ensure data users from civil society interests are represented on the board”

Secondly, the emphasis on clinical data and issues surrounding privacy and the sale of personal data:

“The first measures in the Cabinet Office’s paper are explicitly not about open data as public data, but are about the restricted sharing of personal medical records with life-science research firms – with the intent of developing this sector of the economy. With a small nod to “identifying specified datasets for open publication and linkage”, the proposals are more centrally concerned with supporting the development of a Clinical Practice Research Datalink (CPRD) which will contain interlinked ‘unidentifiable, individual level’ health records, by which I interpret the ability to identify a particular individual with some set of data points recorded on them in primary and secondary care data, without the identity of the person being revealed.

“The place of this in open data measures raises a number of questions, such as whether the right constituencies have been consulted on these measures and why such a significant shift in how the NHS may be handing citizens personal data is included in proposals unlikely to be heavily scrutinised by patient groups? In the past, open data policies have been very clear that ‘personal data’ is out of scope – and the confusion here raises risks to public confidence in the open data agenda. Leaving this issue aside for the moment, we also need to critically explore the evidence that the release of detailed health data will “reinforce the UK’s position as a global centre for research and analytics and boost UK life sciences”. In theory, if life science data is released digitally and online, then the firms that can exploit it are not only UK firms – but the return on the release of UK citizens personal data could be gained anywhere in the world where the research skills to work with it exist.”

UPDATE: More on that in The Guardian.

Thirdly, it looks like this data will allow journalists to scrutinise welfare and credit (so plenty of material for the tabloids and mid-market press), but not data that scrutinises corporations or governments:

“When we look at the other administrative datasets proposed for release in the Measures the politicisation of open data release is evident: Fit Note Data; Universal Credit Data; and Welfare Data (again discussed for ‘linking’ implying we’re not just talking about aggregate statistics) are all proposed for increased release, with specific proposals to “increase their value to industry”. By contrast, no mention of releasing more details on the tax share paid by corporations, where the UK issues arms export licenses, or which organisations are responsible for the most employment law violations. Although the stated aims of the Measures include increasing “transparency and accountability” it would not be unreasonable to read the detail of the measures as very one-sided on this point: and emphasising industry exploitation of data far more than good governance and citizen rights with respect to data.

“The blurring of the line between ‘personal data’ and ‘open data’, and the state’s assumption of the right to share personal data for industrial gain should give cause for concern, and highlights the need for build a stronger constituency scrutinising government open data action.”

It’s nice to see a data initiative being greeted with a critical eye rather than Three Cheers for the Numbers.

UPDATE: On a similar note, Access Info Europe highlights problems with the Open Government Partnership, which “must significantly improve its internal access to information policy to meet the standards it is advancing”. Specifically:

“The policy should be reformed to incorporate basic open data principles such as that information will be made available in a machine-readable, electronic format free of restrictions on reuse.”

“A key problem is the lack of detail in the policy, which has the result of leaving important matters to the discretion of the OGP. Other key problems include:
» The failure of the policy to recognise the fundamental human right to information;
» The significantly overbroad and discretionary regime of exceptions;
» The failure of the draft Policy to put in place a system of protections and sanctions.”

Maps “in the public interest” now exempt from Google Maps API charge

If you thought you couldn’t use the Google Maps API any more as a journalist, this update to the Google Geo Developers Blog should make you reconsider. From Nieman Journalism Lab:

“Certain web apps will be given blanket exemptions from charging. Here’s Google: “Maps API applications developed by non-profit organisations, applications deemed by Google to be in the public interest, and applications based in countries where we do not support Google Checkout transactions or offer Maps API Premier are exempt from these usage limits.” So nonprofit news orgs look to be in the clear, and Google could declare other news org maps apps to be “in the public interest” and free to run. (It also notes that nonprofits could be eligible for a free Maps API Premier license, which comes with extra goodies around advertising and more.)”

Sentencing data update: Manchester Evening News make another splash

Since I wrote about the need for more data journalism around sentencing in August, the Manchester Evening News have been beavering away keeping track of riot sentencing data on their own patch with stories on the first 60 looters to be sentenced and the role of poverty. Last week the newspaper finally made a splash on the figures.

The collected data led to this front page story: Looters jailed straight after Manchester riots given terms 30 per cent longer than those punished later.

While another article builds up a detailed profile of the rioters with plenty of visualisation, and links to the raw data.

The MEN’s Paul Gallagher had previously told me in an email correspondence that they were expecting at least 250-300 cases to be going through the courts in total, making “enough to make a very interesting and useful dataset but not so many as to make it too big a job.

“This spreadsheet is being completed using information provided by our journalists in court. The MEN is committed to staffing every court hearing so we should be able to fill this over time. This is a trial project limited only to the riots, and I don’t know if we will do anything with other court data in future.”

At the time Paul was trying to set up a system that would see court reporters add information when they covered a case, a system that could be used to publish court data in future.

“One of the biggest problems I have found is that we can produce graphics quite easily for online using Google Fusion Tables and other tools but it is difficult to turn these into graphics that will work in print without getting a graphic designer to recreate the image.”

A couple months on Paul remarks that the project has required significant editorial resources:

“Around ten MEN journalists have either sat in court to take down details of one or more riot cases in the last three months, or have been involved in the data analysis.”

He also says the exercise has raised some questions about the use, and sharing, of court data.

“Although the names and home addresses of adult defendants are published in court reports in the media, it does not seem appropriate to include them in shared spreadsheets, or to plot them on street level maps.

“For that reason, I decided to remove the names and personal details when we plotted home addresses of defendants on a map of Greater Manchester to visualise the correlation between rioters and high levels of poverty and deprivation.

The Manchester Evening News have not decided if they will continue their data work on other non-riot-related court data, which Paul feels “begs the question why court data is not publicly available from official sources.”
“At the moment there is no other way of getting this information than to have a person sat in court at every hearing, jotting down the details in their notebook and then copying them into a spreadsheet.”

The data and visualisation was also used in last night’s Panorama: Inside The Riots. Disappointingly, the Panorama website and solitary blog post include no links to the MEN coverage or data, and the official Twitter account not only failed to link – it has failed to tweet at all in almost two weeks.