Tag Archives: datablog

Notes on setting up a regional newspaper datablog

Behind the Numbers - Birmingham's regional datablog

I’ve been working recently with the Birmingham Mail to launch Behind The Numbersa new datablog project with Birmingham City University supported by Help Me Investigate. I’m told that it is probably the UK’s first regional newspaper datablog, although whether that’s a meaningful claim is debatable*.

The first story generated by the project – what is the worst time to be seen at A&E – was published in the newspaper a week ago. But it’s what happens next that’s going to be interesting. Continue reading

The straw man of data journalism’s “scientific” claim

Guardian cover March 10 2012: Half UK's young black men out of work

Over the weekend Fleet Street Blues has had a bee in its bonnet about the “pretence” of data journalism and Saturday’s Guardian front page: “Half UK’s young black men out of work“.

This, says FSB, is a lie that demonstrates the “pretence” that “‘crunching the numbers’ is somehow an an abstract, scientific, mathematical task”. Continue reading

A new Scottish datablog (and a treemap in Liverpool)

The Scotsman has a newish data blog, set up (I’m rather proud to say) by one of my former PA/Telegraph trainees: Jennifer O’Mahony. This is particularly important as so much data covered in the ‘national’ press tends to be English-only due to devolution.

The Department of Education, for example, only publishes English education data. If you want Scottish education data you need to go to the Scottish Government website or Education ScotlandOfsted inspects schools in England; for Scottish schools reports you need to visit HM Inspectorate of Education. (Meanwhile, the National Statistics site, publishes data from England, Scotland, Wales and Northern Ireland).

So if there’s any Scottish data – or that of Wales or Northern Ireland – that you want me to help with, let me or Jennifer know. By way of illustrating the process, here’s a post over on Help Me Investigate: Education on how I helped Jennifer collect data on free school meals in Scotland.

A treemap in Liverpool

On the same note of non-national data journalism, here’s a particularly nice bit of data visualisation at the Liverpool Post. It’s not often you see treemaps on a local newspaper website – this one was designed by Ilan Sheady based on data gathered by City Editor David Bartlett after a day’s data journalism training.

Infographic showing the huge scale of the £5.5bn Liverpool Waters scheme

 

FAQ: How can broadcasters benefit from online communities?

Here’s another set of questions I’m answering in public in case anyone wants to ask the same:

How can broadcasters benefit from online communities?

Online communities contain many individuals who will be able to contribute different kinds of value to news production. Most obviously, expertise, opinion, and eyewitness testimony. In addition, they will be able to more effectively distribute parts of a story to ensure that it reaches the right experts, opinion-formers and eyewitnesses. The difference from an audience is that a community tends to be specialised, and connected to each other.

If you rephrase the question as ‘How can broadcasters benefit from people?’ it may be clearer.

How does a broadcaster begin to develop an engaged online community, any tips?

Over time. Rather than asking about how you develop an online community ask yourself instead: how do you begin to develop relationships? Waiting until a major news event happens is a bad strategy: it’s like waiting until someone has won the lottery to decide that you’re suddenly their friend.

Journalists who do this well do a little bit every so often – following people in their field, replying to questions on social networks, contributing to forums and commenting on blogs, and publishing blog posts which are helpful to members of that community rather than simply being about ‘the story’ (for instance, ‘Why’ and ‘How’ questions behind the news).

In case you are aware of networks in the middle east, do you think they are tapping into online communities and social media adequately?

I don’t know the networks well enough to comment – but I do think it’s hard for corporations to tap into communities; it works much better at an individual reporter level.

Can you mention any models whether it is news channels or entertainment television which have developed successful online communities, why do they work?

The most successful examples tend to be newspapers: I think Paul Lewis at The Guardian has done this extremely successfully, and I think Simon Rogers’ Data Blog has also developed a healthy community around data and visualisation. Both of these are probably due in part to the work of Meg Pickard there around community in general.

The BBC’s UGC unit is a good example from broadcasting – although that is less about developing a community as about providing platforms for others to contribute, and a way for journalists to quickly find expertise in those communities. More specifically, Robert Peston and Rory Cellan-Jones use their blogs and Twitter accounts well to connect with people in their fields.

Then of course there’s Andy Carvin at NPR, who is an exemplar of how to do it in radio. There’s so much written about what he does that I won’t repeat it here.

What are the reasons that certain broadcasters cannot connect successfully with online communities?

I expect a significant factor is regulation which requires objectivity from broadcasters but not from newspapers. If you can’t express an opinion then it is difficult to build relationships, and if you are more firmly regulated (which broadcasting is) then you take fewer risks.

Also, there are more intermediaries in broadcasting and fewer reporters who are public-facing, which for some journalists in broadcasting makes the prospect of speaking directly to the former audience that much more intimidating.

Something I wrote for the Guardian Datablog (and caveats)

I’ve written a piece on ‘How to be a data journalist’ for The Guardian’s Datablog. It seems to have proven very popular, but I thought I should blog briefly about it if you haven’t seen one of those tweets.

The post is necessarily superficial – it was difficult enough to cover the subject area for a 12,000-word book chapter, so summarising further into a 1,000 word article was almost impossible.

In the process I had to leave a huge amount out, compensating slightly by linking to webpages which expanded further.

Visualising and mashing, as the more advanced parts of data journalism, suffered most, because it seemed to me that locating and understanding data necessarily took precedence.

Heather Billings, for example, blogged about my “very British footnote [which was the] only nod to visual presentation”. If you do want to know more about visualisation tips, I wrote 1,000 words on that alone here. There’s also this great post by Kaiser Fung – and the diagram below, of which Fung says: “All outstanding charts have all three elements in harmony. Typically, a problematic chart gets only two of the three pieces right.”:

Trifecta checkup

On Monday I blogged the advice on where aspiring data journalists should start in full. There’s also the selection of passages from the book chapter linked above. And my Delicious bookmarks on data journalism, visualisation and mashups. Each has an RSS feed.

I hope that helps. If you do some data journalism as a result, it would be great if you could let me know about it – and what else you picked up.

The BBC and missed data journalism opportunities

Bar chart: UN progress on eradication of world hunger

I’ve tweeted a couple of times recently about frustrations with BBC stories that are based on data but treat it poorly. As any journalist knows, two occasions of anything in close proximity warrants an overreaction about a “worrying trend”. So here it is.

“One in four council homes fails ‘Decent Homes Standard'”

This is a good piece of newsgathering, but a frustrating piece of online journalism. “Almost 100,000 local authority dwellings have not reached the government’s Decent Homes Standard,” it explained. But according to what? Who? “Government figures seen by BBC London”. Ah, right. Any chance of us seeing those too? No.

The article is scattered with statistics from these figures “In Havering, east London, 56% of properties do not reach Decent Homes Standard – the highest figure for any local authority in the UK … In Tower Hamlets the figure is 55%.”

It’s a great story – if you live in those two local authorities. But it’s a classic example of narrowing a story to fit the space available. This story-centric approach serves readers in those locations, and readers who may be titillated by the fact that someone must always finish bottom in a chart – but the majority of readers will not live in those areas, and will want to know what the figures are for their own area. The article does nothing to help them do this. There are only 3 links, and none of them are deep links: they go to the homepages for Havering Council, Tower Hamlets Council, and the Department of Communities and Local Government.

In the world of print and broadcast, narrowing a story to fit space was a regrettable limitation of the medium; in the online world, linking to your sources is a fundamental quality of the medium. Not doing so looks either ignorant or arrogant.

“Uneven progress of UN Millennium Development Goals”

An impressive piece of data journalism that deserves credit, this looks at the UN’s goals and how close they are to being achieved, based on a raft of stats, which are presented in bar chart after bar chart (see image above). Each chart gives the source of the data, which is good to see. However, that source is simply given as “UN”: there is no link either on the charts or in the article (there are 2 links at the end of the piece – one to the UN Development Programme and the other to the official UN Millennium Development Goals website).

This lack of a link to the specific source of the data raises a number of questions: did the journalist or journalists (in both of these stories there is no byline) find the data themselves, or was it simply presented to them? What is it based on? What was the methodology?

The real missed opportunity here, however, is around visualisation. The relentless onslaught on bar charts makes this feel like a UN report itself, and leaves a dry subject still looking dry. This needed more thought.

Off the top of my head, one option might have been an overarching visualisation of how funding shortfalls overall differ between different parts of the world (allowing you to see that, for example, South America is coming off worst). This ‘big picture’ would then draw in people to look at the detail behind it (with an opportunity for interactivity).

Had they published a link to the data someone else might have done this – and other visualisations – for them. I would have liked to try it myself, in fact.

UPDATE: After reading this post, a link has now been posted to the report (PDF).

Compare this article, for example, with the Guardian Datablog’s treatment of the coalition agreement: a harder set of goals to measure, and they’ve had to compile the data themselves. But they’re transparent about the methodology (it’s subjective) and the data is there in full for others to play with.

It’s another dry subject matter, but The Guardian have made it a social object.

No excuses

The BBC is not a print outlet, so it does not have the excuse of these stories being written for print (although I will assume they were researched with broadcast as the primary outlet in mind).

It should also, in theory, be well resourced for data journalism. Martin Rosenbaum, for example, is a pioneer in the field, and the team behind the BBC website’s Special Reports section does some world class work. The corporation was one of the first in the world to experiment with open innovation with Backstage, and runs a DataArt blog too. But the core newsgathering operation is missing some basic opportunities for good data journalism practice.

In fact, it’s missing just one basic opportunity: link to your data. It’s as simple as that.

On a related note, the BBC Trust wants your opinions on science reporting. On this subject, David Colquhoun raises many of the same issues: absence of links to sources, and anonymity of reporters. This is clearly more a cultural issue than a technical one.

Of all the UK’s news organisations, the BBC should be at the forefront of transparency and openness in journalism online. Thinking politically, allowing users to access the data they have spent public money to acquire also strengthens their ideological hand in the Big Society bunfight.

UPDATE: Credit where it’s due: the website for tonight’s Panorama on public pay includes a link to the full data.