Tag Archives: ben goldacre

Ethics in data journalism: accuracy

The following is the first in a series of extracts from a draft book chapter on ethics in data journalism. This is a work in progress, so if you have examples of ethical dilemmas, best practice, or guidance, I’d be happy to include it with an acknowledgement.

Data journalism ethics: accuracy

Probably the most basic ethical consideration in data journalism is the need to be accurate, and provide proper context to the stories that we tell. That can influence how we analyse the data, report on data stories, or our publication of the data itself.

In late 2012, for example, data journalist Nils Mulvad finally got his hands on veterinary prescriptions data that he had been fighting for for seven years. But he decided not to publish the data when he realised that it was full of errors. Continue reading

A quick exercise for aspiring data journalists

A funnel plot of bowel cancer mortality rates in different areas of the UK

The latest Ben Goldacre Bad Science column provides a particularly useful exercise for anyone interested in avoiding an easy mistake in data journalism: mistaking random variation for a story (in this case about some health services being worse than others for treating a particular condition):

“The Public Health Observatories provide several neat tools for analysing data, and one will draw a funnel plot for you, from exactly this kind of mortality data. The bowel cancer numbers are in the table below. You can paste them into the Observatories’ tool, click “calculate”, and experience the thrill of touching real data.

“In fact, if you’re a journalist, and you find yourself wanting to claim one region is worse than another, for any similar set of death rate figures, then do feel free to use this tool on those figures yourself. It might take five minutes.”

By the way, if you want an easy way to get that data into a spreadsheet (or any other table on a webpage), try out the =importHTML formula, as explained on my spreadsheet blog (and there’s an example for this data here).

Ben Goldacre wants a "Repository of news ingredients"

Here’s a nice idea from Bad Science blogger Ben Goldacre: a repository of news ingredients:

  • A website that gives each news story a unique ID.
  • Any involved party can add / upload a full press release or quote to that story’s page
  • Anyone can add a link to a primary source
  • Anyone can vote these up or down like on digg/reddit
  • You can register as a “trusted source” and not need to be modded up or down
  • Anyone can add a link to media coverage of that story

You could have a browser plugin that pinged you to the frontpage.org (whatever) site whenever you were reading a piece that was covered there.

So:

  • Journalists could use it to source info in one place
  • Readers could use it to get unmediated / unedited access to full comments from interested parties
  • Involved parties would have a platform for unmediated access too
  • It would be fun and easy for comparing different outlets’ coverage of stories (which a lot of people including me occasionally enjoy doing with Google news search)

It’s a good idea.

I’m not sure how workable using the ‘story’ as the unique unit would be (even with all its processing power, Google News performs patchily on clustering along these lines) – and you could use the unit of the ‘issue’ and build on Wikipedia’s engine, but there are problems with this approach too (although it would be fantastic for SEO).

Another way might be to start from ‘source’ given that so many stories are now single-source, i.e. press releases, reports, research, etc. That would make it easier to relate stories to it and build a patchwork of related sources as Goldacre suggests. Indeed, you could use semantic technology to pick out other sources from relevant stories and automatically add them to the page. Also, if each source has its own page you then start to build a patchwork for cross-referencing and context.

Anyway, it’s out there for discussion and improvement. Ideas?

The BBC and linking part 3 – the BBC respond

As promised in a comment on the first post on this topic (part 2 here), the BBC’s Steve Herrmann today responded to the debate surrounding the BBC’s linking policy (or policies).

In it Steve not only invites comments on how their linking policy should develop, but also gives a valuable insight into the guidance distributed within the corporation, which includes the following:

  • Related links matter: They are part of the value you add to your story – take them seriously and do them well; always provide the link to the source of your story when you can; if you mention or quote other publications, newspapers, websites – link to them; you can, where appropriate, deep-link; that is, link to the specific, relevant page of a website.
  • Where we have previously copied PDFs (for full versions of official reports and documents, for example) and put them on our own servers, we should now consider in each case whether to simply link to PDFs in their native location – with the proviso that if it’s likely to be a popular story, we may need to let the site know of possible increased demand.

“On linking to science papers in particular,” Steve continues,

“we don’t currently have a specific policy, but the simplest principle would seem to be that we should find and provide the most relevant and useful links at time of writing, wherever they are – whether it’s an abstract of a scientific paper, the paper itself, or a journal.

“There is some devil in the detail as far as this goes, though. First and foremost, we’re often reporting a story before the full paper has been published, so there may not yet be a full document to link to; some journals are subscription-only; some have web addresses which might expire.”

The post ends with a series of specific questions about how the BBC should link, from what types of links are most valuable, to where they should be placed, to what they should do about linking to scientific papers and information behind paywalls.

The comments so far are worth reading too, raising as they do recurring issues around ethics (do you link to a far-right political party?) and, in one case, seeing linking as part of “this internal destruction of the BBC, linking out shouldn’t be featured at all”.

It’s a debate worth having, and Steve and the BBC deserve credit for engaging in it.

The BBC and linking part 2: a call to become curators of context

A highlight of my recent visit with MA Online Journalism students to the BBC’s user generated content hub was the opportunity to ask this question posed by Andy Mabbett via Twitter: ‘Why don’t you link back to people if they send a picture in?’ (audio embedded above and here).

The UGC Hub’s head, Matthew Eltringham, gave this response:

“We credit their picture … we absolutely embrace the principle of linking on and through. I think the question would be – if Andy sends in a picture because he happened to witness a particular event, how relevant is the rest of his content to the audience. I think we’d have to take a view on that.”

It was a highlight because something clicked in my head at this point. You see, we’d spent some of the previous conversation talking about how the UGC hub verifies the reliability of user generated content, and it struck me that this view of the link as content could risk missing a key aspect of linking: context.

In an online environment one of the biggest signals in how we build a picture of the trustworthiness of someone or something is the links surrounding it. Who is that person friends with? What does this website link to? Who gathers here? What do they say? What else does this person do? What is their background, their interests, their beliefs?

All of this is invaluable context to us as users, not just the BBC.

While we increasingly talk about the role of publishers as curators of content [caveat], we should perhaps start thinking about how publishers are also curators of context.

Curators of context

And on this front, the corporation appears to have an enormous culture shift on its hands – a shift that it has been pushing in public for years, with varying degrees of success in different parts of the organisation.

BBC Radio, and many BBC TV programmes, for example, use users’ pictures and tweets and link and credit as a matter of course, while some parts of BBC News do link directly to research papers.

Yesterday I blogged about the frustration of Ben Goldacre at the refusal of parts of the BBC News website to deep link to scientific journal articles. In the comments to Ben’s post, ‘Gimpy’ says that the journalist quoted by Goldacre told him in “early 2008” that linking was “something which must be reviewed”.

In May 2008 the BBC Trust said linking needed major improvements, and in October 2008 the Head of Multimedia said linking to external websites was a vital part of its future.

And this month, the corporation’s latest strategic review pledges:

“to “turn the site into a window on the web” by providing at least one external link on every page and doubling monthly ‘click-throughs’ to external sites: “making the best of what is available elsewhere online an integral part of the BBC’s offer to audiences”.”

Most recently, this week the BBC’s announcement of 25% cuts to its online spend motivated Erik Huggers to make this statement at a DTG conference:

“Why can’t we find a way to take all that traffic and help share it with other public service broadcasters and with other public bodies so that if our boat rises on the tide, everyone’s boat rises on the tide?

“Rather than trying to keep all that traffic inside the BBC’s domain we’re going to link out very aggressively and help other organisations pull their way up on the back of the investments that the BBC has made in this area.”

To be fair, unlike other media organisations, at least the BBC is talking about doing something about linking (and if you want to nag them, here’s their latest consultation).

But please, enough talk already. Auntie, give us the context.

UPDATE: More on the content vs context debate from Kevin Anderson.

UPDATE 2The BBC have started a debate on the issue on their Editors’ Blog

The BBC and linking part 1: users are not an audience

UPDATE: The BBC have started a debate on the issue on their Editors’ Blog

Ben Goldacre is experiencing understandable frustration with the BBC’s policy on linking to science papers:

Jane Ashley of the website’s health team, says that when they write an article based on scientific research:

“It is our policy to link to the journal rather than the article itself. This is because sometimes links to articles don’t work or change, and sometimes the journals need people to register or pay.”

In email correspondence defending their policy, Richard Warry, Assistant editor, Specialist journalism, adds:

“Many papers are available on the web via subscription only, while others give only an Abstract summary. In these instances, the vast majority of our readers would not be able to read the full papers, without paying for access, even if we provided the relevant link.”

This just doesn’t stand up. Here’s why:

  • An abstract alone is actually very useful in providing more context than a journal homepage provides
  • It also provides useful text that can be used to either find another version of the paper (for example on the author’s or a conference website),
  • It provides extra details on the authors, giving you more insight into the research’s reliability and also an avenue should you want to approach them to get hold of the paper.
  • Even for the ‘vast majority’ who cannot pay for access to the paper, they will still be taken to the journal homepage anyway.
  • Believing that the time spent pasting one link rather than another is better spent on providing “authoritative, accurate and attractive reportage” is a false economy. Authoritative, accurate and attractive coverage relies at least in part in allowing users to point out issues with scientific research or its reporting.
  • Catering for a ‘vast majority’ belies a broadcast media mindset that treats users as passive consumers. The minority of users who can access those papers can actually be key contributors to a collaborative journalism process. If you let them.

If it helps, here’s a broadcast analogy: imagine producing a TV package which captions a source as ‘Someone from the Bank of England’. That’s not saving time for good journalism – it’s just bad journalism.

Linking – and deep linking in particular – are basic elements of online journalism. Why can news organisations still not get this right? More on this here

…Meanwhile, bloggers investigate scientific claims

Ben Goldacre writes about the suing of Simon Singh by The British Chiropractic Association (you’ll see a badge on this blog on the issue), and how bloggers have helped investigate their claims.

“Fifteen months after the case began, the BCA finally released the academic evidence it was using to support specific claims. Within 24 hours this was taken apart meticulously by bloggers, referencing primary research papers, and looking in every corner.

“Professor David Colquhoun of UCL pointed out, on infant colic, that the BCA cited weak evidence in its favour, while ignoring strong evidence contradicting its claims. He posted the evidence and explained it. LayScience flagged up the BCA selectively quoting a Cochrane review. Every stone was turned by QuackometerAPGaylardGimpyblog,EvidenceMattersDr Petra BoyntonMinistryofTruthHolfordwatch, legal blogger Jack of Kent, and many more. At every turn they have taken the opportunity to explain a different principle of evidence based medicine – the sin of cherry-picking results, the ways a clinical trial can be unfair by design – to an engaged lay audience, with clarity as well as swagger.”

Here’s the payoff:

“a ragged band of bloggers from all walks of life has, to my mind, done a better job of subjecting an entire industry’s claims to meaningful, public, scientific scrutiny than the media, the industry itself, and even its own regulator. It’s strange this task has fallen to them, but I’m glad someone is doing it, and they do it very, very well indeed.”