Category Archives: online journalism

Ethics in data journalism: mass data gathering – scraping, FOI and deception

chicago_crime

Automated mapping of data – ChicagoCrime.org – image from Source

This is the third in a series of extracts from a draft book chapter on ethics in data journalismThe first looked at how ethics of accuracy play out in data journalism projects, and the second at culture clashes, privacy, user data and collaborationThis is a work in progress, so if you have examples of ethical dilemmas, best practice, or guidance, I’d be happy to include it with an acknowledgement.

Mass data gathering – scraping, FOI, deception and harm

The data journalism practice of ‘scraping’ – getting a computer to capture information from online sources – raises some ethical issues around deception and minimisation of harm. Some scrapers, for example, ‘pretend’ to be a particular web browser, or pace their scraping activity more slowly to avoid detection. But the deception is practised on another computer, not a human – so is it deception at all? And if the ‘victim’ is a computer, is there harm? Continue reading

Ethics in data journalism: privacy, user data, collaboration and the clash of codes

This is the second in a series of extracts from a draft book chapter on ethics in data journalism. The first looked at how ethics of accuracy play out in data journalism projectsThis is a work in progress, so if you have examples of ethical dilemmas, best practice, or guidance, I’d be happy to include it with an acknowledgement.

Gun permit holders map - image from Sherrie Questioning All

Gun permit holders map – image from Sherrie Questioning All

Hacks/Hackers: collaboration and the clash of codes

Journalism’s increasingly collaborative and global nature in a networked environment has raised a number of ethical issues as contributors from different countries and from professions outside of journalism – with different codes of ethics – come together.

This collaborative spirit is most visible in the ‘Hacks/Hackers’ movement, where journalists meet with web developers to exchange tips and ideas, and work on joint projects. Data journalists also often take part in – and organise – ‘hack days’ or ‘hackathons’ aimed at opening up and linking data and creating apps, or work with external agencies to analyse data gathered by either party. Continue reading

Ethics in data journalism: accuracy

The following is the first in a series of extracts from a draft book chapter on ethics in data journalism. This is a work in progress, so if you have examples of ethical dilemmas, best practice, or guidance, I’d be happy to include it with an acknowledgement.

Data journalism ethics: accuracy

Probably the most basic ethical consideration in data journalism is the need to be accurate, and provide proper context to the stories that we tell. That can influence how we analyse the data, report on data stories, or our publication of the data itself.

In late 2012, for example, data journalist Nils Mulvad finally got his hands on veterinary prescriptions data that he had been fighting for for seven years. But he decided not to publish the data when he realised that it was full of errors. Continue reading

Can you help map local data blogs?

This week Matt Burgess launched the Northampton Data Blog, “exploring the data behind the headlines in Northamptonshire”. The site is at least the fourth local data blog to be launched this year after the Coventry Data Blog in May, and Behind The Numbers sections on Wales Online and the Birmingham Mail.

But are we missing any? Please let me know in the comments.

UPDATE: Philip Nye points out his data posts are largely about Hackney.

Disclosure: Matt Burgess was previously the editor of Help Me Investigate Education and Coventry Data Blog creator Ian Silvera has contributed to Help Me Investigate. I am regularly involved in the Birmingham Data Blog.

How to think like a computer: 5 tips for a data journalism workflow part 3

This is the final part of a series of blog posts. The first explains how using feeds and social bookmarking can make for a quicker data journalism workflow. The second looks at how to anticipate and prevent problems; and how collaboration can improve data work.

Workflow tip 5. Think like a computer

The final workflow tip is all about efficiency. Computers deal with processes in a logical way, and good programming is often about completing processes in the simplest way possible.

If you have any tasks that are repetitive, break them down and work out what patterns might allow you to do them more quickly – or for a computer to do them. Continue reading

5 tips for a data journalism workflow: part 2 – anticipating problems and collaboration

In my last post I wrote about how using feeds and social bookmarking can make for a quicker data journalism workflow. In this second part I look at how to anticipate and prevent problems; and how collaboration can improve data work.

Workflow tip 3. Anticipate problems

A particularly useful habit of successful data journalists is to think ahead in the way you request data. For example, you might want to request basic datasets now that you think you’ll need in future, such as demographic details for local patches.

You might also want to request the ‘data dictionary‘ for key datasets. This lists all the fields used in a particular database. For example, did you know that the police have a database for storing descriptions of suspects? And that one of the fields is shoe size? That could make for quite a quirky story. Continue reading

5 tips for a data journalism workflow: part 1 – data newswires and archiving

Earlier this year I spoke at the BBC’s Data Fusion Day (you can find a liveblog of the event on Help Me Investigate) about data journalism workflows. The presentation slides are embedded below (the title is firmly tongue-in-cheek), but I thought I’d explain a bit more in a series of posts – beginning here.

Data journalism workflow 1: Set up data newswires

Most newsrooms take a newswire of some sort – national and international news from organisations like the Press Association, Reuters, and Associated Press.

Data journalism is no exception. If you want to find stories in data, it helps to know what data is coming out, when it comes out.

Continue reading

The first, second and third duties: why The Guardian had to destroy Snowden files

The Guardian's destroyed files - Photograph: Roger Tooth for the Guardian

Photograph: Roger Tooth for the Guardian

Should The Guardian have destroyed its copies of Edward Snowden’s leaked files rather than go to court? That’s a question raised by Index on Censorship and put to editor Alan Rusbridger by Channel 4 News (from 3.40 in).

Publishing is a practical business, and there are three key duties which a publisher has to consider.

Firstly, a news organisation must try to protect its sources. Continue reading

Daily Mail users think it’s less unbiased than Twitter/Facebook

Daily Mail impartiality compared against BBC, Twitter, Facebook and others

Is the Daily Mail less impartial than social media? That’s the takeaway from one of the charts  (shown above) in Ofcom’s latest Communications Market Report.

The report asked website and app users to rate 7 news websites against 5 criteria. The Daily Mail comes out with the lowest proportion of respondents rating it highly for ‘impartiality and unbiased‘, ‘Offers range of opinions‘, and ‘Importance‘.

This is particularly surprising given that two of the other websites are social networks. 28% rated Facebook and Twitter highly on impartiality, compared to 26% for the Daily Mail. Continue reading