Tag Archives: data journalism

That massive open online course on data journalism now has a start date

In case you haven’t seen the tweets and blog posts, that MOOC on data journalism I’m involved in has a start date: May 19.

The launch was delayed a little due to the amount of people who signed up – which I think was a sensible decision.

You can watch the introduction video above, or ‘meet the instructors’ below. Looking forward to this…

FAQ: Big data and journalism

The latest in the series of Frequently Asked Questions comes from a UK student, who has questions about big data.

How can data journalists make sense of such quantities of data and filter out what’s meaningful?

In the same way they always have. Journalists’ role has always been to make choices about which information to prioritise, what extra information they need, and what information to include in the story they communicate. Continue reading

Data journalism ebook now on Amazon’s Kindle Store

Data journalism book Data Journalism Heist

My short ebook Data Journalism Heist is now available on Amazon for Kindle (US link here - also available on other countries’ Amazon sites).

The book is an introduction to data journalism and two simple techniques in particular: finding story leads using pivot tables and advanced filters.

The book also covers useful sources of data, how to follow leads up, and how to tell the resulting story.

You can also buy it from Leanpub, where it’s been live for a couple months now and is available in PDF, mobi and ePub formats. Comments welcome as always.

How to: clean up spreadsheet headings that run across multiple rows using Open Refine

Something that infuriates me often with government datasets is the promiscuous heading. This is when a spreadsheet doesn’t just have its headings across one row, but instead splits them across two, three or more rows.

To make matters worse, there are often also extra rows before the headings explaining the spreadsheet more generally. Here’s just one offender from the ONS:

A spreadsheet with promiscuous headings

A spreadsheet with promiscuous headings

To clean this up in Excel takes several steps – but Open Refine (formerly Google Refine) does this much more quickly. In this post I’m going to walk through the five minute process there that can save you unnecessary effort in Excel. Continue reading

That free online data journalism course I’m involved in

I’m happy to announce that I’ll be part of the delivery team for a free data journalism course online early next year that is being hosted by The European Journalism Centre. Continue reading

New ebook now ready! Learn basic spreadsheet skills with Data Journalism Heist

Data journalism book Data Journalism Heist

I’ve written a short ebook for people who are looking to get started with data journalism but need some help.

Data Journalism Heist covers two simple techniques for finding story leads in spreadsheets: pivot tables and advanced filters.

Neither technique requires any formulae, and there are dozens of local datasets (and one international one) to use them on.

In addition the book covers how to follow leads from data, and tell the resulting story, with tips on visualisation and plenty of recommendations for next steps.

You can buy it from Leanpub here. Comments welcome as always.

Ethics in data journalism: automation, feeds, and a world without gatekeepers

This is the last in a series of extracts from a draft book chapter on ethics in data journalismOthers have looked at how ethics of accuracy play out in data journalism projects; culture clashes, privacy, user data and collaborationmass data gathering; and protection of sources. This is a work in progress, so if you have examples of ethical dilemmas, best practice, or guidance, I’d be happy to include it with an acknowledgement.

Budget Forecasts, Compared With Reality

Budget Forecasts, Compared With Reality

The ethics of automation and feeds

Since Adrian Holovaty built ChicagoCrime.org in 2005 to automatically update a map with police crime statistics, automation has been an important element of data journalism. Few news organisations have guidelines on automation, but the BBC’s guidelines (2013) on video feeds do provide a framework. Continue reading

Ethics in data journalism: mass data gathering – scraping, FOI and deception

This is the third in a series of extracts from a draft book chapter on ethics in data journalismThe first looked at how ethics of accuracy play out in data journalism projects, and the second at culture clashes, privacy, user data and collaborationThis is a work in progress, so if you have examples of ethical dilemmas, best practice, or guidance, I’d be happy to include it with an acknowledgement.

Automated mapping of data - ChicagoCrime.org - image from Source

Automated mapping of data – ChicagoCrime.org – image from Source

Mass data gathering – scraping, FOI, deception and harm

The data journalism practice of ‘scraping’ – getting a computer to capture information from online sources – raises some ethical issues around deception and minimisation of harm. Some scrapers, for example, ‘pretend’ to be a particular web browser, or pace their scraping activity more slowly to avoid detection. But the deception is practised on another computer, not a human – so is it deception at all? And if the ‘victim’ is a computer, is there harm? Continue reading

Ethics in data journalism: privacy, user data, collaboration and the clash of codes

This is the second in a series of extracts from a draft book chapter on ethics in data journalism. The first looked at how ethics of accuracy play out in data journalism projectsThis is a work in progress, so if you have examples of ethical dilemmas, best practice, or guidance, I’d be happy to include it with an acknowledgement.

Gun permit holders map - image from Sherrie Questioning All

Gun permit holders map – image from Sherrie Questioning All

Hacks/Hackers: collaboration and the clash of codes

Journalism’s increasingly collaborative and global nature in a networked environment has raised a number of ethical issues as contributors from different countries and from professions outside of journalism – with different codes of ethics – come together.

This collaborative spirit is most visible in the ‘Hacks/Hackers’ movement, where journalists meet with web developers to exchange tips and ideas, and work on joint projects. Data journalists also often take part in – and organise – ‘hack days’ or ‘hackathons’ aimed at opening up and linking data and creating apps, or work with external agencies to analyse data gathered by either party. Continue reading

Ethics in data journalism: accuracy

The following is the first in a series of extracts from a draft book chapter on ethics in data journalism. This is a work in progress, so if you have examples of ethical dilemmas, best practice, or guidance, I’d be happy to include it with an acknowledgement.

Data journalism ethics: accuracy

Probably the most basic ethical consideration in data journalism is the need to be accurate, and provide proper context to the stories that we tell. That can influence how we analyse the data, report on data stories, or our publication of the data itself.

In late 2012, for example, data journalist Nils Mulvad finally got his hands on veterinary prescriptions data that he had been fighting for for seven years. But he decided not to publish the data when he realised that it was full of errors. Continue reading