Defending an investigation — and planning one: lessons from ProPublica’s Black Snow

Sugar Companies Said Our Investigation Is Flawed and Biased. Let’s Dive Into Why That’s Not the Case.

In the summer of last year ProPublica published a major investigation into air pollution in Florida, and its connection to the sugar industry. The story itself, Black Snow, is an inspiring example of scrollytelling — but equally instructive is the methodology article which accompanies it, responding to criticisms from the sugar industry.

Not only does it demonstrate how to respond when large organisations attack a piece of journalism — it also provides a great lesson on the tactics that are adopted by organisations when attacking data-driven stories.

In this post I want to break down the three most common attack tactics, how ProPublica deal with two of those, and how to use the same tactics during planning to ensure your project design isn’t flawed.

Continue reading

What Data Journalists Need to Know About Application Programming Interfaces (APIs)

A list of APIs on the Parliament website
The UK Parliament publishes a series of APIs for political data

I’ve written a post for the Global Investigative Journalism Network about how APIs can be useful sources of data for journalists. The article is based on an earlier video post.

The article explains what APIs are and how they differ from other data sources; the basic principles of how they work and how they can be used for stories; some of the jargon to expect — and where to find them. Read the article here.

Making video and audio interviews searchable: how Pinpoint helped with one investigation

Pinpoint creates a ranking of people, organisations, and locations with the number of times they are mentioned on your uploaded documents.

MA Data Journalism student Tony Jarne spent eight months investigating exempt accommodation, collecting hundreds of documents, audio and video recordings along the way. To manage all this information, he turned to Google’s free tool Pinpoint. In a special guest post for OJB, he explains how it should be an essential part of any journalist’s toolkit.

The use of exempt accommodation — a type of housing for vulnerable people — has rocketed in recent years.

At the end of December, a select committee was set up in Parliament to look into the issue. The select committee opened a deadline, and anyone who wished to do so could submit written evidence.

Organisations, local authorities and citizens submitted more than 125 pieces of written evidence to be taken into account by the committee. Some are only one page — others are 25 pages long.

In addition to the written evidence, I had various reports, news articles, Land Registry titles an company accounts downloaded from Companies House.

I needed a tool to organise all the documentation. I needed Pinpoint

Continue reading

Here’s a framework to help fill the ‘human gap’ in your story

One of the most common challenges for student journalists is identifying the right human sources to turn a lead into a fleshed out story. And one of the most common mistakes is not to spend enough time on this vital step in the reporting process.

To help with this, here’s a framework for brainstorming potential sources.

Different types of source and potential roles in stories: matrix of 5 source categories (power; expert; representative; witness; case study) and 4 roles (action; context/colour; reaction; reply).

The five categories of source

There are five categories of source in the framework:

Continue reading

VIDEO: An introduction to SQL for data journalists

The database query language SQL pops up in all sorts of places when you’re working with data — especially big data — and can be a very useful way to query data in spreadsheets, APIs and coding. This video, made for students on the MA in Data Journalism at Birmingham City University, explains what SQL is, the different places you will come across it, and how to get started with SQL queries.

You’ll find related resources and tutorials in the repo here.

UPDATE: Thanks to Tony Hirst in the comments for pointing me to his post about browser-based SQL tools.

This video is shared as part of a series of video posts.

VIDEO: Big data, open data, linked data and other big ideas that data journalists need to know about

Three key terms you might hear used in data journalism circles are “open data“, “linked data” and “big data“. This video, made for students on the MA in Data Journalism at Birmingham City University, explores definitions of the three terms, explains some of the jargon used in relation to them, and the critical and ethical issues to consider in relation to open and big data in particular.

Three other video clips are mentioned in the video, and these are embedded below. First of all, Tim Berners-Lee‘s 2009 call for “raw data now”, where he outlined the potential of open and linked data…

Continue reading

Here are some great examples of how to use AI and satellite imagery in journalism

False colour image of the Paraná River near its mouth at the Rio de La Plata, Argentina
False colour image of the Paraná River near its mouth at the Rio de La Plata, Argentina. Image: Copernicus Sentinel data [2022] processed by Sentinel Hub.

In a guest post for OJB, first published on ML Satellites, MA Data Journalism student Federico Acosta Rainis explains what can be learned from some examples of the format.

Satellite imagery is increasingly a key asset for journalists. Looking from above often allows us to put a story into context, take a more interesting perspective or show what some power prefers to keep hidden.

But with hundreds of satellites taking thousands of images of the Earth every day, it is difficult to separate the wheat from the chaff. How can we find relevant stories in this ocean of data?

Continue reading

What stories can you tell using AI and satellite imagery? Here are some ideas

In the second of two guest posts for OJB, first published on the ML Satellites blog, MA Data Journalism student Federico Acosta Rainis uses the 8 angles used by data journalists framework to explore satellite image-driven journalism.

Satellite-driven stories don’t have to use using artificial intelligence (AI) — many can be told using satellite data alone, without. The main advantages of AI include quantifying phenomena, identifying patterns, showing changes or finding a “needle in a haystack” across large territories or different time periods.

AI algorithms can also be used to automate a process: since satellites produce recurring data, you can build, for example, a platform that automatically detects changes in the size of forests.

Paul Bradshaw’s framework for data journalism angles recognises eight types of stories: scale, change, ranking, variation, exploration, exploration, relationships, stories about data and stories through data. The same framework can be adopted to generate ideas for satellite journalism, too.

Continue reading

Journalism, AI and satellite imagery: how to get started

Satellite image of the Amazon. Tocantins, Brazil. Source: Copernicus Sentinel data [2022] processed by Sentinel Hub, using Highlight Optimized Natural Color.

In the first of two guest posts for OJB, first published on ML Satellites, MA Data Journalism student Federico Acosta Rainis explains how to get started with satellite journalism — and avoid common pitfalls.

Working with satellite imagery and AI models takes time and patience. There is no general rule: you have to find the right model for each case, in a process of trial and error, while crunching large amounts of data.

That is why the advice of Anatoly Bondarenko, data editor of Texty, is crucial:

Continue reading