Category Archives: data journalism

How to search for information in data black holes: Barbara Maseda and the Inventario project

Barbara Maseda

Image from Knight Center

Ahead of a Global Investigative Journalism Conference panel on reporting in countries without press freedom, Liana Bravo spoke to Cuban data journalist Barbara Maseda about launching a data portal for journalists in an environment where no official data is published.

Bárbara Maseda has dedicated the last four years to publishing data where none exists. “In Cuba we use investigative journalism tools to search for information that elsewhere in the world would be in a press release,” she says. Other journalists’ data problems, such as receiving data in formats that are difficult to analyse, “are my highest aspirations”.

In 2018 she created Inventario’, an open data project for Cuba. “In Cuba data is treated as confidential information. The government does not give data — or when it does, they are not disaggregated,” she says. Continue reading

Advertisements

FAQ: Books to read in preparation for doing a data journalism course

This is what you’ll look like after reading all of these books… (“Study of a Man Reading” by Alphonse Legros)

This latest in the frequently asked questions series is an answer to an aspiring data journalism student who asks “Would you be able to direct me to any resources or text books that might help [prepare]?” Here are some recommendations I give to students on my MA in Data Journalism

Books on data journalism as a profession

Data journalism isn’t just the application of a practical skill, but a profession with a culture, a history, and non-technical practices.

For that reason probably the first thing to recommend is not a book, but just general reading (and listening and watching) as much data journalism, and journalism generally, as possible. These mailing lists (and these) are a good start, and following data journalists on Twitter, and the hashtag #ddj, will expose you to the debates taking place in the industry. Continue reading

When you get data in sentences: how to use a spreadsheet to extract numbers from phrases

Unduly lenient sentences review scheme inadequate

This BBC story involved converting phrases into numbers that could be used in calculations

Earlier this month the BBC Data Unit published a story on unduly lenient sentences which involved working with data that was trapped in phrases.

We needed to be able to take a collection of words such as “11 years and 5 months’ imprisonment” and convert that into something that could be used in spreadsheet calculations (specifically, comparing the lengths of time represented by two different phrases).

It’s a problem you come across every so often as a journalist — especially with FOI requests — so in this post — taken from the book Finding Stories in Spreadsheets — I’ll explain how to do that. Continue reading

FAQ: What are the essential computational skills that a journalist should develop?

Blue skyscrapers

Recognising patterns is a key skill in computational journalism (image by Stanley Zimny)

This latest group of frequently asked questions comes from an interview with Source, published here in full just in case it’s — you know — useful or something…

1. What are the essential computational skills that a journalist should develop?

Firstly, an ability to recognise patterns, or structured information. Spreadsheets are explicitly ‘data’ but some of the most interesting applications of computational journalism are where someone has seen data where others don’t.

Continue reading

Here are 2 videos and slides from my MA/PGCert Data Journalism taster day

Earlier this month I held a special open taster class at Birmingham City University for anyone interested in my full time MA and part time PGCert courses in Data Journalism. As some people couldn’t get to the UK to attend the event I put together two video screencasts recapping some of the material covered in the session.

I’ve embedded the two videos — and slides from the day — below.

And if you want to try out some of the hands-on activities from the class, you can find them here.

If we are using AI in journalism we need better guidelines on reporting uncertainty

Chart: women speak 27% of the time in Game of Thrones

The BBC’s chart mentions a margin of error

There’s a story out this week on the BBC website about dialogue and gender in Game of Thrones. It uses data generated by artificial intelligence (AI) — specifically, machine learning —  and it’s a good example of some of the challenges that journalists are increasingly going to face as they come to deal with more and more algorithmically-generated data.

Information and decisions generated by AI are qualitatively different from the sort of data you might find in an official report, but journalists may fall back on treating data as inherently factual.

Here, then, are some of the ways the article dealt with that — and what else we can do as journalists to adapt.

Margins of error: journalism doesn’t like vagueness

The story draws on data from an external organisation, Ceretai, which “uses machine learning to analyse diversity in popular culture.” The organisation claims to have created an algorithm which “has learned to identify the difference between male and female voices in video and provides the speaking time lengths in seconds and percentages per gender.”

Crucially, the piece notes that:

“Like most automatic systems, it doesn’t make the right decision every time. The accuracy of this algorithm is about 85%, so figures could be slightly higher or lower than reported.”

And this is the first problem. Continue reading

Data Journalism Awards 2019 open for entries

Data Journalism Awards 2019 logo

The Data Journalism Awards is now accepting entries for its 2019 awards.

It’s the 8th year of the awards. This year the “Best data journalism team” category has been divided into two categories: small and large teams, with the “Small newsrooms (one or more winners)” category making way for the change.

The awards website has also been revamped to include a range of resources for data journalists, a “Community” section (in addition to the existing Slack group) and news on data journalism developments.

The deadline to enter is 7 April 2019. Winners get an all-expenses-covered trip to June’s Global Editors Network (GEN) Summit and Data Journalism Awards ceremony.