Category Archives: data journalism

VIDEO: Mapping for data journalists

If you’re using maps as a data journalist it’s important to be aware of the editorial choices you are making — and how they can skew your reporting.

In this video — first made for students on the MA in Data Journalism at Birmingham City University and shared as part of a series of video posts — I introduce critical cartography, the different types of maps you might choose to use to tell a story, and the different types of stories that they can tell.

I also give some examples of geography-based stories that might be better told with other charts, and list some tools and tips that can be used to tell geographical stories.

Links mentioned in the video include Theo Kindynis’s research on critical cartography; and Stories Behind A Line. And there are two related videos I refer to which are embedded below. First, a West Wing clip on mapping (more on that here):

Continue reading

VIDEO: The 3 chords of data journalism

With just a few basic data journalism techniques you can tell a lot of data journalism stories. I call these the “three chords of data journalism” — a nod to Simon Rogers’s talk on data journalists as the new punks. Those chords are: sorting; filtering; and calculating percentages.

In this third video first made for students on the MA in Data Journalism at Birmingham City University and shared as part of a series of video posts, I walk through how to use those techniques in practice, using gender pay gap data to demonstrate how those techniques can be used to find outliers and potential interviewees; to drill down to a particular category or area in a dataset; and to put figures into context.

VIDEO: Where data journalists get data from

Journalists get hold of data using four broad approaches: it might be newly published or issued; it might be leaked; they might request it; or they might seek it out based on an idea or in reaction to a news event.

In this second short video first made for students on the MA in Data Journalism at Birmingham City University and shared as part of a series of video posts, I go through the different ways that journalists obtain data and the different types of story that those sources can lead to.

VIDEO: What is data journalism — and why is it growing so much?

Data journalism isn’t just about spreadsheets and interactives: in this video from my MA Data Journalism classes at Birmingham City University I look at why the news industry has expanded its focus on data journalism over the past decade, and how thinking about definitions of data journalism can help reporters think more broadly about potential stories and subjects beyond official statistics.

I also look at related terms such as computational journalism, robot journalism and augmented journalism — and what we can learn from those definitions as practitioners.

This is part of a series of videos recorded during the coronavirus pandemic.

Investigating the World Cup: tips on making FOIA requests to create a data-driven news story

Image by Ambernectar 13

Beatriz Farrugia used Brazil’s freedom of information laws to investigate the country’s hosting of the World Cup. In a special guest post for OJB, the Brazilian journalist and former MA Data Journalism student passes on some of her tips for using FOIA.

I am from Brazil, a country well-known for football and FIFA World Cup titles — and the host of the World Cup in 2014. Being a sceptical journalist, in 2019 I tried to discover the real impacts of that 2014 World Cup on the 213 million residents of Brazil: tracking the 121 infrastructure projects that the Brazilian government carried out for the competition and which were considered the “major social legacy” of the tournament.  

In 2018 the Brazilian government had taken the website and official database on the 2014 FIFA World Cup infrastructure projects offline — so I had to make Freedom of Information (FOIA) requests to get data.

The investigation took 3 months and more than 230 FOIA requests to 33 different public bodies in Brazil. On August 23, my story was published.

Here is everything that I have learned from making those hundreds of FOIA requests:

Continue reading

Here’s a story about a celebrity fashion charity which provides some useful tips and tricks for journalists using company accounts

If you follow me on Twitter you’ll know every so often I highlight a story which uses company accounts. This latest one has celebrity and fashion, and involves a charity that’s raising money through star-studded events — what more can you ask for?

It’s a great excuse to find out about a range of techniques for finding stories and background in company accounts. Follow the thread from the tweet embedded below, or read it on Threadreader here.

What are regular expressions — and how to use them in Google Sheets to get data from text

In an extract from a new chapter in the ebook Finding Stories in Spreadsheets, I explain what regular expressions are — and how they can be used to extract information from spreadsheets. The ebook version of this tutorial includes a dataset and exercise to employ these techniques.

The story was an unusual one: the BBC Data Unit had been given access to a dataset on more than 200,000 works of art in galleries across the UK. What patterns could we find in the data that would allow us to tell a story about the nature of the nation’s paintings?

Some of the data was straightforward to work with: the ‘artist’ column was relatively clean, and allowed us to identify the most common male and female artist. It turned out that the latter – the Victorian botanist Marianne North – was relatively unknown. So, that was one story we could tell.

ukart

But other parts of the data were more problematic. The date column, for example, contained inconsistently formatted data: in the majority of cases a specific year had been entered, but in many others the data contained text such as “18th century” or “1900-1920” or “1800s”.

We also noticed that monarchs featured heavily in the art – but understandably there was no column that was specifically dedicated to classifying those. If we wanted to identify the most-painted monarchs we would have to create new data that somehow extracted those names from the paintings’ titles.

These problems – extracting data from existing data, particular text data – are what regular expressions are designed for. In this chapter I will explain what regular expressions are, and how to use them in spreadsheets.

Continue reading

Os ângulos mais usados por jornalistas para contar histórias com dados

Nas minhas aulas e treinamentos de jornalismo de dados, costumo falar sobre os tipos mais comuns de histórias que podem ser encontradas em bancos de dados. Então, selecionei 100 reportagens baseadas em  dados, analisei-as e verifiquei com qual frequência cada um desses ângulos é utilizado.

Cheguei à conclusão de que, na verdade, existem sete ângulos principais para reportagens e histórias baseadas em dados. Muitas histórias incorporam outros ângulos como dimensões secundárias da narrativa (uma história de mudança pode passar a falar sobre a escala de algo, por exemplo), mas todas as histórias de jornalismo de dados que examinei levaram um desses ângulos como fio-condutor.

Neste post, examino como os sete ângulos mais comuns podem ajudar você a ter ideias para histórias e reportagens, assim como a variedade de execuções e as principais considerações para se ter em mente.

Continue reading

“Don’t give me more data — give me a story.” AJ Labs’ Mohammed Haddad on spotlighting human driven data journalism

The Arab Spring: Retweeted

Al Jazeera’s interactive team AJ Labs have a mantra: “human driven data journalism”. In a guest post for OJB Hanna Duggal speaks to the team’s lead Mohammed Haddad on what this means and how he tackles big data, including a recent story commemorating the Arab Spring. 

Mohammed Haddad joined Al Jazeera just as the Egyptian revolution began to unfold in 2011. Since then he has been behind some of Al Jazeera’s most prolific data stories, covering everything from UN General Assembly voting to mapping India and China’s disputed borders.

And, while many of the issues Al Jazeera covers are deeply complex, AJ Labs often help to explain such narratives using data journalism. Continue reading

“Systems would go offline for days just to delay the release of data” – Rodrigo Menegat on Covid-19 data journalism in Brazil

In a guest post for OJB, Rodrigo George Willoughby spoke to data journalist Rodrigo Menegat about reporting on Covid-19 in Brazil, managing uncertainty and how data journalism could help debunk misinformation.

At the height of the first wave of the coronavirus pandemic in March, data on the disease was in high demand. It required collaboration — something made more difficult with data lacking in quality.

Having spent most of his career covering politics, last year Rodrigo Menegat realised that science data — particularly Covid-19 data — was fast becoming a staple in the newsroom. 

“The first challenge was learning how to cover data which is very different to sport or politics,” he says.

The difficulty was understanding something that, as a country, Brazil was not ready to face. Continue reading