Category Archives: data journalism

I tested AI tools on data analysis — here’s how they did (and what to look out for)

Mug with 'Data or it didn't happen' on it
Photo: Jakub T. Jankiewicz | CC BY-SA 2.0

TL;DR: If you understand code, or would like to understand code, genAI tools can be a useful tool for data analysis — but results depend heavily on the context you provide, and the likelihood of flawed calculations mean code needs checking. If you don’t understand code (and don’t want to) — don’t do data analysis with AI.

ChatGPT used to be notoriously bad at maths. Then it got worse at maths. And the recent launch of its newest model, GPT-5, showed that it’s still bad at maths. So when it comes to using AI for data analysis, it’s going to mess up, right?

Well, it turns out that the answer isn’t that simple. And the reason why it’s not simple is important to explain up front.

Generative AI tools like ChatGPT are not calculators. They use language models to predict a sequence of words based on examples from its training data.

But over the last two years AI platforms have added the ability to generate and run code (mainly Python) in response to a question. This means that, for some questions, they will try to predict the code that a human would probably write to solve your question — and then run that code.

When it comes to data analysis, this has two major implications:

  1. Responses to data analysis questions are often (but not always) the result of calculations, rather than a predicted sequence of words. The algorithm generates code, runs that code to calculate a result, then incorporates that result into a sentence.
  2. Because we can see the code that performed the calculations, it is possible to check how those results were arrived at.
Continue reading

Tre flere vinkler som oftest brukes til å fortelle datahistorier: utforskere, sammenhenger og metadatahistorier

I et tidligere innlegg skrev jeg om fire av vinklene som oftest brukes til å fortelle historier om data. I denne andre delen ser jeg på de tre øvrige vinklene: historier som fokuserer på sammenhenger; ‘metadata’-vinkler som fokuserer på dataenes fravær, dårlige kvalitet eller innsamling — og utforskende artikler som blander flere vinkler eller gir en mulighet til å bli kjent med selve dataene.

7 vanlige vinkler for datahistorier

Omfang: 'Så stort er problemet'
Endring/stillstand: ‘Dette øker/synker/blir ikke bedre’
Rangering: ‘De beste/verste/hvor vi rangerer’
Variasjon: "Geografisk lotteri" 
Utforske: Reportasjer, interaktivitet og kunst
Relasjoner/avmystifisering: ‘Ting er forbundet’ — eller ikke; nettverk og strømmer av makt og penger
Metadata: ‘Bekymringer rundt data’; ‘Manglende data’, ‘Få tak i dataene’
Continue reading

Die umgekehrte Pyramide des Datenjournalismus: Vom Datensatz zur Story

Die umgekehrte Pyramide des Datenjournalismus
Ideen entwickeln
Daten sammeln
Reinigen
Kontextualisieren
Kombinieren
Fragen
Kommunizieren

Datenjournalistische Projekte lassen sich in einzelne Schritte aufteilen – jeder einzelne Schritt bringt eigene Herausforderungen. Um dir zu helfen, habe ich die “Umgekehrte Pyramide des Datenjournalismusentwickelt. Sie zeigt, wie du aus einer Idee eine fokussierte Datengeschichte machst. Ich erkläre dir Schritt für Schritt, worauf du achten solltest, und gebe dir Tipps, wie du typische Stolpersteine vermeiden kannst.

(Auch auf Englisch, Spanisch, Finnisch, Russisch and Ukrainisch verfügbar.)

Continue reading

9 takeaways from the Data Journalism UK conference

Attendees in a lecture theatre with 'data and investigative journalism conference 2025 BBC Shared Data Unit' on the screen.

Last month the BBC’s Shared Data Unit held its annual Data and Investigative Journalism UK conference at the home of my MA in Data Journalism, Birmingham City University. Here are some of the highlights…

Continue reading

How do I get data if my country doesn’t publish any?

Spotlight photo by Paul Green on Unsplash

In many countries public data is limited, and access to data is either restricted, or information provided by the authorities is not credible. So how do you obtain data for a story? Here are some techniques used by reporters around the world.

Continue reading

De vanligste vinklene journalister bruker når de forteller historier med data

 7 vanlige vinkler for datahistorier_
Omfang, Veksling, Rangering, Variasjon, Utforske, Relasjoner, Dårlig/åpne, + saker

Datadrevet historiefortelling kan deles i syv hovedkategorier ifølge en analyse av 200 artikler. I den første av to poster vil jeg demonstrere de fire mest brukte vinklene i nyhetshistorier, hvordan de kan gi deg flere muligheter som reporter, og hvordan de kan hjelpe deg med å arbeide mer effektivt med data.

De fleste datasett kan fortelle mange historier — så mange at det for noen kan virke overveldende eller forstyrrende. Å identifisere hvilke historier som er mulige, og å velge den beste historien innenfor den tiden og de ferdighetene du har tilgjengelig, er en viktig redaksjonell ferdighet.

Mange nybegynnere innen datajournalistikk søker ofte først etter historier om sammenhenger (årsak og virkning) — men disse historiene er vanskelig og tidkrevende. Du kan ønske å fortelle en historie om ting som blir verre eller bedre — men mangle dataene for å fortelle den. Hvis du har svært liten tid og vil komme i gang med datajournalistikk, er de raskeste og enkleste historiene du kan fortelle med data, historier om omfang.

Continue reading

Niamh McIntyre: tips from a career in data journalism

Niamh McIntyre

The Bureau of Investigative Journalism’s Big Tech Reporter Niamh McIntyre has been working with data for eight years — but it all stemmed from an “arbitrary choice” at university. She spoke to MA Data Journalism student Leyla Reynolds about how she got started in the field, why you don’t need to be a maths whizz to excel, and navigating the choppy waters of the newsroom. 

Starting out on any new path can be daunting, but in the minutes before my phone call with Niamh McIntyre, I’m acutely aware that upping sticks to Birmingham and training in data journalism at the grand old age of 29 is nothing less than a tremendous luxury.

A younger me might have — would have — quaked at such a scenario, so I’m keen to know more about Niamh’s work, which ranges from investigating the gig work industry to private children’s homes.

Continue reading

VIDEO: Developing ideas for factual storytelling

Strong factual storytelling relies on good idea development. In this video, part of a series of video posts made for students on the MA in Data Journalism at Birmingham City University, I explain how to generate good ideas by avoiding common mistakes, applying professional techniques and considering your audience.

The links mentioned in the video include:

Related post: Here’s how we teach creativity in journalism (and why it’s the 5th habit of successful journalists)

Identifying bias in your writing — with generative AI

Applications of genAI in the journalism process 
Pyramid with the third 'Production' tier highlighted: 

Identify jargon and bias; improve spelling, grammar, structure and brevity

In the latest in a series of posts on using generative AI, I look at how tools such as ChatGPT and Claude.ai can help help identify potential bias and check story drafts against relevant guidelines.

We are all biased — it’s human nature. It’s the reason stories are edited; it’s the reason that guidelines require journalists to stick to the facts, to be objective, and to seek a right of reply. But as the Columbia Journalism Review noted two decades ago: “Ask ten journalists what objectivity means and you’ll get ten different answers.”

Generative AI is notoriously biased itself — but it has also been trained on more material on bias than any human likely has. So, unlike a biased human, when you explicitly ask it to identify bias in your own reporting, it can perform surprisingly well.

It can also be very effective in helping us consider how relevant guidelines might be applied to our reporting — a checkpoint in our reporting that should be just as baked-in as the right of reply.

In this post I’ll go through some template prompts and tips on each. First, a recap of the rules of thumb I introduced in the previous post.

Continue reading