Category Archives: online journalism

Telling stories with data: more on the difference between ‘variation’ stories and ‘ranking’ angles

7 common angles for data storie: scale, change, ranking, variation, explore, relationships, bad data, leads
The 7 angles. Also available in Norwegian and Finnish.

One of the most common challenges I encounter when teaching people the 7 most common story angles in data journalism is confusion between variation and ranking stories. It all comes down to the difference between process and product.

That’s because both types of story involve ranking as a piece of data analysis.

We might rank the number of specialist teachers in the country’s schools, for example, in order to tell either of the following stories:

  • “There are more specialist science teachers than those in any other subject, new data reveals”
  • “New data reveals stark differences in the number of specialists teaching each subject in secondary schools

The first story reveals which subject has the most teachers — it is a ranking angle because it ranks teachers by subject.

The second story reveals the simple fact that variation exists, without focusing on any particular subject.

Continue reading

“Journey prompts” and “destination prompts”: how to avoid becoming deskilled when using AI

A road
Photo: Tiana

How do you use AI without becoming less creative, more stupid, or deskilled? One strategy is to check whether your prompts are focused on an endpoint that you’re trying to get to, or on building the skills that will get you there — what I call “destination prompts” and “journey prompts”.

In creative work, for example, you might be looking for an idea, or aiming to produce a story or image. In journalism or learning, a ‘destination’ might be key facts, or an article or report.

But prompts that focus only on those destinations are less likely to help us learn, more likely to deskill us — and more likely to add errors to our work.

To avoid those pitfalls, it is better to focus on how we get to those destinations. What, in other words, are the journeys?

Continue reading

AI and “editorial independence”: a risk — or a distraction?

Tools
When you have a hammer does everything look like a nail? Photo by Hunter Haley on Unsplash

TL;DR: By treating AI as a biased actor rather than a tool shaped by human choices, we risk ignoring more fundamental sources of bias within journalism itself. Editorial independence lies in how we manage tools, not which ones we use.

Might AI challenge editorial independence? It’s a suggestion made in some guidance on AI — and I think a flawed one.

Why? Let me count the ways. The first problem is that it contributes to a misunderstanding of how AI works. The second is that it reinforces a potentially superficial understanding of editorial independence and objectivity. But the main danger is it distracts from the broader problems of bias and independence in our own newsrooms.

Continue reading

6 Wege, Datenjournalismus zu kommunizieren (Die umgekehrte Pyramide des Datenjournalismus Teil 2)

Datenjournalismus: Daten kommunizieren Visualisiern Erzählen Herunterbrechen Personalisieren Audiolisieren/materialisieren Nutzen bieten

Die umgekehrte Pyramide des Datenjournalismus bildet den Prozess der Datennutzung in der Berichterstattung ab, von der Ideenentwicklung über die Bereinigung, Kontextualisierung und Kombination bis hin zur Kommunikation. In dieser letzten Phase – der Kommunikation – sollten wir einen Schritt zurücktreten und unsere Optionen betrachten: von Visualisierung und Erzählung bis hin zu Personalisierung und Werkzeugen.

(Auch auf EnglischSpanisch und Portugiesisch verfügbar.)

1. Visualisieren

Visualisierung kann ein schneller Weg sein, die Ergebnisse des Datenjournalismus zu vermitteln: Kostenlose Tools wie Datawrapper und Flourish erfordern oft nur, dass du deinen Daten hochlädst und aus verschiedenen Visualisierungsoptionen auswählst.

Continue reading

How to ask AI to perform data analysis

Consider the model: Some models are better for analysis — check it has run code

Name specific columns and functions: Be explicit to avoid ‘guesses’ based on your most probably meaning

Design answers that include context: Ask for a top/bottom 10 instead of just one answer

'Ground' the analysis with other docs: Methodologies, data dictionaries, and other context

Map out a method using CoT: Outline the steps needed to be taken to reduce risk

Use prompt design techniques to avoid gullibility and other risks: N-shot prompting (examples), role prompting, negative prompting and meta prompting can all reduce risk

Anticipate conversation limits: Regularly ask for summaries you can carry into a new conversation

Export data to check: Download analysed data to check against the original

Ask to be challenged: Use adversarial prompting to identify potential blind spots or assumptions

In a previous post I explored how AI performed on data analysis tasks — and the importance of understanding the code that it used to do so. If you do understand code, here are some tips for using large language models (LLMs) for analysis — and addressing the risks of doing so.

Continue reading

I tested AI tools on data analysis — here’s how they did (and what to look out for)

Mug with 'Data or it didn't happen' on it
Photo: Jakub T. Jankiewicz | CC BY-SA 2.0

TL;DR: If you understand code, or would like to understand code, genAI tools can be a useful tool for data analysis — but results depend heavily on the context you provide, and the likelihood of flawed calculations mean code needs checking. If you don’t understand code (and don’t want to) — don’t do data analysis with AI.

ChatGPT used to be notoriously bad at maths. Then it got worse at maths. And the recent launch of its newest model, GPT-5, showed that it’s still bad at maths. So when it comes to using AI for data analysis, it’s going to mess up, right?

Well, it turns out that the answer isn’t that simple. And the reason why it’s not simple is important to explain up front.

Generative AI tools like ChatGPT are not calculators. They use language models to predict a sequence of words based on examples from its training data.

But over the last two years AI platforms have added the ability to generate and run code (mainly Python) in response to a question. This means that, for some questions, they will try to predict the code that a human would probably write to solve your question — and then run that code.

When it comes to data analysis, this has two major implications:

  1. Responses to data analysis questions are often (but not always) the result of calculations, rather than a predicted sequence of words. The algorithm generates code, runs that code to calculate a result, then incorporates that result into a sentence.
  2. Because we can see the code that performed the calculations, it is possible to check how those results were arrived at.
Continue reading

Tre flere vinkler som oftest brukes til å fortelle datahistorier: utforskere, sammenhenger og metadatahistorier

I et tidligere innlegg skrev jeg om fire av vinklene som oftest brukes til å fortelle historier om data. I denne andre delen ser jeg på de tre øvrige vinklene: historier som fokuserer på sammenhenger; ‘metadata’-vinkler som fokuserer på dataenes fravær, dårlige kvalitet eller innsamling — og utforskende artikler som blander flere vinkler eller gir en mulighet til å bli kjent med selve dataene.

7 vanlige vinkler for datahistorier

Omfang: 'Så stort er problemet'
Endring/stillstand: ‘Dette øker/synker/blir ikke bedre’
Rangering: ‘De beste/verste/hvor vi rangerer’
Variasjon: "Geografisk lotteri" 
Utforske: Reportasjer, interaktivitet og kunst
Relasjoner/avmystifisering: ‘Ting er forbundet’ — eller ikke; nettverk og strømmer av makt og penger
Metadata: ‘Bekymringer rundt data’; ‘Manglende data’, ‘Få tak i dataene’
Continue reading

This is what happened when I asked journalism students to keep an ‘AI diary’

Last month I wrote about my decision to use an AI diary as part of assessment for a module I teach on the journalism degrees at Birmingham City University. The results are in — and they are revealing.

AI diary screenshots, including AI diary template which says:
Use this document to paste and annotate all your interactions with genAI tools. 

Interactions should include your initial prompt and response, as well as follow up prompts (“iterations”) and the responses to those. Include explanatory and reflective notes in the right hand column. Reflective notes might include observations about potential issues such as bias, accuracy, hallucinations, etc. You can also explain what you did outside of the genAI tool, in terms of other work. 

At least some of the notes should include links to literature (e.g. articles, videos, research) that you have used in creating the prompt or on reflecting on it. You do not need to use Harvard referencing - but the link must go directly to the material. See the examples on Moodle for guidance.

To add extra rows place your cursor in the last box and press the Tab key on your keyboard, or right-click in any row and select ‘add new row’.
Excerpts from AI diaries

What if we just asked students to keep a record of all their interactions with AI? That was the thinking behind the AI diary, a form of assessment that I introduced this year for two key reasons: to increase transparency about the use of AI, and to increase critical thinking.

Continue reading

How to reduce the environmental impact of using AI

Generative AI: reducing environmental impact
Disable AI or switch tool
Compare AI vs non-AI
Compare models
Prompt planning
Prompt design and templating
Measuring and reviewing
Run locally

One of the biggest concerns over the use of generative AI tools like ChatGPT is their environmental impact. But what is that impact — and what strategies are there for reducing it? Here is what we know so far — and some suggestions for good practice.

What exactly is the environmental impact of using generative AI? It’s not an easy question to answer, as the MIT Technology Review’s James O’Donnell and Casey Crownhart found when they set out to find some answers.

“The common understanding of AI’s energy consumption,” they write, “is full of holes.”

Continue reading

Die umgekehrte Pyramide des Datenjournalismus: Vom Datensatz zur Story

Die umgekehrte Pyramide des Datenjournalismus
Ideen entwickeln
Daten sammeln
Reinigen
Kontextualisieren
Kombinieren
Fragen
Kommunizieren

Datenjournalistische Projekte lassen sich in einzelne Schritte aufteilen – jeder einzelne Schritt bringt eigene Herausforderungen. Um dir zu helfen, habe ich die “Umgekehrte Pyramide des Datenjournalismusentwickelt. Sie zeigt, wie du aus einer Idee eine fokussierte Datengeschichte machst. Ich erkläre dir Schritt für Schritt, worauf du achten solltest, und gebe dir Tipps, wie du typische Stolpersteine vermeiden kannst.

(Auch auf Englisch, Spanisch, Portugiesisch, Finnisch, Russisch and Ukrainisch verfügbar.)

Continue reading