Author Archives: Paul Bradshaw

About Paul Bradshaw

Paul teaches data journalism at Birmingham City University and is the author of a number of books and book chapters about online journalism and the internet, including the Online Journalism Handbook, Mobile-First Journalism, Finding Stories in Spreadsheets, Data Journalism Heist and Scraping for Journalists. From 2010-2015 he was a Visiting Professor in Online Journalism at City University London and from 2009-2014 he ran Help Me Investigate, an award-winning platform for collaborative investigative journalism. Since 2015 he has worked with the BBC England and BBC Shared Data Units based in Birmingham, UK. He also advises and delivers training to a number of media organisations.

This one story can be used to discuss seven different types of bias

Leave a reply

ITV News headline: John Torode’s wife Lisa Faulkner reveals breast cancer diagnosis

The latest “wife of” headline — ITV News’s report on the actor Lisa Faulkner revealing that she has undergone surgery after a cancer diagnosis — is an opportunity to get journalism students exploring how different forms of bias might shape news reporting — and not just the obvious ones.

Continue reading →

Managing a mass FOI project? Here’s an AI-assisted methodology for that

Leave a reply

Sending FOIs to multiple bodies across the country to get the big picture on an issue sounds like a great idea — until the responses start to trickle in. Differences between responses often make mass FOI projects extremely time-consuming as you try to get everything into a format that allows you to ask journalistic questions and compare different authorities. Can AI help?

On one recent project I decided to put together a methodology that made the process less stressful, faster and more accurate. Here’s how it works.

Data structure

Extract & reshape

Check & verify

Combine

Audit & prioritise

Audit responses to identify the level of detail in each response and identify edge cases. Include a caveats column.
Augment manual audit with NotebookLM audit.
Identify a priority order for data, e.g. totals by outcome, hospital, category or year where these are provided separately

Design a data structure that can accommodate all responses
Structure should follow ‘tidy’ data principles, i.e. one row per combination of features (force, category, hospital, outcome, year)
Structure should include source details, e.g. filename, sheet name, name of person entering data

PDFs: use Tabula or
vibe coding (design a prompt template to generate code to attempt to extract data). Multi-sheet XLS files: use Open Refine to import and combine sheets
Design a prompt template for generating code to reshape CSV responses

Manual checks (e.g. compare entries, check page-ending rows)
Analysis-based checks (e.g pivots, totals)
AI-based checks using a prompt template (e.g. compare files)

Use OpenRefine or: Design a prompt template for generating code to combine the resulting CSV files.

Continue reading →

Showing charts on video? Here are two essential techniques to make them effective

Leave a reply

Using visualisation on TV and video is very different to using charts and maps online. In video, the audience has very little time to absorb the information contained in the chart — so you need to get them to that information as quickly as possible.

Every bad example of charts in videos forgets this. And every good example uses two essential techniques: keeping things simple, and adding motion.

Continue reading →

“Many newsrooms are not optimised for what humans do best” (but we have an opportunity to change that)

Leave a reply

Amazon worker with horse head — Image by Cory Doctorow, from Revenge of the Chickenized Reverse-Centaurs

Some essential reading by Agnes Stenbom Swedling explores how news organisations integrate AI into their workflows and the idea of the “human in the loop“. Many newsrooms, she points out, “are not optimised for what humans do best”, and so far the introduction of AI hasn’t involved a critical consideration of whether we want to embed those features in new systems, or rethink them:

“What is being built – incrementally, often unintentionally – is a form of machine-centric hybridisation. Workflows are optimised for what machines do well: speed, scale, pattern recognition, cost efficiency. Humans are then positioned around those systems, adapting their tasks, roles, and decision-making to fit the logics of machines.

“The consequence is a subtle but significant inversion: rather than engaging in uniquely human activities, work is reorganised to fit machine-driven processes. And once that inversion is embedded at the infrastructural level, it becomes increasingly difficult to reverse.”

Continue reading →

Caught in a trap: what journalists can learn from systems thinking

Leave a reply

One of the most powerful ways to generate original journalism is to look at the systems behind stories — particularly the points where those systems fail.

For investigative work, those points are central. Surface-level scandals often stem from deeper systemic problems. So what tools do we have for recognising those patterns?

Donella Meadows’s classic book Thinking in Systems offers one: “system traps” — patterns that explain how systems get stuck, break down, or behave in ways nobody intends. They are “traps” because attempts to escape them often backfire.

System trap

Journalism examples

Policy resistance

The war on drugs; reforms that fail; missed targets

Overuse leading to shortages; climate change impacts; AI

Tragedy of the commons

Drift to low performance

Normalisation of poor performance or low productivity

Escalation

Arms races; races to the bottom

Success to the successful

Increasing concentration of wealth or resources

Shifting the burden to the intervenor

Subsidies, price fixes and delaying the impact/cost of a policy

Rule beating

Tax avoidance, loopholes

Seeking the wrong goal

Schools focusing on targets over pupil welfare;

In this post I’ll explain each trap, what it looks like in the wild, and how to use it as a lens for story ideas.

Continue reading →

How to: generate hundreds of maps by combining QGIS with Python (code included!)

Leave a reply

At this year’s Dataharvest I delivered a workshop on using Python in QGIS to automate the process of exporting maps for multiple locations. Here’s how to do it (you can find a GitHub repository with materials and links here).

Making a map for a story is cool — but what if you could make a map for every reader? Or if you’re working on a project involving teams in different regions or countries, what if you could give each one of those teams a map centred on their own patch?

Normally you would have to manually move the map to centre it on a key city, and then export an image. Then do it again and again and again for every area.

Luckily, QGIS has the ability to run code. And this is a great excuse to start using it.

By organising the layers on the left you can put shapes such as flood defences over a base OpenStreetMap layer. You can also change the scale in the box underneath the map

Continue reading →

FAQ: AI, misinformation and journalism

Leave a reply

In this latest post in the FAQ series, I am sharing some responses to a radio interview about AI’s impact on journalism.

Q: Is the continuous growth of AI-generated content online a danger for journalism?

It is certainly a problem yes, in three ways: it makes reporting harder, it makes it harder to support journalism financially, and it makes it harder for audiences to trust your reporting.

Continue reading →

Words as data: how data journalists tell stories about documents and text

Leave a reply

Documents and other collections of text can be goldmines for data journalism — if you know how to approach them as data. Here are some techniques and inspiration for your next data project.

From stories about political speech and song lyrics, to street names and social media chatter, data journalists now have a wide range of examples of text-as-data to draw inspiration and guidance from, while tools such as Pinpoint and NotebookLM are making text analysis easier than ever.

I compiled a list of over 200 pieces of data journalism where text or documents were used as sources. Quantification techniques ranged from counting the frequency of a single word and using Google’s ngram viewer, to machine learning and topic modelling.

Looking at those articles it’s clear that, once quantified, journalists tell the same stories about text as any other piece of data: using the seven most common angles.

But how those angles are used — and how often — is where it gets interesting…

7 common angles for data stories: text and documents
Scale: how often words/phrases are used
Change: how language has changed
Ranking: the most/least common words/phrases
Variation: e.g. in relation to gender, ethnicity, ideology etc.
Exploration: journeys through multiple angles; interactives
Relationships: correlations, similarities and connections
Meta: ‘how we quantified text’
Leads: clusters, patterns or themes for further digging

Continue reading →

PEER: a technique for brainstorming interviewees and story sources

Leave a reply

One way to ensure you generate a wide range of potential sources for a story — or for potential story leads — is to use a checklist. The PEER framework is just that: four categories to help journalists generate more names on any given story — and think more creatively about whose voices might add something to that story.

4 icons: Power, expertise, experience, representative

PEER is a mnemonic (based on a previous post) for remembering the following four types of source:

💪 Power
🧠 Expertise
👁️‍🗨️ Experience
🗣️ Representative

Each type of source brings something different to the story: voices of power primarily (but not solely) answer questions about action: what was or is being done, what should or would be done about a particular issue. These are easily the most commonly quoted sources in news reporting.

People with expertise can answer the “why” and “how” questions — and are often more likely to speak to journalists — while those with experience can verify or validate (put a human face to) events. Representatives can speak to the wider impact or significance of an issue, or represent community sentiment about it.

Making each type of source explicit allows us to think about what those roles really mean — and identify less obvious ideas for sources with power, expertise, experience or representative qualities.

Continue reading →

How to use FOI to develop good journalism habits

Leave a reply

Freedom of Information (FOI) requests are not only one of the best ways to get original and exclusive stories that set your reporting apart — they’re also a good way to develop core journalism habits like curiosity, scepticism, and creativity. Here are some tips on how to get started with FOI while developing those qualities.

Being curious: how often is this happening? How much has it increased?

Headlines:
Rising numbers of hospital patients so fed up they discharge themselves
Figures reveal how many lives firefighters have saved
Welsh parents owe thousands in school dinner debts — All these stories involve asking the question “how much” or “how many” about an issue or event

Headlines:
How the cost of paying up is sending bailiffs' diaries wild
Council use of bailiffs to chase debts jumps 16% in two years
Acid attack hospital admissions have almost doubled
Student Loans Company overcharges 78,000 graduates
Schools converting to academies cost councils £30m — All these stories involve asking the question “how much” or “how many” about an issue or event

Curiosity is the first quality I identified in my series on the 7 habits of successful journalists — and FOI is a great way to hone that.

One good way to get started with FOI is to identify an event or problem that you’ve read about, and get curious about it: how many times is that event happening? How much is that problem costing? These are perfect questions for FOI.

Continue reading →

Online Journalism Blog

Comment, analysis and links covering online journalism and online news, citizen journalism, blogging, vlogging, photoblogging, podcasts, vodcasts, interactive storytelling, publishing, Computer Assisted Reporting, User Generated Content, searching and all things internet.

Author Archives: Paul Bradshaw

About Paul Bradshaw

This one story can be used to discuss seven different types of bias

Managing a mass FOI project? Here’s an AI-assisted methodology for that

Showing charts on video? Here are two essential techniques to make them effective

“Many newsrooms are not optimised for what humans do best” (but we have an opportunity to change that)

Caught in a trap: what journalists can learn from systems thinking

How to: generate hundreds of maps by combining QGIS with Python (code included!)

FAQ: AI, misinformation and journalism

Q: Is the continuous growth of AI-generated content online a danger for journalism?

Words as data: how data journalists tell stories about documents and text

Documents and other collections of text can be goldmines for data journalism — if you know how to approach them as data. Here are some techniques and inspiration for your next data project.

PEER: a technique for brainstorming interviewees and story sources

How to use FOI to develop good journalism habits

Being curious: how often is this happening? How much has it increased?