Why I’m no longer saying AI is “biased”

TLDR; Saying “AI has biases” or “biased training data” is preferable to “AI is biased” because it reduces the risk of anthropomorphism and focuses on potential solutions, not problems.

Searches for "AI bias" peaked in 2025. In March 2025 twice as many searches were made for "AI bias" compared to 12 months before.
Click image to explore an interactive version

For the last two years I have been standing in front of classes and conferences saying the words “AI is biased” — but a couple months ago, I stopped.

As journalists, we are trained to be careful with language — and “AI is biased” is a sloppy piece of writing. It is a thoughtless cliche, often used without really thinking what it means, or how it might mislead.

Because yes, AI is “biased” — but it’s not biased in the way most people might understand that word.

When we talk about people being biased, we mean that they have particular beliefs, experiences, and vested interests. The business owner is biased because they want to sell products; the campaigner is biased because they want to change things; a racist person is biased because of their beliefs. Put simply, they have motives.

To say that “AI is biased”, then, risks anthropomorphising the technology.

Worse than that, it risks letting us off the hook.

The biases of generative AI

A large language model like ChatGPT’s doesn’t have motives — it is a statistical model that predicts what humans would say, or what image they would create, in response to a particular prompt.

So when we say AI is biased, what we mean is that the predictions of AI reflect the biases of text and images in its training data.

Those biases reflect two interconnected factors: power (certain people are more likely to create online content); and selection bias (certain sources are more likely to be used for training than others).

These prompts from Hacks/Hackers can be used to show how AI’s training data leads to biased predictions

Similar biases exist in surveys and clinical research and a number of techniques are used to compensate for those.

But we don’t say “the survey is biased”, because we don’t anthropomorphise surveys. Instead we might mention the survey’s margin of error or low participation rates from certain groups.

A more accurate way to talk about AI, then, might be to talk about AI’s plural biases, or its biased training.

Doing so also means we are not giving AI an agency that it doesn’t have.

AI biases don’t behave like human biases

Talking about biases or biased training allows us to address a potential blind spot of “AI is biased” — the fact that AI’s biases are not human-like (aka anthropomorphic).

Human biases are difficult to tackle — they are often either deeply held and defended, or unconscious and denied, or both.

But AI chatbots can adjust their biases on request. You can ask a ‘racist’ AI model to apply principles of diversity, for example, and it will not put up an argument (more about this below).

Generative AI also has some very non-human biases. A good example is temporal bias: all language models are trained on data up to a certain point — their knowledge cut-off dates — and are ignorant of events after that point.

Ask Claude about Pope Francis, for example, and it won’t mention that he is no longer alive, because its temporal bias means it doesn’t know anything about the Pope after October 2024.

tell me about pope francis

Edit
Pope Francis, born Jorge Mario Bergoglio on December 17, 1936, in Buenos Aires, Argentina, has been serving as the head of the Catholic Church and sovereign of the Vatican City State since March 13, 2013. He is the first Jesuit pope, the first from the Americas, the first from the Southern Hemisphere, and the first pope from outside Europe since Gregory III in the 8th century.

Before becoming pope, Bergoglio served as Archbishop of Buenos Aires from 1998 to 2013 and was made a cardinal in 2001 by Pope John Paul II. He chose the papal name Francis in honor of Saint Francis of Assisi.
Temporal bias in training means that Claude doesn’t know about Pope Francis’ and it doesn’t mention his ‘s recent death

This means that ChatGPT and Claude are unable to generate news stories — because by definition large language models will not have been trained on any information that has not been published online.

Or will they?

The training isn’t just in the training data

Here’s an analogy: supermarket trolleys all veer one way or another. So when you use a trolley, you identify its bias and push a little more in the opposite direction to stop it from veering in the wrong direction. You also identify how quickly or slowly it moves, and push more or less in response. If you’re especially practical, you might pull out material that has become stuck in the wheel, or even tighten loose bolts. The trolley’s behaviour changes as you add more and more items to the trolley, and you change how you behave in response.

Generative AI is that supermarket trolley: it can help you to do a job faster and at an enhanced scale, but it comes with pre-existing biases that we need to correct for.

Put another way, the important thing is not whether an algorithm is ‘biased’ or not. All methods are imperfect, including relying solely on humans (who cannot walk in a straight line without correction either) — what is important is what steps we are taking to reduce the impact of those inevitable biases.

In fact, biases when using AI come from three key forces:

  • The training data, yes, but also:
  • The algorithm design itself, such as how different inputs are weighted and any ‘guardrails’, and
  • The inputs of the user
A middle circle labelled 'biases' with three circles around it, labelled 'training material', 'algorithm design' and 'user behaviour and inputs'
Biases are largely the result of three factors

Ask Google Gemini to write a racist story, for example, and it will refuse, saying that it “goes against my core principles”. That’s not a bias in the training data — it’s a corrective bias in the design of the algorithm.

The existence of such biases reminds us that bias is not inherently bad. A bias towards fairness or a bias towards factual accuracy are positive biases that most journalists apply to their work to attempt to correct less helpful human biases such as a tendency to believe people in authority and confirmation bias.

Pushing the shopping trolley in the right direction

Similarly, your inputs into a conversation with generative AI are a vital force in correcting the biases that exist in that mass of training data. An example I use with my students is to type in the prompt “Give me five story ideas“. Without any correction, ChatGPT will predict that you mean fictional stories, reflecting one of the many biases in its training data.

But if you use role prompting and instead ask: “You are a journalist. Give me five story ideas” the response will change accordingly, effectively weighting any training data that relates to journalistic material. Recursive prompting that provides feedback on what is relevant or irrelevant to your needs has a similar result.

Another common technique to counterweight bias is to upload a document or dataset. This not only augments the algorithm’s training data (the technique is called Retrieval Augmented Generation), it also significantly weights its response towards the new material. Any responses are now likely to relate mainly or entirely to that document. Providing documents or allowing the model to link to the wider web are also common ways to address the temporal bias in large language models.

The relationship works both ways: weighting responses (with guidelines, for example) means AI can be used as a corrective force to check our own biases in ideas, sources or drafts.

Focus on the solutions, not the problem

Describing AI as having biases, or biased training, allows us to move past the problem to the strategies that we need to address it.

It allows us to see AI as a tool instead of a character, and take responsibility for how it is used.

And it allows us to see the agency that we have in shaping those biases, instead of passing on that agency to AI.

But most importantly, saying that AI “has biases” or “has biased training” shifts the focus grammatically: away from the subject of AI towards the object of biases or the training itself. And that’s where our focus should be.

This entry was posted in AI, online journalism and tagged on by .
Unknown's avatar

About Paul Bradshaw

Paul teaches data journalism at Birmingham City University and is the author of a number of books and book chapters about online journalism and the internet, including the Online Journalism Handbook, Mobile-First Journalism, Finding Stories in Spreadsheets, Data Journalism Heist and Scraping for Journalists. From 2010-2015 he was a Visiting Professor in Online Journalism at City University London and from 2009-2014 he ran Help Me Investigate, an award-winning platform for collaborative investigative journalism. Since 2015 he has worked with the BBC England and BBC Shared Data Units based in Birmingham, UK. He also advises and delivers training to a number of media organisations.

2 thoughts on “Why I’m no longer saying AI is “biased”

  1. graham344's avatargraham344

    An absolutely brilliant and spot on post Paul. Thank you!

    Graham Lovelace, Charting Gen AI

    Reply
  2. Gavin Allen's avatarGavin Allen

    On a similar note, I have a colleague who suggests that AI should stand for ‘Artificial Information’ rather than ‘Intelligence’, as the models are not intelligent, and so the term perpetuates the anthropomorphism.

    Reply

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.