TLDR; Saying “AI has biases” or “biased training data” is preferable to “AI is biased” because it reduces the risk of anthropomorphism and focuses on potential solutions, not problems.

For the last two years I have been standing in front of classes and conferences saying the words “AI is biased” — but a couple months ago, I stopped.
As journalists, we are trained to be careful with language — and “AI is biased” is a sloppy piece of writing. It is a thoughtless cliche, often used without really thinking what it means, or how it might mislead.
Because yes, AI is “biased” — but it’s not biased in the way most people might understand that word.
When we talk about people being biased, we mean that they have particular beliefs, experiences, and vested interests. The business owner is biased because they want to sell products; the campaigner is biased because they want to change things; a racist person is biased because of their beliefs. Put simply, they have motives.
To say that “AI is biased”, then, risks anthropomorphising the technology.
Worse than that, it risks letting us off the hook.
The biases of generative AI
A large language model like ChatGPT’s doesn’t have motives — it is a statistical model that predicts what humans would say, or what image they would create, in response to a particular prompt.
So when we say AI is biased, what we mean is that the predictions of AI reflect the biases of text and images in its training data.
Those biases reflect two interconnected factors: power (certain people are more likely to create online content); and selection bias (certain sources are more likely to be used for training than others).

Similar biases exist in surveys and clinical research and a number of techniques are used to compensate for those.
But we don’t say “the survey is biased”, because we don’t anthropomorphise surveys. Instead we might mention the survey’s margin of error or low participation rates from certain groups.
A more accurate way to talk about AI, then, might be to talk about AI’s plural biases, or its biased training.
Doing so also means we are not giving AI an agency that it doesn’t have.
AI biases don’t behave like human biases
Talking about biases or biased training allows us to address a potential blind spot of “AI is biased” — the fact that AI’s biases are not human-like (aka anthropomorphic).
Human biases are difficult to tackle — they are often either deeply held and defended, or unconscious and denied, or both.
But AI chatbots can adjust their biases on request. You can ask a ‘racist’ AI model to apply principles of diversity, for example, and it will not put up an argument (more about this below).
Generative AI also has some very non-human biases. A good example is temporal bias: all language models are trained on data up to a certain point — their knowledge cut-off dates — and are ignorant of events after that point.
Ask Claude about Pope Francis, for example, and it won’t mention that he is no longer alive, because its temporal bias means it doesn’t know anything about the Pope after October 2024.

This means that ChatGPT and Claude are unable to generate news stories — because by definition large language models will not have been trained on any information that has not been published online.
Or will they?
The training isn’t just in the training data
Here’s an analogy: supermarket trolleys all veer one way or another. So when you use a trolley, you identify its bias and push a little more in the opposite direction to stop it from veering in the wrong direction. You also identify how quickly or slowly it moves, and push more or less in response. If you’re especially practical, you might pull out material that has become stuck in the wheel, or even tighten loose bolts. The trolley’s behaviour changes as you add more and more items to the trolley, and you change how you behave in response.
Generative AI is that supermarket trolley: it can help you to do a job faster and at an enhanced scale, but it comes with pre-existing biases that we need to correct for.
Put another way, the important thing is not whether an algorithm is ‘biased’ or not. All methods are imperfect, including relying solely on humans (who cannot walk in a straight line without correction either) — what is important is what steps we are taking to reduce the impact of those inevitable biases.
In fact, biases when using AI come from three key forces:
- The training data, yes, but also:
- The algorithm design itself, such as how different inputs are weighted and any ‘guardrails’, and
- The inputs of the user

Ask Google Gemini to write a racist story, for example, and it will refuse, saying that it “goes against my core principles”. That’s not a bias in the training data — it’s a corrective bias in the design of the algorithm.
The existence of such biases reminds us that bias is not inherently bad. A bias towards fairness or a bias towards factual accuracy are positive biases that most journalists apply to their work to attempt to correct less helpful human biases such as a tendency to believe people in authority and confirmation bias.
Pushing the shopping trolley in the right direction
Similarly, your inputs into a conversation with generative AI are a vital force in correcting the biases that exist in that mass of training data. An example I use with my students is to type in the prompt “Give me five story ideas“. Without any correction, ChatGPT will predict that you mean fictional stories, reflecting one of the many biases in its training data.
But if you use role prompting and instead ask: “You are a journalist. Give me five story ideas” the response will change accordingly, effectively weighting any training data that relates to journalistic material. Recursive prompting that provides feedback on what is relevant or irrelevant to your needs has a similar result.
Another common technique to counterweight bias is to upload a document or dataset. This not only augments the algorithm’s training data (the technique is called Retrieval Augmented Generation), it also significantly weights its response towards the new material. Any responses are now likely to relate mainly or entirely to that document. Providing documents or allowing the model to link to the wider web are also common ways to address the temporal bias in large language models.
The relationship works both ways: weighting responses (with guidelines, for example) means AI can be used as a corrective force to check our own biases in ideas, sources or drafts.
Focus on the solutions, not the problem
Describing AI as having biases, or biased training, allows us to move past the problem to the strategies that we need to address it.
It allows us to see AI as a tool instead of a character, and take responsibility for how it is used.
And it allows us to see the agency that we have in shaping those biases, instead of passing on that agency to AI.
But most importantly, saying that AI “has biases” or “has biased training” shifts the focus grammatically: away from the subject of AI towards the object of biases or the training itself. And that’s where our focus should be.

An absolutely brilliant and spot on post Paul. Thank you!
Graham Lovelace, Charting Gen AI
On a similar note, I have a colleague who suggests that AI should stand for ‘Artificial Information’ rather than ‘Intelligence’, as the models are not intelligent, and so the term perpetuates the anthropomorphism.