There’s a story out this week on the BBC website about dialogue and gender in Game of Thrones. It uses data generated by artificial intelligence (AI) — specifically, machine learning — and it’s a good example of some of the challenges that journalists are increasingly going to face as they come to deal with more and more algorithmically-generated data.
Information and decisions generated by AI are qualitatively different from the sort of data you might find in an official report, but journalists may fall back on treating data as inherently factual.
Here, then, are some of the ways the article dealt with that — and what else we can do as journalists to adapt.
Margins of error: journalism doesn’t like vagueness
The story draws on data from an external organisation, Ceretai, which “uses machine learning to analyse diversity in popular culture.” The organisation claims to have created an algorithm which “has learned to identify the difference between male and female voices in video and provides the speaking time lengths in seconds and percentages per gender.”
Crucially, the piece notes that:
“Like most automatic systems, it doesn’t make the right decision every time. The accuracy of this algorithm is about 85%, so figures could be slightly higher or lower than reported.”
This week’s GEN Summit marked a breakthrough moment for artificial intelligence (AI) in the media industry. The topic dominated the agenda of the first two days of the conference, from Facebook’s Antoine Bordesopening keynote to voice AI, bots, monetisation and verification – and it dominated my timeline too.
At times it felt like being at a conference in the 1980s discussing how ‘computers’ could be used in the newsroom, or listening to people talking about the use of mobile phones for journalism in the noughties — in other words, it feels very much like early days. But important days nonetheless.
Ludovic Blecher‘s slide on the AI-related projects that received Google Digital News Initiative funding illustrated the problem best, with proposals counted in categories as specific as ‘personalisation’ and as vague as ‘hyperlocal’.
Because he sends me an email every December, Nic Newmanhas a tag all of his own on this blog. So as this year’s email lands in my inbox here’s my annual reply around what I’ve noticed in the last 12 months — along with some inevitably doomed predictions of what might happen in the next year…
Surprising in 2017: horizontal storytelling and Facebook disappointments
This week I’m rounding off the first semester of classes on the new MA in Data Journalism with a session on artificial intelligence (AI) and machine learning. Machine learning is a subset of AI — and an area which holds enormous potential for journalism, both as a tool and as a subject for journalistic scrutiny.
So I thought I would share part of the class here, showing some examples of how the 3 types of machine learning — supervised, unsupervised, and reinforcement — have already been used for journalistic purposes, and using those to explain what those are along the way. Continue reading →
McLaren is talking about malicious threats, and the way that machine learning can be used to identify suspicious patterns of behaviour. But the example given above is equally useful in illustrating the way that similar behaviour might be used to identify an employee intending to whistleblow on illegal, unethical or dangerous behaviour by his or her organisation. Continue reading →