There’s a story out this week on the BBC website about dialogue and gender in Game of Thrones. It uses data generated by artificial intelligence (AI) — specifically, machine learning — and it’s a good example of some of the challenges that journalists are increasingly going to face as they come to deal with more and more algorithmically-generated data.
Information and decisions generated by AI are qualitatively different from the sort of data you might find in an official report, but journalists may fall back on treating data as inherently factual.
Here, then, are some of the ways the article dealt with that — and what else we can do as journalists to adapt.
Margins of error: journalism doesn’t like vagueness
The story draws on data from an external organisation, Ceretai, which “uses machine learning to analyse diversity in popular culture.” The organisation claims to have created an algorithm which “has learned to identify the difference between male and female voices in video and provides the speaking time lengths in seconds and percentages per gender.”
Crucially, the piece notes that:
“Like most automatic systems, it doesn’t make the right decision every time. The accuracy of this algorithm is about 85%, so figures could be slightly higher or lower than reported.”
I’ll be holding a special ‘Taster Day’ on June 11 for anyone interested in studying journalism at postgraduate level — specifically data journalism (which includes a part time PGCert option for those already working in the industry) and multiplatform journalism (full time only).
In the afternoon (2pm-4.30pm), I’ll be hosting a taster session of the MA in Multiplatform and Mobile Journalism. This will cover reporting news for multiplatform audiences, and how to use mobile journalism to report stories
I’ll be making time in both sessions for questions and discussion about postgraduate study and developments in journalism.
Last week saw the third Data Journalism UK conference, an opportunity for the country’s data journalists to gather, take stock of the state of the industry and look at what’s ahead.
The BBC Shared Data Unit’s Pete Sherlock kicked off the event, looking back at the first 18 months of the unit’s existence. In that period the unit has trained 15 secondees and helped generate over 600 stories across more than 250 titles in the regional press.
Both stories resulted in strong pushback – from the Ministry of Justice and the electric car industry respectively – but their new data journalism skills gave them the confidence to persist with the story. Continue reading →