Tag Archives: data journalism

If we are using AI in journalism we need better guidelines on reporting uncertainty

Chart: women speak 27% of the time in Game of Thrones

The BBC’s chart mentions a margin of error

There’s a story out this week on the BBC website about dialogue and gender in Game of Thrones. It uses data generated by artificial intelligence (AI) — specifically, machine learning —  and it’s a good example of some of the challenges that journalists are increasingly going to face as they come to deal with more and more algorithmically-generated data.

Information and decisions generated by AI are qualitatively different from the sort of data you might find in an official report, but journalists may fall back on treating data as inherently factual.

Here, then, are some of the ways the article dealt with that — and what else we can do as journalists to adapt.

Margins of error: journalism doesn’t like vagueness

The story draws on data from an external organisation, Ceretai, which “uses machine learning to analyse diversity in popular culture.” The organisation claims to have created an algorithm which “has learned to identify the difference between male and female voices in video and provides the speaking time lengths in seconds and percentages per gender.”

Crucially, the piece notes that:

“Like most automatic systems, it doesn’t make the right decision every time. The accuracy of this algorithm is about 85%, so figures could be slightly higher or lower than reported.”

And this is the first problem. Continue reading

Advertisements

From making data physical to giving journalists confidence (and a few other things too): Data Journalism UK 2019

marie segger at data journalsm uk 19

Last week saw the third Data Journalism UK conference, an opportunity for the country’s data journalists to gather, take stock of the state of the industry and look at what’s ahead.

The BBC Shared Data Unit’s Pete Sherlock kicked off the event, looking back at the first 18 months of the unit’s existence. In that period the unit has trained 15 secondees and helped generate over 600 stories across more than 250 titles in the regional press.

Sherlock highlighted two stories in particular to demonstrate how the data unit had helped equip regional reporters in holding power to account: the Eastern Daily Press’s Dominic Gilbert‘s story on legal aid deserts, and JPI Media’s Aimee Stanton‘s report on electric car charging points.

Both stories resulted in strong pushback – from the Ministry of Justice and the electric car industry respectively – but their new data journalism skills gave them the confidence to persist with the story. Continue reading

FAQ: Do you think that an increase in algorithms is leading to a decline in human judgement?

recipe by Phillip Stewart

This algorithm has been quality tested. Image by Phillip Stewart

The latest in my series of FAQ posts follows on from the last one, in response to a question from an MA student at City University who posed the question “Do you think that an increase in algorithmic input is leading to a decline in human judgement?”. Here’s my response.

Does an increase in computation lead to a decline in human input?

Firstly, it’s important to emphasise that the vast majority of data journalism involves no algorithms or automation at all: it’s journalists making calculations, which historically they would have done manually.

You mention the possibility that “an increase in computation leads to a decline in human input”. An analogy would be to ask whether an increase in pencils leads to a decline in human input in art. Continue reading

Teaching data journalism — fast and slow

lecture theatre

Lecture theatre image by judy dean

I’ve now been teaching data journalism for over a decade — from one-off guest classes at universities with no internal data journalism expertise, to entire courses dedicated to the field. In the first of two extracts from a commentary I was asked to write for Asia Pacific Media Educator I reflect on the lessons I’ve learned, and the differences between what I describe (after Daniel Kahneman) as “teaching data journalism fast” and “teaching data journalism slow”. First up, ‘teaching data journalism fast‘ — techniques for one-off data journalism classes aimed at general journalism students.

Like a gas, data journalism teaching will expand to fill whatever space is allocated to it. Educators can choose to focus on data journalism as a set of practices, a form of journalistic output, a collection of infrastructure or inputs, or a culture (see also Karlsen and Stavelin 2014; Lewis and Usher 2014; Boyles and Meyer 2016). Or, they might choose to spend all their time arguing over what we mean by ‘data journalism’ in the first place.

We can choose to look to the past of Computer Assisted Reporting and Precision Journalism, emerging developments around computational and augmented journalism, and everything that has happened in between.

In this commentary, I outline the different pedagogical approaches I have adopted in teaching data journalism within different contexts over the last decade. In each case, there was more than enough data journalism to fill the space — the question was how to decide which bits to leave out, and how to engage students in the process. Continue reading

2018 has been a good year for UK local data journalism — here’s the story so far

Local data journalism in the UK has been undergoing a quiet revolution in the last 12 months, but 2018 in particular has seen a number of landmarks already in its first few months. Here’s some of the highlights in just its first 12 and a half weeks…

January: BBC Shared Data Unit publishes its first secondee-led investigation

The BBC Shared Data Unit had already been producing stories before in late 2017 it took on its first three-month secondees from the news industry. Over the next 12 weeks they received training in data journalism and work on a joint investigation. Continue reading

On International Women’s Day here are 7 data journalism projects about women’s issues

Photo: Pixabay

Women represent 49.5% of the world’s population, but they do not have a corresponding public, political and social influence. In recent years, more and more women have raised their voices, making society aware of their challenges — data journalists included. To commemorate International Women’s Day, Carla Pedret presents a list of data journalism projects that detail the sacrifices, injustices and prejudices that women still have to face in the 21st century.

Continue reading

Text-as-data journalism? Highlights from a decade of SOTU speech coverage

January 2012: The National Post’s graphics team analyzes keywords used in State of the Union addresses by presidents Bush and Obama / Image: © Richard Johnson/The National Post

January 2012: The National Post’s graphics team analyzes keywords used in State of the Union addresses by presidents Bush and Obama / Image: © Richard Johnson/The National Post

In a guest post for OJB, Barbara Maseda looks at how the media has used text-as-data to cover State of the Union addresses over the last decade.

State of the Union (SOTU) addresses are amply covered by the media —from traditional news reports and full transcripts, to summaries and highlights. But like other events involving speeches, SOTU addresses are also analyzable using natural language processing (NLP) techniques to identify and extract newsworthy patterns.

Every year, a new speech is added to this small collection of texts, which some newsrooms process to add a fresh angle to the avalanche of coverage.

Continue reading