FAQ: Data journalism and computer science


Where I started, with BASIC code. Image by Terry Freedman

I have a habit of posting replies to questions on OJB: this one is in response to a series of questions from a student at the University of the West of England about data journalism.

How do you feel about the intertwining of computer science with journalism?

Not surprisingly, I’m quite positive about it. I think most industries benefit from being exposed to different practices and ideas, as they make you reevaluate your own habits and assumptions.

That has very much been the case with the influence of computer science on journalism: in many ways data journalism is more open and more collaborative than other parts of journalism, and that has led to some of its best work.

For example, when organisations like Quartz, Vox or NY Public Radio open source their code, it makes it easier for other news organisations to innovate with that, and improve on it.

Likewise, when teams like FiveThirtyEight or BuzzFeed publish their code, it means others can check how they arrived at their conclusions.

Notably, a similar thing happened in the 1960s and 70s in the US when journalism started to look to social science methods, and identify techniques it could borrow from that to improve the quality of reporting. This became known as Precision Journalism and was a major influence in Computer Assisted Reporting (CAR).

Put into your own words whether you think data-driven journalism does or does not contribute to journalistic practice

If you mean what people also call ‘data journalism’ — journalism which uses structured information in some way — then I think it clearly has the potential to contribute in all sorts of ways. The most obvious are:

  • A stronger factual basis to our reporting;
  • Different ways of finding stories;
  • Different ways of telling stories;
  • A more transparent reporting process (as outlined in the last question).

On the first: it is relatively easy in journalism to merely report what other people have said without actually taking any action to identify whether they are telling the truth.

Politics and sport are both good examples of this: the leader of one party says that they are spending more money on the health service than ever, but the opposition says that the health service is on the brink of collapse.

Or one pundit says that this midfielder is lacking fitness; but another says they should be picked for England.

Data allows us to do more than just report what people say — it allows us to provide the context around that: it might be a simple debunking (“no, the data says that we’re not spending more than ever”), or it might be unpicking a misleading statement (“we are spending more, but this is misleading because: we are spending less per person, and inflation needs to be taken into account)”.

This context is particularly important when politicians and other public figures can communicate directly with audiences: in that context why should we do what they can do themselves anyway, especially if we’re not going to be the first to do it (because they will always be first)? We need to add something more.

As to different ways of finding stories: there are many stories I’ve worked on which 20 years ago would have simply taken too much time to do.

Automation using online tools and programming means we can say ‘Read these 300 PDF reports and look for these figures/that phrase’ or ‘Go through these 10,000 webpages and grab a number from each’ or ‘Tell me which outliers I should be focusing on in this bunch of people/organisations’.

Thirdly, we can now make stories personal and interactive using data: type in your postcode to find out how this policy affects schools in your area; search the database to find things that interest you; explore our interactive map.

There are other contributions, but those are the main ones that spring to mind.

Do you think that there is a necessity for data journalism education?

Given that I chose to learn and teach it, and then write about it, my answer is obviously going to be yes.

That doesn’t mean I think everyone needs to be a data journalist, or do data journalism, but two things make it central to modern journalism:

  1. We live in an increasingly data-driven world; and
  2. We live in a world where people don’t need journalists to find out what a public figure is saying or doing.

This happens every so often in the history of journalism: in the early 19th century the invention of the press release meant journalists began to learn how to ‘use’ press releases.

And even earlier in the history of journalism, reporters started to move from drawing on official reports to learning how to interview people directly.

There is so much data around now that journalists need to know how to understand it and ask the right questions of it, just like other sources of information. Otherwise they will be misled by people with an agenda to push.

Also, increasingly our publishing is data-driven, and journalists will need to be able to use data to connect their reporting with the users in a particular area or demographic that the reporting relates to.

In what ways do you think data journalism impacts journalistic labour?

I would say it’s part of a broader change in the ways that journalists encounter and process information. Where previously journalists might have spent time organising to meet with someone, or call someone, to get information, a lot of time is now saved by the same information being published online.

So we might instead now spend the same time analysing or verifying that information, whether it’s data analysis or follow up phonecalls, or searching social media for witnesses, or finding experts, or something else.

Put another way, the labour of journalism is being reorganised as news organisations try to work out where their value now lies. Data journalism is one of the areas where news organisations can add more value.

To what extent do you think journalism curricula has changed in order to involve data-driven journalism?

My experience is only anecdotal (you can find some research on teaching in US colleges here and more qualitative work on ‘why’ here) but broadly I don’t think it’s changed as much as many people would like it to have changed. Why? Largely because it’s difficult to find people with the skills to teach data journalism (the industry struggles to hire people to do it, for a start!).

But certainly more journalism students are now introduced to the concept of data journalism and high profile examples like the MPs’ expenses and Wikileaks stories, and of course it’s in an increasing number of journalism books like my own Online Journalism Handbook.

And the general direction is towards more data journalism teaching within journalism courses, as more teaching staff learn the skills required, or find visiting tutors who can cover it, and more people from the industry with those skills find themselves in academia.

Do you think that everybody should be given the opportunity to learn how to use computational methods in the field of journalism?

Yes. I think at the broadest level students should be able to experience all the main forms of journalism: not just text-based journalism, but reporting using audio and video, and using computational methods in the broadest sense, whether that’s interactivity or analysis or newsgathering.

Journalism is now a multiplatform discipline (which is why I changed my MA title!) so understanding the possibilities on all platforms, including online, is pretty crucial now. ‘Computational methods’ takes in a lot of different types of work, though, so the degree of depth depends on the course and the student.

What are your thoughts on how new technologies in journalistic practice have altered the way in which journalist educators teach future journalists?

Journalism teaching has transformed enormously since the start of the century. The majority of journalism lecturers a decade ago would have come from one discipline: newspapers, magazines, radio or TV. They would know how to work in one medium, but not necessarily others.

The technical demands on students have increased enormously since then. It’s not enough any more just to be able to work in one medium, and broadly speaking we expect journalism students to be able to work in all media.

That means we have to teach a lot more, technically, than was the case before.

But editorial demands have increased too: being able to turn a press release into decent copy, for example, or write a good review — these things are much less valuable than used to be the case. So I think there’s a lot more focus in teaching on techniques such as data analysis, verification and fact-checking, and advanced sourcing.

As a journalist educator, could you give examples of why you might feel forced to adapt to technological developments?

One concrete way this has altered by own teaching is in the way I teach students narrative before I teach them tools (as I explained here). Because if a new tool or feature comes along next year, they’ll be able to think critically about how to use it without being ‘taught’.

You can’t just keep adding new tools to the list of things that you teach on a course – there are just too many. So the emphasis shifts to the skills to help students themselves adapt to ongoing technological development:

  • How do they access communities of practice who support each other with tips and advice?(Which I wrote about here)
  • How do they draw on lessons from the history of media technologies? The early days of cinema or radio are in many ways very similar to these early days of online storytelling
  • How do they apply general principles to specific technologies? For example, narrative; or the importance of sound when recording video; or rules of composition and lighting; or ethics
  • How do they use critical analysis and research techniques to test the effectiveness of new approaches? (Here, for example, I wrote about getting students to use analytics in their work)

What do you think technological advancements in the field of journalism mean for journalists? Is it fight or flight?

I’m not sure what you’re talking about fighting against or flying from! But given the choice between abandoning journalism and fighting for better reporting, I’d have to go for “fight”!

I think like any change, technological advancements means many opportunities and many threats. The greatest opportunities, for me, are around raising the bar for journalism: students are expected now to be better technically, editorially and ethically than they ever have, in my opinion.

We are thinking harder than ever before about how to connect with audiences: lazy reliance on formats that we used for decades is being challenged. We now know when people watch or read our reporting, and for how long. We know how much people engage, or when they don’t. We have dozens of new formats to experiment with, and that’s providing an enormous boom in creativity. And we have new ways to fund journalism, which is leading to stories that would never be reported before.

But that measurement can mean we chase the wrong metrics, or lack a proper understanding of what they mean, or rely on formats that are safer than experimentation. We can sacrifice technical depth for breadth. A lack of commercial security can lead to a lack of risk taking too. Access to much wider ranges of information means we can fall victim to confirmation bias — and our audiences can too.

So we need to fight to make sure that we take on those challenges, and come out with a better journalism than we had before. I’m pretty optimistic that we will get better, actually.


2 thoughts on “FAQ: Data journalism and computer science

  1. Pingback: FAQ: Data journalism and computer science – Online Journalism Blog (blog) – ItsInvisible

  2. Pingback: FAQ: Data journalism and computer science – News Go

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.