Tag Archives: charlie beckett

Crowdsourcing investigative journalism: a case study (part 1)

As I begin on a new Help Me Investigate project, I thought it was a good time to share some research I conducted into the first year of the site, and the key factors in how that project tried to crowdsource investigative and watchdog journalism.

The findings of this research have been key to the development of this new project. They also form the basis of a chapter in the book Face The Future, and another due to be published in the Handbook of Online Journalism next year (not to be confused with my own Online Journalism Handbook). Here’s the report:

In both academic and mainstream literature about the world wide web, one theme consistently recurs: the lowering of the barrier allowing individuals to collaborate in pursuit of a common goal. Whether it is creating the world’s biggest encyclopedia (Lih, 2009), spreading news about a protest (Morozov, 2011) or tracking down a stolen phone (Shirky, 2008), the rise of the network has seen a decline in the role of the formal organisation, including news organisations.

Two examples of this phenomenon were identified while researching a book chapter on investigative journalism and blogs (De Burgh, 2008). The first was an experiment by The Florida News Press: when it started receiving calls from readers complaining about high water and sewage connection charges for newly constructed homes the newspaper, short on in-house resources to investigate the leads, decided to ask their readers to help. The result is by now familiar as a textbook example of “crowdsourcing” – outsourcing a project to ‘the crowd’ or what Brogan & Smith (2009, p136) describe as “the ability to have access to many people at a time and to have them perform one small task each”:

“Readers spontaneously organized their own investigations: Retired engineers analyzed blueprints, accountants pored over balance sheets, and an inside whistle-blower leaked documents showing evidence of bid-rigging.” (Howe, 2006a)

The second example concerned contaminated pet food in the US, and did not involve a mainstream news organisation. In fact, it was frustration with poor mainstream ‘churnalism’ (see Davies, 2009) that motivated bloggers and internet users to start digging into the story. The resulting output from dozens of blogs ranged from useful information for pet owners and the latest news to the compilation of a database that suggested the official numbers of pet deaths recorded by the US Food and Drug Administration was short by several thousand. One site, Itchmo.com, became so popular that it was banned in China, the source of the pet food in question.

What was striking about both examples was not simply that people could organise to produce investigative journalism, but that this practice of ‘crowdsourcing’ had two key qualities that were particularly relevant to journalism’s role in a democracy. The first was engagement: in the case of the News-Press for six weeks the story generated more traffic to its website than “ever before, excepting hurricanes” (Weise, 2007). Given that investigative journalism often concerns very ‘dry’ subject matter that has to be made appealing to a wider audience, these figures were surprising – and encouraging for publishers.

The second quality was subject: the contaminated pet food story was, in terms of mainstream news values, unfashionable and unjustifiable in terms of investment of resources. It appeared that the crowdsourcing model of investigation might provide a way to investigate stories which were in the public interest but which commercial and public service news organisations would not consider worth their time. More broadly, research on crowdsourcing more generally suggested that it worked “best in areas that are not core to your product or central to your business model” (Tapscott and Williams, 2006, p82).

Investigative journalism: its history and discourses

DeBurgh (2008, p10) defines investigative journalism as “distinct from apparently similar work [of discovering truth and identifying lapses from it] done by police, lawyers and auditors and regulatory bodies in that it is not limited as to target, not legally founded and usually earns money for media publishers.” The term is notoriously problematic and contested: some argue that all journalism is investigative, or that the recent popularity of the term indicates the failure of ‘normal’ journalism to maintain investigative standards. This contestation is a symptom of the various factors underlying the growth of the genre, which range from journalists’ own sense of a democratic role, to professional ambition and publishers’ commercial and marketing objectives.

More recently investigative journalism has been used to defend traditional print journalism against online publishing, with publishers arguing that true investigative journalism cannot be maintained without the resources of a print operation. This position has become harder to defend as online-only operations and journalists have won increasing numbers of awards for their investigative work – Clare Sambrook in the UK and VoiceOfSanDiego.com and Talking Points Memo in the US are three examples – while new organisations have been established to pursue investigations without any associated print operation including Canada’s OpenFile; the UK’s Bureau of Investigative Journalism and a number of bodies in the US such as ProPublica, The Florida Center for Investigative Reporting, and the Huffington Post’s investigative unit.

In addition, computer technology has started to play an increasingly important role in print investigative journalism: Stephen Grey’s investigation into the CIA’s ‘extraordinary rendition’ programme (Grey, 2006) was facilitated by the use of software such as Analyst’s Notebook, which allowed him to analyse large amounts of flight data and identify leads. The Telegraph’s investigation into MPs’ expenses was made possible by digitisation of data and the ability to store large amounts on a small memory stick. And newspapers around the world collaborated with the Wikileaks website to analyse ‘warlogs’ from Iraq and Afghanistan, and hundreds of thousands of diplomatic cables. More broadly the success of Wikipedia inspired a raft of examples of ‘Wiki journalism’ where users were invited to contribute to editorial coverage of a particular issue or field, with varying degrees of success.

Meanwhile, investigative journalists such as The Guardian’s Paul Lewis have been exploring a more informal form of crowdsourcing, working with online communities to break stories including the role of police in the death of newspaper vendor Ian Tomlinson; the existence of undercover agents in the environmental protest movement; and the death of a man being deported to Angola (Belam, 2011b).

This is part of a broader move to networked journalism explored by Charlie Beckett (2008):

“In a world of ever-increasing media manipulation by government and business, it is even more important for investigative journalists to use technology and connectivity to reveal hidden truths. Networked journalists are open, interactive and share the process. Instead of gatekeepers they are facilitators: the public become co-producers. Networked journalists “are ‘medium agnostic’ and ‘story-centric’”. The process is faster and the information sticks around longer.” (2008, p147)

As one of its best-known practitioners Paul Lewis talks particularly of the role of technology in his investigations – specifically Twitter – but also the importance of the crowd itself and journalistic method:

“A crucial factor that makes crowd-sourcing a success [was that] there was a reason for people to help, in this case a perceived sense of injustice and that the official version of events did not tally with the truth. Six days after Tomlinson’s death, Paul had twenty reliable witnesses who could be placed on a map at the time of the incident – and only one of them had come from the traditional journalistic tool of a contact number in his notebook.” (Belam, 2011b)

A further key skill identified by Lewis is listening to the crowd – although he sounds a note of caution in its vulnerability to deliberately placed misinformation, and the need for verification.

“Crowd-sourcing doesn’t always work […] The most common thing is that you try, and you don’t find the information you want […] The pattern of movement of information on the internet is something journalists need to get their heads around. Individuals on the web in a crowd seem to behave like a flock of starlings – and you can’t control their direction.” (Belam, 2011b)

Conceptualising Help Me Investigate

The first plans for Help Me Investigate were made in 2008 and were further developed over the next 18 months. They built on research into crowdsourced investigative journalism, as well as other research into online journalism and community management. In particular the project sought to explore concepts of “P2P journalism” which enables “more engaged interaction between and amongst users” (Bruns, 2005, p120, emphasis in original) and of “produsage”, whose affordances included probabilistic problem solving, granular tasks, equipotentiality, and shared content (Bruns, 2008, p19).

A key feature in this was the ownership of the news agenda by users themselves (who could be either members of the public or journalists). This was partly for reasons identified above in research into the crowdsourced investigation into contaminated pet food. It would allow the site to identify questions that would not be considered viable for investigation within a traditional newsroom; but the feature was also implemented because ‘ownership’ was a key area of contestation identified within crowdsourcing research (Lih, 2009; Benkler, 2006; Surowiecki, 2005) – ‘outsourcing’ a project to a group of people raises obvious issues regarding claims of authorship, direction and benefits (Bruns, 2005).

These issues were considered carefully by the founders. The site adopted a user interface with three main modes of navigation for investigations: most-recent-top; most popular (those investigations with the most members); and two ‘featured’ investigations chosen by site staff: these were chosen on the basis that they were the most interesting editorially, or because they were attracting particular interest and activity from users at that moment. There was therefore an editorial role, but this was limited to only two of the 18 investigations listed on the ‘Investigations’ page, and was at least partly guided by user activity.

In addition there were further pages where users could explore investigations through different criteria such as those investigations that had been completed, or those investigations with particular tags (e.g. ‘environment’, ‘Bristol’, ‘FOI’, etc.).

A second feature of the site was that ‘journalism’ was intended to be a by-product: the investigation process itself was the primary objective, which would inform users, as research suggested that if users were to be attracted to the site, it must perform the function that they needed it to (Porter, 2008), which was – as became apparent – one of project management. The ‘problem’ that the site was attempting to ‘solve’ needed to be user-centric rather than publisher-centric: ‘telling stories’ would clearly be lower down the priority list for users than it was for journalists and publishers. Of higher priority were the need to break down a question into manageable pieces; find others to investigate those with; and get answers. This was eventually summarised in the strapline to the site: “Connect, mobilise, uncover”.

Thirdly, there was a decision to use ‘game mechanics’ that would make the process of investigation inherently rewarding. As the site and its users grew, the interface was changed so that challenges started on the left hand side of the screen, coloured red, then moved to the middle when accepted (the colour changing to amber), and finally to the right column when complete (now with green border and tick icon). This made it easier to see at a glance what needed doing and what had been achieved, and also introduced a level of innate satisfaction in the task. Users, the idea went, might grow to like to feeling of moving those little blocks across the screen, and the positive feedback (see Graham, 2010 and Dondlinger, 2007) provided by the interface.

Similar techniques were coincidentally explored at the same time by The Guardian’s MPs’ expenses app (Bradshaw, 2009). This provided an interface for users to investigate MP expense claim forms that used many conventions of game design, including a ‘progress bar’, leaderboards, and button-based interfaces. A second iteration of the app – created when a second batch of claim forms were released – saw a redesigned interface based on a stronger emphasis on positive feedback. As developer Martin Belam explains (2011a):

“When a second batch of documents were released, the team working on the app broke them down into much smaller assignments. That meant it was easier for a small contribution to push the totals along, and we didn’t get bogged down with the inertia of visibly seeing that there was a lot of documents still to process.

“By breaking it down into those smaller tasks, and staggering their start time, you concentrated all of the people taking part on one goal at a time. They could therefore see the progress dial for that individual goal move much faster than if you only showed the progress across the whole set of documents.”

These game mechanics are not limited to games: many social networking sites have borrowed the conventions to provide similar positive feedback to users. Jon Hickman (2010, p2) describes how Help Me Investigate uses these genre codes and conventions:

“In the same way that Twitter records numbers of “followers”, “tweets”, “following” and “listed”, Help Me Investigate records the number of “things” which the user is currently involved in investigating, plus the number of “challenges”, “updates” and “completed investigations” they have to their credit. In both Twitter and Help Me Investigate these labels have a mechanistic function: they act as hyperlinks to more information related to the user’s profile. They can also be considered culturally as symbolic references to the user’s social value to the network – they give a number and weight to the level of activity the user has achieved, and so can be used in informal ranking of the user’s worth, importance and usefulness within the network.” (2010, p8)

This was indeed the aim of the site design, and was related to a further aim of the site: to allow users to build ‘social capital’ within and through the site: users could add links to web presences and Twitter accounts, as well as add biographies and ‘tag’ themselves. They were also ranked in a ‘Most active’ table; and each investigation had its own graph of user activity. This meant that users might use the site not simply for information-gathering reasons, but also for reputation building ones, a characteristic of open source communities identified by Bruns (2005) and Leadbeater (2008) among others.

There were plans to take these ideas much further which were shelved during the proof of concept phase as the team concentrated on core functionality. For example, it was clear that users needed to be able to give other users praise for positive contributions, and they used the ‘update feature’ to do so. A more intuitive function allowing users to give a ‘thumbs up’ to a contribution would have made this easier, and also provided a way to establish the reputation of individual users, and encourage further use.

Another feature of the site’s construction was a networked rather than centralised design. The bid document to 4iP proposed to aggregate users’ material:

“via RSS and providing support to get users onto use web-based services. While the technology will facilitate community creation around investigations, the core strategy will be community-driven, ‘recruiting’ and supporting alpha users who can drive the site and community forward.”

Again, this aggregation functionality was dropped as part of focusing the initial version of the site. However, the basic principle of working within a network was retained, with many investigations including a challenge to blog about progress on other sites, or use external social networks to find possible contributors. The site included guidance on using tools elsewhere on the web, and many investigations linked to users’ blog posts.

In the second part I discuss the building of the site and reflections on the site’s initial few months.

One ambassador’s embarrassment is a tragedy, 15,000 civilian deaths is a statistic

Few things illustrate the challenges facing journalism in the age of ‘Big Data’ better than Cable Gate – and specifically, how you engage people with stories that involve large sets of data.

The Cable Gate leaks have been of a different order to the Afghanistan and Iraq war logs. Not in number (there were 90,000 documents in the Afghanistan war logs and over 390,000 in the Iraq logs; the Cable Gate documents number around 250,000) – but in subject matter.

Why is it that the 15,000 extra civilian deaths estimated to have been revealed by the Iraq war logs did not move the US authorities to shut down Wikileaks’ hosting and PayPal accounts? Why did it not dominate the news agenda in quite the same way?

Tragedy or statistic?

I once heard a journalist trying to put the number ‘£13 billion’ into context by saying: “imagine 13 million people paying £1,000 more per year” – as if imagining 13 million people was somehow easier than imagining £13bn. Comparing numbers to the size of Wales or the prime minister’s salary is hardly any better.

Generally misattributed to Stalin, the quote “The death of one man is a tragedy, the death of millions is a statistic” illustrates the problem particularly well: when you move beyond scales we can deal with on a human level, you struggle to engage people in the issue you are covering.

Research suggests this is a problem that not only affects journalism, but justice as well. In October Ben Goldacre wrote about a study that suggested “People who harm larger numbers of people get significantly lower punitive damages than people who harm a smaller number. Courts punish people less harshly when they harm more people.”

“Out of a maximum sentence of 10 years, people who read the three-victim story recommended an average prison term one year longer than the 30-victim readers. Another study, in which a food processing company knowingly poisoned customers to avoid bankruptcy, gave similar results.”

In the US “scoreboard reporting” on gun crime – “represented by numbing headlines like, “82 shot, 14 fatally.”” – has been criticised for similar reasons:

“”As long as we have reporting that gives the impression to everyone that poor, black folks in these communities don’t value life, it just adds to their sense of isolation,” says Stephen Franklin, the community media project director at the McCormick Foundation-funded Community Media Workshop, where he led the “We Are Not Alone” campaign to promote stories about solution-based anti-violence efforts.

“Natalie Moore, the South Side Bureau reporter for the Chicago Public Radio, asks: “What do we want people to know? Are we just trying to tell them to avoid the neighborhoods with many homicides?” Moore asks. “I’m personally struggling with it. I don’t know what the purpose is.””

Salience

This is where journalists play a particularly important role. Kevin Marsh, writing about Wikileaks on Sunday, argues that

“Whistleblowing that lacks salience does nothing to serve the public interest – if we mean capturing the public’s attention to nurture its discourse in a way that has the potential to change something material. “

He is right. But Charlie Beckett, in the comments to that post, points out that Wikileaks is not operating in isolation:

“Wikileaks is now part of a networked journalism where they are in effect, a kind of news-wire for traditional newsrooms like the New York Times, Guardian and El Pais. I think that delivers a high degree of what you call salience.”

This is because last year Wikileaks realised that they would have much more impact working in partnership with news organisations than releasing leaked documents to the world en masse. It was a massive move for Wikileaks, because it meant re-assessing a core principle of openness to all, and taking on a more editorial role. But it was an intelligent move – and undoubtedly effective. The Guardian, Der Spiegel, New York Times and now El Pais and Le Monde have all added salience to the leaks. But could they have done more?

Visualisation through personalisation and humanisation

In my series of posts on data journalism I identified visualisation as one of four interrelated stages in its production. I think that this concept needs to be broadened to include visualisation through case studies: or humanisation, to put it more succinctly.

There are dangers here, of course. Firstly, that humanising a story makes it appear to be an exception (one person’s tragedy) rather than the rule (thousands suffering) – or simply emotive rather than also informative; and secondly, that your selection of case studies does not reflect the more complex reality.

Ben Goldacre – again – explores this issue particularly well:

“Avastin extends survival from 19.9 months to 21.3 months, which is about 6 weeks. Some people might benefit more, some less. For some, Avastin might even shorten their life, and they would have been better off without it (and without its additional side effects, on top of their other chemotherapy). But overall, on average, when added to all the other treatments, Avastin extends survival from 19.9 months to 21.3 months.

“The Daily Mail, the ExpressSky News, the Press Association and the Guardian all described these figures, and then illustrated their stories about Avastin with an anecdote: the case of Barbara Moss. She was diagnosed with bowel cancer in 2006, had all the normal treatment, but also paid out of her own pocket to have Avastin on top of that. She is alive today, four years later.

“Barbara Moss is very lucky indeed, but her anecdote is in no sense whatsoever representative of what happens when you take Avastin, nor is it informative. She is useful journalistically, in the sense that people help to tell stories, but her anecdotal experience is actively misleading, because it doesn’t tell the story of what happens to people on Avastin: instead, it tells a completely different story, and arguably a more memorable one – now embedded in the minds of millions of people – that Roche’s £21,000 product Avastin makes you survive for half a decade.”

Broadcast journalism – with its regulatory requirement for impartiality, often interpreted in practical terms as ‘balance’ – is particularly vulnerable to this. Here’s one example of how the homeopathy debate is given over to one person’s experience for the sake of balance:

Journalism on an industrial scale

The Wikileaks stories are journalism on an industrial scale. The closest equivalent I can think of was the MPs’ expenses story which dominated the news agenda for 6 weeks. Cable Gate is already on Day 9 and the wealth of stories has even justified a live blog.

With this scale comes a further problem: cynicism and passivity; Cable Gate fatigue. In this context online journalism has a unique role to play which was barely possible previously: empowerment.

3 years ago I wrote about 5 Ws and a H that should come after every news story. The ‘How’ and ‘Why’ of that are possibilities that many news organisations have still barely explored. ‘Why should I care?’ is about a further dimension of visualisation: personalisation – relating information directly to me. The Guardian moves closer to this with its searchable database, but I wonder at what point processing power, tools, and user data will allow us to do this sort of thing more effectively.

‘How can I make a difference?’ is about pointing users to tools – or creating them ourselves – where they can move the story on by communicating with others, campaigning, voting, and so on. This is a role many journalists may be uncomfortable with because it raises advocacy issues, but then choosing to report on these stories, and how to report them, raises the same issues; linking to a range of online tools need not be any different. These are issues we should be exploring, ethically.

All the above in one sentence

Somehow I’ve ended up writing over a thousand words on this issue, so it’s worth summing it all up in a sentence.

Industrial scale journalism using ‘big data’ in a networked age raises new problems and new opportunities: we need to humanise and personalise big datasets in a way that does not detract from the complexity or scale of the issues being addressed; and we need to think about what happens after someone reads a story online and whether online publishers have a role in that.

Data journalism pt3: visualising data – charts and graphs (comments wanted)

This is a draft from a book chapter on data journalism (the first, on gathering data, is here; the section on interrogating data is here). I’d really appreciate any additions or comments you can make – particularly around considerations in visualisation. A further section on visualisation tools, can be found here.

UPDATE: It has now been published in The Online Journalism Handbook.

“At their best, graphics are instruments for reasoning about quantitative information. Often the most effective way to describe, explore, and summarize a set of numbers – even a very large set – is to look at pictures of those numbers.” (Edward Tufte, The Visual Display of Quantitative Information, 2001)

Visualisation is the process of giving a graphic form to information which is often otherwise dry or impenetrable. Classic examples of visualisation include turning a table into a bar chart, or a series of percentage values into a pie chart – but the increasing power of both computer analysis and graphic design software have seen the craft of visualisation develop with increasing sophistication. In larger organisations the data journalist may work with a graphic artist to produce an infographic that visualises their story – but in smaller teams, in the initial stages of a story, or when speed is of the essence they are likely to need to use visualisation tools to give form to their data.

Broadly speaking there are two typical reasons for visualising data: to find a story; or to tell one. Quite often, it is both. Continue reading

Speaking at the Perugia International Journalism Festival 2009

The lineup for the Perugia International Journalism Festival 2009 has been announced. I’ll be speaking on the first of a series of panels devoted to ‘New Media – The Future of Journalism’. The topic is “Blogs and online communities: Where now for interactive journalism?”. The other members of the panel are Luca Conti, Ben Hammersley Antonio Sofi and Juan Varela.

The following day Paolo Ligouri, Marco Pratellesi, Charlie Beckett, Erik Ulken and Giuseppe Smorto will discuss “Networked journalism – permeable, interactve, 24/7, multi-platform, multi-dimensional – is here. The media is saved!” (if they have any time left after they finish reading out the title) Continue reading

Carnival of journalism: How do you financially support journalism online?

Gather round, gather round for this month’s Carnival of Journalism, which addresses the timely question of ‘How do you financially support journalism online?’. I’ll be updating this post as the carnival performers put on their outsized business heads and add their peacock-like contributions.

Interview: Charlie Beckett on SuperMedia

“This book is my manifesto for the media as a journalist but also as a citizen of the world. As a journalist you are constantly being told that the news media have enormous power to shape society and events, to change lives and history. So why are we so careless as a society about the future of journalism itself ?” [1]

Saving JournalismThis is how Charlie Beckett presents his book “SuperMedia: Saving Journalism So It Can Save The World” (Wiley-Blackwell, 2008), in which he tackles the main challenges to journalistic practice in our days, and its influence to maintain free and democratic societies .

Charlie Beckett is a journalist with a 20 yearscareer at the BBC and ITN, and he is also the founding Director of POLIS, a think tank about journalism and society at the London School of Economics. “SuperMedia” is a work that gathers and structures several streams of thought about the future of Journalism as a essential service to contemporary societies, and how the changes in the news industry, beyond inevitable, are necessary.

Alex Gamela posed a few questions to Charlie Beckett about his book (Portuguese version available here). Continue reading