Tag Archives: torchbearers

A case study in online journalism part 2: verification, SEO and collaboration (investigating the Olympic torch relay)

corporate Olympic torchbearers image

Having outlined some of the data journalism processes involved in the Olympic torch relay investigation, in part 2 I want to touch on how verification and ‘passive aggressive newsgathering’ played a role.

Verification: who’s who

Data in this story not only provided leads which needed verifying, but also helped verify leads from outside the data.

In one example, an anonymous tip-off suggested that both children of one particular executive were carrying the Olympic torch on different legs of the relay. A quick check against his name in the data suggested this was so: two girls with the same unusual surname were indeed carrying the torch. Neither mentioned the company or their father. But how could we confirm it?

The answer involved checking planning applications, Google Streetview, and a number of other sources, including newsletters from the private school that they both attended which identified the father.

In another example, I noticed that one torchbearer had mentioned running alongside two employees of Aggreko, who were paying for their torches. I searched for other employees, and found a cake shop which had created a celebratory cake for three of them. Having seen how some corporate sponsors used their places, I went on a hunch and looked up the board of directors, searching in the data first for the CEO Rupert Soames. His name turned up – with no nomination story. A search for other directors found that more than half the executive board were carrying torches – which turned out to be our story. The final step: a call to the company to get a reaction and confirmation.

The more that we knew about how torch relay places had been used, the easier it was to verify other torchbearers. As a pattern emerged of many coming from the telecomms industry, that helped focus the search – but we had to be aware that having suspicions ‘confirmed’ didn’t mean that the name itself was confirmed – it was simply that you were more likely to hit a match that you could verify.

Scepticism was important: at various times names seemed to match with individuals but you had to ask ‘Would that person not use his title? Why would he be nominated? Would he be that age now?’

Images helped – sometimes people used the same image that had been used elsewhere (you could match this with Google Images ‘match image’ feature, then refine the search). At other times you could match with public photos of the person as they carried the torch.

This post on identifying mystery torchbearers gives more detail.

Passive aggressive newsgathering

Alerts proved key to the investigation. Early on I signed up for daily alerts on any mention of the Olympic torch. 95% of stories were formulaic ‘local town/school/hero excited about torch’ reports, but occasionally key details would emerge in other pieces – particularly those from news organisations overseas.

Google Alerts for Olympic torch

It was from these that I learned how many places exactly Dow, Omega, Visa and others had, and how many were nominated. It was how I learned about torchbearers who were not even listed on the official site, about the ‘criteria’ that were supposed to be adhered to by some organisations, about public announcements of places which suggested a change from previous numbers, and more besides.

As I came across anything that looked interesting, I bookmarked and tagged it. Some would come in useful immediately, but most would only come in useful later when I came to write up the full story. Essentially, they were pieces of a jigsaw I was yet to put together.  (For example, this report mentioned that 2,500 employees were nominated within Dow for just 10 places. How must those employees feel when they find the company’s VP of Olympic operations took up one of the few places? Likewise, he fit a broader pattern of sponsorship managers carrying the torch)

I also subscribed to any mention of the torch relay in Parliament, and any mention in FOI requests.

SEO – making yourself findable

One of the things I always emphasise to my students is the importance of publishing early and often on a subject to maximise the opportunities for others in the field to find out – and get in touch. This story was no exception to this. From the earliest stages through to the last week of the relay, users stumbled across the site as they looked for information on the relay – and passed on their concerns and leads.

It was particularly important with a big public event like the Olympic torch relay, which generated a lot of interest among local people. In the first week of the investigation one photographer stumbled across the site because he was searching for the name of one of the torchbearers we had identified as coming from adidas. He passed on his photographs – but more importantly, made me aware that there may be photographs of other executives who had already carried the torch.

That led to the strongest image of the investigation – two executives exchanging a ‘torch kiss’ (shown at the top of this post) – which was in turn picked up by The Daily Mail.

Other leads kept coming. The tip-off about the executive’s daughters mentioned above; someone mentioning two more Aggreko directors – one of which had never been published on the official site, and the other had been listed and then removed. Questions about a Polish torchbearer who was not listed on the official site or, indeed, anywhere on the web other than the BBC’s torch relay liveblog. Challenges to one story we linkblogged, which led to further background that helped flesh out the processes behind the nominations given to universities.

When we published the ‘mystery torchbearers’ with The Guardian some got in touch to tell us who they were. In one case, that contact led to an interview which closed the book: Geoff Holt, the first quadriplegic to sail single-handed across the Atlantic Ocean.

Collaboration

I could have done this story the old-fashioned way: kept it to myself, done all the digging alone, and published one big story at the end.

It wouldn’t have been half as good. It wouldn’t have had the impact, it wouldn’t have had the range, and it would have missed key ingredients.

Collaboration was at the heart of this process. As soon as I started to unearth the adidas torchbearers I got in touch with The Guardian’s James Ball. His report the week after added reactions from some of the companies involved, and other torchbearers we’d simultaneously spotted. But James also noticed that one of Coca Cola’s torchbearers was a woman “who among other roles sits on a committee of the US’s Food and Drug Administration”.

It was collaborating with contacts in Staffordshire which helped point me to the ‘torch kiss’ image. They in turn followed up the story behind it (a credit for Help Me Investigate was taken out of the piece – it seems old habits die hard), and The Daily Mail followed up on that to get some further reaction and response (and no, they didn’t credit the Stoke Sentinel either). In Bournemouth and Sussex local journalists took up the baton (sorry), and the Times Higher did their angle.

We passed on leads to Ventnor Blog, whose users helped dig into a curious torchbearer running through the area. And we published a list of torchbearers missing stories in The Guardian, where users helped identify them.

Collaborating with an international mailing list for investigative journalists, I generated datasets of local torchbearers in Hungary, Italy, India, the Middle East, Germany, and Romania. German daily newspaper Der Tagesspiegel got in touch and helped trace some of the Germans.

And of course, within the Help Me Investigate network people were identifying mystery torchbearers, getting responses from sponsors, visualising data, and chasing interviews. One contributor in particular – Carol Miers – came on board halfway through and contributed some of the key elements of the final longform report – in particular the interview that opens the book, which I talk about in the final part of this series.

A case study in online journalism part 2: verification, SEO and collaboration (investigating the Olympic torch relay)

corporate Olympic torchbearers image

Having outlined some of the data journalism processes involved in the Olympic torch relay investigation, in part 2 I want to touch on how verification and ‘passive aggressive newsgathering’ played a role.

Verification: who’s who

Data in this story not only provided leads which needed verifying, but also helped verify leads from outside the data. Continue reading

A case study in online journalism: investigating the Olympic torch relay

Infographic: Where did the Olympic torch relay places go? What we know so far

image by @CarolineBeavon

For the last two months I’ve been involved in an investigation which has used almost every technique in the online journalism toolbox. From its beginnings in data journalism, through collaboration, community management and SEO to ‘passive-aggressive’ newsgathering,  verification and ebook publishing, it’s been a fascinating case study in such a range of ways I’m going to struggle to get them all down.

But I’m going to try.

Data journalism: scraping the Olympic torch relay

The investigation began with the scraping of the official torchbearer website. It’s important to emphasise that this piece of data journalism didn’t take place in isolation – in fact, it was while working with Help Me Investigate the Olympics‘s Jennifer Jones (coordinator for#media2012, the first citizen media network for the Olympic Games) and others that I stumbled across the torchbearer data. So networks and community are important here (more later).

Indeed, it turned out that the site couldn’t be scraped through a ‘normal’ scraper, and it was the community of the Scraperwiki site – specifically Zarino Zappia – who helped solve the problem and get a scraper working. Without both of those sets of relationships – with the citizen media network and with the developer community on Scraperwiki – this might never have got off the ground.

But it was also important to see the potential newsworthiness in that particular part of the site. Human stories were at the heart of the torch relay – not numbers. Local pride and curiosity was here – a key ingredient of any local newspaper. There were the promises made by its organisers – had they been kept?

The hunch proved correct – this dataset would just keep on giving stories.

The scraper grabbed details on around 6,000 torchbearers. I was curious why more weren’t listed – yes, there were supposed to be around 800 invitations to high profile torchbearers including celebrities, who might reasonably be expected to be omitted at least until they carried the torch – but that still left over 1,000.

I’ve written a bit more about the scraping and data analysis process for The Guardian and the Telegraph data blog. In a nutshell, here are some of the processes used:

  • Overview (pivot table): where do most come from? What’s the age distribution?
  • Focus on details in the overview: what’s the most surprising hometown in the top 5 or 10? Who’s oldest and youngest? What about the biggest source outside the UK?
  • Start asking questions of the data based on what we know it should look like – and hunches
  • Don’t get distracted – pick a focus and build around it.

This last point is notable. As I looked for mentions of Olympic sponsors in nomination stories, I started to build up subsets of the data: a dozen people who mentioned BP, two who mentioned ArcelorMittal (the CEO and his son), and so on. Each was interesting in its own way – but where should you invest your efforts?

One story had already caught my eye: it was written in the first person and talked about having been “engaged in the business of sport”. It was hardly inspirational. As it mentioned adidas, I focused on the adidas subset, and found that the same story was used by a further six people – a third of all of those who mentioned the company.

Clearly, all seven people hadn’t written the same story individually, so something was odd here. And that made this more than a ‘rotten apple’ story, but something potentially systemic.

Signals

While the data was interesting in itself, it was important to treat it as a set of signals to potentially more interesting exploration. Seven torchbearers having the same story was one of those signals. Mentions of corporate sponsors was another.

But there were many others too.

That initial scouring of the data had identified a number of people carrying the torch who held executive positions at sponsors and their commercial partners. The GuardianThe Independent and The Daily Mail were among the first to report on the story.

I wondered if the details of any of those corporate torchbearers might have been taken off off the site afterwards. And indeed they had: seven disappeared entirely (many still had a profile if you typed in the URL directly – but could not be found through search or browsing), and a further two had had their stories removed.

Now, every time I scraped details from the site I looked for those who had disappeared since the last scrape, and those that had been added late.

One, for example – who shared a name with a very senior figure at one of the sponsors – appeared just once before disappearing four days later. I wouldn’t have spotted them if they – or someone else – hadn’t been so keen on removing their name.

Another time, I noticed that a new torchbearer had been added to the list with the same story as the 7 adidas torchbearers. He turned out to be the Group Chief Executive of the country’s largest catalogue retailer, providing “continuing evidence that adidas ignored LOCOG guidance not to nominate executives.”

Meanwhile, the number of torchbearers running without any nomination story went from just 2.7% in the first scrape of 6,056 torchbearers, to 7.2% of 6,891 torchbearers in the last week, and 8.1% of all torchbearers – including those who had appeared and then disappeared – who had appeared between the two dates.

Many were celebrities or sportspeople where perhaps someone had taken the decision that they ‘needed no introduction’. But many also turned out to be corporate torchbearers.

By early July the numbers of these ‘mystery torchbearers’ had reached 500 and, having only identified a fifth, we published them through The Guardian datablog.

There were other signals, too, where knowing the way the torch relay operated helped.

For example, logistics meant that overseas torchbearers often carried the torch in the same location. This led to a cluster of Chinese torchbearers in StanstedHungarians in Dorset,Germans in BrightonAmericans in Oxford and Russians in North Wales.

As many corporate torchbearers were also based overseas, this helped narrow the search, with Germany’s corporate torchbearers in particular leading to an article in Der Tagesspiegel.

I also had the idea to total up how many torchbearers appeared each day, to identify days when details on unusually high numbers of torchbearers were missing – thanks to Adrian Short – but it became apparent that variation due to other factors such as weekends and the Jubilee made this worthless.

However, the percentage per day missing stories did help (visualised below by Caroline Beavon), as this also helped identify days when large numbers of overseas torchbearers were carrying the torch. I cross-referenced this with the ‘mystery torchbearer’ spreadsheet to see how many had already been checked, and which days still needed attention.

But the data was just the beginning. In the second part of this case study, I talk about the verification process, SEO and collaboration.

A case study in online journalism: investigating the Olympic torch relay

Infographic: Where did the Olympic torch relay places go? What we know so far

For the last two months I’ve been involved in an investigation which has used almost every technique in the online journalism toolbox. From its beginnings in data journalism, through collaboration, community management and SEO to ‘passive-aggressive’ newsgathering,  verification and ebook publishing, it’s been a fascinating case study in such a range of ways I’m going to struggle to get them all down.

But I’m going to try. Continue reading

Two guest posts on using data journalism techniques to investigate the Olympics

Corporate Olympic torchbearers exchange a 'torch kiss'

Investigating corporate Olympic torchbearers - analysing the data and working collaboratively led to this photo of a 'torch kiss' between two retail bosses

If I’ve been a little quiet on the blog recently, it’s because I’ve been spending a lot of time involved in an investigation into the Olympic torch relay over on Help Me Investigate the Olympics.

I’ve written two guest posts – for The Guardian’s Data Blog and The Telegraph’s new Olympics infographics and data blog – talking about some of the processes involved in that investigation. Here are the key points: Continue reading