Category Archives: online journalism

Practical steps for improving visualisation

Here’s a useful resource for anyone involved in data journalism and visualising the results. ‘Dataviz‘ – a site for “improving data visualisation in the public sector” – features a step by step guide to good visualisation, as well as case studies and articles.

Although it’s aimed at public sector workers, the themes in the former provide a good starting point for journalists: “What do we need to do?”; “How do we do it?” and “How did we do?” Each provides a potential story angle. Clicking through those themes takes you through some of the questions to ask of the data, taking you to a gallery of visualisation possibilities. Even if you never get that far, it’s a good way to narrow the question you’re asking – or find other questions that might result in interesting stories and insights.

Lessons in community from community managers #12: Lorna Mitchell

It’s been a while since the last in the community management series. In this latest post Lorna Mitchell gives her 3 tips. Lorna is co-project lead for http://joind.in – an open source development project for gathering event feedback. She says “The other project lead is Chris Cornutt, a guy I’ve met three times over three years, who lives in a timezone 6 hours out from mine.”

Lorna worked as a telecommuter for a number of years and did community relations in that role, and was involved in running PHPWomen, a global user group bringing together women programming PHP, “with all the cultural and linguistic variations that brings.”

Lorna’s tips are:

Keep communicating

A running commentary of what you are doing and thinking is essential when you are working with people who can’t see you and may not have met you.

Communicate appropriately

Don’t hold a discussion over Twitter that would be better in long hand over email. Make a phone call rather than having days of comment and response on a bug tracker.

Be inclusive

Nothing turns newcomers off faster than lots of in-jokes or references to people they don’t know or places they didn’t go.

Hyperlocal voices: Bart Brouwers, Telegraaf hyperlocal project, Netherlands

Bart Brouwers has been overseeing the establishment of a whole group of hyperlocal sites in the Netherlands with the Telegraaf Media Group. As part of the Hyperlocal Voices series, he explains the background to the project and what they’ve learned so far. Two presentations on the project can be seen above.

Who were the people behind the blog, and what were their backgrounds?

About a year ago, I came up with the plan for a hyperlocal, hyperpersonal news and data network covering all of the Netherlands. My dream was to give every single Dutchman (we have 16 million & counting…) his own platform for local relevance.

I wanted to roll it out myself and in order to get it financed I made contact with the board of directors of the Telegraaf Media Groep. I already worked for them (as the editor-in-chief of national free newspaper Sp!ts and before that as the editor-in-chief of regional newspaper Dagblad De Limburger), so it felt kind of natural to tell and ask them before I would pitch my idea somewhere else.

What I didn’t know is that TMG was already working on a hyperlocal platform, so after a few talks we decided to combine both plans. So instead of quitting TMG and starting my own company, I’m still an employee.

What made you decide to set up the blogs?

I was convinced local relevance would/will be a strong force in media. The combination of local business and local information (news, data) could easily become the trigger for a fine enterprise. Continue reading

Manchester police tweets – live data visualisation by the MEN

Manchester police tweets - live data visualisation

Greater Manchester Police (GMP) have been experimenting today with tweeting every incident they deal with. The novelty value of the initiative has been widely reported – but local newspaper the Manchester Evening News has taken the opportunity to ask some deeper questions of the data generated by experimenting with data visualisation.

A series of bar charts – generated from Google spreadsheets and updated throughout the day – provide a valuable – and instant – insight into the sort of work that police are having to deal with.

In particular, the newspaper is testing the police’s claim that they spend a great deal of time dealing with “social work” as well as crime. At the time of writing, it certainly does take up a significant proportion – although not the “two-thirds” mentioned by GMP chief Peter Fahy. (Statistical disclaimer: the data does not yet even represent 24 hours, so is not yet going to be a useful guide. Fahy’s statistics may be more reliable).

Also visualised are the areas responsible for the most calls, the social-crime breakdown of incidents by area, and breakdowns of social incidents and serious crime incidents by type.

I’m not sure how much time they had to prepare for this, but it’s a good quick hack.

That said, the visualisation could be improved: 3D bars are never a good idea, for instance, and the divisional breakdown showing serious crime versus “social work” is difficult to visually interpret (percentages of the whole would be more easy to directly compare). The breakdowns of serious crimes and “social work”, meanwhile, should be ranked from most popular down with labelling used rather than colour.

Head of Online Content Paul Gallagher says that it’s currently a manual exercise that requires a page refresh to see updated visuals. But he thinks “the real benefit of this will come afterwards when we can also plot the data over time”. Impressively, the newspaper plans to publish the raw data and will be bringing it to tomorrow’s Hacks and Hackers Hackday in Manchester.

More broadly, the MEN is to be commended for spotting this more substantial angle to what could easily be dismissed as a gimmick by the GMP. Although that doesn’t stop me enjoying the headlines in coverage elsewhere (shown below).

UPDATE: The data is also visualised as a word cloud and line chart at Data Driven.

Manchester police twitter headlines

Statistical analysis as journalism – Benford’s law

 

drug-related murder map

I’m always on the lookout for practical applications of statistical analysis for doing journalism, so this piece of work by Diego Valle-Jones, on drug-related murders, made me very happy.

I’ve heard of the first-digit law (also known as Benford’s law) before – it’s a way of spotting dodgy data.

What Diego Valle-Jones has done is use the method to highlight discrepancies in information on drug-delated murders in Mexico. Or, as Pete Warden explains:

“With the help of just Benford’s law and data sets to compare he’s able to demonstrate how the police are systematically hiding over a thousand murders a year in a single state, and that’s just in one small part of the article.”

Diego takes up the story:

“The police records and the vital statistics records are collected using different methodologies: vital statistics from the INEGI [the statistical agency of the Mexican government] are collected from death certificates and the police records from the SNSP are the number of police reports (“averiguaciones previas”) for the crime of murder—not the number of victims. For example, if there happened to occur a particular heinous crime in which 15 teens were massacred, but only one police report were filed, all the murders would be recorded in the database as one. But even taking this into account, the difference is too high.

“You could also argue that the data are provisional—at least for 2008—but missing over a thousand murders in Chihuahua makes the data useless at the state level. I could understand it if it was an undercount by 10%–15%, or if they had added a disclaimer saying the data for Chihuahua was from July, but none of that happened and it just looks like a clumsy way to lie. It’s a pity several media outlets and the UN homicide statistics used this data to report the homicide rate in Mexico is lower than it really is.”

But what brings the data alive is Diego’s knowledge of the issue. In one passage he checks against large massacres since 1994 to see if they were recorded in the database. One of them – the Acteal Massacre (“45 dead, December 22, 1997″) – is not there. This, he says, was “committed by paramilitary units with government backing against 45 Tzotzil Indians … According to the INEGI there were only 2 deaths during December 1997 in the municipality of Chenalho, where the massacre occurred. What a silly way to avoid recording homicides! Now it is just a question of which data is less corrupt.”

The post as a whole is well worth reading in full, both as a fascinating piece of journalism, and a fascinating use of a range of statistical methods. As Pete says, it is a wonder this guy doesn’t get more publicity for his work.

Statistical analysis as journalism – Benford's law

drug-related murder map

I’m always on the lookout for practical applications of statistical analysis for doing journalism, so this piece of work by Diego Valle-Jones, on drug-related murders, made me very happy.

I’ve heard of the first-digit law (also known as Benford’s law) before – it’s a way of spotting dodgy data.

What Diego Valle-Jones has done is use the method to highlight discrepancies in information on drug-delated murders in Mexico. Or, as Pete Warden explains:

“With the help of just Benford’s law and data sets to compare he’s able to demonstrate how the police are systematically hiding over a thousand murders a year in a single state, and that’s just in one small part of the article.”

Diego takes up the story:

“The police records and the vital statistics records are collected using different methodologies: vital statistics from the INEGI [the statistical agency of the Mexican government] are collected from death certificates and the police records from the SNSP are the number of police reports (“averiguaciones previas”) for the crime of murder—not the number of victims. For example, if there happened to occur a particular heinous crime in which 15 teens were massacred, but only one police report were filed, all the murders would be recorded in the database as one. But even taking this into account, the difference is too high.

“You could also argue that the data are provisional—at least for 2008—but missing over a thousand murders in Chihuahua makes the data useless at the state level. I could understand it if it was an undercount by 10%–15%, or if they had added a disclaimer saying the data for Chihuahua was from July, but none of that happened and it just looks like a clumsy way to lie. It’s a pity several media outlets and the UN homicide statistics used this data to report the homicide rate in Mexico is lower than it really is.”

But what brings the data alive is Diego’s knowledge of the issue. In one passage he checks against large massacres since 1994 to see if they were recorded in the database. One of them – the Acteal Massacre (“45 dead, December 22, 1997”)is not there. This, he says, was “committed by paramilitary units with government backing against 45 Tzotzil Indians … According to the INEGI there were only 2 deaths during December 1997 in the municipality of Chenalho, where the massacre occurred. What a silly way to avoid recording homicides! Now it is just a question of which data is less corrupt.”

The post as a whole is well worth reading in full, both as a fascinating piece of journalism, and a fascinating use of a range of statistical methods. As Pete says, it is a wonder this guy doesn’t get more publicity for his work.

Andrew Marr fails to learn from his own history

“It is frightful that someone who is no one… can set any error into circulation with no thought of responsibility & with the aid of this dreadful disproportioned means of communication”

That’s not a quote from Andrew Marr, but Soren Kierkegaard writing about newspapers in the 19th century. Here’s another:

“I do not mean to be the slightest bit critical of TV newspeople, who do a superb job, considering that they operate under severe time constraints and have the intellectual depth of hamsters.  But TV news can only present the “bare bones” of a story; it takes a newspaper, with its capability to present vast amounts of information, to render the story truly boring”

Strange that the author of one of the best histories of British journalism can fail to remember how each new platform for journalism has been greeted, and how fuzzy the concept of journalism is.

“Journalism includes drunks and dyslexics and some of the least trustworthy, wickedest people in the land … The reader doesn’t know who pretends to make the necessary phone calls, but never bothers; or that this one hates Tories and always writes them down.”

That’s a quote from Andrew Marr’s book. Here’s another:

“In a complicated, developed society, much of the most important finding out can only be done by people with narrower, sharper skills – microbiologists, meteorologists, opinion pollsters and market analysts, whose discoveries journalism simply passes on in a more popular (and generally distorted) form.”

Sounds like bloggers to me.

Marr doesn’t even need to look very far back. This fake-debate was laid to rest years ago (is anyone really claiming that citizen journalism will entirely replace professional journalism? Or still trying to compare blogging – a technical process – with journalism – a cultural construct?). As I tweeted yesterday: the year 2005 called, Andrew. They want their prejudices back.

Meanwhile, Channel 4 journalist Krishnan Guru-Murthy has written eloquently in defence of bloggers and the need to engage through social media.

Revisiting Rodolfo Walsh, father of Argentinian non fiction

For Argentinians like me, it was Rodolfo Walsh – and not Truman Capote, who published In Cold Blood almost a decade later – that invented non fiction journalism with his famous 1957 book Operación Masacre, a masterpiece of investigative journalism.

Twenty years later, on the first anniversary of Jorge Rafael Videla’s dictatorship, he was intercepted by soldiers, murdered, and his remains vanished: he became a “desaparecido”, just after delivering his Open Letter from a Writer to the Military Junta (Carta Abierta de un Escritor a la Junta Militar) to Argentine newspapers and correspondents at foreign media organizations.

OperacionMasacreBook

To commemorate his work, Alvaro Liuzzi is starting a “journalistic experiment” called Proyecto Walsh searching for an answer to an interesting question: “What would have happened if, for the research of Operacion Masacre, Rodolfo Walsh had had access to the digital tools we have today?”.

The Twitter user @rodolfowalsh is the first step of Proyecto Walsh that will try to create an digital ecosystem in order to gather all of the research that Rodolfo accomplished 54 years ago, and remix it using the  journalistic tools of today.

Local newspaper data journalism – school admissions in Birmingham

data journalism at the Birmingham Mail - school admissions data

The Birmingham Mail has been trying its hand at data journalism with school admissions data. It’s a good place to start – the topic attracts a lot of interest (and so justifies the investment of time) while people tend to be interested in more than just who finishes top and bottom of the tables (justifying the choice of medium).

The results are impressive. Applications data is plotted on a Google map on the main page, while an “interactive chart” page allows you to compare schools across various criteria, and also narrow the sample by selecting from two drop down menus (town and school).

The charts have been made in Tableau, which includes a download link at the bottom. However, you need Tableau itself (free, but PC only) to open it.

A further page features links to tables for each area. Sadly, the pages containing tables do not contain any link to the raw data. This presents an extra hurdle to users – although you can scrape the table into a Google spreadsheet using the =import formula. If you want to see how, here’s a spreadsheet I created from the data by doing just that. Click on the first cell to see the formula that generates it.

I asked David Higgerson, Trinity Mirror’s Head of Multimedia and the man whose name appears on the Tableau data, to explain the process behind the project. It seems the information was a combination of freely available data and that acquired via FOI.

“The Mail took the data available – number of places available, number of first choice applicants and number of total applicants – and worked out a ratio of first choice applicants per place. This is relevant to parents because councils try to allocate places to children based on preference once they’ve decided which schools a child is eligible for. Eligibility varies depending on type of school.

“The figures showed how popular faith schools were, and also how fierce competition was for places at grammar schools. That’s the story which generated most interest.

“As you’ve said on your blog, the hardest part was making the data uniform, and the making it relevant to readers.

“In print, it ran across three days. Day one was grammar schools, day two was all schools and day three revealed how catchment areas for oversubscribed schools which use distance from school to fill their last few places.

“Online, Google Fusion was used to create maps, Tableau for the interactive chart which lets people choose based on town or school, and Tableizer for the quick tables which appear in the section too. We also had a play with Scribble Maps, which we think has real potential for print/online newsrooms.”

It seems education reporter Kat Keogh deserves the credit for spotting the stories in the data, “with the usual support you’d expect in the newsroom – newsdesk etc.”

David and Anna Jeys experimented with the online presentation and others laid out the data for print.

BBC new linking guidelines issued – science journals mentioned

The BBC have just emailed new linking guidelines to their staff. They stipulate that linking is “essential” to online journalism and in one slide (it’s a PowerPoint document) titled ‘If you remember nothing else’ highlight how linking will change:

What we used to do…

  • Lists of archive news stories
  • Homepages only on external websites
  • No inline linking in news stories

What we do now – think adding value…

  • Avoid news stories and link to useful stuff – analysis, explainers, Q&As, pic galleries etc
  • On external websites look beyond homepage to pages of specific relevance
  • Inline linking in news stories is OK when it’s to a primary source

Other points of note in the document include the repeated emphasis on useful deep linking, and the importance of the newstracker module (which links to coverage on other news sites). Curiously, when referring to inline links it does say that “different rules can apply” to BBC blogs – “speak to blogs team if in doubt”.

Something I did look for – and find – was a reference to linking to scientific journals. And here it is: “In news stories inline links must go to primary sources only– eg scientific journal article or policy report (1 or 2 per story; avoid intro)”

This is significant given the previous campaigning on this issue.

On the whole it’s a good set of guidance – I’ll refrain from publishing it in hope that the BBC will…

UPDATE: It seems The Guardian followed up the story and embedded the document, so here it is:

BBC guidelines for linking – Sept 2010