When you’ve converted data from a PDF to a spreadsheet it’s not uncommon for text to end up being split across multiple rows, like this: In this post I’ll explain how you can use Open Refine to quickly clean the data up so that the text is put back together and you have a single row for each entry. Continue reading
Monthly Archives: May 2014
This simple piece of visualisation will have you rethinking what you know about impact and mobile
It’s not often I encounter a piece of data journalism which solves a common problem in the field – and it’s even rarer to find a piece of work which tackles two.
But that’s just what lean data journalism Ampp3d did last week when it published a piece of visualisation on the deaths of construction workers in Qatar.
The two problems? Creating impact on mobile – and making big numbers meaningful. Continue reading
Useful sources of health data – on Help Me Investigate
Last month I spoke to some health reporters from a national broadcaster about my favourite sources of health data. As part of that I wrote a post on Help Me Investigate. I’m cross-posting it here this month as I talk about sources of data on the Data Journalism MOOC:
Continue reading
After reverse publishing, it’s time to consider the ‘reverse subscription model’
What is a reverse subscription model? Cedric Motte looks at what happens when you see digital as the heart of your distribution marketing and the paper as a commodity. This post was originally published in French on NewsResources.
Digital as the heart of the subscription marketing plan ?
“Reverse publishing” first hit newsroom organisations some years ago (although many medias didn’t switch and instead “just” added another digital newsroom downstairs).
The idea is simple: because of mobile penetration and generous data plans, readers use their mobile (and tablet at home) to access news along the day. So a newsroom has to think about releasing news in a digital way to keep up with the tempo of news. The paper comes later. Continue reading
Finding Stories in Spreadsheets – ebook now live!
My latest ebook – Finding Stories in Spreadsheets – is now live on Leanpub.
As with Scraping for Journalists, I’m publishing the book week-by-week so the book can be updated based on reader feedback, user suggestions and topical developments.
Each week you can download a new chapter covering a different technique for finding stories, from calculating proportions and changes, to combining data, cleaning it up, testing it, and extracting specific details.
There’s also a downloadable spreadsheet at the end of each chapter with a series of exercises to practise that chapter’s technique and find particular stories.
Along the way I tackle some other considerations in telling the story, such as context and background, and the importance of being specific in the language that you use.
If there’s anything you’d like covered in the book let me know. You can also buy the book in a ‘bundle’ with its sister title Data Journalism Heist, which covers quick-turnaround techniques for finding stories in spreadsheets using pivot tables and advanced filters.
Hyperlocal media and engagement with political parties: what’s been your experience?
@BrownhillsBob @paulbradshaw we only got 4 / 7 local candidates respond to questionnaires ( and @wolvesonwheels 17 /78 across the city)
— WV11.co.uk (@WV11) May 21, 2014
One of my abiding memories of the 1997 General Election involves bumping into a candidate from one of the major parties in a beer cellar. The candidate was supposed to have been on air at the time, participating in a live hustings for the small local radio station I was working for.
During a short conversation with them it quickly became clear that they felt an informal meet and greet with a bunch of bemused students was a better use of their time.
That was until I gently nudged him in the direction of the nearest cab…
A decade and a half later I hoped this sort of incident was a thing of the past. But is that the case? Continue reading
Is there a ‘canon’ of data journalism? Comment call!
Looking across the comments in the first discussion of the EJC’s data journalism MOOC it struck me that some pieces of work in the field come up again and again. I thought I’d pull those together quickly here and ask: is this the beginnings of a ‘canon’ in data journalism? And what should such a canon include? Stick with me past the first obvious examples…
Early data vis
These examples of early data visualisation are so well-known now that one book proposal I recently saw specified that it would not talk about them. I’m talking of course about… Continue reading
Journalisme et code : 10 grands principes de programmation expliqués
Cedric Motte asked if he could translate Coding for journalists: 10 programming concepts it helps to understand into French. Here’s the result – first published on NewsResources.
Si vous envisagez de vous mettre à la programmation, il y a de fortes chances que vous butiez sur une série de termes techniques, un jargon qui peut être particulièrement rébarbatif, notamment dans les tutoriels, dont les auteurs ont tendance à oublier que vous êtes inexpérimentés en programmation.
Les sections qui suivent décrivent et indiquent dix concepts que vous êtes susceptible de – non, que vous allez – rencontrer. Continue reading
Coding for journalists: 10 programming concepts it helps to understand
If you’re looking to get into coding chances are you’ll stumble across a raft of jargon which can be off-putting, especially in tutorials which are oblivious to your lack of previous programming experience. Here, then, are 10 concepts you’re likely to come across – and what they mean.
1. Variables

Variables are like boxes which can hold different things at different times. Image by Wolfgang Lonien.
A variable is one of the most basic elements of programming. It is, in a nutshell, a way of referring to something so that you can use it in a line of code. To give some examples:
- You might create a variable to store a person’s age and call it ‘age’
- You might create a variable to store the user’s name and call it ‘username’
- You might create a variable to count how many times something has happened and call it ‘counter’
- You might create a variable to store something’s position and call it ‘index’
Variables can be changed, which is their real power. A user’s name will likely be different every time one piece of code runs. An age can be added to at a particular time of year. A counter can increase by one every time something happens. A list of items can have other items added to it, or removed. Continue reading
Hyperlocal Voices: Jamie Summerfield, A Little Bit of Stone
It’s been a little while since we had a new entry in our Hyperlocal Voices series (where we interview hyperlocal practitioners about their experiences). To kick off our efforts for 2014, Damian Radcliffe touches base with Jamie Summerfield, to talk about A Little Bit of Stone, a community news website for Stone in Staffordshire.
Who were the people behind the blog?
I set up A Little Bit of Stone in August 2010 and was joined a month later by Jon Cook.
We quickly set up a partnership, me doing editorial and Jon looking after web and technical matters. Continue reading