Monthly Archives: May 2014

How to: combine multiple rows in a dataset where text is split across them (Open Refine)

When you’ve converted data from a PDF to a spreadsheet it’s not uncommon for text to end up being split across multiple rows, like this: text split across rows In this post I’ll explain how you can use Open Refine to quickly clean the data up so that the text is put back together and you have a single row for each entry. Continue reading

Advertisements

This simple piece of visualisation will have you rethinking what you know about impact and mobile

deadworkers

It’s not often I encounter a piece of data journalism which solves a common problem in the field – and it’s even rarer to find a piece of work which tackles two.

But that’s just what lean data journalism Ampp3d did last week when it published a piece of visualisation on the deaths of construction workers in Qatar.

The two problems? Creating impact on mobile – and making big numbers meaningful. Continue reading

Useful sources of health data – on Help Me Investigate

Last month I spoke to some health reporters from a national broadcaster about my favourite sources of health data. As part of that I wrote a post on Help Me Investigate. I’m cross-posting it here this month as I talk about sources of data on the Data Journalism MOOC:
Continue reading

After reverse publishing, it’s time to consider the ‘reverse subscription model’

Man attacks door, in reverseWhat is a reverse subscription model? Cedric Motte looks at what happens when you see digital as the heart of your distribution marketing and the paper as a commodity. This post was originally published in French on NewsResources.

Digital as the heart of the subscription marketing plan ?

“Reverse publishing” first hit newsroom organisations some years ago (although many medias didn’t switch and instead “just” added another digital newsroom downstairs).

The idea is simple: because of mobile penetration and generous data plans, readers use their mobile (and tablet at home) to access news along the day. So a newsroom has to think about releasing news in a digital way to keep up with the tempo of news. The paper comes later. Continue reading

Finding Stories in Spreadsheets – ebook now live!

Finding stories in spreadsheets book cover

Cover design by Matt Buck/Drawnalism

My latest ebook – Finding Stories in Spreadsheets – is now live on Leanpub.

As with Scraping for Journalists, I’m publishing the book week-by-week so the book can be updated based on reader feedback, user suggestions and topical developments.

Each week you can download a new chapter covering a different technique for finding stories, from calculating proportions and changes, to combining data, cleaning it up, testing it, and extracting specific details.

There’s also a downloadable spreadsheet at the end of each chapter with a series of exercises to practise that chapter’s technique and find particular stories.

Along the way I tackle some other considerations in telling the story, such as context and background, and the importance of being specific in the language that you use.

If there’s anything you’d like covered in the book let me know. You can also buy the book in a ‘bundle’ with its sister title Data Journalism Heist, which covers quick-turnaround techniques for finding stories in spreadsheets using pivot tables and advanced filters.

Hyperlocal media and engagement with political parties: what’s been your experience?

One of my abiding memories of the 1997 General Election involves bumping into a candidate from one of the major parties in a beer cellar. The candidate was supposed to have been on air at the time, participating in a live hustings for the small local radio station I was working for.

During a short conversation with them it quickly became clear that they felt an informal meet and greet with a bunch of bemused students was a better use of their time.

That was until I gently nudged him in the direction of the nearest cab…

A decade and a half later I hoped this sort of incident was a thing of the past. But is that the case? Continue reading

Is there a ‘canon’ of data journalism? Comment call!

Looking across the comments in the first discussion of the EJC’s data journalism MOOC it struck me that some pieces of work in the field come up again and again. I thought I’d pull those together quickly here and ask: is this the beginnings of a ‘canon’ in data journalism? And what should such a canon include? Stick with me past the first obvious examples…

Early data vis

These examples of early data visualisation are so well-known now that one book proposal I recently saw specified that it would not talk about them. I’m talking of course about… Continue reading