Earlier this year I spoke at the BBC’s Data Fusion Day (you can find a liveblog of the event on Help Me Investigate) about data journalism workflows. The presentation slides are embedded below (the title is firmly tongue-in-cheek), but I thought I’d explain a bit more in a series of posts – beginning here.
Data journalism workflow 1: Set up data newswires
Most newsrooms take a newswire of some sort – national and international news from organisations like the Press Association, Reuters, and Associated Press.
Data journalism is no exception. If you want to find stories in data, it helps to know what data is coming out, when it comes out.
You can find data newswires in all sorts of places. The most obvious is a national statistics agency like the ONS – Office of National Statistics – in the UK, which announces new data releases in advance on its media centre page, and also has a release calendar.
There are also regional statistics bodies, often called observatories (you can find health observatories here); and international statistics bodies like the EU‘s Eurostat or the World Health Organisation‘s Global Health Observatory.
Local, regional and national government and departments are another useful source – statistics are currently being centralised for central government – use the filters and search facilities on the Publications page on Gov.uk but also try departments’ own sites as not all data is in the same place.
Other sources of data leads
Regulators and auditing bodies, charities and nonprofits, professional bodies and unions, political parties and thinktanks all regularly collect data. If they won’t let you look at the original data, or provide details on the methods used to gather it, then you should be sceptical and report that. Commercial research companies collect data for a range of clients too.
Academic institutions and journals are another useful source of possible data alerts. Even if the data isn’t published or you don’t have access to the full journal it will give you an author name that you can chase up for more details.
The final two areas to look for data news are corporations and open data initiatives. Corporations gather data on their customers, market and performance. Some of this will be published in annual reports (Duedil provides alerts, but also check the company’s own site), but they are likely to be talking about other information in press releases and specialist trade press coverage.
In the field of open data OpenSpending opens up spending data from dozens of countries around the world. OpenCorporates has data on millions of companies in jurisdictions from Abu Dhabi to Tanzania, and new projects are springing up all the time.
Workflow tip 2. Be your own librarian
Once you have your data newswires, you will start seeing datasets which aren’t necessarily newsworthy but which you think might come in useful later. You might also stumble across useful datasets as you pursue other stories that you expect to use again and again.
For example, you might find a dataset with the population of every local authority – this will be useful for any future story where you want to turn absolute numbers into the rate per person.
Or you might find a dataset which gives the addresses of every GP surgery, or hospital, or school in a particular region or country. That’s going to enable you to map those in a future story.
Unless you want to have to search for those things all over again – against a deadline – a good habit to develop is bookmarking that data effectively.
Social bookmarking tools like Delicious and Pinboard are ideal for this. Unlike bookmarking on your browser, you can access them from any computer, tablet or mobile phone. And also unlike traditional bookmarking, you can file them in multiple ‘folders’, using tags.
If you’re willing to pay a small fee, you can even use Pinboard to save a copy of bookmarked webpages which might later disappear.
How to bookmark
When you use a social bookmarking tool you ‘bookmark’ (store a link to) webpages or online documents as you come across them, adding tags to help describe them.
For example, you might tag one dataset with the words
- ‘LA’ (for local authorities), and
You might tag another:
- ‘gp’ and
This tagging makes it very easier to find the data later when you need it. If you need data on housing you might use your social bookmarking account to search for bookmarks tagged with both ‘housing’ and ‘data’.
You can often create URLs that link to these too – on Delicious for example your bookmarks for housing data might be at:
On Pinboard they would be at:
(If that sounds too technical for you, just revert to the search facility.)
That also makes it easy to share links or background research with other people – although you can also make individual bookmarks private if you prefer.
Those are just my tips – if you have any to add please let me know!