Following my previous posts on the network journalist and community manager roles as part of an investigation team, this post expands on the first steps a student journalist can take in filling the data journalist role.
1: Brainstorm data that might be relevant to your investigation or field
Before you begin digging for data, it’s worth mapping out the territory you’re working in. Some key questions to ask include:
- Who measures or monitors your field? For example:
- regulators and inspectors
- charities (try searching by keyword on the Charity Commission or OpenCharities)
- campaigning groups
- central government (the department and/or agency responsible, e.g. Ministry of Justice, BIS, etc. – there may also be specific ministers)
- local government (local authorities or primary care trusts, police force, etc.)
- select committees (browse parliamentary research indexed here, or try a specific search)
- general statistical/audit bodies such as ONS or the Audit Commission.
- Where is spending recorded? This might be at both a local and national level.
- What are the key things that might be measured in your field? For example, in prisons they might be interested in reoffending, or overcrowding, or staffing.
- Can you find historical data?
- What data do you need to provide basic context? e.g.
- Where – addresses for all institutions in your field (e.g. schools, prisons, etc.)
- Codes – often these are used instead of institution or area names
- Who – names of those responsible for particular aspects of your field
- Demographics – the distribution of age, gender, ethnicity, industries, wealth, property or other elements may be important to your work
- Politics – who is in charge in each area (local authority and local MP)
- How could you collate data that doesn’t exist? E.g. public awareness of something; or how the policies of different bodies compare, etc.
Sometimes the simplest and quickest way to find out these things is to pick up the phone and speak to someone in a relevant organisation and ask them: what information is collected about your field, and by whom?
You can also make content from this process of research: post a guide to how your field is regulated and measured (and what information isn’t); who’s who in your field – the regulators, monitors, politicians and bodies that all have a hand in keeping it on track.
2. Learn advanced techniques to obtain that data
Once you’ve mapped it all out you can start to prioritise the datasets that are most relevant to your particular investigation. You may need to use different techniques to get hold of these, including:
- Advanced search techniques (limit by filetype:, site:, etc.)
- Simply picking up the phone to call the relevant department (try to get as much detailed data as possible rather than aggregate, i.e. very general, figures)
- Using FOI requests
Again, you can make content from this process, for example: “How we found…” or “Why we’re asking the MoJ for…” (with a link to the FOI request) or “Get the data” (here’s how to publish data online)
The flow chart below (from this previous post) helps guide you to the relevant techniques for your data:
3. Pull out the parts of data relevant to your field/investigation
For example:
- If the data covers every region, pull out the parts that apply to your locality, or how that compares to other areas (space), or to previous data (time)
- Look at the particular issue(s) that interests you in the data, e.g. a particular crime out of many, or a particular indicator. How does that compare across space (regions) or time?
4. Add value to the data
Here are just some suggestions. You can use one or many:
- Combine datasets – e.g. one may have school ratings; another may have the addresses of all schools, or their local authority
- Convert data – this amounts to much the same thing, but for example: postcodes are more useful when converted into lat/long coordinates (likewise easting and northing)
- Find out how the data was collected and/or measured (put simply: pick up the phone and ask)
- Get an independent expert perspective on the data
- Compare the data with official claims or spin – does it really back those claims up?
- Compare the data with reports from elsewhere – is anything missing?
- Unpick jargon and definitions (here’s an example of James Ball unpicking different work experience schemes)
- Add a search and filter interface
Any of these provide useful opportunities for posting new content with the new contextual information (e.g. “How the data on X was gathered“) or new combined data (“Now with QOF data“) or the issues that they raise (“Why schools data may be worthless“).
5. Communicate the story in the data
I’ve written separately about the different ways of communicating data stories, so you can read that here. In short, human case studies are helpful, and visualisation is often useful.
And it’s at this point that you can also link to the further detail provided in all the content you’ve written in the previous 4 steps: How you got the data, the wider context, the specific data that’s of interest, the more detailed expert analysis or background, and so on.