5 principles of data management – for both analytics and data journalism

Whether you’re working with analytics data on your site or data for a story, it strikes me that certain principles apply to both.

At the PPA’s Digital Publishing Conference recently I talked about 5 of those. Here’s the rundown:

1. Data is only as good as the person asking the questions

A scraped dataset tells you little on its own. Likewise, the default dashboard of an analytics package rarely provides any particularly deep insights. What matters is the questions you ask of those.

An analytics dashboard
An analytics dashboard rarely provides any genuine insights into your site – you need to ask the right questions

In data journalism, those questions might be ‘Is the data reliable?‘ or ‘Is a passage of text appearing more than once?‘ or ‘Does this match up to promises made or the story being spun?

In analytics, the questions might be ‘How many people never complete the registration process?‘ or ‘What types of users are likely to spend longest on an article?‘ or ‘What types of content lead users to click through to further material?

2. Data can save time and money

I’ve written before about the myth that data journalism is resource-intensive. Done well, using data can save time on tasks that have previously been done manually, and it does not have to cost money or rely on having a team of developers.

It can also be used to turn around reporting more quickly: you might prepare for a big event by having spreadsheet formulae already set up, for example, or feeds set up or triggers.

Once again, the same applies to analytics. You might use automation techniques to scrape external data (or data from your own site which still has to be scraped) very neatly into a spreadsheet; or create alerts for particular activity on the site. You may pull data from an analytics package into spreadsheet using the Google Analytics API. You might visualise that dynamically with a Google Gadget which anyone in the company can embed on their own dashboard, all of which saves time.

3. Data is about people

Stories – whether that’s a story about defence spending or website traffic – are told about people and to people. The data is just a means to an end.

Find the people that the data is telling a story about. In a piece of journalism, that might be a case study to illustrate the real world impact of that data. In an analytics report you might use the data to create profiles of typical users, which helps others to better shape their journalism for that audience.

4. Good data is social, sticky and useful

Journalism is pointless if it doesn’t reach anyone, and both data journalism and analytics have particular strengths on that front, when done well.

Is your data social? (For example, because you’ve made the raw data available for others to visualise, or created a tool that people can interact across, or because the visualisation itself can be shared).

Is your data sticky? (For example, because people can explore it in detail, or because the report is rich and varied)

Is your data useful? (For example, because it helps someone make a decision, or understand their place in the world, or because they can act on it with a particular tool)

5. You can be driven by the data or driven by the story

The final principle is really about how proactive you want to be in your involvement with data. You might prefer to lie back and let the data come to you – from a dashboard, analytics reports, or statistical releases – but the really interesting data comes when you seek it out, because you’re driven by a question.

A/B testing of headlinesfor example, allows you to create your own data around which headlines work best (there’s even a WordPress plugin for that). Or you can try an experiment where you combine ‘traffic whoring’ with deeper reporting, and see how they work together, or follow and compare the impact of a new site feature.

And in data journalism the most interesting stories often come not from the flow of official government data, but from leaks you get through developing relationships with contacts, asking for something under freedom of information, EIR or data protection laws, campaigning to get a decision on releasing something specific, crowdsourcing or collecting your own data, or scraping datasets.

Nothing new under the sun

All of these principles are merely journalistic principles renamed. So:

  • Journalism is only as good as the person asking the questions
  • Journalism can save time and money
  • Journalism is about people
  • Good journalism is social, stick and useful
  • You can be driven by the source or the story

Any other examples you can think of?

One thought on “5 principles of data management – for both analytics and data journalism

  1. Pingback: Morning Toolbox – October 10, 2012 – Privacy, TV News « Skeptical Software Tools

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s