3 concepts from archive studies that every data journalist should know

Until last month I hadn’t heard of diplomatic studies. It’s the discipline of studying historical documents, and comes from the word ‘diploma’, as in ‘verifying that someone hasn’t faked their records’ (I’m paraphrasing here). But this discipline of verification has some useful lessons for journalists — particularly data journalists — because it provides a very handy framework for picking apart what makes a record (data) credible, and what we should be looking out for when establishing that.

Particularly useful are three terms that are used to distinguish different aspects of a record’s credibility: authenticity; reliability; and accuracy.

Luciana Duranti’s paper on electronic records (PDF) defines each of the three concepts in depth, and — although she notes that the terms are given different meanings in different sectors — it is worth exploring in detail… Continue reading

Advertisements

GEN Summit: AI’s breakthrough year in publishing

This week’s GEN Summit marked a breakthrough moment for artificial intelligence (AI) in the media industry. The topic dominated the agenda of the first two days of the conference, from Facebook’s Antoine Bordes opening keynote to voice AI, bots, monetisation and verification – and it dominated my timeline too.

At times it felt like being at a conference in the 1980s discussing how ‘computers’ could be used in the newsroom, or listening to people talking about the use of mobile phones for journalism in the noughties — in other words, it feels very much like early days. But important days nonetheless.

Ludovic Blecher‘s slide on the AI-related projects that received Google Digital News Initiative funding illustrated the problem best, with proposals counted in categories as specific as ‘personalisation’ and as vague as ‘hyperlocal’.

Digging deeper, then, here are some of the most concrete points I took away from Lisbon — and what journalists and publishers can take from those.

Continue reading

This is what I learned after teaching chatbots to journalists: 3 takeaways for newsrooms

In a guest post for OJB Maria Crosas points out three main takeaways that newsrooms should consider when aiming for a complete chatbot experience. 

Over the past year I’ve been frequently invited to share ideas around how bots can help newsrooms to deliver news, and advice on how to build an engaging chatbot experiences. And throughout these classes, I’ve also had challenging questions on how these technologies are pushing the boundaries of ethics, artificial intelligence and storytelling.

I’ve boiled down these experiences into 3 takeaways for newsrooms that want to begin the chatbot journey. Here they are…

Continue reading

This Twitter hack can help journalists to check what a group of people was tweeting about on a particular day

You may have seen a cute little Twitter hack — popularised by Andy Baio — which allows you to roll back the years and recreate a decade-old Twitter timeline. The twist is that you’ll be seeing updates from people who you may not have been following at the time but discovered later.

Nostalgia aside, the same technique could be used by journalists to track what was being said by any particular group of interest at a particular point in time. Here’s how. Continue reading

How to: uncover Excel data only revealed by a drop-down menu

Sometimes an organisation will publish a spreadsheet where only a part of the full data is shown when you select from a drop-down menu. In order to get all the data, you’d have to manually select each option, and then copy the results into a new spreadsheet.

It’s not great.

In this post, I’ll explain some tricks for finding out exactly where the full data is hidden, and  how to extract it without getting Repetitive Strain Injury. Here goes…

The example

fire data dropdown

To get the data from this spreadsheet you have to select 51 different options from a dropdown menu

The spreadsheet I’m using here is pretty straightforward: it’s a list of the populations for each fire and rescue authority in the UK (XLS). These figures are essential for putting any story about fires into context (giving us a per capita figure rather than just whole numbers) — and yet the authority behind the spreadsheet has made it very difficult to extract those numbers. Continue reading

Wanted: MA Data Journalism applicants to work with Haymarket Automotive

Autocar and What Car?

image: Haymarket

One of the industry partners for the MA in Data Journalism is Haymarket Automotive (What Car?, PistonHeads and Autocar) — we’re now inviting applications from people who are particularly interested in studying data journalism in relation to the automotive sector. In other words, data motoring journalism!

You should have a passion for journalism and retail journeys, cars or the car industry, be interested in helping find new sources of data for stories, and working on stories based on data collected by third parties, and have lots of ideas that tap into the power of data-driven journalism.

Editorial director Jim Holder explains:

“The automotive industry is awash with historic data, from car specs to buyer behaviour, and populated by experts who believe they know how to produce and read it. But our brands – and buyer’s guide What Car? in particular – have unique access to live data from in-market car buyers. Harnessed properly, the data has the potential to surprise and delight the car industry, and car buyers – and shake-up outmoded suppositions and attitudes.”

Successful applicants approved by Haymarket will work with a Haymarket Automotive brand during part or all of their MA studies.

If you are interested, please apply through the course webpage specifying in your supporting statement that you are specifically interested in working with Haymarket Automotive.

Opportunities are also available to work with FourFourTwo, or The Telegraph, or a number of other news organisations.

Britain does a great job of opening its data, except for what journalists really want

Fighting with inflatable hammers? Image by Joe Shlabotnik. Licence: CC BY-NC-SA 2.0

Journalist SA Mathieson has used open data in Britain to put together an impressive new ebook. In a guest post for OJB he looks at the country’s strengths when it comes to open data — and the problems still facing journalists who want to see how the public’s money is spent.

It is tough for a British journalist to admit that their government does something well, but here goes: when it comes to openly releasing data, Great Britain (in other words England, Scotland and Wales) is second only to Taiwan according to the Global Open Data Index.

Westminster gets maximum marks for releasing data on the government’s budget, national statistics, administrative boundaries, national maps, air quality and company registers. Continue reading