How to: calculate or find rankings in spreadsheets using RANK, LARGE and SMALL

The ebook version of this tutorial includes a dataset and exercise to employ these techniques.

Right at the start of my book on Excel for journalists I talk about sorting data to find out which values come top or bottom. However, there is a family of functions which will give you a lot more control in finding out not just who is top or bottom, but the rank of any value in any series of values.

This is particularly useful if you want to compare ranks.

Many stories are based on finding out where your own country or region ranks in the latest data

Ranking isn’t just about statistics – it can be used in consumer stories too

For example, say you had a table showing school performance across the last two years.

Each table shows the percentage of pupils achieving the top grades in that year. You can use RANK to find out what rank each percentage would have placed the school in for each year. Continue reading

HOW TO: Find out the ages of people using Excel

This post is taken from the ebook Finding Stories With Spreadsheets

“How do I calculate an age in Excel?” Marion Urban, a French journalist and student on the MA in Online Journalism in Birmingham, was preparing data for the forthcoming UK General Election.

“It was a very young intake. But it wasn’t easy to calculate their ages.”

Indeed. You would think that calculating ages in Excel would be easy. But there is no off-the-shelf function to help you do so. Or at least, no easy-to-find function.

Instead there are a range of different approaches: some of them particularly, and unnecessarily complicated.

In this extract from Finding Stories in Spreadsheets I will outline one approach to calculating ages, which also illustrates a useful technique in using spreadsheets in stories: the ability to break down a problem into different parts. Continue reading

Finding Stories in Spreadsheets – ebook now live!

Cover design by Matt Buck/Drawnalism

My latest ebook – Finding Stories in Spreadsheets – is now live on Leanpub.

As with Scraping for Journalists, I’m publishing the book week-by-week so the book can be updated based on reader feedback, user suggestions and topical developments.

Each week you can download a new chapter covering a different technique for finding stories, from calculating proportions and changes, to combining data, cleaning it up, testing it, and extracting specific details.

There’s also a downloadable spreadsheet at the end of each chapter with a series of exercises to practise that chapter’s technique and find particular stories.

Along the way I tackle some other considerations in telling the story, such as context and background, and the importance of being specific in the language that you use.

If there’s anything you’d like covered in the book let me know. You can also buy the book in a ‘bundle’ with its sister title Data Journalism Heist, which covers quick-turnaround techniques for finding stories in spreadsheets using pivot tables and advanced filters.

My next ebook: Finding Stories in Spreadsheets

In a few weeks I will begin publishing my new ebook: Finding Stories In Spreadsheets.

The book has been written in response to requests from journalists who need a book on Excel aimed at storytellers, not accountants.

Finding Stories In Spreadsheets will outline a range of techniques, including ways to find the ‘needle in the haystack’ in text data, number calculations to make stories clearer, and methods of cleaning and combining data to tell new stories, including getting data ready for maps and charts.

The book will be available for a discounted rate for the first couple of weeks. To be informed when it’s available, register your interest on the Leanpub page.

A sample dirty dataset for trying out Google Refine

I’ve created this spreadsheet of ‘dirty data‘ to demonstrate some typical problems that data cleaning tools and techniques can be used for:

• Subheadings that are only used once (and you need them in each row where they apply)
• Odd characters that stand for something else (e.g. a space or ampersand)
• Different entries that mean the same thing, either because they are lacking pieces of information, or have been mistyped, or inconsistently formatted

It’s best used alongside this post introducing basic features of Google Refine. But you can also use it to explore more simple techniques in spreadsheets like Find and replace; the TRIM function (and alternative solutions); and the functions UPPER, LOWER, and PROPER (which convert text into all upper case, lower case, and titlecase respectively).

Thanks to Eva Constantaras for suggesting the idea.

Is this an Excel killer? QueryTree app lowers the bar on data journalism

Sometimes the most impressive tools solve a problem you never knew you had. In the case of QueryTree, a new data analysis tool, that problem is something most people never question: spreadsheets.

For all the shiny-shiny copy-and-paste-click-and-drag-ness in new journalism tools, most data digging comes back to at least some simple spreadsheet work, and that represents a significant hurdle for many journalists used to working with simpler tools.

While interface design has undergone generations of improvement on the web, spreadsheet software interfaces have remained largely unchanged for decades.

So why did no one think to do this before?