Tag Archives: visualisation

“It’s black and white to colour”: Ben Fry on data visualisation’s past and future


Ben Fry published his book Visualizing Data in 2007, before the term ‘data journalism’ had entered the professional vocabulary. Since then, Fry has been developing Processing, an open source “language for learning how to code within the context of the visual arts”, and he is a principal at Fathom, a Boston design and software consultancy which has created visualisation projects for National Geographic; Bill, Hillary & Chelsea Clinton Foundation and Bill & Melinda Gates Foundation

Catalina George asked him a few questions about his current work and his advice to aspiring data journalists.

Visualisation, a reinvented tool


For a better view of the world calories consumption, the user can see how much this differs from China to the UK @Fathom

One of your Fathom projects was a data visualisation for National Geographic’s What the World Eats”. The graphic part can play a great role to enrich our perception and understanding of reality. But what does the development of visualisation mean for journalism?

I think what’s called “visualisation” has been around a long time for journalism. Otto Neurath was doing this in the 1920s. I think it’s been receiving more attention in recent years because we have the means to more easily distribute interactive works, which is a boon for more sophisticated takes on data. Continue reading

When to use shape maps in data visualisation: part 2 of a great big guide

maps xkcd

xkcd’s take on mapping, via Duarte Romero

In a previous post I explained some of the considerations in deciding to use a map in data visualisation, and went into detail about mapping points and heatmaps. In this second part, taken from the MA in Online Journalism at Birmingham City University, I’m going to look at other types of maps: shape-based maps and image maps.

Mapping shapes

A more ambitious alternative to mapping points is to map shapes: in other words, instead of each data point being placed on a specific point on a map, instead different areas on that map are drawn and coloured/labelled according to the relevant data. Continue reading

When to use maps in data visualisation: a great big guide

Zombie map

Matt Bierbaum’s zombie map allows you to simulate outbreaks

When it comes to data visualisation, everyone loves a map. More exciting than a chart, easier than an infographic, it’s generally the first thing that journalists and journalism students alike ask: “How can we create a map?”

But just because you have some geographical data doesn’t mean you should map it.

Here’s why: maps, like all methods of visualisation, are designed for a purpose. They tell particular types of stories well – but not all of them.

There is also more than one type of map. You can map points, shapes, or routes. You can create heat maps and choropleth maps.

I’ll tackle those different types of maps first – and then the sorts of stories you might tell with each. But the key rule running throughout is this: make sure you are clear what story you are trying to tell, or the story that users will try to find. The test is whether a map does that job best. Continue reading

Leveraging music to help people understand data

In a guest post for OJB, Ion Mates interviews Tom Levine and Roman Heindorff about the role of audio in data journalism.

Audiolisation (sometimes called ‘auralization‘ or ‘sonification’) is the process of turning complex data to sound.

Instead of using graphics and bar charts, one can represent the contents of a spreadsheet by assigning sounds to different kinds of data.

In the above example, the activity of newsrooms is represented by verses, phrases and different rhythms. The author is Thomas Levine.

Beginning to represent data as audio

Tom started playing with computers from an early age. His main interest was to design things towards them being easier to use.

Continue reading

Guest post: 10 lessons from data journalism training

Following my post on data journalism teaching fellow trainer Peter Verweij got in touch to share a post which first appeared on his blog earlier this month. I’m reproducing it here with permission. A Dutch version is also available here. Continue reading

Guest post: How I did it – visualising carpooling patterns in Germany

Visualising carpool data

In a guest post for OJB, Natalia Karbasova explains how, with no coding experience, she used German carpool data for the basis of a data visualisation project.

Some time ago I was working on a new blog on the sharing economy, lets-share.de. It was high time to add some data-driven stories visualising important issues of the sharing economy, which change our lives.

Mitfahrgelegenheit.de is the popular German version of Carpooling.com. I decided to create a visualization which would show carpooling patterns between cities in Germany and, possibly, reveal hidden connections. Continue reading

Olympics Swimming Lap Charts from the New York Times

Part of the promise of sports data journalism is the ability to use data from an event to enrich the reporting of that event. One of the widely used graphical devices used in motor racing is the lap chart, which shows the relative positions of each car at the end of each lap:

Another, more complex chart, and one that can be quite hard to read when you first come across it, is the race history chart, which shows the laptime of each car relative to the average laptime (calculated over the whole of the race) of the race winner:

(Great examples of how to read a race history charts can be found on the IntelligentF1 blog. For the general case, see The IntelligentF1 model.)

Both of these charts can be used to illustrate the progression of a race, and even in some cases to identify stories that might otherwise have been missed (particularly races amongst back markers, for example). For Olympics events particularly, where reporting is often at a local level (national and local press reporting on the progression of their athletes, as well as the winning athletes), timing data may be one of the few sources available for finding out what actually happened to a particular competitor who didn’t feature in coverage that typically focusses on the head of the race.

I’ve also experimented with some other views, including a race summary chart that captures the start position, end of first lap position, final position and range of positions held at the end of each lap by each driver:

One of the ways of using this chart is as a quick summary of the race position chart, as well as a tool for highlighting possible “driver of the day” candidates.

A rich lap chart might also be used to convey information about the distance between cars as well as their relative positions. Here’s one experiment I tried (using Gephi to visualise the data) in which node size is proportional to time to car in front and colour is related to time to car behind (red is hot – car behind is close):

(You might also be able to imagine a variant of this chart where we fix the y-value so each row shows data relating to one particular driver. Looking along a row then allows us to see how exciting a race they had.)

All of these charts can be calculated from lap time data. Some of them can be calculated from data describing the position held by each competitor at the end of each lap. But whatever the case, the data is what drives the visualisation.

A little bit of me had been hoping that laptime data for Olympics track, swimming and cycling events might be available somewhere, but if it is, I haven’t found a reliable source yet. What I did find encouraging, though, was that the New York Times, (in many ways one of the organisations that is seeing the value of using visualised data-driven storytelling in its daily activities) did make some split time data available – and was putting it to work – in the swimming events:

Here, the NYT have given split data showing the times achieved in each leg by the relay team members, along with a lap chart that has a higher level of detail, showing the position of each team at the end of each 50m length (I think?!). The progression of each of the medal winners is highlighted using an appropriate colour theme.

[Here’s an insight from @kevinQ about how the New York Times dataviz team put this graphic together: Shifts in rankings. Apparently, they’d done similar views in previous years using a Flash component, but the current iteration uses d3.js]

The chart provides an illustration that can be used to help a reporter identify different stories about how the race progressed, whether or not it is included in the final piece. The graphic can also be used as a sidebar illustration of a race report.

Lap charts also lend themselves to interactive views, or highlighted customisations that can be used to illustrate competition between selected individuals – here’s another F1 example, this time from the f1fanatic blog:

(I have to admit, I prefer this sort of chart with greyed options for the unhighlighted drivers because it gives a better sense of the position churn that is happening elsewhere in the race.)

Of course, without the data, it can be difficult trying to generate these charts…

…which is to say: if you know where lap data can be found for any of the Olympics events, please post a link to the source in the comments below:-)

PS for an example of the lapcharting style used to track the hole by hole scoring across a multi-round golf tournament, see Andy Cotgreave’s Golf Analytics.