Got a new laptop? Here’s how to maintain your privacy from the start

When you get a new laptop – with no cookies on it! – it’s a great opportunity to start afresh and protect your privacy online by default. As I recently got a new laptop here’s what I did as I set it up…

Start from scratch – no importing of settings/applications

Many laptop setup wizards offer the option to import applications, documents or other elements from your existing laptop. I didn’t do this, partly because I didn’t want to bloat my new laptop with anything that wasn’t necessary (and if you use cloud storage then you can download from there anyway), but largely because I wanted to check the settings of each application as I went – this is much easier to do if you’re installing them.

Browsers – install them all

I use at least four different browsers: Safari, Chrome, Firefox and Opera. (You might also want to install Tor for particular use cases, although I’m not going to cover it here).

It’s useful to have different browsers partly because they offer different functionality, but also because it allows you to separate different activities. For example: Continue reading

10 principles for data journalism in its second decade

10 principles Data journalism

In 2007 Bill Kovach and Tom Rosenstiel published The Elements of Journalism. With the concept of ‘journalism’ increasingly challenged by the fact that anyone could now publish to mass audiences, their principles represented a welcome platform-neutral attempt to articulate exactly how journalism could be untangled from the vehicles that carried it and the audiences it commanded.

In this extract from a forthcoming book chapter* I attempt to use Kovach and Rosenstiel’s principles (outlined in part 1 here) as the basis for a set that might form a basis for (modern) data journalism as it enters its second and third decades.

Principle 1: Data journalists should strive to interrogate data as a power in its own right

When data journalist Jean-Marc Manach set out to find out how many people had died while migrating to Europe he discovered that no EU member state held any data on migrants’ deaths. As one public official put it, dead migrants “aren’t migrating anymore, so why care?

Similarly, when the BBC sent Freedom of Information requests to mental health trusts about their use of face-down restraint, six replied saying they could not say how often any form of restraint was used — despite being statutorily obliged to “document and review every episode of physical restraint which should include a detailed account of the restraint” under the Mental Health Act 1983.

The collection of data, the definitions used, and the ways that data informs decision making, are all exercises of power in their own right. The availability, accuracy and employment should all be particular focuses for data journalism as we see the expansion of smart cities and wearable technology. Continue reading

Computational thinking and the next wave of data journalism

In this second extract from a forthcoming book chapter I look at the role that computational thinking is likely to play in the next wave of data journalism — and the need to problematise that. You can read the first part of this series here.

Computational thinking is the process of logical problem solving that allows us to break down challenges into manageable chunks. It is ‘computational’ not only because it is logical in the same way that a computer is, but also because this allows us to turn to computer power to solve it.

As Jeannette M. Wing puts it:

“To reading, writing, and arithmetic, we should add computational thinking to every child’s analytical ability. Just as the printing press facilitated the spread of the three Rs, what is appropriately incestuous about this vision is that computing and computers facilitate the spread of computational thinking.”

This process is at the heart of a data journalist’s work: it is what allows the data journalist to solve the problems that make up so much of modern journalism, and to be able to do so with the speed and accuracy that news processes demand. Continue reading

The next wave of data journalism?

In the first of three expanded extracts from a forthcoming book chapter on ‘The next wave of data journalism’ I outline some of the ways that data journalism is reinventing itself, and adapting for a world which is rapidly changing again. Where networked communications and processing power were key in the 2000s, automation and AI are becoming key in the decade to come. And just as data journalism raised the bar for journalism as a whole, the bar is about to be raised for data journalism itself.

Data journalism isn’t doing enough. Now into its second decade, the noughties-era technologies that it was built on – networked access to information and vastly improving visualisation capabilities – are now taken for granted, just as the ‘computer assisted’ part of its antecedent Computer Assisted Reporting was.

In just ten years data journalism has settled down into familiar practices and genres, from the interactive map and giant infographics to the quick turnaround “Who comes bottom in the latest dataset” write-up. It’s a sure sign of maturity when press officers are sending you data journalism-based media releases.

Now we need to move forward. And the good news is: there are plenty of places to go. Continue reading

How to: create a treemap in Tableau

tableau treemap buzzfeed

Treemaps are a great alternative to pie charts when you want to tell a story about the composition of something: whereas pie charts can be limiting, treemaps allow users to drill down into ‘parts of parts’, and group elements within particular categories.

In this post I’m going to show you how to create a treemap in the free visualisation software Tableau Public to show how the majority of BuzzFeed’s content views in 2015 took place away from its website.

This data suits a treemap particularly well: although the traffic is broadly split between ‘website’, ‘Facebook’, ‘Snapchat’, ‘YouTube’ and ‘other’, there are also subdivisions within some of those platforms – for example, Facebook traffic is split between video and images, and website traffic is split between direct visits, those via Google, and those via Facebook. A treemap allows users to explore those subtleties in a way that pie charts do not.

Step 1: Get the data in the right format

To create a treemap it is vital that you get the data in the correct format to begin with. In particular, you will need to make sure that as well as a primary ‘category’ column, you also have a second ‘sub category’ column. They don’t have to have these names, but that general concept is important. A third column should contain the values to be visualised.

treemap data format

The key feature to look for in this data structure is that you should expect to see categories in your main ‘category’ column appear more than once. In our BuzzFeed data, for example, the platform category ‘website’ appears 3 times – once for each ‘Source’ sub category of ‘Direct traffic’, ‘Traffic from Google’ and ‘Traffic from Facebook’.

Also, make sure that your values only use numbers – don’t add percentage symbols, commas or other characters that might lead to it being interpreted as text and mean you have to reclassify the data later.

If you need a dataset to work with, I’ve uploaded the Buzzfeed figures here (you’ll need to save it to your computer).

Begin creating your chart

In Tableau Public, connect to the data you’ve just created, and go to your empty worksheet. On the left you should see your category (in this case, ‘Platform’) and sub category (in this case ‘Source’) columns in the Dimensions area; and underneath that in the Measures area, your values (in this case, ‘Traffic %’).

Click and drag your main category dimension (‘Platform’) into the Rows area at the top. Then do the same with your measure (‘Traffic %’) so that it looks like this:

Platform and traffic % in rows

Tableau will automatically draw a chart for you – but ignore that. Instead, look to the right hand side where the ‘Show me’ menu should now be showing which chart types you can use with this combination of data. If it doesn’t show, click ‘Show me’ in the upper right corner.

One of the options available should be the treemap – it’s normally the first option four rows down. Click on this to change the chart to a treemap.

Show me menu

Now we have a treemap – but it’s only showing the top-level category (in this case, ‘Platform’). We need to customise it a bit to get a treemap which allows users to see the sub-categories too.

Customising the colours and slices

To the left of the chart itself you should see a box titled Marks containing buttons for Color, Size, Label, Detail and Tooltip. And underneath those buttons, icons indicating three settings:

  • The ‘Size’ icon is set by ‘SUM(Traffic %)’
  • The ‘Color’ icon is also set by ‘SUM(Traffic %)’
  • The ‘Label’ icon is set by ‘Platform’


First we need to change it so that the ‘Color’ is determined by the main category (‘Platform’). To do that, click and drag the ‘Platform’ dimension onto the Color button.

Now, instead of one colour in different shades representing an amount, you should have five colours – one for each category:

Treemap coloured by category

We can customise the colour further by clicking on the ‘Color’ button and clicking Edit colors…. This will open up a new window with a list of categories on the left, and a palette on the right. In our case it makes sense to assign relevant colours to each platform: yellow for Snapchat, red for YouTube, and blue for Facebook. I’ve also chosen grey for ‘Other’. If you prefer other shades you can access other palettes using the dropdown menu in the upper right corner. Click ‘Apply’ to see the results, and ‘OK’ to apply and leave this window.

Edit colors menu

Next, we need to bring in that sub-category (‘Category’) in. A good place to do this is on the ‘Label’: because we are already using colour to indicate the platform, we need the label to add that extra information about the source of traffic to that platform.

To do this, click and drag the ‘Category’ dimension onto the Label button.

Now the area of colour for each platform should now split into further parts based on this new category. In addition, the text labels should reflect that information too.


Other customisation

The chart is now pretty much ready. That just leaves the title to customise: at the moment it is automatically taking the name of the sheet, so you can change the title by double-clicking on the sheet tab at the bottom and renaming it. Alternatively you can double-click on the sheet title and replacing it in the window that opens.

You can also customise the Tooltip – for example a % sign needs adding after the percentage figure on each slice.

Podcast: Data journalism: More important than ever?

data journalism podcast

I took part in a BBC Academy podcast about data journalism last week, along with The Guardian’s Helena Bengtsson and the BBC’s Daniel Wainwright and John Walton, in the wake of the BBC’s annual plan and three-year strategy which included a focus on the “interrogation of data”.

Among other things we talked about why data journalism is increasingly important, what skills are needed (including the role of code), why I’m launching an MA in Data Journalism, and what sorts of stories can be done with those skills.

If you want to listen, it’s now live on the Academy website.

Data journalism on radio, audio and podcasts

In a previous post I talked about how data journalism stories are told in different ways on TV and in online video. I promised I’d do the same for audio and radio — so here it is: examples from my MA in Data Journalism to give you ideas for telling data stories using audio.

this american life

As with any audio post, This American Life features heavily: not only is the programme one of the best examples of audio journalism around — it also often involves data too. Continue reading