When you’ve converted data from a PDF to a spreadsheet it’s not uncommon for text to end up being split across multiple rows, like this: In this post I’ll explain how you can use Open Refine to quickly clean the data up so that the text is put back together and you have a single row for each entry. Continue reading
It’s not often I encounter a piece of data journalism which solves a common problem in the field – and it’s even rarer to find a piece of work which tackles two.
But that’s just what lean data journalism Ampp3d did last week when it published a piece of visualisation on the deaths of construction workers in Qatar.
The two problems? Creating impact on mobile – and making big numbers meaningful. Continue reading
My latest ebook – Finding Stories in Spreadsheets – is now live on Leanpub.
As with Scraping for Journalists, I’m publishing the book week-by-week so the book can be updated based on reader feedback, user suggestions and topical developments.
Each week you can download a new chapter covering a different technique for finding stories, from calculating proportions and changes, to combining data, cleaning it up, testing it, and extracting specific details.
There’s also a downloadable spreadsheet at the end of each chapter with a series of exercises to practise that chapter’s technique and find particular stories.
Along the way I tackle some other considerations in telling the story, such as context and background, and the importance of being specific in the language that you use.
If there’s anything you’d like covered in the book let me know. You can also buy the book in a ‘bundle’ with its sister title Data Journalism Heist, which covers quick-turnaround techniques for finding stories in spreadsheets using pivot tables and advanced filters.
Recently it has felt like data journalism might finally be taking a step forward after years spent treading water. I’ve long said that the term ‘data journalism’ was too generic for work that includes practices as diverse as scraping, data visualisation, web interactives, and FOI. But now, in 2014, it feels like different practitioners are starting to find their own identity.
It starts with the unicorn. Continue reading
Costing up your work as a freelance in multimedia, liveblogging, data journalism, community management or SEO isn’t straightforward. There’s no simple answer to ‘How much should I charge for this particular work?’ because the field isn’t standardised enough to have reliable rates.
But there are three questions you can ask yourself which can help you set a price or feel comfortable with the decision you make about taking or turning down work.
Question 1: What are your costs?
The basic cost is your time. How much time do you realistically think the work will take? And how much do you value that time, e.g. per hour? Journalism often takes more time than anticipated: contacts take more chasing; case studies fall through. Editors ask for new versions.
Travel is another cost, and accommodation and food for some work too. It may be worth negotiating on these costs separately, rather than including it in the fee, so the two can be distinguished.
Then there are equipment costs if you are working in multimedia. These are generally not specific to the project so shouldn’t have a big impact – but they do have an impact on question 3 below.
Question 2: What value does the work add to you?
How much value do you get from the work, for example:
- Will it add to your CV in areas that it lacks (rather than areas you already have)?
- Will it build your reputation?
- Does it give you an opportunity to learn new skills that you couldn’t learn otherwise?
- Does it give you an opportunity to meet people or gain access to places you wouldn’t otherwise – and what will add value to you in some way?
Question 3: What supply and demand exists for your skills?
Like any market, prices in journalism are significantly shaped by supply and demand.
For example, if there are very few people offering the skills that you have, and – crucially – a lot of clients demanding your skills, then you can ask for more.
Multimedia is one area where an investment in equipment can narrow the pool of those able to offer similar services - but not always.
It should be a judgement based on experience, not ego. You may think your skills are valuable, but if you’re not getting a lot of approaches for work then the demand is not there at this time. But if you’re getting more offers than you have time to do, then try increasing your price.
If the market is flooded by people with particular skills, prices drop. This is why freelance print journalism is so poorly paid: work that might take you two weeks to produce might only command £100, or £50, or zero, because there are enough people competing to do it for that price.
But it’s worth remembering that they might be competing to do it for that price because they a) think it will take them less time than you (rightly or wrongly); and/or b) get more value from the work than you. In other words, they have different answers to questions 1 and 2 above.
Upping the price: gathering an evidence base
One of the reasons why you may be offered less money than you expect is because publishers often don’t know the value of content themselves. Liveblogging, multimedia, and other new formats are still establishing their worth, while journalism as a whole continues to depreciate.
So collect evidence on effectiveness and make a case for the value your work has.
For example, liveblogging is known to drive traffic; there is evidence that data journalism tends to have much higher dwell times than other journalism. Multimedia generates higher engagement metrics. Good community management can increase conversion rates.
Simple tools like bit.ly allow you to measure things like clickthrough on links; asking for analytics from employers – or even negotiating extra payments based on performance can encourage clients to look at metrics they might otherwise be ignorant of.
Do you have any other factors you consider when pricing up work?
The launch was delayed a little due to the amount of people who signed up – which I think was a sensible decision.
You can watch the introduction video above, or ‘meet the instructors’ below. Looking forward to this…
The latest in the series of Frequently Asked Questions comes from a UK student, who has questions about big data.
How can data journalists make sense of such quantities of data and filter out what’s meaningful?
In the same way they always have. Journalists’ role has always been to make choices about which information to prioritise, what extra information they need, and what information to include in the story they communicate. Continue reading
The book is an introduction to data journalism and two simple techniques in particular: finding story leads using pivot tables and advanced filters.
The book also covers useful sources of data, how to follow leads up, and how to tell the resulting story.
You can also buy it from Leanpub, where it’s been live for a couple months now and is available in PDF, mobi and ePub formats. Comments welcome as always.
Something that infuriates me often with government datasets is the promiscuous heading. This is when a spreadsheet doesn’t just have its headings across one row, but instead splits them across two, three or more rows.
To make matters worse, there are often also extra rows before the headings explaining the spreadsheet more generally. Here’s just one offender from the ONS:
To clean this up in Excel takes several steps – but Open Refine (formerly Google Refine) does this much more quickly. In this post I’m going to walk through the five minute process there that can save you unnecessary effort in Excel. Continue reading