Tag Archives: Eva Constantaras

Teaching data journalism in developing countries: lessons from ODECA

Eva Constantaras

Eva Constantaras

Eva Constantaras is a data journalist and trainer who recently wrote the Data Journalism Manual for the UN Development Program. In a special guest post she talks about the background to the manual, her experiences in working with journalists and professors who want to introduce data journalism techniques in developing nations, and why the biggest challenges not technological, but cultural.

Over the last few years, there has been a significant shift in global experiments in data journalism education away from short term activities like boot camps and hackathons to more sustained and sustainable interventions including fellowships and institutes.

There is a growing awareness that the challenge of teaching data journalism in many countries is split straight down the middle between teaching data and teaching journalism — where neither data science nor public interest journalism are particularly common. Open data can be a boon to democracy — but only if there are professionals capable and motivated to transform that data into information for the public. Continue reading

Advertisements

A sample dirty dataset for trying out Google Refine

I’ve created this spreadsheet of ‘dirty data‘ to demonstrate some typical problems that data cleaning tools and techniques can be used for:

  • Subheadings that are only used once (and you need them in each row where they apply)
  • Odd characters that stand for something else (e.g. a space or ampersand)
  • Different entries that mean the same thing, either because they are lacking pieces of information, or have been mistyped, or inconsistently formatted

It’s best used alongside this post introducing basic features of Google Refine. But you can also use it to explore more simple techniques in spreadsheets like Find and replace; the TRIM function (and alternative solutions); and the functions UPPER, LOWER, and PROPER (which convert text into all upper case, lower case, and titlecase respectively).

Thanks to Eva Constantaras for suggesting the idea.

UPDATE: Peter Verweij has put together an introduction to some other cleaning techniques here.