Home » Data cleaning tool relaunches: Freebase Gridworks becomes Google Refine

Data cleaning tool relaunches: Freebase Gridworks becomes Google Refine

When I first saw Freebase Gridworks I was a very happy man. Here was a tool that tackled one of the biggest problems in data journalism: cleaning dirty data (and data is invariably dirty). The tool made it easy to identify variations of a single term, and clean them up, to link one set of data to another – and much more besides.

Then Google bought the company that made Gridworks, and now it’s released a new version of the tool under a new name: Google Refine.

It’s notable that Google are explicitly positioning Refine in their video (above) as a “data journalism” tool.

You can download Google Refine here.

Further videos below. The first explains how to take a list on a webpage and convert it into a cleaned-up dataset – a useful alternative to scraping:

The second video explains how to link your data to data from elsewhere, aka “reconciliation” – e.g. extracting latitude and longitude or language.

Print Friendly

2 Responses to “Data cleaning tool relaunches: Freebase Gridworks becomes Google Refine”

  1. [...] Data cleaning tool relaunches: Freebase Gridworks becomes Google Refine | Online Journalism Blog It’s notable that Google are explicitly positioning Refine in their video as a “data journalism” tool. (tags: data tools) [...]

  2. [...] For cleaning up very large sets of data you might want to use a data cleaning tool like Google Refine. [...]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>