Tag Archives: gjic2011

AUDIO: Text mining tips from Andy Lehren and Sarah Cohen

Searches made of the Sarah Palin emails

Searches made of the Sarah Palin emails - from a presentation by the New York Times's Andy Lehren

One of the highlights of last week’s Global Investigative Journalism Conference was the session on text mining, where the New York Times’s Andy Lehren talked about his experiences of working with data from Wikileaks and elsewhere, and former Washington Post database editor Sarah Cohen gave her insights into various tools and techniques in text mining.

Andy Lehren’s audio is embedded below. The story mentioned on North Korean missile deals can be found here. Other relevant links: Infomine and NICAR Net Tour.

And here’s Sarah’s talk which covers extracting information from large sets of documents. Many of the tools mentioned are bookmarked ‘textmining’ on my Delicious account.