Dealing with live data and sentiment analysis: Q&A with The Guardian's Martyn Inglis

As part of the research for my book on online journalism, I interviewed Martyn Inglis about The Guardian’s Blairometer, which measured a live stream of data from Twitter as Tony Blair appeared before the Chilcot inquiry. I’m reproducing it in full here, with permission:

How did you prepare for dealing with live data and sentiment analysis?

I think it was important to be aware of our limitations. We can process a limited amount of data – due to Twitter quotas and so on. This is not a definitive sample. Once we accept that (a) we are not going to rank every tweet and (b) this is therefore going to be a limited exercise it frees us to make concessions that provide an easier technology solution.

Sentiment analysis is hard programatically, given the short time span of the event in which we can do this manually. We had an interface view onto incoming tweets which we had pulled from a twitter search. This allows us to be really accurate in our assessment. This does not work over a long period of time – the Chilcot inquiry is one thing, you couldn’t do it for an event lasting a week or so on. Continue reading