There have been quite a few scraping-related stories that I’ve been meaning to blog about – so many I’ve decided to write a round up instead. It demonstrates just the increasing role that scraping is playing in journalism – and the possibilities for those who don’t know them:
Scraping company information
Chris Taggart explains how he built a database of corporations which will be particularly useful to journalists and anyone looking at public spending:
“Let’s have a look at one we did earlier: the Isle of Man (there’s also one for Gibraltar, Ireland, and in the US, the District of Columbia) … In the space of a couple of hours not only have we liberated the data, but both the code and the data are there for anyone else to use too, as well as being imported in OpenCorporates.”
OpenCorporates are also offering a bounty for programmers who can scrape company information from other jurisdictions.
Scraperwiki on the front page of The Guardian…
The Scraperwiki blog gives the story behind a front page investigation by James Ball on lobbyist influence in the UK Parliament: Continue reading
Ben Goldacre writes about the suing of Simon Singh by The British Chiropractic Association (you’ll see a badge on this blog on the issue), and how bloggers have helped investigate their claims.
“Fifteen months after the case began, the BCA finally released the academic evidence it was using to support specific claims. Within 24 hours this was taken apart meticulously by bloggers, referencing primary research papers, and looking in every corner.
“Professor David Colquhoun of UCL pointed out, on infant colic, that the BCA cited weak evidence in its favour, while ignoring strong evidence contradicting its claims. He posted the evidence and explained it. LayScience flagged up the BCA selectively quoting a Cochrane review. Every stone was turned by Quackometer, APGaylard, Gimpyblog,EvidenceMatters, Dr Petra Boynton, MinistryofTruth, Holfordwatch, legal blogger Jack of Kent, and many more. At every turn they have taken the opportunity to explain a different principle of evidence based medicine – the sin of cherry-picking results, the ways a clinical trial can be unfair by design – to an engaged lay audience, with clarity as well as swagger.”
Here’s the payoff:
“a ragged band of bloggers from all walks of life has, to my mind, done a better job of subjecting an entire industry’s claims to meaningful, public, scientific scrutiny than the media, the industry itself, and even its own regulator. It’s strange this task has fallen to them, but I’m glad someone is doing it, and they do it very, very well indeed.”