Monthly Archives: September 2009

The 100-25-10 Rule

A curious piece of data emerging from a conference at the American Press Institute. It seems that in “nearly all markets, newspaper websites receive 2.5 visits and 10 pageviews for each unique visitor.” Is this a 90-9-1 rule for the newspaper industry?

If you want to make it snappier, multiply by 10, so it becomes: 100 pageviews and 25 visits for every 10 visitors. The 100-25-10 rule.


Google’s Fast Flip – a cruel joke on the news industry

So Google launched Fast Flip yesterday, a Labs experiment that allows you to ‘experience’ news websites in a similar way to their analogue equivalents. Yes, you can ‘flick’ through pages of news.


Superficially this appears little more than a repeat of many similar experiments in the past decade from publishers who thought readers wanted an analogue experience online and commissioned disproportionately expensive technologies that allowed you to ‘turn the page’ on-screen (I turned down one such technology myself as a magazine editor as long as 10 years ago). Things have moved on so much that anyone can have this flashy technology for themselves for free by going to Issuu.

So why are the web-native minds of Google wasting time on such an analogue-mindset concept?

Here’s the laughable quote that I think is key:

“To make money, Fast Flip also serves up contextual adverts around the screenshots.

“Publishers who have signed up to provide content to the service will share in that revenue; that was proof, said Ms Mayer, that Google was keen to help the industry at a time when it was clearly struggling.”

Oh yes, that’s concrete proof alright.

Allow me to call bullshit. If this is concrete proof of anything, it is proof that Google are prepared to cash in on the blind panic of the news industry in the midst of a crisis. Add in their recently mooted micropayments system and it’s almost as if Google are having a bit of fun tormenting ants with a magnifying glass.

Until now Google has walked a fine line in claiming that it is not the parasite that the news industry says it is. It does did not sell adverts on Google News, it is generally the major source of traffic to news websites, and publishers are free to remove themselves from Google’s listings through a simple piece of script.

Fast Flip and the micropayments system are moves to take them over that line – despite the claims to be ‘helping’ the news industry any relationship is likely to be skewed in the other direction – as anyone who has tried to make a living from AdSense will tell you. Note that, like AdSense:

“Google is running banner ads alongside the article thumbnails, the proceeds of which will be split with publishers (though Google won’t disclose the terms of the revenue split).”

Of course, by hosting screenshots Google are eating into one of the key metrics that publishers use to sell advertising: the time a user spends on your site. And given that many readers don’t read beyond the first few pars, there’s a good chance it will eat into the numbers clicking through to the actual page at all. So unless Google’s ad rates are significantly higher, what reason at all would a commercial publisher have to sign up to a scheme that devalues their own ad inventory in exchange for some pennies from Google? Blind panic in the midst of a crisis, that’s all.

In defence of paywalls redux: what he said

Back in June I posted ‘In defence of paywalls (a thought experiment)‘ where I said: “When you’re driving a tanker and you see a big rock ahead – do you ask everyone on the ship to rebuild it as an aeroplane? Or do you start steering away in the hope that your part of the tanker will somehow avoid the worst?”

I’ve only just come across a piece written in the same month by Michael Nielsen which expresses the same points in a much more rigorous way during a piece on disruption in general (h/t Jo Geary). It’s well worth reading in full, but here’s how he puts it so much better than I:

Continue reading

Data and the future of journalism panel discussion: Linked Data London

Tonight I had the pleasure of chairing an extremely informative panel discussion on data and the future of journalism at the first London Linked Data Meetup. On the panel were:

What follows is a series of notes from the discussion, which I hope are of some use.

For a primer on Linked Data there is A Skim-Read Introduction to Linked DataLinked Data: The Story So Far PDF) by Tom Heath, Christian Bizer and Berners-Lee; and this TED video by Sir Tim Berners-Lee (who was on the panel before this one).

To set some brief context, I talked about how 2009 was, for me, a key year in data and journalism – largely because it has been a year of crisis in both publishing and government. The seminal point in all of this has been the MPs’ expenses story, which both demonstrated the power of data in journalism, and the need for transparency from government – for example, the government appointment of Sir Tim Berners-Lee, seeking developers to suggest things to do with public data, and the imminent launch of around the same issue.

Even before then the New York Times and Guardian both launched APIs at the beginning of the year, MSN Local and the BBC have both been working with Wikipedia and we’ve seen the launch of a number of startups and mashups around data including Timetric, Verifiable, BeVocal, OpenlyLocal, MashTheState, the open source release of Everyblock, and Mapumental.

Q: What are the implications of paywalls for Linked Data?

The general view was that Linked Data – specifically standards like RDF – would allow users and organisations to access information about content even if they couldn’t access the content itself. To give a concrete example, rather than linking to a ‘wall’ that simply requires payment, it would be clearer what the content beyond that wall related to (e.g. key people, organisations, author, etc.)

Leigh Dodds felt that using standards like RDF would allow organisations to more effectively package content in commercially attractive ways, e.g. ‘everything about this organisation’.

Q: What can bloggers do to tap into the potential of Linked Data?

This drew some blank responses, but Leigh Dodds was most forthright, arguing that the onus lay with developers to do things that would make it easier for bloggers to, for example, visualise data. He also pointed out that currently if someone does something with data it is not possible to track that back to the source and that better tools would allow, effectively, an equivalent of pingback for data included in charts (e.g. the person who created the data would know that it had been used, as could others).

Q: Given that the problem for publishing lies in advertising rather than content, how can Linked Data help solve that?

Dan Brickley suggested that OAuth technologies (where you use a single login identity for multiple sites that contains information about your social connections, rather than creating a new ‘identity’ for each) would allow users to specify more specifically how they experience content, for instance: ‘I only want to see article comments by users who are also my Facebook and Twitter friends.’

The same technology would allow for more personalised, and therefore more lucrative, advertising.

John O’Donovan felt the same could be said about content itself – more accurate data about content would allow for more specific selling of advertising.

Martin Belam quoted James Cridland on radio: “[The different operators] agree on technology but compete on content”. The same was true of advertising but the advertising and news industries needed to be more active in defining common standards.

Leigh Dodds pointed out that semantic data was already being used by companies serving advertising.

Other notes

I asked members of the audience who they felt were the heroes and villains of Linked Data in the news industry. The Guardian and BBC came out well – The Daily Mail were named as repeat offenders who would simply refer to “a study” and not say which, nor link to it.

Martin Belam pointed out that The Guardian is increasingly asking itself ‘How will that look through an API’ when producing content, representing a key shift in editorial thinking. If users of the platform are swallowing up significant bandwidth or driving significant traffic then that would probably warrant talking to them about more formal relationships (either customer-provider or partners).

A number of references were made to the problem of provenance – being able to identify where a statement came from. Dan Brickley specifically spoke of the problem with identifying the source of Twitter retweets.

Dan also felt that the problem of journalists not linking would be solved by technology. In conversation previously, he also talked of “subject-based linking” and the impact of SKOS and linked data style identifiers. He saw a problem in that, while new articles might link to older reports on the same issue, older reports were not updated with links to the new updates. Tagging individual articles was problematic in that you then had the equivalent of an overflowing inbox.

(I’ve invited all 4 participants to correct any errors and add anything I’ve missed)

Finally, here’s a bit of video from the very last question addressed in the discussion (filmed with thanks by @countculture):

Linked Data London 090909 from Paul Bradshaw on Vimeo.

Online Journalism lesson #6: Interactivity

I’ve been rather tardy about getting all of these online, so here’s the 6th of my presentations from the Online Journalism class of Spring 2009, looking at Interactivity. Much of what I talk about here is also in my lengthy post on the topic:

Is the Mirror selling links to

The Mirror wants to watch out – as it looks like it’s selling links, even if it isn’t (as I first posted here and which later went hot on Sphinn). Several stories on the site share all these characteristics, and must look extremely suspicious to Google:

  • All the stories contain three links to the same MoneyExtra page.
  • All the links use different anchor text.
  • The text happens to be competitive search terms.
  • MoneyExtra isn’t mentioned in the article itself.
  • They were all published in August.

There’s nothing wrong or illegal about selling links if that is what they’re doing. But it’s likely to get you penalized by Google if they spot it as it’s done to manipulate their search results for SEO reasons (Google counts the number of links to a page as a measure of its importance).

Pages on from August

Now let’s look at several pages from

Headline: Sorting out the best credit card rate

This page from 20th August contains three links to the MoneyExtra credit cards page, using the link text “best credit card rate in the UK”, “best credit card” and “credit cards”. There is no mention of MoneyExtra in the article.

Headline: Why do credit card providers offer credit cards with 0% interest?

This page from 20th August contains three links to the MoneyExtra credit cards page, using the link text “credit card providers”, “0% credit card interest rates”, and “0% credit card deal”. No mention of MoneyExtra in the article.

Headline: Best credit card transfer: Does one size fit all?

This page from 5th August for once contains, er, three links to the MoneyExtra credit cards page, using the link text “best credit card”, “0% balance transfer rate” and “best credit card balance transfer rate”. Again, no mention of MoneyExtra in the article.

Headline: Is it too late for debt management in England?

This page from 20th August contains, er, three links to the MoneyExtra debt page, using the link text “debt management”, “debt” and “debt advice”. There is no mention of MoneyExtra in the article.

Headline: What is ‘government debt management’?

This page from 20th August contains, guess what, three links to the MoneyExtra debt page, using the link text “Government debt solution”, “debt management plans” and “debt”. There’s no mention of MoneyExtra in the article.

Something a bit different!

This page is a bit different. It’s from the 20th August, naturally. But it contains FOUR links to the MoneyExtra car insurance quotes page – and mentions MoneyExtra in the article!

Some other pages

Other pages from August (not the 20th this time) which contain three links to a specific MoneyExtra page but which don’t mention MoneyExtra in the article include: this one and this one and this one (OK, that one’s only got two links) and this one (as has that one) and this one.


As I say, there’s nothing wrong with selling links, and there’s no actual evidence that that’s what the Mirror is doing. However, this looks like the sort of pattern you’d see with sold links – so the Mirror wants to watch out it doesn’t get hit by a penalty by Google.

Data and the future of journalism: what questions should I ask?

Tomorrow I’m chairing a discussion panel on the Future of Journalism at the first London Linked Data Meetup. On the panel are:

What questions would you like me to ask them about data and the future of journalism?