Tag Archives: Guardian

Data and the future of journalism panel discussion: Linked Data London

Tonight I had the pleasure of chairing an extremely informative panel discussion on data and the future of journalism at the first London Linked Data Meetup. On the panel were:

What follows is a series of notes from the discussion, which I hope are of some use.

For a primer on Linked Data there is A Skim-Read Introduction to Linked DataLinked Data: The Story So Far PDF) by Tom Heath, Christian Bizer and Berners-Lee; and this TED video by Sir Tim Berners-Lee (who was on the panel before this one).

To set some brief context, I talked about how 2009 was, for me, a key year in data and journalism – largely because it has been a year of crisis in both publishing and government. The seminal point in all of this has been the MPs’ expenses story, which both demonstrated the power of data in journalism, and the need for transparency from government – for example, the government appointment of Sir Tim Berners-Lee, seeking developers to suggest things to do with public data, and the imminent launch of Data.gov.uk around the same issue.

Even before then the New York Times and Guardian both launched APIs at the beginning of the year, MSN Local and the BBC have both been working with Wikipedia and we’ve seen the launch of a number of startups and mashups around data including Timetric, Verifiable, BeVocal, OpenlyLocal, MashTheState, the open source release of Everyblock, and Mapumental.

Q: What are the implications of paywalls for Linked Data?

The general view was that Linked Data – specifically standards like RDF – would allow users and organisations to access information about content even if they couldn’t access the content itself. To give a concrete example, rather than linking to a ‘wall’ that simply requires payment, it would be clearer what the content beyond that wall related to (e.g. key people, organisations, author, etc.)

Leigh Dodds felt that using standards like RDF would allow organisations to more effectively package content in commercially attractive ways, e.g. ‘everything about this organisation’.

Q: What can bloggers do to tap into the potential of Linked Data?

This drew some blank responses, but Leigh Dodds was most forthright, arguing that the onus lay with developers to do things that would make it easier for bloggers to, for example, visualise data. He also pointed out that currently if someone does something with data it is not possible to track that back to the source and that better tools would allow, effectively, an equivalent of pingback for data included in charts (e.g. the person who created the data would know that it had been used, as could others).

Q: Given that the problem for publishing lies in advertising rather than content, how can Linked Data help solve that?

Dan Brickley suggested that OAuth technologies (where you use a single login identity for multiple sites that contains information about your social connections, rather than creating a new ‘identity’ for each) would allow users to specify more specifically how they experience content, for instance: ‘I only want to see article comments by users who are also my Facebook and Twitter friends.’

The same technology would allow for more personalised, and therefore more lucrative, advertising.

John O’Donovan felt the same could be said about content itself – more accurate data about content would allow for more specific selling of advertising.

Martin Belam quoted James Cridland on radio: “[The different operators] agree on technology but compete on content”. The same was true of advertising but the advertising and news industries needed to be more active in defining common standards.

Leigh Dodds pointed out that semantic data was already being used by companies serving advertising.

Other notes

I asked members of the audience who they felt were the heroes and villains of Linked Data in the news industry. The Guardian and BBC came out well – The Daily Mail were named as repeat offenders who would simply refer to “a study” and not say which, nor link to it.

Martin Belam pointed out that The Guardian is increasingly asking itself ‘How will that look through an API’ when producing content, representing a key shift in editorial thinking. If users of the platform are swallowing up significant bandwidth or driving significant traffic then that would probably warrant talking to them about more formal relationships (either customer-provider or partners).

A number of references were made to the problem of provenance – being able to identify where a statement came from. Dan Brickley specifically spoke of the problem with identifying the source of Twitter retweets.

Dan also felt that the problem of journalists not linking would be solved by technology. In conversation previously, he also talked of “subject-based linking” and the impact of SKOS and linked data style identifiers. He saw a problem in that, while new articles might link to older reports on the same issue, older reports were not updated with links to the new updates. Tagging individual articles was problematic in that you then had the equivalent of an overflowing inbox.

(I’ve invited all 4 participants to correct any errors and add anything I’ve missed)

Finally, here’s a bit of video from the very last question addressed in the discussion (filmed with thanks by @countculture):

Linked Data London 090909 from Paul Bradshaw on Vimeo.

Data and the future of journalism: what questions should I ask?

Tomorrow I’m chairing a discussion panel on the Future of Journalism at the first London Linked Data Meetup. On the panel are:

What questions would you like me to ask them about data and the future of journalism?

UK newspapers add 213,892 Twitter followers in a month

National UK newspapers had 1,471,936 Twitter followers at the start of September – up 213,892 or 17% on August 1 (when they had 1,258,044 followers).

You can see the September figures (orignally posted here) below or here.

I have more Twitter statistics here.

Guardian the most bookmarked newspaper on delicious

The Guardian has more URLs bookmarked on Delicious than any other UK newspaper, as I first revealed here (with the original video here)

There are 10,914 Guardian URLs bookmarked, with the Times coming 2nd (3,944) and the Independent in 3rd place (3,196).

Newspaper
website
Bookmarks on Delicious
Guardian 10,914
Times Online 3,944
The Independent 3,196
Telegraph 2,258
The Sun 1,409
FT 1,303
Daily Mail 785
Mirror 624
Express 197

Quarkbase must be using the Delicious API but it doesn’t say where it gets the number. Click the papers’ name to see the Quarkbase figures (and more).

Guardian joins NYT in mulling over members’ club

It seems The Guardian is considering launching a members’ club of some sort as part of moves to increase revenue, an idea that was also mooted by the New York Times a few months ago.

Members clubs are not a particularly new idea – they’ve been used successfully in the magazine industry for a long time – and they have a lot of potential, although probably not as a massive revenue generator, and less so in a recession (talk to anyone in the events industry to understand why). I’m trying to get hold of some concrete figures and experiences of these – if you have any, I’d be grateful if you could add them.

The biggest problem for newspapers in putting together a members’ club is the diversity of their ‘members’.

When the New York Times’ Bill Keller described their possible members’ club it apparently included “a baseball cap or a T-shirt, an invite to a Times event, or perhaps, like The Economist, access to specialized content on the Web.”

The Guardian appear to have a little more imagination: “benefits might include, for example, a welcome pack, exclusive content, live events, special offers from our partners and the opportunity to communicate with our journalists.”*

Still, from the very vague initial impressions I think both are making the mistake of seeing readers as an amorphous mass of ‘news consumers’ rather than a collection of niche markets.

The Guardian, for example, has particular strengths in covering the media, education, and ‘society’ (the supplements it prints on the first 3 days of the week). If I was launching a members’ club I would start with one of those (not media) and branch outwards. The offering then becomes much clearer (both to readers and commercial partners), the learning curve quicker and less damaging – and it also becomes easier for users to charge it to an institution.

*By the way, I love the fact that “the opportunity to communicate with our journalists” is part of the deal. So much for being ‘part of the conversation’

The stickiness of UK newspaper sites compared

Visitors to UK newspaper sites look at an average of 2.5 pages a day, according to data from Alexa. But 62.8% of users look at just one page (figures originally posted here).

In terms of daily page views per user, the Sun (4 pages), Guardian (3.1) and Telegraph (2.9) are above average. Visitors to the Mail site look at just 2.4 pages a day – so while the Mail may have come top in the July ABCe figures, maybe its large number of overseas visitors aren’t staying to look round the site.

Stickiness of UK newspaper sites

Newspaper Daily page views
per user
Bounce
rate (%)
The Sun 4 48.5
Guardian 3.1 59.2
Telegraph 2.9 65.2
Daily Mail 2.4 60.7
Times Online 2.4 59.7
Independent 2.2 70.4
FT.com 1.9 66.8
Mirror 1.7 67.5
Express 1.7 66.7
Average 2.5 62.8
  • Better than average figures are in bold.
  • The bounce rate is the percentage of visits that consisted of just one page (so a low number is good).
  • These figures are 3-month averages. These change on a daily basis at Alexa – so they may have altered slightly by the time you check. Click the papers’ names to see the current data.
  • The overall average at the bottom is a simple average – it has not been weighted by traffic.

Page views vs bounce rate

The table is ranked by daily page views per user. The bounce rate is another measure of stickiness. It doesn’t exactly correlate with page views, as papers may have differing proportions of loyal, engaged users who visit lots of pages. The more pages that these users visit, the better the page view figure – but they won’t affect the bounce rate.

The Telegraph has a worse bounce rate than the sites near it in the table, perhaps because the great success with its Digg tool doesn’t always lead to multi-page visits?

Using Alexa data

There are issues with using Alexa data like this as it underrepresents UK users, who may have differing usage patterns to other visitors. However, as it seems to underrepresent them more or less equally, the rankings should be OK even if the absolute figures are all out by the same margin.

How US traffic is vital for UK newspaper sites

The latest figures for UK users  from the audited ABCes together with Compete‘s figures for American site usage show how USA traffic is vital for UK newspaper sites (figures originally posted here).

On average, US traffic is 36.8% of the UK traffic (ie there is just over one US visitor for every 3 UK visitors). The figure for the Telegraph is slightly higher (44.5%) and for the Mail it’s a massive 62.5%.

Newspaper
site
USA
visitors
(Compete)
UK
visitors
(ABCe)
US users
as % of UK
Daily Mail 5,199,078 8,316,083 62.5
Telegraph 4,087,769 9,184,082 44.5
Times Online 2,805,815 7,668,637 36.6
Guardian 3,676,498 10,211,385 36.0
Independent 1,317,298 3,781,320 34.8
The Sun 2,419,319 8,704,036 27.8
Mirror 748,098 4,907,540 15.2
FT.com 5,960,589 n/a n/a
Express 63,216 n/a n/a
Average 2,919,742 7,539,012 36.8

These figures are all for June 2009. The FT wasn’t audited in June’s ABCes. The Express isn’t in the ABCes. I had planned to use Alexa data but Compete seems a bit more robust.

The figures are further proof that the Mail’s success in the June ABCes was driven by American searches for Michael Jackson’s kids.

Did Michael Jackson’s kids make the Daily Mail the most visited UK newspaper site in June?

The Daily Mail surprisingly overtook the Telegraph and Guardian in the June ABCes – with more unique visitors than any other UK newspaper (this is a cross-post of my original June ABCe analysis on my blog).

However it was only 4th in terms of UK visitors. Figures from Compete.com, which tracks Americans’ internet use, show that, of the 4.7 million unique users the Mail added from May to June, 1.2 million were from the USA. American and other foreign visitors searching for Michael Jackson’s kids – the Mail tops google.com for a search on this – drove this overseas growth.

US traffic to UK newspaper sites

Of the big three UK newspaper sites this is what happened to their US traffic from May to June:

This dramatic increase in traffic, compared to its rivals, from May to June helps explains how the Mail leapfrogged the Guardian and Telegraph.

compete-mail-traffic

Google.com was the main referrer to the Mail – responsible for 22.7% of its traffic. More on this below. Next up was drudgereport.com (a large US news aggregation site), followed by Yahoo.com and Facebook.com.

What was behind this rise in US traffic?

So what led to this sudden increase for the Mail? Compete also shows you the main search terms that lead US visitors to sites. Continue reading

Who links to the report they’re reporting on?

This week the UK government released a report into social mobility. While mainstream reporting focused mainly on the broad picture, I wanted to read the original government report itself. Which publishers linked to it?

I’ve written and spoken extensively on the importance of linking, but it comes down to 2 core reasons:

Firstly, Google will rank a page more highly if it includes more outgoing links.

Secondly, people will return to your site more often if they know they can expect useful links.

So, get your act together, please what are news organisations doing to address this?