Category Archives: online journalism

A Storify of what Android phones people recommended on Twitter

Yesterday I asked – on this blog, on my Facebook page, and on Twitter – what Android phones were best for a journalism student who didn’t want to buy an iPhone or BlackBerry. The blog post comments are particularly informative on the key features to look out for, while the tweets provide a good overview of who recommends what, and why. I’ve used Storify to organise those below:

View “What Android phone would you recommend to a student journalist?” on Storify

A network infrastructure for journalists online

For some years now, I have started every online journalism course I teach with an introduction to three key tools: RSS readers, social networks, and social bookmarking.

These are, I believe, the basis of a network infrastructure which few modern journalists – whatever their platform – can do without.

The word ‘network’ is key here – because I believe one of the fundamental changes that journalists have to adapt to in the 21st century is the move to networked modes of working.

Firstly, because the newsroom itself is becoming more networked with contributors situated outside of it (the increasingly collaborative nature of journalism).

Secondly, because sources are becoming more networked (formal organisations are increasingly complemented by ad hoc ones formed across Facebook, Twitter, blogs, and so on).

And finally, because distribution of news – which has both commercial and editorial implications – is reliant on networks outside of the journalist or their employer’s control.

When I describe the network infrastructure outlined below, I outline two levels: the tools themselves, and how they connect to each other. In an attempt to clarify that, I’ve created a diagram.

The icons in the diagram attempt to show clearly the purpose of each tool:

  • The exclamation mark representing RSS readers indicate that the tool is focused on monitoring what’s new;
  • The question mark representing social bookmarking indicate that that tool largely serves to answer questions, providing context and background
  • The facial expressions representing social networks indicate that this tool help provide access to sources who may have stories to tell (positive; negative) or who are asking important questions (confused).

Here is a further breakdown of each element, and how they connect to each other.

RSS Reader

As outlined above, this part of the structure is all about ‘What’s new?’ and is quite often the first thing a journalist checks at the start of the working day (indeed, it’s ideal for checking on a phone on the way to work). It is the modern equivalent of picking up the day’s newspapers and tuning into the first radio and TV broadcasts of the day.

The RSS Reader gathers news feeds from a range of sources. Here are just a few:

  • Formal news organisations
  • Journalistic blogs
  • Organisational blogs
  • Personal blogs of individuals in your field

In addition, an RSS reader allows you to follow customised feeds reporting any mention of key terms, organisations and individuals across a variety of platforms:

  • Google News
  • The blogosphere as a whole
  • Social bookmarking services such as Delicious
  • Forums
  • Microblogging services such as Twitter
  • Video sharing services such as YouTube
  • Photo sharing services such as Flickr
  • Audio sharing services such as Audioboo
  • Social networks such as Facebook Pages

This is how the RSS reader connects to the two other elements of the infrastructure: most social networks have RSS feeds of some kind, as do social bookmarking services (one of the reasons I prefer Delicious over other platforms is the fact that it has an RSS feed for every user, for every item bookmarked with a particular ‘tag’ (explained below), for tags by particular users and for any combination of tags.

These are explained in a bit more detail in my post on ‘Passive-Aggressive Newsgathering‘.

But if you can follow these feeds in an RSS reader, why use a social network at all?

Social networks

Why use a social network? To follow people, not just content, and because your own contributions to those networks are a key factor in gaining access to sources.

With many social networking platforms (Twitter, for example) you can of course find individual users’ RSS feeds in an RSS reader, or a feed of people you are ‘following’ – either of which you can subscribe to in an RSS reader. But there’s little point, and your RSS reader will soon become flooded with updates. Instead, you should use the RSS reader to follow subjects and add the individuals talking about those subjects to your social networks.

The social network provides an added level of serendipity to your newsgathering: increased opportunities to encounter leads, tips and stories that you would not otherwise encounter.

It is also a three-way medium: a platform for you to ask questions or invite experiences relevant to the story you are pursuing, or to follow the public conversations of others asking questions or sharing experiences.

Because of this focus on social networks as a serendipity engine, I adopt an approach of seeing Twitter as a ‘stream, not a pool’ – not worrying about following too many people but rather about following too few, but having my cake and eating it by using Lists as a filter for those I want to miss least.

The final use for social networks is often the first use that journalists think of: distribution. And it is here that social networking also connects to the other 2 parts of the network infrastructure.

If you read something interesting in your RSS reader and wish to share it across social networks, you can often do so with a single click – with a bit of preparation. Twitterfeed is a tool which will automatically tweet updates on your Twitter account – all you need to know is the RSS feed for the updates you want to share. If you’re using Google Reader, for example, that feed is on your Shared Items page.

To tweet something interesting you’ve seen in your RSS Reader all you have to do then is (in the case of Google Reader) click on the ‘Share’ button below that item.

Social bookmarking

The first two parts of the network infrastructure – an RSS reader and social networks – are about the initial stages of newsgathering; the first things you check at the start of a working day.

Social bookmarking, however, is about what you do with information from your RSS reader and social networks – and information you deal with throughout your day.

Today’s news is tomorrow’s context. And social bookmarking allows you to keep a record of that context to make it quickly accessible when needed.

That’s the bookmarking part. The social part also allows you to publish information at the same time as you store it; to discover what information other people with similar interests are bookmarking; and to discover which people are bookmarking similar things to you).

Because social bookmarking is the least immediate element of this network infrastructure, it is also the aspect which the fewest students get their heads around and actually use.

Yet it is, for me, perhaps the most useful element. It takes an upfront investment of time and the development of a habit which initially doesn’t have any obvious reward.

But when you’re up against a deadline and are able to retrieve a dozen useful reports, documents and people within minutes – then you’ll get it.

Here’s the process:

  1. You come across something of interest. It may be a useful article, blog post or official report in your RSS reader – or a document linked to by someone in your social network. You might encounter the thing of interest while working on a story. You may read it – you may not have time.
  2. You bookmark the specific webpage containing it using a service like Delicious. You add ‘tags’ to help you find it later: these might include:
    • the subjects of the webpage (e.g. ‘environment’, ‘health’),
    • its author or publisher (e.g. ‘paulbradshaw’, ‘OJB’),
    • specific organisations or individuals (‘nhs’, ‘davidcameron’),
    • the type of document (‘report’, ‘research’, ‘video’)
    • or information (‘statistics’, ‘contacts’),
    • and even tags you have made up which refer to a specific story or event (‘croatia11′)
  3. You can if you wish add ‘Notes’. Many people copy a key passage from the webpage here, such as a quote (if a passage is selected on the page it will be automatically entered, depending how you are bookmarking it) to help them remember more about the page and why it was important.
  4. You can also mark your bookmark as ‘private’. This means that no one else can see it – it becomes ‘non-social’.
  5. Once you save it, it becomes available for you to retrieve at a future date: a personal search engine of items you once encountered.

The key thing here is to think about how you might look for this in future, and make sure you use those tags. For example, the publisher might not seem important now, but if in future you need to re-read a certain report and can recall that it appeared in the FT, that will help you access it quickly.

UPDATE: I’ve written a post explaining how this works with a particular case study.

Remember also that tags can be combined, so if I want to narrow down my search to items that I bookmarked with both ‘UGC’ and ‘BBC’, I can find those at delicious.com/paulb/UGC+BBC.

This is one of the reasons why a social bookmarking service is more effective than an RSS reader. You can, for example, search your shared or starred items in Google Reader – and you can tag them also – but as you tend to get more results it is harder to find what you are looking for. The use and combination of tags in Delicious narrows things down very effectively – but equally importantly, it allows you to bookmark pages that do not appear in your RSS reader.

That said, if you cannot find what you are looking for in Delicious, Google Reader is another option. It is also worth using a backup service which provides another way to search your bookmarks.Trunk.ly is one that does just that.

Of course, the bookmark only points to the live webpage – and it may be that in future the page is moved, changed, or deleted. If you are dealing with that type of information it is worth copying it to another webspace (I use the quote option on Tumblr) or using a (generally paid-for) social bookmarking service that saves copies of the pages you bookmark (Diigo and Pinboard are just two)

Social bookmarking: networks and cross-publishing

One of the features of social bookmarking services is that you can follow the bookmarks of other users. In Delicious this is called your network – and it’s where social bookmarking not only connects to RSS readers but also becomes a form of social network. Here’s how you build your network:

  1. Look at your bookmarks. Next to each one will be a number indicating how many users have bookmarked this. If you click on this you will see a list of who bookmarked it, and when. (Alternatively, you could also look at all users using a particular tag – if you’re a health correspondent, for example, you might want to look at people who are tagging items with ‘NHS’). Click on any name to see all their public bookmarks.
  2. If you would like to follow that person’s future bookmarks (because they are bookmarking items which relate to your interests), click on ‘Add to my network’
  3. You will now be able to see their bookmarks – and those of anyone else you have added – on your ‘Network’ page. It is, essentially, a mini RSS reader.

Which is why I use Google Reader to follow my network’s bookmarks instead. Because at the bottom of your Delicious Network page is, of course, a link to an RSS feed. Right-click on this and copy the link, then paste it into your RSS reader and you don’t need to keep checking your Delicious Network separately to all your other RSS feeds.

Of course, if you find someone interesting on Delicious, you might find them interesting on Twitter or a blog. If they’ve edited their Delicious public profile (the one you found in step 1 above) it might include a link. Alternatively, there’s a good chance they’ve used the same username on other social networks – so search for them using that.

This is another example of how social bookmarking can connect to social networking.

Here’s another: you can use a service like Twitterfeed (explained above) to auto-publish every item you bookmark – or just those with a particular tag, or a combination of tags. Because Delicious provides RSS feeds for your bookmarks as a whole, those with a particular tag, and any combination of tags.

For example, anything I tag ‘t’ is automatically tweeted by Twitterfeed on my @paulbradshaw Twitter account. Anything I tag ‘hmitwt’ is tweeted the same way – but to my @helpmeinvestig8 account. Editor Marc Reeves uses the same service to tweet all of his bookmarks with “I’m reading…”.

You can use a Facebook app like RSS Graffiti to do the same thing on a Facebook page.

One process across your network infrastructure then starts to look like this:

  1. Read interesting blog post on Google Reader
  2. Bookmark using Delicious – use a tag which is automatically tweeted
  3. Link auto-tweeted on Twitter

Conversely, if you want to automatically bookmark links that you share on Twitter, you can do so by signing up to Packrati.us. Tweeted links will be given the tag ‘packrati.us’ as well as any hashtags that you include in the same tweet (So a link tweeted with the hashtag ‘#crime’ will be tagged ‘crime’).

Another process across your network infrastructure then starts to look like this:

  1. Read interesting link tweeted on Twitter
  2. Retweet it, adding relevant hashtags
  3. Link is auto-bookmarked on Delicious

Listen, connect, publish

This has turned out to be a long post – which is why I think the diagram is needed. The initial set up is simple: sign up to social networks and a social bookmarking service, and set up an RSS reader. Subscribe to feeds, and add people to your networks.

But once you’ve done the technical part, you need to develop the habit of listening and continuing to add to those networks: check your RSS feeds and networks every day (but know when to switch off), and look for new sources. Bookmark useful resources – articles, documents, reports, research and profile pages – and tag them effectively.

Finally, contribute to those networks and connect the different parts together so it is as easy as possible to gather, store, publish and distribute useful information.

As you start to understand the possibilities that RSS feeds open up, you also start to see all sorts of possibilities beyond this. A site like If This Then That (IFTTT) not only showcases those possibilities particularly effectively, it also makes them as easy as they’ve ever been

It is a small – and regular – investment of time. But it will keep you in touch with your field, lead you to new sources and new stories, and help you work faster and deeper in reporting what’s happening.

A network infrastructure for journalists online

RSS reader, social networks and social bookmarking: a Network Infrastructure for journalists online

A network infrastructure for journalists online

For some years now, I have started every online journalism course I teach with an introduction to three key tools: RSS readers, social networks, and social bookmarking.

These are, I believe, the basis of a network infrastructure which few modern journalists – whatever their platform – can do without.

The word ‘network’ is key here – because I believe one of the fundamental changes that journalists have to adapt to in the 21st century is the move to networked modes of working. Continue reading

Data Journalists Engaging in Co-Innovation…

You may or may not have noticed that the Boundary Commission released their take on proposed parliamentary constituency boundaries today.

They could have released the data – as data – in the form of shape files that can be rendered at the click of a button in things like Google Maps… but they didn’t… [The one thing the Boundary Commission quango forgot to produce: a map] (There are issues with publishing the actual shapefiles, of course. For one thing, the boundaries may yet change – and if the original shapefiles are left hanging around, people may start to draw on these now incorrect sources of data once the boundaries are fixed. But that’s a minor issue…)

Instead, you have to download a series of hefty PDFs, one per region, to get a flavour of the boundary changes. Drawing a direct comparison with the current boundaries is not possible.

The make-up of the actual constituencies appears to based on their member wards, data which is provided in a series of spreadsheets, one per region, each containing several sheets describing the ward makeup of each new constituency for the counties in the corresponding region.

It didn’t take long for the data junkies to get on the case though. From my perspective, the first map I saw was on the Guardian Datastore, reusing work by University of Sheffield academic Alasdair Rae, apparently created using Google Fusion Tables (though I haven’t see a recipe published anywhere? Or a link to the KML file that I saw Guardian Datablog editor Simon Rogers/@smfrogers tweet about?)

[I knew I should have grabbed a screen shot of the original map…:-(]

It appears that Conrad Quilty-Harper (@coneee) over at the Telegraph then got on the case, and came up with a comparative map drawing on Rae’s work as published on the Datablog, showing the current boundaries compared to the proposed changes, and which ties the maps together so the zoom level and focus are matched across the maps (MPs’ constituencies: boundary changes mapped):

Telegraph side by side map comparison

Interestingly, I was alerted to this map by Simon tweeting that he liked the Telegraph map so much, they’d reused the idea (and maybe even the code?) on the Guardian site. Here’s a snapshot of the conversation between these two data journalists over the course of the day (reverse chronological order):

Datajournalists in co-operative bootstrapping mode

Here’s the handshake…

Collaborative co-evolution

I absolutely love this… and what’s more, it happened over the course of four or five hours, with a couple of technology/knowledge transfers along the way, as well as evolution in the way both news agencies communicated the information compared to the way the Boundary Commission released it. (If I was evil, I’d try to FOI the Boundary Commission to see how much time, effort and expense went into their communication effort around the proposed changes, and would then try to guesstimate how much the Guardian and Telegraph teams put into it as a comparison…)

At the time of writing (15.30), the BBC have no data driven take on this story…

And out of interest, I also wondered whether Sheffield U had a take…

Sheffiled u media site

Maybe not…

PS By the by, the DataDrivenJournalism.net website relaunched today. I’m honoured to be on the editorial board, along with @paulbradshaw @nicolaskb @mirkolorenz @smfrogers and @stiles, and looking forward to seeing how we can start to drive interest, engagement and skills development in, as well as analysis and (re)use of, and commentary on, public open data through the data journalism route…

PPS if you’re into data journalism, you may also be interested in GetTheData.org, a question and answer site in the model of Stack Overflow, with an emphasis on Q&A around how to find, access, and make use of open and public datasets.

Creating Thematic Maps Based on UK Constituency Boundaries in Google Fusion Tables

I don’t have time to chase this just now, but it could be handy… Over the last few months, several of Alasdair Rae (University of Sheffield) Google Fusion Tables generated maps have been appearing on the Guardian Datablog, including one today showing the UK’s new Parliamentay constituency boundaries.

Looking at Alasdair’s fusion table for English Indices of Deprivation 2010, we can see how it contains various output area codes as well as KML geometry shape files that can be used to draw the boundaries on map.

Google fusion table - UK boundaries

On the to do list, then, is to a set of fusion tables that we can use to generate maps from datatables containing particular sorts of output area code. Because it’s easy to join two fusion tables by a common column, we’d then have a Google Fusion Tables simple recipe for thematic maps:

1) get data containing output area or constituency codes;
2) join with the appropriate mapping fusion table to annotate original data with appropriate shape files;
3) generate map…

I wonder – have Alasdair or anyone from the Guardian Datablog/Datastore team already published such a tutorial?

PS Ah, here’s one example tutorial: Peter Aldhous: Thematic Maps with Google Fusion Tables [PDF]

PPS for constituency boundary shapefiles as KML see http://www.google.com/fusiontables/DataSource?dsrcid=1574396 or the Guardian Datastore’s http://www.google.com/fusiontables/exporttable?query=select+col0%3E%3E1+from+1474106+&o=kmllink&g=col0%3E%3E1

Gathering data: a flow chart for data journalists


Gathering data - a flow chart

Above is a flow chart that I sketched out during a long car journey to the Balkan Investigative Reporters Network Summer School in Croatia (don’t worry: I wasn’t driving).

It aims to help those doing data journalism identify how best to get hold of and deal with data by asking a series of questions about the information you want to compile and making suggestions on ways both to get hold of it and tools to then get it into a state which makes it easier to ask questions.

It also illustrates at a glance how the process of ‘getting hold of the data’ can vary widely, and how different projects can often involve completely different tools and skillsets from previous ones.

I will have missed obvious things, so please help me improve this. And if you find it useful, let me know.

Click on the image for other sizes.

Using Google Spreadsheets as a Database Source for R

I couldn’t contain myself (other more pressing things to do, but…), so I just took a quick time out and a coffee to put together a quick and dirty R function that will let me run queries over Google spreadsheet data sources and essentially treat them as database tables (e.g. Using Google Spreadsheets as a Database with the Google Visualisation API Query Language).

Here’s the function:

library(RCurl)
gsqAPI = function(key,query,gid=0){ return( read.csv( paste( sep="",'http://spreadsheets.google.com/tq?', 'tqx=out:csv','&tq=', curlEscape(query), '&key=', key, '&gid=', gid) ) ) }

It requires the spreadsheet key value and a query; you can optionally provide a sheet number within the spreadsheet if the sheet you want to query is not the first one.

We can call the function as follows:

gsqAPI('tPfI0kerLllVLcQw7-P1FcQ','select * limit 3')

In that example, and by default, we run the query against the first sheet in the spreadsheet.

Alternatively, we can make a call like this, and run a query against sheet 3, for example:
tmpData=gsqAPI('0AmbQbL4Lrd61dDBfNEFqX1BGVDk0Mm1MNXFRUnBLNXc','select A,C where <= 10',3)
tmpData

My first R function

The real question is, of course, could it be useful.. (or even OUseful?!)?

Here’s another example: a way of querying the Guardian Datastore list of spreadsheets:

gsqAPI('0AonYZs4MzlZbdFdJWGRKYnhvWlB4S25OVmZhN0Y3WHc','select * where A contains "crime" and B contains "href" order by C desc limit 10')

What that call does is run a query against the Guardian Datastore spreadsheet that lists all the other Guardian Datastore spreadsheets, and pulls out references to spreadsheets relating to “crime”.

The returned data is a bit messy and requires parsing to be properly useful.. but I haven’t started looking at string manipulation in R yet…(So my question is: given a dataframe with a column containing things like <a href=”http://example.com/whatever”>Some Page</a>, how would I extract columns containing http://example.com/whatever or Some Page fields?)

[UPDATE: as well as indexing a sheet by sheet number, you can index it by sheet name, but you’ll probably need to tweak the function to look end with '&gid=', curlEscape(gid) so that things like spaces in the sheet name get handled properly I’m not sure about this now.. calling sheet by name works when accessing the “normal” Google spreadsheets application, but I’m not sure it does for the chart query language call??? ]

[If you haven’t yet discovered R, it’s an environment that was developed for doing stats… I use the RStudio environment to play with it. The more I use it (and I’ve only just started exploring what it can do), the more I think it provides a very powerful environment for working with data in quite a tangible way, not least for reshaping it and visualising it, let alone doing stats with in. (In fact, don’t use the stats bit if you don’t want to; it provides more than enough data mechanic tools to be going on with;-)]

PS By the by, I’m syndicating my Rstats tagged posts through the R-Bloggers site. If you’re at all interested in seeing what’s possible with R, I recommend you subscribe to R-Bloggers, or at least have a quick skim through some of the posts on there…

PPS The RSpatialTips post Accessing Google Spreadsheets from R has a couple of really handy tips for tidying up data pulled in from Google Spreadsheets; assuming the spreadsheetdata has been loaded into ssdata: a) tidy up column names using colnames(ssdata) <- c("my.Col.Name1","my.Col.Name2",...,"my.Col.NameN"); b) If a column returns numbers as non-numeric data (eg as a string "1,000") in cols 3 to 5, convert it to a numeric using something like: for (i in 3:5) ssdata[,i] <- as.numeric(gsub(",","",ssdata[,i])) [The last column can be identifed as ncol(ssdata) You can do a more aggessive conversion to numbers (assuming no decimal points) using gsub("[^0-9]“,”",ssdata[,i])]

PPPS via Revolutions blog, how to read the https file into R (unchecked):

require(RCurl)
myCsv = getURL(httpsCSVurl)
read.csv(textConnection(myCsv))

Has investigative journalism found its feet online? (part 3)

Previously this serialised chapter for the forthcoming book Investigative Journalism: Dead or Alive? looked at new business models surrounding investigative journalism and online investigative journalism as a genre. This third and final part looks at how changing supplies of information change the context within which investigative journalism operates.

What next for investigative journalism in a world of information overload?

But this identity crisis does highlight a final, important, question to be asked: in a world where users have direct access to a wealth of information themselves, what is investigative journalism for? I would argue that it comes down to the concept of “uncovering the hidden”, and in exploring this it is useful to draw an analogy with the general journalistic idea of “reporting the new”.

Trainee journalists sometimes see “new” in limited terms – as simply what is happening today. But what is “new” is not limited to that. It can also be what is happening tomorrow, or what happened 30 years ago. It can be something that someone has said about an “old story” days later, or an emerging anger about something that was never seen as “newsworthy” to begin with. The talent of the journalist is to be able to spot that “newness”, and communicate it effectively.

Journalism typically becomes investigative when that newness involves uncovering the hidden – and that can be anything that our audience couldn’t see before – it could be a victim’s story, a buried report, 250,000 cables accessible to 2.5 million people, or even information that is publicly available but has not been connected before (“the hidden” – like “the new” is, of course, a subjective quality, dependent on the talent of a particular journalist for finding something in it – or a way of seeing it – that is newsworthy). Continue reading

Has investigative journalism found its feet online? (part 2)

The first part of this serialised chapter for the forthcoming book Investigative Journalism: Dead or Alive? looked at new business models surrounding investigative journalism. This second part looks at how new ways of gathering, producing and distributing investigative journalism are emerging online.

Online investigative journalism as a genre

Over many decades print and broadcast investigative journalism have developed their own languages: the spectacular scoop; the damning document; the reporter-goes-undercover; the doorstep confrontation, and so on. Does online investigative journalism have such a language? Not quite. Like online journalism as a whole, it is still finding its own voice. But this does not mean that it lacks its own voice.

For some the internet appears too fleeting for serious journalism. How can you do justice to a complex issue in 140 characters? How can you penetrate the fog of comment thread flame wars, or the “echo chambers” of users talking to themselves? For others, the internet offers something new: unlimited space for expansion beyond the 1,000 word article or 30-minute broadcast; a place where you might take some knowledge, at least, for granted, instead of having to start from a base of zero. A more cooperative and engaged medium where you can answer questions directly, where your former audience is now also your distributor, your sub-editor, your source.

The difference in perception is largely a result of people mistaking parts for the whole. The internet is not Twitter, or comment threads, or blogs. It is a collection of linked objects and people – in other words: all of the above, operating together, each used, ideally, to their strengths, and also, often in relationship to offline media. Continue reading

When will we stop saying “Pictures from Twitter” and “Video from YouTube”?

Image from YouTube

Image from YouTube

Over the weekend the BBC had to deal with the embarrassing ignorance of someone in their complaints department who appeared to believe that images shared on Twitter were “public domain” and “therefore … not subject to the same copyright laws” as material outside social networks.

A blog post, from online communities adviser Andy Mabbett, gathered thousands of pageviews in a matter of hours before the BBC’s Social Media Editor Chris Hamilton quickly responded:

“We make every effort to contact people, as copyright holders, who’ve taken photos we want to use in our coverage.

“In exceptional situations, ie a major news story, where there is a strong public interest in making a photo available to a wide audience, we may seek clearance after we’ve first used it.”

(Chris also published a blog post yesterday expanding on some of the issues, the comments on which are also worth reading)

The copyright issue – and the existence of a member of BBC staff who hadn’t read the Corporation’s own guidelines on the matter – was a distraction. What really rumbled through the 170+ comments – and indeed Andy’s original complaint – was the issue of attribution.

Continue reading