Category Archives: online journalism

Moving away from ‘the story’: 5 roles of an online investigations team

The online investigation team: curation editor, multimedia editor, data journalist, community manager, editor

In almost a decade of teaching online journalism I repeatedly come up against the same two problems:

  • people who are so wedded to the idea of the self-contained ‘story’ that they struggle to create journalism outside of that (e.g. the journalism of linking, liveblogging, updating, explaining, or saying what they don’t know);
  • and people stuck in the habit of churning out easy-win articles rather than investing a longer-term effort in something of depth.

Until now I’ve addressed these problems largely through teaching and individual feedback. But for the next 3 months I’ll be trying a new way of organising students that hopes to address those two problems. As always, I thought I’d share it here to see what you think.

Roles in a team: moving from churnalism to depth

Here’s what I’m trying (for context: this is on an undergraduate module at Birmingham City University):

Students are allocated one of 5 roles within a group, investigating a particular public interest question. They investigate that for 6 weeks, at which point they are rotated to a different role and a new investigation (I’m weighing up whether to have some sort of job interview at that point).

The group format allows – I hope – for something interesting to happen: students are not under pressure to deliver ‘stories’, but instead blog about their investigation, as explained below. They are still learning newsgathering techniques, and production techniques, but the team structure makes these explicitly different to those that they would learn elsewhere.

The hope is that it will be much more difficult for them to just transfer print-style stories online, or to reach for he-said/she-said sources to fill the space between ads. With only one story to focus on, students should be forced to engage more, to do deeper and deeper into an issue, and to be more creative in how they communicate what they find out.

(It’s interesting to note that at least one news organisation is attempting something similar with a restructuring late last year)

Only one member of the team is primarily concerned with the story, and that is the editor:

The Editor (ED)

It is the editor’s role to identify what exactly the story is that the team is pursuing, and plan how the resources of the team should be best employed in pursuing that. It will help if they form the story as a hypothesis to be tested by the team gathering evidence – following Mark Lee Hunter’s story based inquiry method (PDF).

Qualities needed and developed by the editor include:

  • A nose for a story
  • Project management skills
  • Newswriting – the ability to communicate a story effectively
This post on Poynter is a good introduction to the personal skills needed for the role.

The Community Manager (CM)

The community manager’s focus is on the communities affected by the story being pursued. They should be engaging regularly with those communities – contributing to forums, having conversations with members on Twitter; following updates on Facebook; attending real world events; commenting on blogs or photo/video sharing sites, and so on.

They are the two-way channel between that community and the news team: feeding leads from the community to the editor, and taking a lead from the editor in finding contacts from the community (experts, case studies, witnesses).

Qualities needed and developed by the community manager include:

  • Interpersonal skills – the ability to listen to and communicate with different people
  • A nose for a story
  • Contacts in the community
  • Social network research skills – the ability to find sources and communities online

6 steps to get started in community management can be found in this follow-up post.

The Data Journalist (DJ)

While the community manager is focused on people, the data journalist is focused on documentation: datasets, reports, documents, regulations, and anything that frames the story being pursued.

It is their role to find that documentation – and to make sense of it. This is a key role because stories often come from signs being ignored (data) or regulations being ignored (documents).

Qualities needed and developed by the data journalist include:

  • Research skills – advanced online search and use of libraries
  • Analysis skills – such as using spreadsheets
  • Ability to decipher jargon – often by accessing experts (the CM can help)

Here’s a step by step on how to get started as a data journalist.

The Multimedia Journalist (MMJ)

The multimedia journalist is focused on the sights, sounds and people that bring a story to life. In an investigation, these will typically be the ‘victims’ and the ‘targets’.

They will film interviews with case studies; organise podcasts where various parties play the story out; collect galleries of images to illustrate the reality behind the words.

They will work closely with the CM as their roles can overlap, especially when accessing sources. The difference is that the CM is concerned with a larger quantity of interactions and information; the MM is concerned with quality: much fewer interactions and richer detail.

Qualities needed and developed by the MMJ include:

  • Ability to find sources: experts, witnesses, case studies
  • Technical skills: composition; filming or recording; editing
  • Planning: pre-interviewing, research, booking kit

The Curation Journalist (CJ)

(This was called Network Aggregator in an earlier version of this post) The CJ is the person who keeps the site ticking over while the rest of the team is working on the bigger story.

They publish regular links to related stories around the country. They are also the person who provides the wider context of that story: what else is happening in that field or around that issue; are similar issues arising in other places around the country. Typical content includes backgrounders, explainers, and updates from around the world.

This is the least demanding of the roles, so they should also be available to support other members of the team when required, following up minor leads on related stories. They should not be ‘just linking’, but getting original stories too, particularly by ‘joining the dots’ on information coming in.

Qualities needed and developed by the CJ include:

  • Information management – following as many feeds, newsletters and other relevant soures of information
  • Wide range of contacts – speaking to the usual suspects regularly to get a feel for the pulse of the issue/sector
  • Ability to turn around copy quickly

There’s a post on 7 ways to follow a field as a network aggregator (or any other journalist) on Help Me Investigate.

And here’s a post on ‘How to be a curation editor‘.

Examples of network aggregation in action:

  • Blogs like Created In Birmingham regularly round up the latest links to events and other reports in their field. See also The Guardian’s PDA Newsbucket.
  • John Grayson’s post on G4S uses a topical issue as the angle into a detailed backgrounder on the company with copious links to charity reports, politicians’ statements, articles in the media, research projects, and more.
  • This post by Diary of a Benefit Scrounger is the most creative and powerful example I’ve yet seen. It combines dozens of links to stories of treatment of benefit claimants and protestors, and to detail on various welfare schemes, to compile a first-person ‘story’.

Publish regular pieces that come together in a larger story

If this works, I’m hoping students will produce different types of content on their way to that ‘big story’, as follows:

  • Linkblogging – simple posts that link to related articles elsewhere with a key quote (rather than wasting resources rewriting them)
  • Profiles of key community members
  • Backgrounders and explainers on key issues
  • Interviews with experts, case studies and witnesses, published individually first, then edited together later
  • Aggregation and curation – pulling together a gallery of images, for example; or key tweets on an issue; or key facts on a particular area (who, what, where, when, how); or rounding up an event or discussion
  • Datablogging – finding and publishing key datasets and documents and translating them/pulling out key points for a wider audience.
  • The story so far – taking users on a journey of what facts have been discovered, and what remains to be done.

You can read more on the expectations of each role in this document. And there’s a diagram indicating how group members might interact at the top of this article.

What will make the difference is how disciplined the editor is in ensuring that their team keeps moving towards the ultimate aim, and that they can combine the different parts into a significant whole.

UPDATE: A commenter has asked about the end result. Here’s how it’s explained to students:

“At an identified point, the Editor will need to organise his or her team to bring those ingredients into that bigger story – and it may be told in different ways, for example:

  • A longform text narrative with links to the source material and embedded multimedia
  • An edited multimedia package with links to source material in the accompanying description
  • A map made with Google Maps, Fusion Tables or another tool, where pins include images or video, and links to each story”

If you’ve any suggestions or experiences on how this might work better, I’d very much welcome them.

Leveson: the Internet Pops In

The following post was originally published by Gary Herman on the NUJ New Media blog. It’s reproduced here with permission.

Here at Newmedia Towers we are being swamped by events which at long last are demonstrating that the internet is really rather relevant to the whole debate about media ethics and privacy. So this is by way of a short and somewhat belated survey of the news tsunami – Google, Leveson, Twitter, ACTA, the EU and more.

When Camilla Wright, founder of celebrity gossip site Popbitch (which some years ago broke the news of Victoria Beckham’s pregnancy possibly before she even knew about it), testified before Leveson last week (26 January 2012) [Guardian liveblog; Wright’s official written statement (PDF)] the world found out (if it could be bothered) how Popbitch is used by newspaper hacks to plant stories so that they can then be said to have appeared on the internet. Anyone remember the Drudge report, over a decade ago?

Wright, of course, made a somewhat lame excuse that Popbitch is a counterweight to gossip magazines which are full of stories placed by the PR industry.

But most interesting is the fact that Wright claimed that Popbitch is self-regulated and that it works.

Leveson pronounced that he is not sure there is ‘so much of a difference’ between what Popbitch does and what newspapers do – which is somehow off the point. Popbitch – like other websites – has a global reach by definition and Wright told the Inquiry that Popbitch tries to comply with local laws wherever it was available – claims also made more publicly by Google and Yahoo! when they have in the past given in to Chinese pressure to release data that actually or potentially incriminated users and, more recently, by Twitter when it announced its intention to regulate tweets on a country-by-country basis.

Trivia – like the stuff Popbitch trades – aside, the problem is real. A global medium will cross many jurisdictions and be accessible within many different cultures. What one country welcomes, another may ban. And who should judge the merits of each?

Confusing the internet with its applications

The Arab Spring showed us that social media – like mobile phones, CB radios, fly-posted silkscreen prints, cheap offset litho leaflets and political ballads before them – have the power to mobilise and focus dissent. Twitter’s announcement should have been expected – after all, tweeting was never intended to be part of the revolutionaries’ tool-kit.

There are already alternatives to Twitter – Vibe, Futubra, Plurk, Easy Chirp and Blackberry Messenger, of course – and the technology itself will not be restrained by the need to expand into new markets. People confuse the internet with its applications – a mistake often made by those authorities who seek to impose a duty to police content on those who convey it.

Missing the point again, Leveson asked whether it would be useful to have an external ombudsman to advise Popbitch on stories and observed that a common set of standards across newspapers and websites might also help.

While not dismissing the idea, Wright made the point that the internet made it easy for publications to bypass UK regulators.

This takes us right into the territory of Google, Facebook and the various attempts by US and international authorities to introduce regulation and impose duties on websites themselves to police them.

ACTA, SOPA and PIPA

The latest example is the Anti-Counterfeit Trade Agreement (ACTA) – a shadowy international treaty which, according to Google’s legal director, Daphne Keller, speaking over a year ago, has ‘metastasized’ from a proposal on border security and counterfeit goods to an international legal framework covering copyright and the internet.

According to a draft of ACTA, released for public scrutiny after pressure from the European Union, internet providers who disable access to pirated material and adopt a policy to counter unauthorized ‘transmission of materials protected by copyright’ will be protected against legal action.

Fair use rights would not be guaranteed under the terms of the agreement.

Many civil liberty groups have protested the process by which ACTA has been drafted as anti-democratic and ACTA’s provisions as draconian.

Google’s Keller described ACTA as looking ‘a lot like cultural imperialism’.

Google later became active in the successful fight against the US Stop Online Piracy Act (SOPA) and the related Protect Intellectual Proerty Act (PIPA), which contained similar provisions to ACTA.

Google has been remarkably quite on the Megaupload case, however. This saw the US take extraterritorial action against a Hong Kong-based company operating a number of websites accused of copyright infringement.

The arrest of all Megaupload’s executives and the closure of its sites may have the effect of erasing perfectly legitimate and legal data held on the company’s servers – something which would on the face of it be an infringement of the rights of Megaupload users who own the data.

Privacy

Meanwhile, Google – in its growing battle with Facebook – has announced its intention to introduce a single privacy regime for 60 or so of its websites and services which will allow the company to aggregate all the data on individual users the better to serve ads.

Facebook already does something similar, although the scope of its services is much, much narrower than Google’s.

Privacy is at the heart of the current action against Google by Max Mosley, who wants the company to take down all links to external websites from its search results if those sites cover the events at the heart of his successful libel suit against News International.

Mosley is suing Google in the UK, France and Germany, and Daphne Keller popped up at the Leveson Inquiry, together with David-John Collins, head of corporate communications and public affairs for Google UK, to answer questions about the company’s policies on regulation and privacy.

Once again, the argument regarding different jurisdictions and the difficulty of implementing a global policy was raised by Keller and Collins.

Asked about an on-the-record comment by former Google chief executive, Eric Schmidt, that ‘only miscreants worry about net privacy’, Collins responded that the comment was not representative of Google’s policy on privacy, which it takes ‘extremely seriously’.

There is, of course, an interesting disjuncture between Google’s theoretical view of privacy and its treatment of its users. When it comes to examples like Max Mosley, Google pointed out – quite properly – that it can’t police the internet, that it does operate across jurisdictions and that it does ensure that there are comprehensive if somewhat esoteric mechanisms for removing private data and links from the Google listings and caches.

Yet it argues that, if individuals choose to use Google, whatever data they volunteer to the company is fair game for Google – even where that data involves third persons who may not have assented to their details being known or when, as happened during the process of building Google’s StreetView application, the company collected private data from domestic wi-fi routers without the consent or knowledge of the householders.

Keller and Collins brought their double-act to the UK parliament a few days later when they appeared before the joint committee on privacy and injunctions, chaired by John Whittingdale MP.

When asked why Google did not simply ‘find and destroy’ all instances of the images and video that Max Mosley objected to, they repeated their common mantras – Google is not the internet, and neither can nor should control the websites its search results list.

Accused by committee member Lord MacWhinney of ‘ducking and diving’ and of former culture minister, Ben Bradshaw of being ‘totally unconvincing’, Keller noted that Google could in theory police the sites it indexed, but that ‘doing so is a bad idea’.

No apparatus disinterested and qualified enough

That seems indisputable – regulating the internet should not be the job of providers like Google, Facebook or Twitter. On the contrary, the providers are the ones to be regulated, and this should be the job of legislatures equipped (unlike the Whittingdale committee) with the appropriate level of understanding and coordinated at a global level.

The internet requires global oversight – but we have no apparatus that is disinterested and qualified enough to do the job.

A new front has been opened in this battle by the latest draft rules on data protection issued by Viviane Reding’s Justice Directorate at the European Commission on 25 January.

Reding is no friend of Google or the big social networks and is keen to draw them into a framework of legislation that will – should the rules pass into national legislation – be coordinated at EU level.

Reding’s big ideas include a ‘right to be forgotten’ which will apply to online data only and an extension of the scope of personal data to cover a user’s IP address. Confidentiality should be built-in to online systems according to the new rules – an idea called ‘privacy by design’.

These ideas are already drawing flak from corporates like Google who point out that the ‘right to be forgotten’ is something that the company already upholds as far as the data it holds is concerned.

Reding’s draft rules includes an obligation by so-called ‘data controllers’ such as Google to notify third parties when someone wishes their data to be removed, so that links and copies can also be removed.

Not surprisingly, Google objects to this requirement which, if not exactly a demand to police the internet, is at least a demand to ‘help the police with their enquiries’.

The problem will not go away: how do you make sure that a global medium protects privacy, removes defamation and respects copyright while preserving its potential to empower the oppressed and support freedom of speech everywhere?

Answers on a postcard, please.

Location, Location, Location

In this guest post, Damian Radcliffe highlights some recent developments in the intersection between hyper-local SoLoMo (social, location, mobile). His more detailed slides looking at 20 developments across the sector during the last two months of 2011 are cross-posted at the bottom of this article.

Facebook’s recent purchase of location-based service Gowalla (Slide 19 below,) suggests that the social network still thinks there is a future for this type of “check in” service. Touted as “the next big thing” ever since Foursquare launched at SXSW in 2009, to date Location Based Services (LBS) haven’t quite lived up to the hype.

Certainly there’s plenty of data to suggest that the public don’t quite share the enthusiasm of many Silicon Valley investors. Yet.

Part of their challenge is that not only is awareness of services relatively low – just 30% of respondents in a survey of 37,000 people by Forrester (Slide 27) – but their benefits are also not necessarily clearly understood.

In 2011, a study by youth marketing agency Dubit found about half of UK teenagers are not aware of location-based social networking services such as Foursquare and Facebook Places, with 58% of those who had heard of them saying they “do not see the point” of sharing geographic information.

Safety concerns may not be the primary concern of Dubit’s respondents, but as the “Please Rob Me” website says: “….on one end we’re leaving lights on when we’re going on a holiday, and on the other we’re telling everybody on the internet we’re not home… The danger is publicly telling people where you are. This is because it leaves one place you’re definitely not… home.”

Reinforcing this concern are several stories from both the UK and the US of insurers refusing to pay out after a domestic burglary, where victims have announced via social networks that they were away on holiday – or having a beer downtown.

For LBS to go truly mass market – and Forrester (see Slide 27) found that only 5% of mobile users were monthly LBS users – smartphone growth will be a key part of the puzzle. Recent Ofcom data reported that:

  • Ownership nearly doubled in the UK between February 2010 and August 2011 (from 24% to 46%).
  • 46% of UK internet users also used their phones to go online in October 2011.

For now at least, most of our location based activity would seem to be based on previous online behaviours. So, search continues to dominate.

Google in a recent blog post described local search ads as “so hot right now” (Slide 22, Sept-Oct 2011 update). The search giant launched hyper-local search ads a year ago, along with a “News Near You” feature in May 2011. (See: April-May 2011 update, Slide 27.)

Meanwhile, BIA/Kelsey forecast that local search advertising revenues in the US will increase from $5.1 billion in 2010 to $8.2 billion in 2015. Their figures suggest by 2015, 30% of search will be local.

The other notable growth area, location based mobile advertising, also offers a different slant on the typical “check in” service which Gowalla et al tend to specialise in. Borrell forerecasts this space will increase 66% in the US during 2012 (Slide 22).

The most high profile example of this service in the UK is O2 More, which triggers advertising or deals when a user passes through certain locations – offering a clear financial incentive for sharing your location.

Perhaps this – along with tailored news and information manifest in services such as News Near You, Postcode Gazette and India’s Taazza – is the way forward.

Jiepang, China’s leading Location-Based Social Mobile App, offered a recent example of how to do this. Late last year they partnered with Starbucks, offering users a virtual Starbucks badge if they “checked-in” at a Starbucks store in the Shanghai, Jiangsu and Zhejiang provinces. When the number of badges issued hit 20,000, all badge holders got a free festive upgrade to a larger cup size. When coupled with the ease of NFC technology deployed to allow users to “check in” then it’s easy to understand the consumer benefit of such a service.

Mine’s a venti gingerbread latte. No cream. Xièxiè.

Twitter’s ‘censorship’ is nothing new – but it is different

Over the weekend thousands of Twitter users boycotted the service in protest at the announcement that the service will begin withholding tweets based on the demands of local governments and law enforcement.

Protesting against censorship is laudable, but it is worth pointing out that most online services already do the same, whether it’s Google’s Orkut; Apple removing apps from its store; or Facebook disabling protest groups.

Evgeny Morozov’s book The Net Delusion provides a good indicative list of examples:

“In the run-up to the Olympic torch relay passing through Hong Kong in 2008, [Facebook] shut down several groups, while many pro-Tibetan activists had their accounts deactivated for “persistent misuse of the site … Twitter has been accused of silencing online tribute to the 2008 Gaza War. Apple has been bashed for blocking Dalai Lama–related iPhone apps from its App Store for China … Google, which owns Orkut, a social network that is surprisingly popular in India, has been accused of being too zealous in removing potentially controversial content that may be interpreted as calling for religious and ethnic violence against both Hindus and Muslims.”

What’s notable about the Twitter announcement is that it suggests that censorship will be local rather than global, and transparent rather than secret. Techdirt have noted this, and Mireille Raad explains the distinction particularly well:

  • “Censorship is not silent and will not go un-noticed like most other censoring systems
  • The official twitter help center article includes the way to bypass it – simply – all you have to do is change your location to another country and overwrite the IP detection.
    Yes, that is all, and it is included in the help center
  • Quantity – can you imagine a govt trying to censor on a tweet by tweet basis a trending topic like Occupy or Egypt or Revolution – the amount of tweets can bring up the fail whale despite the genius twitter architecture , so imagine what is gonna happen to a paper work based system.
  • Speed – twitter, probably one of the fastest updating systems online –  and legislative bodies move at glaringly different speeds – It is impossible for a govt to be able to issue enough approval for a trending topic or anything with enough tweets/interest on.
  • Curiosity kills the cat  and with such an one-click-bypass process, most people will become interested in checking out that “blocked” content. People are willing to sit through endless hours of tech training and use shady services to access blocked content – so this is like doing them a service.”

I’m also reminded of Ethan Zuckerman’s ‘Cute Cats Theory’ of censorship and revolution, as explained by Cory Doctorow:

“When YouTube is taken off your nation’s internet, everyone notices, not just dissidents. So if a state shuts down a site dedicated to exposing official brutality, only the people who care about that sort of thing already are likely to notice.

“But when YouTube goes dark, all the people who want to look at cute cats discover that their favourite site is gone, and they start to ask their neighbours why, and they come to learn that there exists video evidence of official brutality so heinous and awful that the government has shut out all of YouTube in case the people see it.”

What Twitter have announced (and since clarified) perhaps makes this all-or-nothing censorship less likely, but it also adds to the ‘Don’t look at that!’ effect. The very act of censorship, online, can create a signal that is counter-productive. As journalists we should be more attuned to spotting those signals.

Comment call: Objectivity and impartiality – a newsroom policy for student projects

I’ve been updating a newsroom policy guide for a project some of my students will be working on, with a particular section on objectivity and impartiality. As this has coincided with the debate on fact-checking stirred by the New York Times public editor Arthur Brisbane, I thought I would reproduce the guidelines here, and invite comments on whether you think it hits the right note:

Objectivity and impartiality: newsroom policy

Objectivity is a method, not an element of style. In other words:

  • Do not write stories that give equal weight to each ‘side’ of an argument if the evidence behind each side of the argument is not equal. Doing so misrepresents the balance of opinions or facts. Your obligation is to those facts, not to the different camps whose claims may be false.
  • Do not simply report the assertions of different camps. As a journalist your responsibility is to check those assertions. If someone misrepresents the facts, do not simply say someone else disagrees, make a statement along the lines of “However, the actual wording of the report…” or “The official statistics do not support her argument” or “Research into X contradict this.” And of course, link to that evidence and keep a copy for yourself (which is where transparency comes in).

Lazy reporting of assertions without evidence is called the ‘View From Nowhere’ – you can read Jay Rosen’s Q&A or the Wikipedia entry, which includes this useful explanation:

“A journalist who strives for objectivity may fail to exclude popular and/or widespread untrue claims and beliefs from the set of true facts. A journalist who has done this has taken The View From Nowhere. This harms the audience by allowing them to draw conclusions from a set of data that includes untrue possiblities. It can create confusion where none would otherwise exist.”

Impartiality is dependent on objectivity. It is not (as subjects of your stories may argue) giving equal coverage to all sides, but rather promising to tell the story based on objective evidence rather than based on your own bias or prejudice. All journalists will have opinions and preconceived ideas of what a story might be, but an impartial journalist is prepared to change those opinions, and change the angle of the story. In the process they might challenge strongly-held biases of the society they report on – but that’s your job.

The concept of objectivity comes from the sciences, and this provides a useful guideline: scientists don’t sit between two camps and repeat assertions without evaluating them. They identify a claim (hypothesis) and gather the evidence behind it – both primary and secondary.

Claims may, however, already be in the public domain and attracting a lot of attention and support. In those situations reporting should be open about the information the journalist does not have. For example:

  • “His office, however, were unable to direct us to the evidence quoted”, or
  • “As the report is yet to be published, it is not possible to evaluate the accuracy of these claims”, or
  • “When pushed, X could not provide any documentation to back up her claims”.

Thoughts?

Different Speeches? Digital Skills Aren’t just About Coding…

Secretary of State for Education, Michael Gove, gave a speech yesterday on rethinking the ICT curriculum in UK schools. You can read a copy of the speech variously on the Department for Education website, or, err, on the Guardian website.

Seeing these two copies of what is apparently the same speech, I started wondering:

a) which is the “best” source to reference?
b) how come the Guardian doesn’t add a disclaimer about the provenance of, and link, to the DfE version? [Note the disclaimer in the DfE version – “Please note that the text below may not always reflect the exact words used by the speaker.”]
c) is the Guardian version an actual transcript, maybe? That is, does the Guardian reprint the “exact words” used by the speaker?

And that made me think I should do a diff… About which, more below…

Before that, however, here’s a quick piece of reflection on how these two things – the reinvention of the the IT curriculum, and the provenance of, and value added to, content published on news and tech industry blog sites – collide in my mind…

So for example, I’ve been pondering what the role of journalism is, lately, in part because I’m trying to clarify in my own mind what I think the practice and role of data journalism are (maybe I should apply for a Nieman-Berkman Fellowship in Journalism Innovation to work on this properly?!). It seems to me that “communication” is one important part (raising awareness of particular issues, events, or decisions), and holding governments and companies to account is another. (Actually, I think Paul Bradshaw has called me out on that, before, suggesting it was more to do with providing an evidence base through verification and triangulation, as well as comment, against which governments and companies could be held to account (err, I think? As an unjournalist, I don’t have notes or a verbatim quote against which to check that statement, and I’m too lazy to email/DM/phone Paul to clarify what he may or may not have said…(The extent of my checking is typically limited to what I can find on the web or in personal archives…which appear to be lacking on this point…))

Another thing I’ve been mulling over recently in a couple of contexts relates to the notion of what are variously referred to as digital or information skills.

The first context is “data journalism”, and the extent to which data journalists need to be able to do programming (in the sense of identifying the steps in a process that can be automated and how they should be sequenced or organised) versus writing code. (I can’t write code for toffee, but I can read it well enough to copy, paste and change bits that other people have written. That is, I can appropriate and reuse other people’s code, but can’t write it from scratch very well… Partly because I can’t ever remember the syntax and low level function names. I can also use tools such as Yahoo Pipes and Google Refine to do coding like things…) Then there’s the question of what to call things like URL hacking or (search engine) query building?

The second context is geeky computer techie stuff in schools, the sort of thing covered by Michael Gove’s speech at the BETT show on the national ICT curriculum (or lack thereof), and about which the educational digerati were all over on Twitter yesterday. Over the weekend, houseclearing my way through various “archives”, I came across all manner of press clippings from 2000-2005 or so about the activities of the OU Robotics Outreach Group, of which I was a co-founder (the web presence has only recently been shut down, in part because of the retirement of the sys admin on whose server the websites resided.) This group ran an annual open meeting every November for several years hosting talks from the educational robotics community in the UK (from primary school to HE level). The group also co-ordinated the RoboCup Junior competition in the UK, ran outreach events, developed various support materials and activities for use with Lego Mindstorms, and led the EPSRC/AHRC Creative Robotics Research Network.

At every robotics event, we’d try to involve kids and/or adults in elements of problem solving, mechanical design, programming (not really coding…) based around some sort of themed challenge: a robot fashion show, for example, or a treasure hunt (both variants on edge following/line following;-) Or a robot rescue mission, as used in a day long activity in the “Engineering: An Active Introduction” (TXR120) OU residential school, or the 3 hour “Robot Theme Park” team building activity in the Masters level “Team Engineering” (T885) weekend school. [If you’re interested, we may be able to take bookings to run these events at your institution. We can make them work at a variety of difficulty levels from KS3-4 and up;-)]

Given that working at the bits-atoms interface is where the a lot of the not-purely-theoretical-or-hardcore-engineering innovation and application development is likely to take place over the next few years, any mandate to drop the “boring” Windows training ICT stuff in favour of programming (which I suspect can be taught in not only a really tedious way, but a really confusing and badly delivered way too) is probably Not the Best Plan.

Slightly better, and something that I know is currently being mooted for reigniting interest in computing, is the Raspberry Pi, a cheap, self-contained, programmable computer on a board (good for British industry, just like the BBC Micro was…;-) that allows you to work at the interface between the real world of atoms and the virtual world of bits that exists inside the computer. (See also things like the OU Senseboard, as used on the OU course “My Digital Life” (TU100).)

If schools were actually being encouraged to make a financial investment on a par with the level of investment around the introduction of the BBC Micro, back in the day, I’d suggest a 3D printer would have more of the wow factor…(I’ll doodle more on the rationale behind this in another post…) The financial climate may not allow for that (but I bet budget will manage to get spent anyway…) but whatever the case, I think Gove needs to be wary about consigning kids to lessons of coding hell. And maybe take a look at programming in a wider creative context, such as robotics (the word “robotics” is one of the reason why I think it’s seen as a very specialised, niche subject; we need a better phrase, such as “Creative Technologies”, which could combine elements of robotics, games programming, photoshop, and, yex, Powerpoint too… Hmm… thinks.. the OU has a couple of courses that have just come to the end of their life that between them provide a couple of hundred hours of content and activity on robotics (T184) and games programming (T151), and that we delivered, in part, to 6th formers under the OU’s Young Applicants in Schools Scheme.

Anyway, that’s all as maybe… Because there are plenty of digital skills that let you do coding like things without having to write code. Such as finding out whether there are any differences between the text in the DfE copy of Gove’s BETT speech, and the Guardian copy.

Copy the text from each page into a separate text file, and save it. (You’ll need a text editor for that..) Then, if you haven’t already got one, find yourself a good text editor. I use Text Wrangler on a Mac. (Actually, I think MS Word may have a diff function?)

FInding diffs between txt doccs in Text Wrangler

The difference’s all tend to be in the characters used for quotation marks (character encodings are one of the things that can make all sorts of programmes fall over, or misbehave. Just being aware that they may cause a problem, as well as how and why, would be a great step in improving the baseline level understanding of folk IT. Some of the line breaks don’t quite match up either, but other than that, the text is the same.

Now, this may be because Gove was a good little minister and read out the words exactly as they had been prepared. Or it may be the case that the Guardian just reprinted the speech without mentioning provenance, or the disclaimer that he may not actually have read the words of that speech (I have vague memories of an episode of Yes, Minister, here…;-)

Whatever the case, if you know: a) that it’s even possible to compare two documents to see if they are different (a handy piece of folk IT knowledge); and b) know a tool that does it (or how to find a tool that does it, or a person that may have a tool that can do it), then you can compare the texts for yourself. And along the way, maybe learn that churnalism, in a variety of forms, is endemic in the media. Or maybe just demonstrate to yourself when the media is acting in a purely comms, rather than journalistic, role?

PS other phrases in the area: “computational thinking”. Hear, for example: A conversation with Jeannette Wing about computational thinking

PPS I just remembered – there’s a data journalism hook around this story too… from a tweet exchange last night that I was reminded of by an RT:

josiefraser: RT @grmcall: Of the 28,000 new teachers last year in the UK, 3 had a computer-related degree. Not 3000, just 3.
dlivingstone: @josiefraser Source??? Not found it yet. RT @grmcall: 28000 new UK teachers last year, 3 had a computer-related degree. Not 3000, just 3
josiefraser: That ICT qualification teacher stat RT @grmcall: Source is the Guardian http://www.guardian.co.uk/education/2012/jan/09/computer-studies-in-schools

I did a little digging and found the following document on the General Teaching Council of England website – Annual digest of statistics 2010–11 – Profiles of registered teachers in England [PDF] – that contains demographic stats, amongst others, for UK teachers. But no stats relating to subject areas of degree level qualifications held, which is presumably the data referred to in the tweet. So I’m thinking: this is partly where the role of data journalist comes in… They may not be able to verify the numbers by checking independent sources, but they may be able to shed some light on where the numbers came from and how they were arrived at, and maybe even secure their release (albeit as a single point source?)

20 free ebooks on journalism (for your Xmas Kindle)

For some reason there are two versions of this post on the site – please check the more up to date version here.

20 free ebooks on journalism (for your Xmas Kindle) {updated to 65}

Journalism 2.0 cover

As many readers of this blog will have received a Kindle for Christmas I thought I should share my list of the free ebooks that I recommend stocking up on.

Online journalism and multimedia ebooks

Starting with more general books, Mark Briggs‘s book Journalism 2.0 (PDF*) is a few years old but still provides a good overview of online journalism to have by your side. Mindy McAdams‘s 42-page Reporter’s Guide to Multimedia Proficiency (PDF) adds some more on that front, and Adam Westbrook‘s Ideas on Digital Storytelling and Publishing (PDF) provides a larger focus on narrative, editing and other elements.

After the first version of this post, MA Online Journalism student Franzi Baehrle suggested this free book on DSLR Cinematography, as well as Adam Westbrook on multimedia production (PDF). And Guy Degen recommends the free ebook on news and documentary filmmaking from ImageJunkies.com.

The Participatory Documentary Cookbook [PDF] is another free resource on using social media in documentaries.

A free ebook on blogging can be downloaded from Guardian Students when you register with the site, and Swedish Radio have produced this guide to Social Media for Journalists (in English).

The Traffic Factories is an ebook that explores how a number of prominent US news organisations use metrics, and Chartbeat’s role in that. You can download it in mobi, PDF or epub format here.

Continue reading

Social Interest Positioning – Visualising Facebook Friends’ Likes With Data Grabbed Using Google Refine

What do my Facebook friends have in common in terms of the things they have Liked, or in terms of their music or movie preferences? (And does this say anything about me?!) Here’s a recipe for visualising that data…

After discovering via Martin Hawksey that the recent (December, 2011) 2.5 release of Google Refine allows you to import JSON and XML feeds to bootstrap a new project, I wondered whether it would be able to pull in data from the Facebook API if I was logged in to Facebook (Google Refine does run in the browser after all…)

Looking through the Facebook API documentation whilst logged in to Facebook, it’s easy enough to find exemplar links to things like your friends list (https://graph.facebook.com/me/friends?access_token=A_LONG_JUMBLE_OF_LETTERS) or the list of likes someone has made (https://graph.facebook.com/me/likes?access_token=A_LONG_JUMBLE_OF_LETTERS); replacing me with the Facebook ID of one of your friends should pull down a list of their friends, or likes, etc.

(Note that validity of the access token is time limited, so you can’t grab a copy of the access token and hope to use the same one day after day.)

Grabbing the link to your friends on Facebook is simply a case of opening a new project, choosing to get the data from a Web Address, and then pasting in the friends list URL:

Google Refine - import Facebook friends list

Click on next, and Google Refine will download the data, which you can then parse as a JSON file, and from which you can identify individual record types:

Google Refine - import Facebook friends

If you click the highlighted selection, you should see the data that will be used to create your project:

Google Refine - click to view the data

You can now click on Create Project to start working on the data – the first thing I do is tidy up the column names:

Google Refine - rename columns

We can now work some magic – such as pulling in the Likes our friends have made. To do this, we need to create the URL for each friend’s Likes using their Facebook ID, and then pull the data down. We can use Google Refine to harvest this data for us by creating a new column containing the data pulled in from a URL built around the value of each cell in another column:

Google Refine - new column from URL

The Likes URL has the form https://graph.facebook.com/me/likes?access_token=A_LONG_JUMBLE_OF_LETTERS which we’ll tinker with as follows:

Google Refine - crafting URLs for new column creation

The throttle control tells Refine how often to make each call. I set this to 500ms (that is, half a second), so it takes a few minutes to pull in my couple of hundred or so friends (I don’t use Facebook a lot;-). I’m not sure what limit the Facebook API is happy with (if you hit it too fast (i.e. set the throttle time too low), you may find the Facebook API stops returning data to you for a cooling down period…)?

Having imported the data, you should find a new column:

Google Refine - new data imported

At this point, it is possible to generate a new column from each of the records/Likes in the imported data… in theory (or maybe not..). I found this caused Refine to hang though, so instead I exprted the data using the default Templating… export format, which produces some sort of JSON output…

I then used this Python script to generate a two column data file where each row contained a (new) unique identifier for each friend and the name of one of their likes:

import simplejson,csv

writer=csv.writer(open('fbliketest.csv','wb+'),quoting=csv.QUOTE_ALL)

fn='my-fb-friends-likes.txt'

data = simplejson.load(open(fn,'r'))
id=0
for d in data['rows']:
	id=id+1
	#'interests' is the column name containing the Likes data
	interests=simplejson.loads(d['interests'])
	for i in interests['data']:
		print str(id),i['name'],i['category']
		writer.writerow([str(id),i['name'].encode('ascii','ignore')])

[I think this R script, in answer to a related @mhawksey Stack Overflow question, also does the trick: R: Building a list from matching values in a data.frame]

I could then import this data into Gephi and use it to generate a network diagram of what they commonly liked:

Sketching common likes amongst my facebook friends

Rather than returning Likes, I could equally have pulled back lists of the movies, music or books they like, their own friends lists (permissions settings allowing), etc etc, and then generated friends’ interest maps on that basis.

[See also: Getting Started With The Gephi Network Visualisation App – My Facebook Network, Part I and how to visualise Google+ networks]

PS dropping out of Google Refine and into a Python script is a bit clunky, I have to admit. What would be nice would be to be able to do something like a “create new rows with new column from column” pattern that would let you set up an iterator through the contents of each of the cells in the column you want to generate the new column from, and for each pass of the iterator: 1) duplicate the original data row to create a new row; 2) add a new column; 3) populate the cell with the contents of the current iteration state. Or something like that…

PPS Related to the PS request, there is a sort of related feature in the 2.5 release of Google Refine that lets you merge data from across rows with a common key into a newly shaped data set: Key/value Columnize. Seeing this, it got me wondering what a fusion of Google Refine and RStudio might be like (or even just R support within Google Refine?)

PPPS this could be interesting – looks like you can test to see if a friendship exists given two Facebook user IDs.

2011: the UK hyper-local year in review

In this guest post, Damian Radcliffe highlights some topline developments in the hyper-local space during 2011. He also asks for your suggestions of great hyper-local content from 2011. His more detailed slides looking at the previous year are cross-posted at the bottom of this article.

2011 was a busy year across the hyper-local sphere, with a flurry of activity online as well as more traditional platforms such as TV, Radio and newspapers.

The Government’s plans for Local TV have been considerably developed, following the Shott Review just over a year ago. We now have a clearer indication of the areas which will be first on the list for these new services and how Ofcom might award these licences. What we don’t know is who will apply for these licences, or what their business models will be. But, this should become clear in the second half of the year.

Whilst the Leveson Inquiry hasn’t directly been looking at local media, it has been a part of the debate. Claire Enders outlined some of the challenges facing the regional and local press in a presentation showing declining revenue, jobs and advertising over the past five years. Her research suggests that the impact of “the move to digital” has been greater at a local level than at the nationals.

Across the board, funding remains a challenge for many. But new models are emerging, with Daily Deals starting to form part of the revenue mix alongside money from foundations and franchising.

And on the content front, we saw Jeremy Hunt cite a number of hyper-local examples at the Oxford Media Convention, as well as record coverage for regional press and many hyper-local outlets as a result of the summer riots.

I’ve included more on all of these stories in my personal retrospective for the past year.

One area where I’d really welcome feedback is examples of hyper-local content you produced – or read – in 2011. I’m conscious that a lot of great material may not necessarily reach a wider audience, so do post your suggestions below and hopefully we can begin to redress that.