Category Archives: twitter

Interest Differencing: Folk Commonly Followed by Tweeting MPs of Different Parties

Earlier this year I doodled a recipe for comparing the folk commonly followed by users of a couple of BBC programme hashtags (Social Media Interest Maps of Newsnight and BBCQT Twitterers). Prompted in part by a tweet from Michael Smethurst/@fantasticlife about generating an ESP map for UK politicians (something I’ve also doodled before – Sketching the Structure of the UK Political Media Twittersphere) I drew on the @tweetminster Twitter lists of MPs by party to generate lists of folk commonly followed by the MPs of each party.

Using the R wordcloud library commonality and comparison clouds, we can get a visual impression of folk commonly followed in significant numbers by all the MPs of the three main parties, as well as the folk the MPs of each party follow significantly and differentially to the other parties:

There’s still a fair bit to do making the methodology robust (for example, being able to cope with comparing folk commonly followed by different sets of users where the size of the set differs to a significant extent (for example, there is a large difference between the number of tweeting Conservative and LibDem MPs). I’ve also noticed that repeatedly running the comparison.cloud code turns up different clouds, so there’s some element of randomness in there. I guess this just adds to the “sketchy” nature of the visualisation; or maybe hints at a technique akin to the way a photogrpaher will take multiple shots of a subject before picking one or two to illustrate something in particular. Which is to say: the “truthiness” of the image reflects the message that you are trying to communicate. The visualisation in this case exposes a partial truth (which is to say, no absolute truth), or particular perspective about the way different groups differentially follow folk on Twitter. A couple of other quirks I’ve noticed about the comparison.cloud as currently defined: firstly, very highly represented friends are sized too large to appear in the cloud (which is why very commonly followed folk across all sets – the people that appear in the commonality cloud – tend not to appear) – there must be a better way of handling this? Secondly, if one person is represented so highly in one group that they don’t appear in the cloud for that group, they may appear elsewhere in the cloud. (So for example, I tried plotting clouds for folk commonly followed by a sample of the followers of @davegorman, as well as the people commonly followed by the friends of @davegorman – and @davegorman appeared as a small label in the friends part of the comparison.cloud (notwithstanding the fact that all the followers of @davegorman follow @davegorman, but not all his friends do… What might make more sense would be to suppress the display of a label in the colour of a particular group if that label has a higher representation in any of the other groups (and isn’t displayed because it would be too large)).

That said, as a quick sketch, I think there’s some information being revealed there (the coloured comparison.cloud seems to pull out some names that make sense as commonly followed folk peculiar to each party…). I guess way forward is to start picking apart the comparison.cloud code, another is to explore a few more comparison sets? Suggestions welcome as to what they might be…:-)

PS by the by, I notice via the Guardian datablog (Church vs beer: using Twitter to map regional differences in US culture) another Twitter based comparison project – Church or Beer? Americans on Twitter – which looked at geo-coded Tweets over a particular time period on a US state-wide basis and counted the relative occurrence of Tweets mentioning “church” or “beer”…

What you need to know about the laws on harassment, data protection and hate speech {UPDATED: Stalking added}

5 Replies

The following is taken from the law chapter of The Online Journalism Handbook. The book blog and Facebook page contain updates and additions – those specifically on law can be found here.

Harassment

The Protection From Harrassment Act 1997 is occasionally used to prevent journalists on reporting on particular individuals. Specifically, any conduct which amounts to harassment of someone can be considered to a criminal act, for which the victim can seek an injunction (followed by arrest if broken) or damages.

One example of a blogger’s experience is illustrative of the way the act can be used with regard to online journalism, even if no case reaches court. Continue reading →

Generation AudioBoo: how journalism students are interacting online

2 Replies

This post is by Judith Townend (@jtownend).

The journalism class of 2012 has a pretty enviable opportunity to get their stuff out there; the development of online platforms like Twitter, Google+, Storify, Tumblr, Posterous, AudioBoo, Pinterest, Facebook, Instagram, CoverItLive and Vimeo allows piecemeal dissemination of content to relevant and engaged audiences, without necessarily needing to set up a specific site.

Free technology allows them to find and do journalism outside journalism, in productive and creative ways. To adapt David Carr’s description of Brian Stelter, his browser tab-flicking colleague at the New York Times, we’re seeing the rise of the ‘robots in the basement‘. Continue reading →

The New Online Journalists #12: Michael Greenfield

What factors helped you land the job?

I was offered an interview after I was recommended to Sky News by someone I was doing freelance work for.

The main factors that helped me get to that point were:

having a Broadcast Journalism MA from City University London;
having a substantial amount of work experience in the industry;
going straight into work wherever I could get it straight off the back of my MA;
and applying myself as best I could when given the chance of bits of freelance work.

The whole process proved to me that you really don’t know how things will fall so you just have to get yourself out there.

Where do you see your career developing?

Well the scheme finishes at the end of August 2013 and I’m hoping that I will continue to work at Sky News. They are the pioneers in news coverage – they were the first UK news broadcaster to go HD, their iPad app has been awarded for it’s innovation and they are constantly looking to embrace new ideas and different approaches to how we see news.

I see my career and its relative success revolving around my ability to be a multi-platform journalist. The notion of TV, radio and online journalism being mutually exclusive is becoming increasingly outdated, and so I must strive to be a good journalist across all multi-media platforms.

Audiences expect news in many different formats now, so the more skilled I am at delivering the story through pictures, audio, online copy and social media outlets, the better I will be able to serve a public hungry for information.

I am keen to stress, however, that despite all the technological change, I will stick to the core principles of journalism that I have been taught and now exercise every day.

Are Sky and BBC leaving the field open to Twitter competitors?

11 Replies

At first glance, Sky’s decision that its journalists should not retweet information that has “not been through the Sky News editorial process” and the BBC’s policy to prioritise filing “written copy into our newsroom as quickly as possible” seem logical.

For Sky it is about maintaining editorial control over all content produced by its staff. For the BBC, it seems to be about making sure that the newsroom, and by extension the wider organisation, takes priority over the individual.

But there are also blind spots in these strategies that they may come to regret.

Our content?

The Sky policy articulates an assumption about ‘content’ that’s worth picking apart.

We accept as journalists that what we produce is our responsibility. When it comes to retweeting, however, it’s not entirely clear what we are doing. Is that news production, in the same way that quoting a source is? Is it newsgathering, in the same way that you might repeat a lead to someone to find out their reaction? Or is it merely distribution?

The answer, as I’ve written before, is that retweeting can be, and often is, all three.

Writing about a similar policy at the Oregonian late last year, Steve Buttry made the point that retweets are not endorsements. Jeff Jarvis argued that they were “quotes”.

I don’t think it’s as simple as that (as I explain below), but I do think it’s illustrative: if Sky News were to prevent journalists from using any quote on air or online where they could not verify its factual basis, then nothing would get broadcast. Live interviews would be impossible.

The Sky policy, then, seems to treat retweets as pure distribution, and – crucially – to treat the tweet in isolation. Not as a quote, but as a story, consisting entirely of someone else’s content, which has not been through Sky editorial processes but which is branded or endorsed as Sky journalism.

There’s a lot to admire in the pride in their journalism that this shows – indeed, I would like to see the same rigour applied to the countless quotes that are printed and broadcast by all media without being compared with any evidence.

But do users really see retweets in the same way? And if they do, will they always do so?

Curation vs creation

There’s a second issue here which is more about hard commercial success. Research suggests that successful users of Twitter tend to combine curation with creation. Preventing journalists from retweeting leaves them – and their employers – without a vital tool in their storytelling and distribution.

The tension surrounding retweeting can be illustrated in the difference between two broadcast journalists who use Twitter particularly effectively: Sky’s own Neal Mann, and NPR’s Andy Carvin. Andy retweets habitually as a way of seeking further information. Neal, as he explained in this Q&A with one of my classes, feels that he has a responsibility not to retweet information he cannot verify (from 2 mins in).

Both approaches have their advantages and disadvantages. But both combine curation with creation.

Network effects

A third issue that strikes me is how these policies fit uncomfortably alongside the networked ways that news is experienced now.

The BBC policy, for example, appears at first glance to prevent journalists from diving right into the story as it develops online. Social media editor Chris Hamilton does note, importantly, that they have “a technology that allows our journalists to transmit text simultaneously to our newsroom systems and to their own Twitter accounts”. However, this is coupled with the position that:

“Our first priority remains ensuring that important information reaches BBC colleagues, and thus all our audiences, as quickly as possible – and certainly not after it reaches Twitter.”

This is an interesting line of argument, and there are a number of competing priorities underlying it that I want to understand more clearly.

Firstly, it implies a separation of newsroom systems and Twitter. If newsroom staff are not following their own journalists on Twitter as part of their systems, why not? Sky pioneered the use of Twitter as an internal newswire, and the man responsible, Julian March, is now doing something similar at ITV. The connection between internal systems and Twitter is notable.

Then there’s that focus on “all our audiences” in opposition to those early adopter Twitter types. If news is “breaking news, an exclusive or any kind of urgent update”, being first on Twitter can give you strategic advantages that waiting for the six o’clock – or even typing a report that’s over 140 characters – won’t. For example:

Building a buzz (driving people to watch, listen to or search for the fuller story)
Establishing authority on Google (which ranks first reports over later ones)
Establishing the traditional authority in being known as the first to break the story
Making it easier for people on the scene to get in touch (if someone’s just experienced a newsworthy event or heard about it from someone who was, how likely is it that they search Twitter to see who else was there? You want to be the journalist they find and contact)

UPDATE: Chris Hamilton has further clarified the technical aspects in this comment:

“When the technology [to inform the newsroom and generate a tweet at the same time] isn’t available, for whatever reason, we’re asking them to prioritise telling the newsroom before sending a tweet.

“We’re talking a difference of a few seconds. In some situations.

“And we’re talking current guidance, not tablets of stone. This is a landscape that’s moving incredibly quickly, inside and outside newsrooms, and the guidance will evolve as quickly.”

Everything at the same time

There’s another side to this, which is evidence of news organisations taking a strategic decision that, in a world of information overload, they should stop trying to be the first (an increasingly hard task), and instead seek to be more authoritative. To be able to say, confidently, “Every atom we distribute is confirmed”, or “We held back to do this spectacularly as a team”.

There’s value in that, and a lot to be admired. I’m not saying that these policies are inherently wrong. I don’t know the full thinking that went into them, or the subtleties of their implementation (as Rory Cellan-Jones illustrates in his example, which contrasts with what can actually happen). I don’t think there is a right and a wrong way to ‘do Twitter’. Every decision is a trade off, because so many factors are in play. I just wanted to explore some of those factors here.

As soon as you digitise information you remove the physical limitations that necessitated the traditional distinctions between the editorial processes of newsgathering, production, editing and distribution.

A single tweet can be doing all at the same time. Social media policies need to recognise this, and journalists need to be trained to understand the subtleties too.

Leveson: the Internet Pops In

2 Replies

The following post was originally published by Gary Herman on the NUJ New Media blog. It’s reproduced here with permission.

Here at Newmedia Towers we are being swamped by events which at long last are demonstrating that the internet is really rather relevant to the whole debate about media ethics and privacy. So this is by way of a short and somewhat belated survey of the news tsunami – Google, Leveson, Twitter, ACTA, the EU and more.

When Camilla Wright, founder of celebrity gossip site Popbitch (which some years ago broke the news of Victoria Beckham’s pregnancy possibly before she even knew about it), testified before Leveson last week (26 January 2012) [Guardian liveblog; Wright’s official written statement (PDF)] the world found out (if it could be bothered) how Popbitch is used by newspaper hacks to plant stories so that they can then be said to have appeared on the internet. Anyone remember the Drudge report, over a decade ago?

Wright, of course, made a somewhat lame excuse that Popbitch is a counterweight to gossip magazines which are full of stories placed by the PR industry.

But most interesting is the fact that Wright claimed that Popbitch is self-regulated and that it works.

Leveson pronounced that he is not sure there is ‘so much of a difference’ between what Popbitch does and what newspapers do – which is somehow off the point. Popbitch – like other websites – has a global reach by definition and Wright told the Inquiry that Popbitch tries to comply with local laws wherever it was available – claims also made more publicly by Google and Yahoo! when they have in the past given in to Chinese pressure to release data that actually or potentially incriminated users and, more recently, by Twitter when it announced its intention to regulate tweets on a country-by-country basis.

Trivia – like the stuff Popbitch trades – aside, the problem is real. A global medium will cross many jurisdictions and be accessible within many different cultures. What one country welcomes, another may ban. And who should judge the merits of each?

Confusing the internet with its applications

The Arab Spring showed us that social media – like mobile phones, CB radios, fly-posted silkscreen prints, cheap offset litho leaflets and political ballads before them – have the power to mobilise and focus dissent. Twitter’s announcement should have been expected – after all, tweeting was never intended to be part of the revolutionaries’ tool-kit.

There are already alternatives to Twitter – Vibe, Futubra, Plurk, Easy Chirp and Blackberry Messenger, of course – and the technology itself will not be restrained by the need to expand into new markets. People confuse the internet with its applications – a mistake often made by those authorities who seek to impose a duty to police content on those who convey it.

Missing the point again, Leveson asked whether it would be useful to have an external ombudsman to advise Popbitch on stories and observed that a common set of standards across newspapers and websites might also help.

While not dismissing the idea, Wright made the point that the internet made it easy for publications to bypass UK regulators.

This takes us right into the territory of Google, Facebook and the various attempts by US and international authorities to introduce regulation and impose duties on websites themselves to police them.

ACTA, SOPA and PIPA

The latest example is the Anti-Counterfeit Trade Agreement (ACTA) – a shadowy international treaty which, according to Google’s legal director, Daphne Keller, speaking over a year ago, has ‘metastasized’ from a proposal on border security and counterfeit goods to an international legal framework covering copyright and the internet.

According to a draft of ACTA, released for public scrutiny after pressure from the European Union, internet providers who disable access to pirated material and adopt a policy to counter unauthorized ‘transmission of materials protected by copyright’ will be protected against legal action.

Fair use rights would not be guaranteed under the terms of the agreement.

Many civil liberty groups have protested the process by which ACTA has been drafted as anti-democratic and ACTA’s provisions as draconian.

Google’s Keller described ACTA as looking ‘a lot like cultural imperialism’.

Google later became active in the successful fight against the US Stop Online Piracy Act (SOPA) and the related Protect Intellectual Proerty Act (PIPA), which contained similar provisions to ACTA.

Google has been remarkably quite on the Megaupload case, however. This saw the US take extraterritorial action against a Hong Kong-based company operating a number of websites accused of copyright infringement.

The arrest of all Megaupload’s executives and the closure of its sites may have the effect of erasing perfectly legitimate and legal data held on the company’s servers – something which would on the face of it be an infringement of the rights of Megaupload users who own the data.

Privacy

Meanwhile, Google – in its growing battle with Facebook – has announced its intention to introduce a single privacy regime for 60 or so of its websites and services which will allow the company to aggregate all the data on individual users the better to serve ads.

Facebook already does something similar, although the scope of its services is much, much narrower than Google’s.

Privacy is at the heart of the current action against Google by Max Mosley, who wants the company to take down all links to external websites from its search results if those sites cover the events at the heart of his successful libel suit against News International.

Mosley is suing Google in the UK, France and Germany, and Daphne Keller popped up at the Leveson Inquiry, together with David-John Collins, head of corporate communications and public affairs for Google UK, to answer questions about the company’s policies on regulation and privacy.

Once again, the argument regarding different jurisdictions and the difficulty of implementing a global policy was raised by Keller and Collins.

Asked about an on-the-record comment by former Google chief executive, Eric Schmidt, that ‘only miscreants worry about net privacy’, Collins responded that the comment was not representative of Google’s policy on privacy, which it takes ‘extremely seriously’.

There is, of course, an interesting disjuncture between Google’s theoretical view of privacy and its treatment of its users. When it comes to examples like Max Mosley, Google pointed out – quite properly – that it can’t police the internet, that it does operate across jurisdictions and that it does ensure that there are comprehensive if somewhat esoteric mechanisms for removing private data and links from the Google listings and caches.

Yet it argues that, if individuals choose to use Google, whatever data they volunteer to the company is fair game for Google – even where that data involves third persons who may not have assented to their details being known or when, as happened during the process of building Google’s StreetView application, the company collected private data from domestic wi-fi routers without the consent or knowledge of the householders.

Keller and Collins brought their double-act to the UK parliament a few days later when they appeared before the joint committee on privacy and injunctions, chaired by John Whittingdale MP.

When asked why Google did not simply ‘find and destroy’ all instances of the images and video that Max Mosley objected to, they repeated their common mantras – Google is not the internet, and neither can nor should control the websites its search results list.

Accused by committee member Lord MacWhinney of ‘ducking and diving’ and of former culture minister, Ben Bradshaw of being ‘totally unconvincing’, Keller noted that Google could in theory police the sites it indexed, but that ‘doing so is a bad idea’.

No apparatus disinterested and qualified enough

That seems indisputable – regulating the internet should not be the job of providers like Google, Facebook or Twitter. On the contrary, the providers are the ones to be regulated, and this should be the job of legislatures equipped (unlike the Whittingdale committee) with the appropriate level of understanding and coordinated at a global level.

The internet requires global oversight – but we have no apparatus that is disinterested and qualified enough to do the job.

A new front has been opened in this battle by the latest draft rules on data protection issued by Viviane Reding’s Justice Directorate at the European Commission on 25 January.

Reding is no friend of Google or the big social networks and is keen to draw them into a framework of legislation that will – should the rules pass into national legislation – be coordinated at EU level.

Reding’s big ideas include a ‘right to be forgotten’ which will apply to online data only and an extension of the scope of personal data to cover a user’s IP address. Confidentiality should be built-in to online systems according to the new rules – an idea called ‘privacy by design’.

These ideas are already drawing flak from corporates like Google who point out that the ‘right to be forgotten’ is something that the company already upholds as far as the data it holds is concerned.

Reding’s draft rules includes an obligation by so-called ‘data controllers’ such as Google to notify third parties when someone wishes their data to be removed, so that links and copies can also be removed.

Not surprisingly, Google objects to this requirement which, if not exactly a demand to police the internet, is at least a demand to ‘help the police with their enquiries’.

The problem will not go away: how do you make sure that a global medium protects privacy, removes defamation and respects copyright while preserving its potential to empower the oppressed and support freedom of speech everywhere?

Answers on a postcard, please.

Twitter’s ‘censorship’ is nothing new – but it is different

FAQ: Niche blogs vs mainstream media outlets

1 Reply

Here’s another collection of questions answered here to avoid duplication. This time from a final year student at UCLAN:

Blogs are often based on niche subject areas and created by individuals from a community. Do you think mainstream media outlets are limited by resources to compete? Or are there signs they are adapting?

I think they are more limited by passion, and by commercial imperatives. Niche blogs tend to be driven by passion initially, and sometimes by the commercial imperative to target those niches, whereas mainstream outlets are built on scale and mass audiences – or affluent audiences who still don’t really qualify as a niche.

They are adapting as the commercial drive changes and advertisers look for measurements of engagement, but it’s hard, as your next question fleshes out…

Communities by nature need conversation, and this often visible online in forums, blog comments etc. Can it be argued niche blogs are better at engaging communities and providing a platform for conversation?

…yes, but more because they often build those communities from the ground up, whereas established media platforms are having to start with a mass audience and carve niches out of those. It’s like trying to hold a community meeting in the middle of a busy high street, compared to doing it in a community centre.

… If so, do you think the success of blogs are as a result of people wanting conversation instead of a ‘lecture from journalists?

Not necessarily – I think blogs succeed (and fail) for all sorts of reasons. One of those is that blogs have made it easier to connect with likeminded people across the platform (in comments, for example, without having to fight through hundreds of comments from idiots), another is the ability for users to input into the journalistic process rather than merely consuming a story, and another is the ability to focus on elements of an issue which may not be accessible enough to justify coverage by a mass audience publication – and I’m sure there are as many other reasons as there are blogs.

Finally, with the emergence of Twitter, along with other methods of contact, are journalists now becoming more involved in conversation with communities of interest or is there still a reluctance from journalists to be involved?

Some recent research in the US suggested that Twitter is still being used overwhelmingly as a broadcast platform by journalists and news brands. But there are also an increasing number of journalists who are using it particularly effectively as a way to talk with users. My own research into blogging suggested a similar effect. So yes, there is reluctance (talking to sources is hard work, after all, whether it’s on Twitter, the phone, or face to face – and for many journalists it’s easier to avoid it) but the culture is changing slowly.

The strikes and the rise of the liveblog

11 Replies

Liveblogging the strikes: Twitter's #n30 stream

Today sees the UK’s biggest strike in decades as public sector workers protest against pension reforms. Most news organisations are covering the day’s events through liveblogs: that web-native format which has so quickly become the automatic choice for covering rolling news.

To illustrate just how dominant the liveblog has become take a look at the BBC, Channel 4 News, The Guardian’s ‘Strikesblog‘ or The Telegraph. The Independent’s coverage is hosted on their own live.independent.co.uk subdomain while Sky have embedded their liveblog in other articles. There’s even a separate Storify liveblog for The Guardian’s Local Government section, and on Radio 5 Live you can find an example of radio reporters liveblogging.

Regional newspapers such as the Chronicle in the north east and the Essex County Standard are liveblogging the local angle; while the Huffington Post liveblog the political face-off at Prime Minister’s Question Time and the PoliticsHome blog liveblogs both. Leeds Student are liveblogging too. And it’s not just news organisations: campaigning organisation UK Uncut have their own liveblog, as do the public sector workers union UNISON and Pensions Justice (on Tumblr).

So dominant so quickly

The format has become so dominant so quickly because it satisfies both editorial and commercial demands: liveblogs are sticky – people stick around on them much longer than on traditional articles, in the same way that they tend to leave the streams of information from Twitter or Facebook on in the background of their phone, tablet or PC – or indeed, the way that they leave on 24 hour television when there are big events.

It also allows print outlets to compete in the 24-hour environment of rolling news. The updates of the liveblog are equivalent to the ‘time-filling’ of 24-hour television, with this key difference: that updates no longer come from a handful of strategically-placed reporters, but rather (when done well) hundreds of eyewitnesses, stakeholders, experts, campaigners, reporters from other news outlets, and other participants.

The results (when done badly) can be more noise than signal – incoherent, disconnected, fragmented. When done well, however, a good liveblog can draw clarity out of confusion, chase rumours down to facts, and draw multiple threads into something resembling a canvas.

At this early stage liveblogging is still a form finding its feet. More static than broadcast, it does not require the same cycle of repetition; more dynamic than print, it does, however, demand regular summarising.

Most importantly, it takes place within a network. The audience are not sat on their couches watching a single piece of coverage; they may be clicking between a dozen different sources; they may be present at the event itself; they may have friends or family there, sending them updates from their phone. If they are hearing about something important that you’re not addressing, you have a problem.

The list of liveblogs above demonstrates this particularly well, and it doesn’t include the biggest liveblog of all: the #n30 thread on Twitter (and as Facebook users we might also be consuming a liveblog of sorts of our friends’ updates).

Getting Started With Twitter Analysis in R

Earlier today, I saw a post vis the aggregating R-Bloggers service a post on Using Text Mining to Find Out What @RDataMining Tweets are About. The post provides a walktrhough of how to grab tweets into an R session using the twitteR library, and then do some text mining on it.

I’ve been meaning to have a look at pulling Twitter bits into R for some time, so I couldn’t but have a quick play…

Starting from @RDataMiner’s lead, here’s what I did… (Notes: I use R in an R-Studio context. If you follow through the example and a library appears to be missing, from the Packages tab search for the missing library and import it, then try to reload the library in the script. The # denotes a commented out line.)

require(twitteR)
#The original example used the twitteR library to pull in a user stream
#rdmTweets <- userTimeline("psychemedia", n=100)
#Instead, I'm going to pull in a search around a hashtag.
rdmTweets <- searchTwitter('#mozfest', n=500)
# Note that the Twitter search API only goes back 1500 tweets (I think?)

#Create a dataframe based around the results
df <- do.call("rbind", lapply(rdmTweets, as.data.frame))
#Here are the columns
names(df)
#And some example content
head(df,3)

So what can we do out of the can? One thing is look to see who was tweeting most in the sample we collected:

counts=table(df$screenName)
barplot(counts)

# Let's do something hacky:
# Limit the data set to show only folk who tweeted twice or more in the sample
cc=subset(counts,counts>1)
barplot(cc,las=2,cex.names =0.3)

Now let’s have a go at parsing some tweets, pulling out the names of folk who have been retweeted or who have had a tweet sent to them:

#Whilst tinkering, I came across some errors that seemed
# to be caused by unusual character sets
#Here's a hacky defence that seemed to work...
df$text=sapply(df$text,function(row) iconv(row,to='UTF-8'))

#A helper function to remove @ symbols from user names...
trim <- function (x) sub('@','',x)

#A couple of tweet parsing functions that add columns to the dataframe
#We'll be needing this, I think?
library(stringr)
#Pull out who a message is to
df$to=sapply(df$text,function(tweet) str_extract(tweet,"^(@[[:alnum:]_]*)"))
df$to=sapply(df$to,function(name) trim(name))

#And here's a way of grabbing who's been RT'd
df$rt=sapply(df$text,function(tweet) trim(str_match(tweet,"^RT (@[[:alnum:]_]*)")[2]))

So for example, now we can plot a chart showing how often a particular person was RT’d in our sample. Let’s use ggplot2 this time…

require(ggplot2)
ggplot()+geom_bar(aes(x=na.omit(df$rt)))+opts(axis.text.x=theme_text(angle=-90,size=6))+xlab(NULL)

Okay – enough for now… if you’re tempted to have a play yourself, please post any other avenues you explored with in a comment, or in your own post with a link in my comments;-)