Tag Archives: New York Times

Word cloud or bar chart?

Bar charts preferred over word clouds

One of the easiest ways to get someone started on data visualisation is to introduce them to word clouds (it also demonstrates neatly how not all data is numerical).

Using tools like Wordle and Tagxedo, you can paste in a major speech and see it visualised within a minute or so.

But is a word cloud the best way of visualising speeches? The New York Times appear to think otherwise. Their visualisation (above) comparing President Obama’s State of the Union address and speeches by Republican presidential candidates chooses to use something far less fashionable: the bar chart.

Why did they choose a bar chart? The key is the purpose of the chart: comparison. If your objective is to capture the spirit of a speech, or its key themes, then a word cloud can still work well, if you clean the data (see this interactive example that appeared on the New York Times in 2009).

But if you want to compare it to speeches of others – and particularly if you want to compare on specific issues such as employment or tax – then bar charts are a better choice. Compare, for example, ReadWriteWeb’s comparison of inaugural speeches, and how effective that is compared to the bar charts.

In short, don’t always reach for the obvious chart type – and be clear what you’re trying to communicate.

UPDATE: More criticism of word clouds by New York Times software architect here (via Harriet Bailey)

Obama inaugural speech word cloud by ReadWriteWeb

Obama inaugural speech word cloud by ReadWriteWeb

via Flowing Data

AUDIO: Text mining tips from Andy Lehren and Sarah Cohen

Searches made of the Sarah Palin emails

Searches made of the Sarah Palin emails - from a presentation by the New York Times's Andy Lehren

One of the highlights of last week’s Global Investigative Journalism Conference was the session on text mining, where the New York Times’s Andy Lehren talked about his experiences of working with data from Wikileaks and elsewhere, and former Washington Post database editor Sarah Cohen gave her insights into various tools and techniques in text mining.

Andy Lehren’s audio is embedded below. The story mentioned on North Korean missile deals can be found here. Other relevant links: Infomine and NICAR Net Tour.

And here’s Sarah’s talk which covers extracting information from large sets of documents. Many of the tools mentioned are bookmarked ‘textmining’ on my Delicious account.

 

New York Times paywall: sense prevails over ideology (almost)

So, the plans for the New York Times paywall are out. I said when they were first mooted that they looked to be thinking along the right lines in allowing people to view content for free if they came via social media – but I feared that that innovation would be lost along the way.

It’s enormously encouraging to see that it hasn’t.

Why is it encouraging? For two main reasons: firstly, it recognises the importance of distribution in online publishing. If you erect an arbitrary paywall, many people will not bother to link to you because they don’t want to frustrate their friends. That not only hurts your social media traffic, it hurts your search engine ranking.

Variety magazine suffered from this so much recently, it seems, that they launched a blog outside of their paywall with an email begging other sites to link to it.

Secondly, it recognises that they need to balance quality with quantity. Online advertising has yet to settle into any sort of pattern, but metrics of engagement are rising in importance, and one of those metrics is how much traffic comes from recommendations, i.e. social media.

Another metric is, of course, how loyal a user is, how many articles they read, and how much you know about them. The subscription options will allow the NYT to gather that information too – without sacrificing the huge numbers that most advertisers will be looking for.

Curiously, the chairman of The New York Times Company is quoted as saying “A few years ago it was almost an article of faith that people would not pay for the content they accessed via the Web.”

But I don’t think they are paying just for the content. I think this system recognises that they are paying for convenience (you pay more to get the content across web, mobile and iPad than you do to get the same content on fewer platforms – and you could get all the content for free if you can bother to go through Bing), and reliability (not hitting a wall when you want to read the 21st article of the month).

In many ways it is no different to traditional subscriptions: it is the difference between paying for regular deliveries of the whole paper package, and picking up a newspaper that someone has left on the bus or the staff canteen, or borrowing one from a friend, for free.

In the past we accounted for those ‘freeloaders’ and ‘parasites’ – as we call them online – by adjusting our readership figures to reflect that every copy bought was read by 4 people. We didn’t lock down the newspapers and tell subscribers what they could do with them.

And so here we are, with the most mature, intelligent, and commercially sensible paywall model yet.

But we still have no idea if it will work…

UPDATE: Aside from the technical implementation I think Dave Winer has a point about the content proposition

"The mass market was a hack": Data and the future of journalism

The following is an unedited version of an article written for the International Press Institute report ‘Brave News Worlds (PDF)

For the past two centuries journalists have dealt in the currency of information: we transmuted base metals into narrative gold. But information is changing.

At first, the base metals were eye witness accounts, and interviews. Later we learned to melt down official reports, research papers, and balance sheets. And most recently our alloys have been diluted by statements and press releases.

But now journalists are having to get to grips with a new type of information: data. And this is a very rich seam indeed.

Data: what, how and why

Data is a broad term so I should define it here: I am not talking here about statistics or numbers in general, because those are nothing new to journalists. When I talk about data I mean information that can be processed by computers.

This is a crucial distinction: it is one thing for a journalist to look at a balance sheet on paper; it is quite another to be able to dig through those figures on a spreadsheet, or to write a programming script to analyse that data, and match it to other sources of information. We can also more easily analyse new types of data, such as live data, large amounts of text, user behaviour patterns, and network connections.

And that, for me, is hugely important. Indeed, it is potentially transformational. Adding computer processing power to our journalistic arsenal allows us to do more, faster, more accurately, and with others. All of which opens up new opportunities – and new dangers. Things are going to change. Continue reading

Visualisation through sound – the New York Times 'audiolises' the Winter Olympics

http://www.nytimes.com/interactive/2010/02/26/sports/olympics/20100226-olysymphony.html

The New York Times has combined visualisation with audio to produce a fascinating piece of work on the differences between gold winning times and runners-up across a number of Winter Olympics events. It’s a particularly creative approach to the challenge of communicating a relatively abstract story: what separates gold and silver. Well worth a look.

h/t Pete Ashton

Living Stories: NYT and Google produce jaw-dropping online journalism form

How good is this? While Murdoch and Sly complain about Google, The New York Times and Washington Post have been working with the search engine behemoth on a new form of online journalism. I’m still getting my head around the results, because the format is brimming with clever ideas. Here’s the obligatory cheesy video before I get my teeth into it:

So what’s so special about this? Firstly, it is built around the way people consume content online, as opposed to how they consumed it in print or broadcast. In other words, the unit of entry is the ‘topic’, not the ‘article’, ‘broadcast’ or ‘publication’. If you look at search behaviour, that’s often what people search for (and why Wikipedia is so popular). Continue reading

Arriving at an ideal social-media policy for journalism, Part 1: Perspectives from journalists and news organizations

Much has been said about the Washington Post’s now-infamous incident with issuing restrictive social-media guidelines after Managing Editor Raju Narisetti expressed his not-so-subtle views on war spending and public-official term limits on his Twitter page. Narisetti’s own first reaction to the policy was another tweet: “For flagbearers of free speech, some newsroom execs have the weirdest double standards when it comes to censoring personal views.” He since retracted and shut down his Twitter page on account of “perception problems.”

The Post’s own media reporter Howard Kurtz poked fun at the incident with this tweet: “I will now hold forth only on the weather and dessert recipes.” He then gave a half-hearted, almost contrived endorsement to his organization’s policy, calling the furor surrounding the incident “much ado about nothing” while emphasizing that social media are important channels for communication with readers. The newspaper’s technology writer Rob Pegoraro was also quick to insist that journalistic interactions through social media are indispensable.

It is hard to deny the fact that opiners are neatly divided between journalists and news organizations–in other words—between those that use social media and those that want to regulate it.

The very essence of social media is that it offers readers a glimpse of the “person” behind the journalist. Citizen journalism pioneer Dan Gillmor looks at social networks as an opportunity for news organizations “to show readers that news is not a commodity produced by a faceless institution but a rich, collaborative process.”

For instance, Post political reporter Chris Cillizza, whose Twitter account, “The Fix” is named after his blog at the paper, entertains readers not only with snarky political comments but also by finding humor in life’s little trials, and his Twitter page has been surprisingly—and comfortingly—unhindered by all the drama. If his tweets were to trickle down to news article URLs in keeping with the Post’s new regulations, I wouldn’t follow him. It’s safe to say, neither would 14,540 others.

Despite these differences, even old-school news organizations agree that social media are important. But can managers, editors, reporters and readers agree on a social media policy? To that end, it would, perhaps, be helpful to analyze guidelines that have so far been proposed by different news organizations, and more importantly, how they have been received.

The policies

The Wall Street Journal laid down its own set of social-media regulations over the summer to much opposition.“Sharing your opinions,” the Journal said in an e-mail to staff members, “could open us to criticism that we have biases and could make a reporter ineligible to cover topics in the future for Dow Jones.” A tad more ridiculously, it continued, “Openly “friending” sources is akin to publicly publishing your Rolodex.”

Apart from confidential sources that any journalist would be expected to protect through sheer common sense, social media interactions with reporting contacts can only serve to enrich the exercise of newsgathering, and allow a more transparent process while at it.

Continuing in the same vein of going against the grain of journalistic transparency, the WSJ guidelines also insist that reporters not “detail how an article was reported, written or edited.” Social media guru Jeff Jarvis rightfully points out that these rules challenge the very idea of the collaborative nature of journalism that is promoted by online media.

The ability of a journalist to interact with his audience, be it by seeking story ideas, soliciting sources or sharing the newsgathering process is one of the main advantages of social media. Time’s James Poniewozik astutely calls blogs and social networks, the “DVD director’s cut with commentary.”

Perhaps, one of the most ridiculous of guidelines comes from the AP, which over the summer issued a set of rules, among them, asking employees to control not only what they said on social networks but also what their friends and acquaintances said: “It’s a good idea to monitor your profile page to make sure material posted by others doesn’t violate AP standards; any such material should be deleted.”

The AP’s rules came in the aftermath of one of its reporters posting a critical comment about the McClatchy newspaper chain on his Facebook profile. Mashable’s Ben Parr expressed rightful outrage at this, pointing to the ridiculousness of holding an employee accountable for another individual’s words.

Some guidelines, of course, are acceptable, though none seem to require much more than common sense and ethical awareness on the part of the reporter. For instance, the WSJ’s following rules:

  • “Don’t recruit friends or family to promote or defend your work,” or
  • “Don’t disparage the work of colleagues or competitors or aggressively promote your coverage.”

Also reasonable are rules curbing the sharing of confidential company information. “Posting material about the AP’s internal operations is prohibited on employees’ personal pages” is acceptable as a standard for all staff members at an organization, not exclusively for journalists.

This was one of the reasons why the NYT found itself in a tight corner earlier this summer, when its reporters tweeted about internal discussions at the paper. The Timessocial-media rules are actually more reasonable than most, merely asking reporters to avoid conflicts of interest, maintain political impartiality, and exercise good judgment.

But when a group of journalists decided to broadcast proceedings from an internal staff meeting, the Times decided to throw down the gauntlet. Craig Whitney, the standards editor, made a valid point: “When you’re in an internal meeting that is not public where you’re discussing policy, you would no more Twitter it than pick up the cell phone or call up one of your friends and say, ‘Hey you’ll never believe what (Executive Editor) Bill Keller just said!”

And while that is perfectly reasonable, Jennifer Lee, one of the tweeters from the meeting insisted that there is often something to be said for sharing internal information about your news organization with your audiences. For instance, her tweet about Times’ Pulitzer winners was not only acceptable, but also good for the paper, she said.

Are readers excited to learn these nuggets of information directly from journalists they follow? Sure, it’s certainly more personal than reading a press release. And when the news is about the organization itself, it is especially helpful to hear employees’ unfiltered opinions. If not for Twitter, I probably would have had no way of knowing what Howard Kurtz thought about the Post’s regulations.

Distinction between individual tweeters and institutional ones

Where the Times went a bit far in its regulation was Bill Keller’s insistence that tweeting policies should follow what was already being implemented with regard to what reporters say on television or speeches: anything said was representative of the entire institution. This seems reasonable till you consider that Twitter is a “personal-social” page. It is not like appearing on television to talk about your thoughts and viewpoints on an issue as a reporter from the NYT might be expected to on Meet the Press.

This sentence among the Post‘s guidelines, rings a similar tone: “Post journalists must recognize that any content associated with them in an online social network is, for practical purposes, the equivalent of what appears beneath their bylines in the newspaper or on our website.”

Along the same lines, Rob King, Editor in Chief of ESPN.com, called Twitter a “live microphone.” The site’s guidelines state that “editorial decision makers (such as reporters and writers) essentially represent ESPN in all social networks, and hence, should exercise appropriate judgment (this is as opposed to policies for the rest of ESPN’s staff who may extricate themselves from ESPN affiliation in personal blogs).

ESPN sparked its own controversy when it recently banned reporters from using Twitter for content not sanctioned by ESPN.com, and Mediaite actually questioned the use of the “live microphone” metaphor in an interview with ESPN spokesman Paul Melvin: “Does ESPN recognize the difference between a Twitter feed and a live microphone on television (which requires incredibly exclusive access as well as millions of dollars of broadcast infrastructure)?”

Melvin’s response: “The point here is that all of these media are public. Whether it is TV or radio or a blog, a column a tweet or any other publishing format, these are all public media. The words we use have impact, and we should be mindful of that.”

This is significant. What a journalist says in a tweet cannot be similar to what would appear under a byline or on live television or on radio. Social media don’t operate strictly within the sphere of the workplace. Social media are part of what journalists carry home with them; it is where they ought to be able to express views wholly unrestrained by the rigid rules of traditional journalism. It is also where they delight their readers with a goofy tale about their dog and the latest controversy unfolding on Capitol Hill with equal aplomb.

A distinction should be made (as is done in the business world) between “individual” tweeters, and tweeters who tweet “under the umbrella of an organization.” Corporate policies on social media separate the personal from the professional, and hence are less restrictive on an employee’s right to tweet or blog. By these standards, @washingtonpost would clearly cross the line by tweeting about enforcing a term limit on senators such as Mr. Byrd, but @rajunarisetti was entitled to his opinion. As individual tweeters, journalists should not “relinquish some of the personal privileges of private citizens,” as the Post guidelines require them to.

The BBC, perhaps comes closest to adopting this sort of hands-off approach to the use of “personal” social media by its reporters: “Many bloggers, particularly in technical areas, use their personal blogs to discuss their BBC work in ways that benefit the BBC, and add to the “industry conversation”.  This editorial guidance note is not intended to restrict this, as long as confidential information is not revealed.” In addition, it excludes “personal” blogs from the guidelines, as long as no affiliation to the BBC is mentioned, and even encourages employees to include a disclaimer.

Is unadulterated objectivity possible?

It does, however, specify that editorial staff “should not be seen to support any political party or cause.” It also warns employees to discuss “any potential conflicts of interest” with managers and editors. This is a common theme among regulations cited by all news organizations. Perhaps, if a reporter did not share on his social network opinions and viewpoints on subjects he was reporting on, that would be acceptable.

But then again, restricting specific types of content is a slippery slope. As Editor & Publisher editor Jennifer Saba questions,“Somebody could say, ‘Oh I really enjoy Mad Men,’ and if they cover TV, does that mean they are biased?”

Post ombudsman Andrew Alexander raises this very question in his piece: “Can a reporter who doesn’t cover sports tweet that a team’s owner is a tyrant? Should an editor in the Business section post a comment on her Facebook page that gun owners are paranoid?” I’m not sure if his question is rhetorical, but unfortunately for Saba, he fails to answer it. The New York Times, ever our reliable source for information, jumps in, however: “A City Hall reporter or a politics editor might be “friends” with several different City Council members as well as the Mayor, but not just with one of them. But a reporter or editor whose work has nothing to do with City Hall could be “friends” with people who work there with no conflict of interest.”

But then again, is unadulterated objectivity on a subject a journalist has studied closely, even possible? As James Poniewozik writes, “any person who immersed him or herself in a vital, contentious subject all day and formed no opinion about it whatsoever would be an idiot, and you do not want to get your news from idiots.” And if he does have an opinion, is it in keeping with journalism’s goals to shield it?

Not surprisingly, organizations that appear to be least restrictive of journalists’ use of social media are also the ones that have embraced social networks to effectively disseminate information, engage with the audience, and promote content, such as the BBC and the New York Times, and NPR, which is touted by many as the most effective user of social media, most notably, Mashable.

Alan Rusbridger, Editor-in-chief of the Guardian, another organization known for its utilization of social media tools for citizen journalism and crowdsourcing, has perhaps been most convincing in his ringing endorsement of journalists’ use of such networks to interact, engage and impart information. He has clearly stated on the site’s editorial pages that one of the advantages of Twitter is that it allows reporters to publish, unhindered by the confines of the newspaper and its Web site. This is also reinforced in the site’s social media statement, which promotes the idea of an open forum that promotes all forms of social networking interactions with readers.

Any set of reasonable rules for social media, then, are more common-sense parameters than anything else. And one would hope that journalists would be smart enough to not broadcast something on Twitter that would jeopardize their own credibility, alienate audiences, or embarrass their organizations.

As NYT’s David Carr writes “if you can’t trust the women and men who put out your newspaper to use their keyboards wisely regardless of platform, what are they doing working for you?”

[Part 2 will look at perspectives from history, such as the role of objectivity and the influence of technology on the changing rules of journalism]

Guardian joins NYT in mulling over members’ club

It seems The Guardian is considering launching a members’ club of some sort as part of moves to increase revenue, an idea that was also mooted by the New York Times a few months ago.

Members clubs are not a particularly new idea – they’ve been used successfully in the magazine industry for a long time – and they have a lot of potential, although probably not as a massive revenue generator, and less so in a recession (talk to anyone in the events industry to understand why). I’m trying to get hold of some concrete figures and experiences of these – if you have any, I’d be grateful if you could add them.

The biggest problem for newspapers in putting together a members’ club is the diversity of their ‘members’.

When the New York Times’ Bill Keller described their possible members’ club it apparently included “a baseball cap or a T-shirt, an invite to a Times event, or perhaps, like The Economist, access to specialized content on the Web.”

The Guardian appear to have a little more imagination: “benefits might include, for example, a welcome pack, exclusive content, live events, special offers from our partners and the opportunity to communicate with our journalists.”*

Still, from the very vague initial impressions I think both are making the mistake of seeing readers as an amorphous mass of ‘news consumers’ rather than a collection of niche markets.

The Guardian, for example, has particular strengths in covering the media, education, and ‘society’ (the supplements it prints on the first 3 days of the week). If I was launching a members’ club I would start with one of those (not media) and branch outwards. The offering then becomes much clearer (both to readers and commercial partners), the learning curve quicker and less damaging – and it also becomes easier for users to charge it to an institution.

*By the way, I love the fact that “the opportunity to communicate with our journalists” is part of the deal. So much for being ‘part of the conversation’

The future of journalism: Will journalists be paying out of their own pockets?

While talking to an editor at a newspaper that had made a splash with a crowdsourced investigative story a couple years ago, I remember the subject of payment coming up, to which she made an interesting point. The citizens who contribute their time and effort have a personal interest in the story and do it because they want to help the paper – this is a citizenry interacting with its hometown newspaper for the betterment of the community and for the good of democracy. It was a valid point. After all, if they paid their citizens, they wouldn’t just be citizens anymore, they’d be employees.

News organizations have long been excused from digital sharecropping, a label that has been attached to crowdsourced businesses that exploit free labor from the public without offering compensation. Perhaps, media entities benefit from the altruistic and democratic nature of information sharing. The millions of Internet users that voluntarily put content out for free are more than a testament to that.

But where should the line be drawn? When should news organizations and media conglomerates begin to have to start paying for utilizing the time and resources of their volunteer contributors while holding complete ownership of the product – or at the very least, making revenue off of an individual’s product? Continue reading

More crowdsourcing from the Guardian and NYT – this time on Iran

They’re at it again. Following the very domestic issue of MPs’ expenses, The Guardian’s latest experiment with crowdsourcing goes international: Iran. Continue reading