Monthly Archives: November 2012

Yet Another Leveson Pundit conversation

7 laws journalists now need to know – from database rights to hate speech

Image by Mr T in DC

When you start publishing online you move from the well-thumbed areas of defamation and libel, contempt of court and privilege and privacy to a whole new world of laws and licences.

This is a place where laws you never knew existed can be applied to your work – while other ones can come in surprisingly useful. Here are the key ones:

Continue reading →

Live Blogs outperform other online news formats by up to 300%

5 Replies

Comparison of time spent on a selection of Live Blogs, articles, and picture galleries at Guardian.co.uk, March to May 2011

In a guest post for OJB, Neil Thurman highlights a new research report that suggests that Live Blogs outperform other online news formats by up to 300% and are seen by readers as more transparent, trusted, and ‘factual’ than conventional online news stories.

Continue reading →

Cross-post: Why I started self-publishing

6 Replies

The following was written for three:d, the newsletter of MeCCSA, the Media Communications and Cultural Studies Association (PDF, page 9).

Something has happened to self-publishing over the past few years. No longer the last resort for local historians and wannabe poets, it is now a sign of entrepreneurial spirit, an alternative to the limitations of attention-starved journalism, and a way of kicking against the pricks of mainstream publishing. Self-published books have almost tripled in number over the last five years, with a number of authors making the bestseller lists. More than one in ten ebooks bought by UK readers is now self-published.

This year I finally joined that group, as I made a long-planned move away from writing for traditional publishers towards publishing my own ebooks. In fact, I published three. So what’s the appeal? Continue reading →

Community management tips for journalists

Schofield’s list, the mob and a very modern moral panic

3 Replies

Someone, somewhere right now will be writing a thesis, dissertation or journal paper about the very modern moral panic playing out across the UK media.

What began as a story about allegations of sexual abuse by TV and radio celebrity Jimmy Savile turned into a story about that story being covered up, into how the abuse could take place (at the BBC too, in the 1970s, but also in hospitals and schools), then into wider allegations of a paedophile ring involving politicians.

Continue reading →

Scraping using regular expressions in OutWit Hub – part 2: special characters, negative matches and more

13 Replies

Image by Lasse Havelund

In the second part of this extract from Chapter 10 of Scraping for Journalists I recap the basics before discussing techniques to use in looking for patterns in data, and how regex can deal with non-textual characters such as spaces and carriage returns, special characters such as backslashes, and ‘negative matches’. You can find the first part here.

Continue reading →

The US election was a wake up call for data illiterate journalists

18 Replies

So Nate Silver won in 50 states; big data was the winner; and Nate Silver and data won the election. And somewhere along the lines some guy called Obama won something, too.

Elections set the pace for much of journalism’s development: predictable enough to allow for advance planning; big enough to justify the budgets to match, they are the stage on which news organisations do their growing up in public.

For most of the past decade, those elections have been about social media: the YouTube election; the Facebook election; the Twitter election. This time, it wasn’t about the campaigning (yet) so much as it was about the reporting. And how stupid some reporters ended up looking. Continue reading →

How-to: Scraping ugly HTML using ‘regular expressions’ in an OutWit Hub scraper

11 Replies

Regular Expressions cartoon from xkcd

The following is the first part of an extract from Chapter 10 of Scraping for Journalists. It introduces a particularly useful tool in scraping – regex – which is designed to look for ‘regular expressions’ such as specific words, prefixes or particular types of code. I hope you find it useful.

This tutorial will show you how to scrape a particularly badly formatted piece of data. In this case, the UK Labour Party’s publication of meetings and dinners with donors and trade union general secretaries.

To do this, you’ll need to install the free scraping tool OutWit Hub. Regex can be used in other tools and programming as well, but this tool is a good way to learn it without knowing any other programming. Continue reading →

Data alone isn’t enough – Tim Davies on “complexity and complementarity”

Online Journalism Blog

Comment, analysis and links covering online journalism and online news, citizen journalism, blogging, vlogging, photoblogging, podcasts, vodcasts, interactive storytelling, publishing, Computer Assisted Reporting, User Generated Content, searching and all things internet.

Monthly Archives: November 2012

Yet Another Leveson Pundit conversation

7 laws journalists now need to know – from database rights to hate speech

Live Blogs outperform other online news formats by up to 300%

Cross-post: Why I started self-publishing

Community management tips for journalists

Scraping using regular expressions in OutWit Hub – part 2: special characters, negative matches and more

The US election was a wake up call for data illiterate journalists

How-to: Scraping ugly HTML using ‘regular expressions’ in an OutWit Hub scraper

Data alone isn’t enough – Tim Davies on “complexity and complementarity”