Tag Archives: Scraping for Journalists

FAQ: Books to read in preparation for doing a data journalism course

This is what you’ll look like after reading all of these books… (“Study of a Man Reading” by Alphonse Legros)

This latest in the frequently asked questions series is an answer to an aspiring data journalism student who asks “Would you be able to direct me to any resources or text books that might help [prepare]?” Here are some recommendations I give to students on my MA in Data Journalism

Books on data journalism as a profession

Data journalism isn’t just the application of a practical skill, but a profession with a culture, a history, and non-technical practices.

For that reason probably the first thing to recommend is not a book, but just general reading (and listening and watching) as much data journalism, and journalism generally, as possible. These mailing lists (and these) are a good start, and following data journalists on Twitter, and the hashtag #ddj, will expose you to the debates taking place in the industry. Continue reading

All my data journalism ebooks are $5 or less this Christmas

data journalism books

The prices of my 3 data journalism ebooks — Data Journalism Heist, Finding Stories in Spreadsheets and Scraping for Journalists — have been cut to $5 on Leanpub in the lead up to Christmas. And if you want to get all 3, you can also get the data journalism books bundle on Leanpub for more than half price over the same period, at $13. Get them while it lasts!

The 2nd edition of Scraping for Journalists is now live

Scraping for Journalists

When I began publishing Scraping for Journalists in 2012, one of the reasons for choosing to publish online was the ability to publish chapters as I wrote them, and update the book in response to readers’ feedback. The book was finally ‘finished’ in 2013 — but earlier this year I decided to go through it from cover to cover and update everything.

The result — a ‘second edition’ of Scraping for Journalists — is now live. Those who bought the first edition on Leanpub will already have access to this version.

The second edition includes new scrapers for different websites, and a new chapter on scraping APIs and handling JSON.

As always, I’ll be continuing to update the book, including any examples from readers (if you’ve used the techniques in the book for a story, I’d love to know about it).

Over 1000 journalists are now exploring scraping techniques. Incredible.

Scraping for Journalists book coverLast week the number of people who have bought my ebook Scraping for Journalists passed the 1,000 mark. That is, to me, incredible. A thousand journalists interested enough in scraping to buy a book? What happened?

When I first began writing the book I imagined there might be perhaps 100 people in the world who would be interested in buying it. It was such a niche subject I didn’t even consider pitching it to my normal publishers.

Now it’s so mainstream that the 1000th ‘book’ was actually 12: purchased by a university which wanted multiple copies for its students to borrow – one of a number of such institutions to approach me to do so.  Continue reading

How to think like a computer: 5 tips for a data journalism workflow part 3

This is the final part of a series of blog posts. The first explains how using feeds and social bookmarking can make for a quicker data journalism workflow. The second looks at how to anticipate and prevent problems; and how collaboration can improve data work.

Workflow tip 5. Think like a computer

The final workflow tip is all about efficiency. Computers deal with processes in a logical way, and good programming is often about completing processes in the simplest way possible.

If you have any tasks that are repetitive, break them down and work out what patterns might allow you to do them more quickly – or for a computer to do them. Continue reading

My next ebook: the Data Journalism Heist

Data Journalism Heist data journalism ebook

In the next couple of months I will begin publishing my next ebook: Data Journalism Heist.

Data Journalism Heist is designed to be a relatively short introduction to data journalism skills, demonstrating basic techniques for finding data, spotting possible stories and turning them around to a deadline.

Based on a workshop, the emphasis is on building confidence through speed and brevity, rather than headline-grabbing spectacular investigations or difficult datasets (I’m hoping to write a separate ebook on the latter at some point).

If you’re interested in finding out about the book, please sign up on the book’s Leanpub page.

Meanwhile, I’m looking for translators for Scraping for Journalists – get in touch if you’re interested.

 

My next ebook: the Data Journalism Heist

Data Journalism Heist data journalism ebook

In the next couple of months I will begin publishing my next ebook: Data Journalism Heist.

Data Journalism Heist is designed to be a relatively short introduction to data journalism skills, demonstrating basic techniques for finding data, spotting possible stories and turning them around to a deadline.

Based on a workshop, the emphasis is on building confidence through speed and brevity, rather than headline-grabbing spectacular investigations or difficult datasets (I’m hoping to write a separate ebook on the latter at some point).

If you’re interested in finding out about the book, please sign up on the book’s Leanpub page.

Meanwhile, I’m looking for translators for Scraping for Journalists – get in touch if you’re interested.

Why I stopped working with print publishers (for a while)

Scraping for Journalists book

This was first published on the BBC College of Journalism website:

I have just spent 10 months publishing an ebook. Not ‘writing’, or ‘producing’, but 10 months publishing. Just as the internet helped flatten the news industry – making reporters into publishers and distributors – it has done the same to the book industry. The question I wanted to ask was: how does that change the book?

Having written books for traditional publishers before, my plunge into self-publishing was prompted when I decided I wanted to write a book for journalists about scraping: the technique of grabbing and combining information from online documents. Continue reading

It’s finished! Scraping for Journalists now complete (for now)

Scraping for Journalists book

Last night I published the final chapter of my first ebook: Scraping for Journalists. Since I started publishing it in July, over 40 ‘versions’ of the book have been uploaded to Leanpub, a platform that allows users to receive updates as a book develops – but more importantly, to input into its development.

I’ve been amazed at the consistent interest in the book – last week it passed 500 readers: 400 more than I ever expected to download it. Their comments have directly shaped, and in some cases been reproduced in, the book – something I expect to continue (I plan to continue to update it).

As a result I’ve become a huge fan of this form of ebook publishing, and plan to do a lot more with it (some hints here and here). The format combines the best qualities of traditional book publishing with those of blogging and social media (there’s a Facebook page too).

Meanwhile, there’s still more to do with Scraping for Journalists: publishing to other platforms and in other languages for starters… If you’re interested in translating the book into another language, please get in touch.

 

It’s finished! Scraping for Journalists now complete (for now)

Scraping for Journalists book

Last night I published the final chapter of my first ebook: Scraping for Journalists. Since I started publishing it in July, over 40 ‘versions’ of the book have been uploaded to Leanpub, a platform that allows users to receive updates as a book develops – but more importantly, to input into its development.

I’ve been amazed at the consistent interest in the book – last week it passed 500 readers: 400 more than I ever expected to download it. Their comments have directly shaped, and in some cases been reproduced in, the book – something I expect to continue (I plan to continue to update it).

As a result I’ve become a huge fan of this form of ebook publishing, and plan to do a lot more with it (some hints here and here). The format combines the best qualities of traditional book publishing with those of blogging and social media (there’s a Facebook page too).

Meanwhile, there’s still more to do with Scraping for Journalists: publishing to other platforms and in other languages for starters… If you’re interested in translating the book into another language, please get in touch.