VIDEO PLAYLIST: An introduction to Python for data journalism and scraping

Python is an extremely powerful language for journalists who want to scrape information from online sources. This series of videos, made for students on the MA in Data Journalism at Birmingham City University, explains some core concepts to get started in Python, how to use Colab notebooks within Google Drive, and introduces some code to get started with scraping.

Video 1: Python lists

The first video introduces lists in Python, why they’re so important to scraping, and the different ways that lists are used, from generating URLs to scrape to storing the data that’s in the webpages you want to scrape.

It also explains the difference between one-stage (where the URLs can be generated) and two-stage scraping (where a list of URLs must be scraped first, before a second scraper is written to scrape those).

Video 2: Using Google Colab to write Python in Google Drive

The second video introduces Google Colab as a way to write Python within Google Drive — and some basic programming concepts such as variables, comments, printing, lists and looping. It also explains how lists play a key role in scraping within data journalism.

Video 3: Learning from an example scraper

The third video walks through a Python scraper, explaining how to read the code, identifying concepts from the first video and learning further programming concepts such as libraries, functions and dictionaries.

The scraper demonstrates how lists and loops are used within a scraper, how CSS selectors are used to ‘target’ particular pieces of information on a webpage, and some useful Python libraries for scraping.

All three videos can be watched in this playlist.

You’ll find related resources and tutorials in the repo here.

This is part of a series of video posts.

2 thoughts on “VIDEO PLAYLIST: An introduction to Python for data journalism and scraping

  1. Pingback: VIDEO PLAYLIST: An introduction to Python for data journalism and scraping (Online Journalism Blog) | ResearchBuzz: Firehose

  2. Pingback: Menos gente quiere informarse: cómo hacer para que permanezcan - Medianalisis

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.