So Google scans email for dodgy images – should we be worried about scanning for sensitive documents?

Gmail logo

You could be forgiven for not having heard of John Henry Skillern. The 41 year old is facing charges of possession and promotion of child pornography after Google detected images of child abuse on his Gmail account.

Because of his case we now know that Google “proactively scours hundreds of millions of email accounts” for certain images. The technology has raised some privacy concerns which have been largely brushed aside because, well, it’s child pornography.

Sky’s technology correspondent Tom Cheshire, for example, doesn’t think it is an invasion of our privacy for “technical and moral reasons”. But should journalists be worried about the wider applications of the technology, and the precedent being set?

Automated matching

Part of Cheshire’s technical argument against the software representing an invasion of privacy is that it is almost entirely automated. As The Telegraph reported:

“It is understood that the software works by comparing images held in users’ accounts against a vast database of child abuse images which have been collated by child protection agencies around the world.

“Each one of the images is given a unique fingerprint, known as a hash, which is then used to compare with those held in the database.”

When a match is found, humans come into the process: “Trained specialists at organisations examine the image and decide whether to alert the police.”

But it’s not too big a leap of the imagination to see the same technology being used to spot documents held in users’ accounts against a database of documents the authorities don’t want made public (on the basis of ‘national security’). Or even images the police don’t want distributed.

And if that technology was employed, it is much less likely that its use would be made public in a court case in the same way as Skillern’s.

This ‘feature creep’ has been seen before in both technologies and laws. The Regulation of Investigatory Powers Act (RIPA), for example, was intended to allow surveillance related to terrorism or serious crime, but authorities used it for purposes including “spying on garden centres for selling pot plants; snooping on staff for using work showers or monitoring shops for unlicensed parrots.”

Who controls the database controls what gets flagged

In the description given above Google is entirely reliant on whoever compiles the database, and whoever they pass the images onto.

However noble the stated purpose, this is state surveillance, with the notable quirk that those conducting the surveillance are ‘blind’.

As Cheshire reports: “No humans are looking at images, which would be illegal. Nor does Google store child abuse images itself, which would also be illegal.”

So if a government whistleblower was trying to share documents their employers could be notified without anyone else knowing.

If a journalist passed on sensitive documents to a colleague a ‘red flag’ would be raised in a government office.

Where protestors shared images of police brutality, that image could be used to identify all of the recipients, including any reporters.

Google says it is not looking for other crimes at the moment, but it’s safe to say any extension of the technology, if introduced, would be operating without users knowing for some time.

On that basis journalists should assume that documents and images cannot be safely shared using Gmail – our account or any source’s.

Encryption, suggested by Cheshire, is not going to be a practical option for most sources. At the very least we should switch to a different email service ourselves and recommend that documents are shared using old fashioned post.

In the meantime, we need to talk about the oversight for systems of mass warrantless surveillance and the implications that such systems have for freedom of speech.

Google may be a commercial organisation, but in these situations it is acting as an agent of the state, and should be subject to the same checks and balances.


5 thoughts on “So Google scans email for dodgy images – should we be worried about scanning for sensitive documents?

  1. Pingback: Why every journalist should have a threat model (with cats) | Online Journalism Blog

  2. Pingback: Miriah Ludtke | Journalistic Integreity in a Digital World

  3. Pingback: What you read most on the Online Journalism Blog in 2014 | Online Journalism Blog

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.