How one Mexican data team uncovered the story of 4,000 missing women

4534

by Maria Crosas Batista

Mexican newspaper El Universal has put a face to the 4,534 women who have gone missing in Mexico City and the State of Mexico over the last decade: Ausencias Ignoradas (Ignored Absences) aims to put pressure on the government and eradicate this situation.

Daniela Guazo, from the data journalism team, explains how they gathered the data and presented the information not as numbers but as close people:

Breaking down the numbers

The Mexican government reported 4,281 missing women from 2005 to 2014, of which they are still looking for 2,000. The number was there — but nobody broke it down.

“The Mexican government declares reports and statistics without uploading the data. Therefore, when you want to check the information, there isn’t any document to follow or refer to.”

Scraping the data

El Universal Data worked with Morlan, a company specialised in data analysis and programming, to gather the information from Odisea and Capea. Both are official websites which hold information on missing people but don’t present them in a downloadable format.

They were able to scrape 1,480 records (pictures and text) from Odisea in a JSON format before the website was closed down in November last year.  

However, they could not scrape the data on Capea: the structure was extremely bad and journalists had to transcribe the information by hand in Excel. 

By February 2016 the website had 6,787 records of which 3,054 could be systemised:

“We started reading record by record and filtered them by gender. Once we got all the missing women, we followed the structure from Odisea and started building the dataset for Mexico City.”

Once this process was completed, they matched and cleaned both datasets. This left 4,534 faces with some patterns (such as the age, body size, height or the colour of the eyes), which they brought to the Mexican authorities.

“When it comes to missing people, there isn’t open data. Authorities don’t want to upload databases with all these details and all you have online is messy data in non readable formats such as JPGs that have to be scraped or copied by hand.”

4534

Families waiting for their daughters

Although they presented the story using one case as the backbone, they spoke to at least ten families in Mexico City. All complained about the same things:

  • Unhelpful authorities
  • Daughters would have called the family to say goodbye and that they are safe
  • They did not pack their suitcases
  • Mobiles phones are disconnected on the same day
  • Families are the ones who look for the missing people because the government mainly categorises these cases as “not located”, “lost” or “absent”, meaning that there isn’t a crime.  

Data journalism in Latin American

Daniela has worked as a data journalist for the last six years. She says that there are several countries such as Peru or Argentina that are growing open data and improving data journalism skills.

However, Mexico isn’t part of that:

“They are now understanding that data journalism is not only about graphics, numbers or statistics. It has a very strong journalism component. But resources are very scarce.”

The El Universal Data team, comprising Lilia Saúl and Daniela Guazo, was able to create Ausencias Ignoradas thanks to the Mike O’Connor Scholarship from the International Center for Journalists (ICFJ).

The story took six months including planning, gathering and analysing data, taking pictures, talking to families, writing the article, programming and designing — and it received a strong response from the audience and major organisations.

The next step is to update the information from 2016 and create another database for missing men in the city.