Tag Archives: database

Building the first central database of victims of the Spanish Civil War and the Franco regime

Bombings in Barcelona in 1938

Bombings in Barcelona in 1938 (Image by Italian Airforce under CC)

In a guest post for OJB, Carla Pedret looks at a new data journalism project to catalogue what happened during the Spanish Civil War.

125,000 people died, disappeared or were repressed in the Spanish Civil War (1936-1939) and during the Franco dictatorship, according to historians. Many of their families still do not know, 40 years later, what exactly happened to them.

Now the Innovation and Human Rights (IHR) association has created the first central database of casualties, missing persons and reprisals during the Spanish Civil War and under Francoism.

Continue reading

Telegraph plans to expand MPs database site in build up to election (Q&A)

I asked Tim Rowell, Digital Publisher at Telegraph.co.uk 3 questions about how they dealt with the MPs expenses story online. The main headline is that the new domain hosting the expenses database – parliament.telegraph.co.uk -will expand in the run-up to the next election along with the MP expenses database itself.

There are also curious “legal reasons” given for disabling the embed/email option on the PDFs. I’m pushing on that because I don’t see how publication on your site is different from allowing someone to embed it on their own, or email it. If you have any insight on that, let me know. [See response below]

Here are the responses in full:

When the team was going through the expenses and reporting, how was this longer term online strategy incorporated?

From day one, it was agreed that we would work towards the publication of an online database that contained not only the files themselves but also an aggregation of publicly available data (Parliament Parser, They Work for You, Register of Members Interests etc.) with our own unique data analysis.

The publication by Parliament last week of the redacted files has provided a glimpse into the scale of operation required to analyse such a volume of documentation but one has to realise that the full files contain many, many more pages.

The launch yesterday of the database is the first phase. We will, in due course, publish the full uncensored files for all 646 MPs. Crucially, the expenses investigative team of reporters spent a week aggregating and processing the data (the unique 2007/8 analysis of the Additional Costs Allowance) themselves. Integration in action again! The end result of that work is the first accurate breakdown of those ACA figures. We soon realised that this data provided a great basis upon which to build the Complete Expenses Files supplement in last Saturday’s newspaper.

Why Issuu? And why is the ’email/embed’ option disabled for “secret documents”?

“Secret documents’ is not our term, it is Issuu’s. We think Issuu is a great product and that it provides a fantastic user experience and have plans to use it more extensively. But for legal reasons we need to be sure that the document cannot be downloaded. By disabling the download function, Issuu automatically restricts email/embed.

[further to that:]  How is publication on your site different from allowing someone to embed it on their own, or emailing it?

It is a precautionary measure. In the unlikely event that one of the source documents puts at risk the identity of a supplier or the full postcode of an MP we need to be confident that a) we can amend that file immediately and b) that the file has not been distributed more widely. For that reason, we do not want the files to be downloadable. We’d be very happy for other to embed the files in their pages but if you restrict the download option in Issuu you restrict the ability to embed.

Am I right in thinking the pages on each MP are static and so indexable by search engines, even though they’re generated from a database?

Yes. You may also notice that it is on a new domain parliament.telegraph.co.uk. We will be enhancing our political resources over the coming months as we build up to the General Election. This application is not just for the Expenses files, we have plans to develop this area into a full service that enables our users to engage more closely with the democratic process.

Elections08: Storytelling with public databases

Written by Wilbert Baan

Today is the day of the US elections. I don’t think we ever had a live event on the web that will get so much live coverage. This means incredible amounts of information will be published over all kind of services and social networks. Websites like Facebook, Twitter, Flickr, WordPress, Blogger and many more.

Most popular web services have programmable interfaces. These interfaces allow developers to extract information out of the system. This creates a whole new genre of storytelling: storytelling with public databases. You can aggregate the information you need and sort it the way you want.

To prove the concept I made three small mock-ups. They all use search.twitter.com to see how people voted.

When I made the first the first animation Erik Borra replied by developing the idea into something that stores the data retrieved from Twitter in a database. I made a new interface that shows a graph based on what people say they voted on Twitter. And the result is a Twitter Poll.

These three examples are not representative data, it is extracted from Twitter. But it shows you how much personal and valuable information is in the public database. All you have to do is ask yourself what you want to tell to your readers and if this information is available.

I voted

This animation gets the latest twitter message where someone says they voted on McCain or Obama. It automatically refreshes. Continue reading