Category Archives: databases

Teaching data journalism in developing countries: lessons from ODECA

Eva Constantaras

Eva Constantaras

Eva Constantaras is a data journalist and trainer who recently wrote the Data Journalism Manual for the UN Development Program. In a special guest post she talks about the background to the manual, her experiences in working with journalists and professors who want to introduce data journalism techniques in developing nations, and why the biggest challenges not technological, but cultural.

Over the last few years, there has been a significant shift in global experiments in data journalism education away from short term activities like boot camps and hackathons to more sustained and sustainable interventions including fellowships and institutes.

There is a growing awareness that the challenge of teaching data journalism in many countries is split straight down the middle between teaching data and teaching journalism — where neither data science nor public interest journalism are particularly common. Open data can be a boon to democracy — but only if there are professionals capable and motivated to transform that data into information for the public. Continue reading

Civio and transparency in Spain: “We fight for public access to data”

Javier de la Vega

Javier de la Vega

Spanish citizens are now a step closer to understanding how power operates in the country, and how decisions affect them, thanks to the work of organisations like Civio fighting for transparency and access to public data. In October their work was recognised with the Gabriel Garcia Marquez award in innovative journalism for their investigations Medicamentalia. In a guest post for OJB, Nuria Riquelme Palazón spoke with Javier de la Vega, one of the members of Civio.

Access to public information, accountability and participatory democracy may have been a reality in many countries for some time — but in Spain they sounded like a utopia. Entrepreneur Jacobo Elosua and computer technician David Cabo decided that this had to change.

The pair used their savings to build an organisation with the intention of serving those active citizens who, like them, believed in transparency: Civio Foundation.

Taking inspiration from organisations like mySociety in the UK, Ciudadano Inteligente in Chile and the Sunlight Foundation in the USA, they began the long process in 2011. Continue reading

Civio y el poder de la transparencia: “Luchamos por el libre acceso a la información en España”

Javier de la Vega

Javier de la Vega

La ciudadanía española se encuentra un paso más cerca de saber como su gobierno y sus políticos hacen qué, cuándo, por qué y, lo más importante, cómo todo esto les afecta. Y este logro ha sido posible gracias al incesante trabajo de organizaciones como Civio, que lucha por una transparencia real y el libre acceso a la información. El esfuerzo de este equipo fue recompensado el pasado octubre con el premio Gabriel García Márquez en Innovación por una de sus últimas investigaciones: Medicamentalia. Nuria Riquelme Palazón ha hablado con Javier de la Vega, uno de los integrantes de Civio.

Acceso a la información pública, rendición de cuentas, democracia participativa… términos que en países como Reino Unido son una realidad desde hace tiempo, en otros como en España sonaban a una utopía descabellada, y esto tenía que cambiar.

Y el cambio empezó cuando Jacobo Elosua (emprendedor) y David Cabo (informático)  juntaron sus ahorros para construir una organización bajo el servicio de aquellos ciudadanos que, como ellos, creen en la transparencia: la Fundación Civio.

Continue reading

A new data journalism tool – and a new way of reporting uncertainty

guesstimate: how long it takes to get ready for preschool

On the last day of last year, web developer Ozzie Gooen launched his new project Guesstimate, a spreadsheet ‘for things that aren’t certain’.

It is an inspired idea: software plays a key role in shaping what we do, and we take spreadsheets’ certainty about numbers for granted. Why should we?

Throw in journalism’s default dislike of ambiguity and a political tendency to play to that… well, it can all make for some flawed reporting.

I was so impressed with Guesstimate and the opportunities it presents for a new style of data reporting that I sought out Gooen to find out more about the project and how he came to launch it. Continue reading

Giving a voice to the (literally) voiceless: data journalism and the dead

Red and blue person icons indicating the dead

In the Bureau’s Naming the Dead visualisation, blue indicates civilian victims and red alleged militants

Giving a voice to the voiceless is one of the core principles of journalism. Traditionally this means those without the power or money to amplify their own voices, but in recent years a strand of work has developed in data journalism that deserves particular attention: projects which give a voice to people who literally don’t have one — because they are dead. Continue reading

Dashboards and journalism: why we need to do better

airplane dashboard

Confused? Knobs and dials image by anataman

Last month I watched the founder of OpenOil, Johnny West, talk via video link about a dashboard he had designed to help people more effectively report on government announcements related to Chad’s hugely important oil industry.

The dashboard struck me in all sorts of ways: firstly in automating certain processes it lowered the barrier to more effective reporting; secondly it reduced the time needed to do so; and thirdly it turned a numerical topic into something more visual, and in the process made stories easier to spot.

More from Johnny later.

First, however, it’s worth taking stock of just how big a part dashboards play in our lives, and how little a role journalists play in their creation:

  • Publishers create content management systems to allow reporters and other staff to navigate between stories, media, metrics and other tools and information
  • Social media services create dashboards as a way of navigating our networks
  • Analytics companies create dashboards to help users monitor the performance of their content
detroit dashboard

This dashboard uses Chartbeat to give a real time view of how reporters are performing

Metrics dashboards are a big part of all three, including HuffPo’s analytics and Bleacher Report’s gamification of writer performance. But what about finding stories?

Story sourcing dashboards: social and RSS

Tweetdeck and Netvibes are good examples of dashboards that save us time as journalists: specifically search time.

RSS readers like Netvibes mean that we do not need to check multiple websites or perform multiple searches to see if new information has been published or shared: instead we only need to check the Netvibes dashboard.

netvibes-dashboard

This Netvibes account has multiple tabs for different dashboards

In fact, we can set up more than one dashboard depending on when or where we might be using them: one for when we are covering health, for example; or another one for a specific event.

Social media management dashboards like Tweetdeck and Hootsuite perform a similar function, but more narrowly focused on social media and with the ability to publish through the dashboard too, and in some cases access analytics.

So we can add saving response time to the time saved performing searches across multiple social networks and monitoring multiple lists or hashtags.

And then there are trending dashboards like Spike that aim to help newsrooms spot breaking stories.

Sometimes organisations develop event-specific dashboards. Here, for example, you can read The Times team on the process behind designing their own election dashboard:

times red box dashboard

The Times’ audience-facing dashboard for the 2015 election was also useful for journalists

If you’re not using dashboards like these then you are probably wasting time unnecessarily. But these all rely on existing infrastructures, whether those are RSS feeds or social network APIs.

We can do better than that.

Dashboards that help more people hold power to account

This is where Johnny West comes in. Johnny was one of the speakers at the Centre for Investigative Journalism’s recent Illicit Finance course. Here’s that dashboard he designed to make it easier to interrogate new figures from Chad’s government and oil industry:

dashboard openoil

A dashboard created by Johnny West of Open Oil for Chad. Image: Joel Benjamin.

Chad’s public finances, he explained, are over 70% dependent on oil revenues and under “severe pressure” from falls in prices. The dashboard made it easier to frame questions:

“…Of whether a budget holds up with likely revenues etc [or compare] Chad’s annual EITI reports – which state revenues actually received – with what you would expect the government to receive.”

He argues that creating a visual interface to the information the journalist needs (in this case oil prices and contract agreements) is essential:

“You cannot achieve any real understanding of the many interlocking parts of the contract and revenue flows without a model of their relationships with each other. I would not trust any financial comment or analysis of oil economics done blind to a model.

“It is a bit of a challenge to get journalists to accept this – since many of them are not familiar with or comfortable with financial models.”

One result of financial illiteracy, he argues, is an inclination towards simple but meaningless comparisons: one royalty rate being higher than another; or how much an income tax rate was raised by.

“Nine out of ten such stories are simply not accurate enough to provide any service to the reader. What if the royalty rate is lower because the income tax rate was raised? Or this one has a higher royalty because it is the second discovery in an area (with less exploration risk to the company therefore putting government in a stronger negotiating position)? And so on…”

The dashboard supports the journalist in reporting something richer despite the pressure to deliver something on deadline. And it’s not just for journalists:

“We know there are many governments which do not have models like this one for contracts they themselves negotiated and signed. It may be a question of institutional knowledge: perhaps one individual once had one, or a consultancy or visiting IMF delegation. But these get hoarded and not passed on. The need for public domain versions of these kinds of applications is critical in helping states build their capacity, not just the media.”

At a broader level there are also dashboards designed by journalists to help make their colleagues’ work easier. The Investigative Dashboard was designed a few years ago to help journalists and civil society investigate organised crime and corruption. It has subsequently had an injection of cash and a relaunch:

And there are dashboards from hackdays which show how we can make better use of the data we already have:

archive_dashboard

Broken Promises dashboard by Journalism++

Those are isolated examples, but they shouldn’t be. I once created a dashboard for journalists at a Scottish newspaper to pick stories out of some data I had scraped.

It meant that journalists with very little spreadsheet skills could call up data on any one of hundreds of measures by using a drop-down menu and be shown where to focus their follow up calls instantly.

Some great original stories and big splashes came out of that, yet all it took was a little initial effort, after which dozens of stories were easy to report.

Online spreadsheet tools like Google Sheets allow us to pull in live information, using built-in functions that fetch stock prices, or scrape web tables or feeds (which themselves might be generated by scrapers).

Once we have that live information it can be connected to historical information, and display those relationships visually.

Imagine a dashboard that pulls in the latest crime reports and tells us whether they’re going up or down – and where.

Imagine the BBC’s A&E tracker redesigned for journalists as well as readers.

Imagine sports performance shown dynamically, so you can pick up on the most improved performers and not just the top performers.

Much of this is already happening – but it’s not being done by journalists or news organisations.

FixMyStreet has long pioneered the ability to report – and see – local problems. And Birmingham’s Civic Dashboard showed all sorts of information on things like which parts of the council were getting the most contacts and when.

birmingham dashboard

The Birmingham Civic Dashboard

Expect to see more of these dashboards as the growth of ‘smart cities’ drives the connection of transport systems, policing, education, business and health.

But they’re not being done by news organisations. And that’s the point.

We need to change that.

Where data is already published we need to be setting up dashboards that bring it to the journalists. Where data is not, we need to be pushing for access to it.

It may be that news organisations can no longer “afford to be a paper of record and dutifully report everything that happened on our patch”. But we can do a better job of bringing as much as possible that happens to journalists’ desktops – and not just the stuff that is shared on social media.

Data journalism isn’t just a technical skill – it’s a cultural one too

kids playing game

Data journalists will always need help. Image by Widhi Rachmanto

When people talk about data journalism the emphasis is almost always on the technicalities of the role: visualisation tools and spreadsheet formulae; scraping and cleaning; coding and mashing.

But data journalism isn’t just a technical skill – it is a cultural skill too.

Let me explain what I mean. If you were to list the technologies involved in data journalism you might start with Excel or a similar spreadsheet tool. Then add Open Refine for cleaning. Some scraping tools. Mapping tools. Some tools for charts, and infographics. Some understanding of HTML and CSS will help. Also XPath, SQL, regular expressions. JavaScript, Python or Ruby or PHP. R probably too… I could go on.

If those technologies sound like too much for one person to master all at once, you’d be right. They are.

So how do data journalists get the job done? They collaborate.

They use sites like CodePen, Stack Exchange and GitHub, where others can build on your work – and you can build on the work of others. They contribute to mailing lists; they share resources; and they work with a range of other individuals and groups.

It is an open approach to reporting that borrows more from the culture of programming than journalism’s own culture of guarding information jealously.

And understanding that culture is, for me, one of the first steps to becoming a successful data journalist.

No longer the gatekeepers

For example, notice my choice of words in the sentence two lines earlier: “contribute to”; “share”; “work with”. Sometimes journalists can make demands of communities of web developers that betray an exaggerated sense of their own importance, and an ignorance of the environments that developers often work in.

Those journalists are often given short shrift as a result of their clumsiness and lack of empathy.

If journalists were the gatekeepers of the 20th century, programmers are the gatekeepers of the 21st.

We no longer need journalists to get information to an audience; but we do need programmers to connect different parts of the networks we operate in.

Recognising this is so important that I’ve codified the requirement for understanding in my data journalism teaching at Birmingham City University and City University London.

Students at BCU on the MA in Online Journalism, for example, are required to engage with – and contribute to – wider communities of practice.

That means sharing what they learn, curating useful discussions in the community, interviewing key individuals and researching problems and questions that are important to that community.

The intention is twofold: firstly to embed good habits as a member of that community. And secondly to position them so that they are able to continue to learn not just while they are on the course, but after they graduate, as technologies and practices continue to develop.

A different culture of learning

A final difference is also important to highlight: journalists and programmers have different learning cultures.

One of the questions I am asked most often by aspiring data journalists is “What should I learn first?” My response is: “What you need to for the story you’re doing right now. And if that’s too much, then pick a simpler story then work up from there.”

If you think you learn to be a data journalist by doing Codecademy or reading a book on Python, you are likely to end up frustrated. It can be helpful – but it’s neither effective nor efficient.

The learning culture of the programmer is much more piecemeal, strategic, and reliant on others.

So I would never advise a journalist to learn a particular programming language for the sake of it. Instead learn some basic concepts in programming, such as variables, data types, loops and if/else tests, and then search the web for code that solves the problem you’re trying to solve, whether that’s “making a chart in JavaScript” or “scraping a spreadsheet in Python” or “Excel function to extract a year from a date”.

Often the next step will be a case of copying and pasting someone else’s code, and changing it slightly to see what works.

That might feel like plagiarism to a journalist, but to a programmer it is simply standing on the shoulders of giants.

Equally, if you’re trying things out in programming they often don’t work first time.

Again, that can feel like failure if you come from a humanities background. But look at it more like science: experimentation, trial and error are part of the process. In fact, programming is essentially a process of working with failure: diagnosing it, looking for solutions, and trying them with a vague expectation that it might not work.

I realised that I had learned this culture when some code of mine worked first time – and I was not only surprised – I was also vaguely disappointed. “Oh. It works. What do I do now?”

Straddling two cultures

Journalists often straddle two cultures: the sports reporter has to connect with fans, players and management; the health reporter with both doctor and patient.

In data journalism we have to draw on the same skill: only it’s not just our audience we’re connecting with, it’s the people who make those connections work.

This was first published on the BBC News Labs Radar blog.