Tag Archives: network analysis

FAQ: Big data and journalism

The latest in the series of Frequently Asked Questions comes from a UK student, who has questions about big data.

How can data journalists make sense of such quantities of data and filter out what’s meaningful?

In the same way they always have. Journalists’ role has always been to make choices about which information to prioritise, what extra information they need, and what information to include in the story they communicate. Continue reading

Following the money: making networks visible with HTML5

Network analysis – the ability to map connections between people and organisations – is one branch of data journalism which has enormous potential. But it is also an area which has not yet been particularly well explored, partly because of the lack of simple tools with which to do it.

One recent example – AngelsOfTheRight.net – is particularly interesting, because of the way that it is experimenting with HTML5.

The site is attempting to map “relationships among institutions due to the exchange of large quantities of money between them as reported to IRS in a decade of Form 990 tax filings.”

But it’s also attempting to “push the limits” of using HTML5 to create network maps. As this blog post explains:

“This project was built using the NodeViz project [...] which wraps up a bunch of the functionality needed to squeeze network ties out of a database, through Graphviz, and into a browser with features like zooming, panning, and full DOM and JavaScript interaction with the rest of the page content. This means that we can do fun things like have a tour to mode a viewer through the map, and have list views of related data alongside the map that will open and focus on related nodes when clicked. It is also supposed to degrade gracefully to just display a clickable image on non-SVG browsers like Internet Explorer 7 and 8.”

HTML5 offers some other interesting possibilities, such as improved search engine optimisation compared to a static image or Flash interactive, although I have no idea how much this project explores that (comments invited).

Also interesting is the discussion section of AngelsOfTheRight.net, which outlines some of the holes in the data, methodological flaws, and ways that the project could be improved:

“In this sort of survey, it is always hard to tell if organizations are missing because they really didn’t make contributions, or just because nobody had time to record the data from their financial statements into the database. Several sources mention the Adolph Coors Foundation as an important funder of the conservative agenda, yet they do not appear in this database. Why not?”

via Pete Warden