Since the start of the year the Argentinian newspaper ‘La Nación’ has been publishing ‘Nación Data’, a blog dedicated to data visualization, interactive projects and especially, all the news related with data journalism.
During this time they have been posting interviews with experts from the community, reporting popular events such as NICAR and sharing the most innovative pieces made by other newspapers.
The multimedia development manager of ‘La Nación’, Momi Peralta, pointed out that their main goal so far is to release as much data as they can. Continue reading →
In this guest post, Damian Radcliffe highlights some topline developments in the hyper-local space during 2011. He also asks for your suggestions of great hyper-local content from 2011. His more detailed slides looking at the previous year are cross-posted at the bottom of this article.
2011 was a busy year across the hyper-local sphere, with a flurry of activity online as well as more traditional platforms such as TV, Radio and newspapers.
The Government’s plans for Local TV have been considerably developed, following the Shott Review just over a year ago. We now have a clearer indication of the areas which will be first on the list for these new services and how Ofcom might award these licences. What we don’t know is who will apply for these licences, or what their business models will be. But, this should become clear in the second half of the year.
Whilst the Leveson Inquiry hasn’t directly been looking at local media, it has been a part of the debate. Claire Enders outlined some of the challenges facing the regional and local press in a presentation showing declining revenue, jobs and advertising over the past five years. Her research suggests that the impact of “the move to digital” has been greater at a local level than at the nationals.
Across the board, funding remains a challenge for many. But new models are emerging, with Daily Deals starting to form part of the revenue mix alongside money from foundations and franchising.
And on the content front, we saw Jeremy Hunt cite a number of hyper-local examples at the Oxford Media Convention, as well as record coverage for regional press and many hyper-local outlets as a result of the summer riots.
I’ve included more on all of these stories in my personal retrospective for the past year.
One area where I’d really welcome feedback is examples of hyper-local content you produced – or read – in 2011. I’m conscious that a lot of great material may not necessarily reach a wider audience, so do post your suggestions below and hopefully we can begin to redress that.
Here are the passages most relevant for journalists. Firstly, following the money and accountability:
“The [Data Strategy Board] body seeking public data will be reliant upon the profitability of the PDG [Public Data Group] in order to have the funding it needs to secure the release of data that, if properly released in free forms, would likely undermine the current trading revenue model of the PDG. That doesn’t look like the foundation for very independent and effective governance or regulation to open up core reference data!
“Furthermore, whilst the proposed terms for the DSB [Data Strategy Board] terms state that “Data users from outside the public sector, including representatives of commercial re-users and the Open Data community, will represent at least 30% of the members of DSB”, there are also challenges ahead to ensure data users from civil society interests are represented on the board”
Secondly, the emphasis on clinical data and issues surrounding privacy and the sale of personal data:
“The first measures in the Cabinet Office’s paper are explicitly not about open data as public data, but are about the restricted sharing of personal medical records with life-science research firms – with the intent of developing this sector of the economy. With a small nod to “identifying specified datasets for open publication and linkage”, the proposals are more centrally concerned with supporting the development of a Clinical Practice Research Datalink (CPRD) which will contain interlinked ‘unidentifiable, individual level’ health records, by which I interpret the ability to identify a particular individual with some set of data points recorded on them in primary and secondary care data, without the identity of the person being revealed.
“The place of this in open data measures raises a number of questions, such as whether the right constituencies have been consulted on these measures and why such a significant shift in how the NHS may be handing citizens personal data is included in proposals unlikely to be heavily scrutinised by patient groups? In the past, open data policies have been very clear that ‘personal data’ is out of scope – and the confusion here raises risks to public confidence in the open data agenda. Leaving this issue aside for the moment, we also need to critically explore the evidence that the release of detailed health data will “reinforce the UK’s position as a global centre for research and analytics and boost UK life sciences”. In theory, if life science data is released digitally and online, then the firms that can exploit it are not only UK firms – but the return on the release of UK citizens personal data could be gained anywhere in the world where the research skills to work with it exist.”
Thirdly, it looks like this data will allow journalists to scrutinise welfare and credit (so plenty of material for the tabloids and mid-market press), but not data that scrutinises corporations or governments:
“When we look at the other administrative datasets proposed for release in the Measures the politicisation of open data release is evident: Fit Note Data; Universal Credit Data; and Welfare Data (again discussed for ‘linking’ implying we’re not just talking about aggregate statistics) are all proposed for increased release, with specific proposals to “increase their value to industry”. By contrast, no mention of releasing more details on the tax share paid by corporations, where the UK issues arms export licenses, or which organisations are responsible for the most employment law violations. Although the stated aims of the Measures include increasing “transparency and accountability” it would not be unreasonable to read the detail of the measures as very one-sided on this point: and emphasising industry exploitation of data far more than good governance and citizen rights with respect to data.
“The blurring of the line between ‘personal data’ and ‘open data’, and the state’s assumption of the right to share personal data for industrial gain should give cause for concern, and highlights the need for build a stronger constituency scrutinising government open data action.”
It’s nice to see a data initiative being greeted with a critical eye rather than Three Cheers for the Numbers.
UPDATE: On a similar note, Access Info Europe highlights problems with the Open Government Partnership, which “must significantly improve its internal access to information policy to meet the standards it is advancing”. Specifically:
“The policy should be reformed to incorporate basic open data principles such as that information will be made available in a machine-readable, electronic format free of restrictions on reuse.”
“A key problem is the lack of detail in the policy, which has the result of leaving important matters to the discretion of the OGP. Other key problems include:
» The failure of the policy to recognise the fundamental human right to information;
» The significantly overbroad and discretionary regime of exceptions;
» The failure of the draft Policy to put in place a system of protections and sanctions.”
How does the foreign aid of Germany support other countries? The Federal Ministry of Economic Cooperation and Development (BMZ) releases no details, although about 6 billion euros is made available for aid every year. Now the Open Knowledge Foundation in Germany has broken down the data – with the unintended help of the OECD.
Until now it was a mystery to the German public which countries benefit, and to what extent, from their government’s spending on foreign aid: the BMZ publishes only a list of the countries that receive aid (PDF). It was also not known which particular sectors in these countries were being supported.
For the political scientist Christian Kreutz, member of the Open Knowledge Foundation Germany, the BMZ database for development aid was just disappointing:
“The relevant information is scattered, little data is available in open formats and a breakdown of financial data such as project expenses is not yet published.”
For two days Christian Kreutz wrangled with the data sets, then he presented his first results on a new open-data map. More than half the ODA payments come from the BMZ, the rest come from other ministries. Kreutz concludes: “Hardly any country receives nothing.”
Surprising findings
Interestingly, not only classic developing countries are supported. The lion’s share goes to BRIC countries, namely Brazil, Russia, India and China which have profited from high economic growth for years.
Russia received around 12 billion euros in the years 1995 to 2009, China and India around 6 and 4 billion euros respectively.
Current sites of conflict receive quite a lot of money: Iraq received 7 billion euros, with the majority coming from debt cancellation. A similar situation is found in Nigeria and Cameroon.
In comparison Afghanistan and Pakistan receive only about 1.2 billion euros.
Even authoritarian regimes benefit from German development aid: Syria received around 1 billion euros. A large proportion of the money is spent on debt relief as well as water and education projects.
Interestingly, however, some European states received more money: Poland got 2.8 billion, mainly going into the education sector.
EU aspirants Serbia and Turkey received 2 billion euros each.
Payment information was also combined with data from the Economist on democratic development. Here a kind of rule of thumb can be recognised: countries which are less democratic are encouraged.
Egypt, for example, not only received support for water projects and its textile industry, but also for its border police – by an unspecified federal ministry.
BMZ is opening up
The new aid data map does not break down numbers by donors yet. But it could do so, as the detailed OECD data supports it.
Christian Kreutz has filed a Freedom of Information Act request with the BMZ to get further data. But the ministry is already showing signs of movement: a spokesperson said that project funding data will be published soon on the ministry’s website.
The interesting question is how open and accessible the BMZ data will be. Recipients of ODA funds can not be inferred directly from the OECD database. Open data activists hope that the BMZ will not hide the data behind a restrictive search interface to prevent further analysis, à la Farmsubsidy.
Instead it’s falling to the likes of Tony Hirst (an Open University academic), Dan Herbert (an Oxford Brookes academic) and Chris Taggart (a developer who used to be a magazine publisher) to fill the scrutiny gap. Recently all three have shone a light into the move towards transparency and open data which anyone with an interest in information would be advised to read.
What all three highlight is how control of information still represents the exercise of power, and how shifts in that control as a result of the transparency/open data/linked data agenda are open to abuse, gaming, or spin. Continue reading →
This is the fourth part of my inaugural lecture at City University London, ‘Is Ice Cream Strawberry?’. You can find part one here, part two here, and part three here.
Human capital
So here’s person number 4: Gary Becker, a Nobel prize-winning economist.
Fifty years ago he used the phrase ‘human capital’ to refer to the economic value that companies should ascribe to their employees.
These days, of course, it is common sense to invest time in recruiting, training and retaining good employees. But at the time employees were seen as a cost.
We need a similar change in the way we see our readers – not as a cost on our time but as a valuable part of our operations that we should invest in recruiting, developing and retaining. Continue reading →
It’s a refreshingly simple execution: a WordPress blog with each question as a separate blog post – presumably it cost a lot less than £300,000. But of course the questions are theirs, and they are:
It’s a shame that there isn’t any space for more open discussion – and that so many of the questions resemble market research. But still, the more journalists who pile in – the more justifiably we can moan later. So go ahead.
There have been a raft of new sites for data launched in the past couple of months which I haven’t had time to blog about, so here’s a quick round-up:
Tim Davies‘ Open Data Cookbook aims to collect “step by step recipes for practical ways to use open data” – a useful complement to GetTheData. The recipes are currently aimed at the more technically minded but you know what to do to address that…
Is It Open Data? aims to “make it easy for people to make enquires of data holders, about the openness of the data they hold — and to record publicly the results of those efforts.”
And for those wishing to publish open data, The Open Data Manual provides information on what open data is, why you should publish open data, and how to do it. If you come up against an organisation that does not know how to publish their data in an open format, or needs convincing of why they should do so, this is a good place to point them to (or learn the arguments from).
If you’ve seen any other useful resources of late, please post a link in the comments.
Conrad Quilty-Harper writes about the new crime data from the UK police force – and in the process adds another straw to the groaning camel’s back of the government’s so-called transparency agenda:
“It’s useless to residents wanting to find out what was going on at the house around the corner at 3am last night, and it’s useless to individuals who want to build mobile phone applications on top of the data (perhaps to get a chunk of that £6 billion industry open data is supposed to create).
“The site’s limitations are as follows:
No IDs for crimes: what if I want to check whether real life crimes have made it onto the map? Sorry.
Six crime categories: including “other crimes”, everything from drug dealing to bank robberies in one handy, impossible to understand category.
No live data: you mean I have to wait until the end of the next month to see this month’s criminality?!
No dates or times: funny how without dates and times I can’t tell which police manager was in charge.
Case status: the police know how many crimes go solved or unsolved, why not tell us this?”
Some data such as sexual offences and murder is removed – even though it would be easy to discover and locate from other police reports.
Data covers reported crimes rather than convictions, so some of it may turn out not to be crime.
The levels of policing are not provided, so that two areas with the “same” crime levels may in fact have “radically different” experiences of crime and policing.
Charles Arthur notes that: “Police forces have indicated that whenever a new set of data is uploaded – probably each month – the previous set will be removed from public view, making comparisons impossible unless outside developers actively store it.”
“What we’ve actually got with www.police.uk is neither one nor the other. Ruth looks like a crime overlord cos of all the crimes happening in her garden and we haven’t got exact point data, but we haven’t got first part of postcode data either e.g. BB5 crimes or NW1 crimes. Instead, we’ve got this weird halfway house thing where it’s not accurate, but its inaccuracy almost renders it useless because we don’t have any idea if every force uses the same parameters when picking these points, we don’t know how they pick their points, we don’t know what we don’t know in terms of whether one house in particular is causing a considerable issue with anti-social behaviour for example, allowing me to go to my local Council and demand they do something about it.”
Adrian Short argues that “What we’re looking at here isn’t a value-neutral scientific exercise in helping people to live their daily lives a little more easily, it’s an explicitly political attempt to shape the terms of a debate around the most fundamental changes in British policing in our lifetimes.”
He adds:
“It’s derived data that’s already been classified, rounded and lumped together in various ways, with a bit of location anonymising thrown in for good measure. I haven’t had a detailed look at it yet but I would caution against trying to use it for anything serious. A whole set of decisions have already transformed the raw source data (individual crime reports) into this derived dataset and you can’t undo them. You’ll just have to work within those decisions and stay extremely conscious that everything you produce with it will be prefixed, “as far as we can tell”.
“£300K for this? There ought to be a law against it.”
UPDATE 2: One frustrated developer has launched CrimeSearch.co.uk to provide “helpful information about crime and policing in your area, without costing 300k of tax payers’ money”