“Future employed a Community Editor to engage with the online craft audience and build a buzz in the months leading up to the launch of Mollie Makes. Continue reading
A post on the Guardian Datablog yesterday (Higher education funding: which institutions will be affected?) alerted me to the release of HEFCE’s “provisional allocations of recurrent funding for teaching and research, and the setting of student number control limits for institutions, for academic year 2012-13″ (funding data).
Here are the OU figures for teaching:
|Funding for old-regime students (mainstream)||Funding for old-regime students (co-funding)||High cost funding for new-regime students||Widening participation||Teaching enhancement and student success||Other targeted allocations||Other recurrent teaching grants||Total teaching funding|
HEFCE preliminary teaching funding allocations to the Open University, 2012-13
Of the research funding for 2012-13, mainstream funding was 8,030,807, the RDP supervision fund came in at 1,282,371, along with 604,103 “other”, making up the full 9,917,281 research allocation.
Adding Higher Education Innovation Funding of 950,000, the OU’s total allocation was 139,714,060.
So what other funding comes into the universities from public funds?
Open Spending publishes data relating to spend by government departments to named organisations, so we can search that for data spent by government departments with the universities (for example, here is a search on OpenSpending.org for “open university”:
Given the amounts spent by public bodies on consultancy (try searching OpenCorporates for mentions of PriceWaterhouseCoopers, or any of EDS, Capita, Accenture, Deloitte, McKinsey, BT’s consulting arm, IBM, Booz Allen, PA, KPMG (h/t @loveitloveit)), university based consultancy may come in reasonably cheaply?
The universities also receive funding for research via the UK research councils (EPSRC, ESRC, AHRC, MRC, BBSRC, NERC, STFC) along with innovation funding from JISC. Unpicking the research council funding awards to universities can be a bit of a chore, but scrapers are appearing on Scraperwiki that make for easier access to individual grant awards data:
- AHRC funding scraper; [grab data using queries of the form select * from `swdata` where organisation like "%open university%" on scraper arts-humanities-research-council-grants]
- EPSRC funding scraper; [grab data using queries of the form select * from `grants` where department_id in (select distinct id as department_id from `departments` where organisation_id in (select id from `organisations` where name like "%open university%")) on scraper epsrc_grants_1]
- ESRC funding scraper; [grab data using queries of the form select * from `grantdata` where institution like "%open university%" on scraper esrc_research_grants]
- BBSRC funding [broken?] scraper;
- NERC funding [broken?] scraper;
- STFC funding scraper; [grab data using queries of the form select * from `swdata` where institution like "%open university%" on scraper stfc-institution-data]
In order to get a unified view over the detailed funding of the institutions from these different sources, the data needs to be reconciled. There are several ID schemes for identifying universities (eg UCAS or HESA codes; see for example GetTheData: Universities by Mission Group) but even official data releases tend not make use of these, preferring instead to rely solely on insitution names, as for example in the case of the recent HEFCE provisional funding data release [DOh! This is not the case – identifiers are there, apparently (I have to admit, I didn’t check and was being a little hasty… See the contribution/correction from David Kernohan in the comments to this post…]:
For some time, I’ve been trying to put my finger on why data releases like this are so hard to work with, and I think I’ve twigged it… even when released in a spreadsheet form, the data often still isn’t immediately “database-ready” data. Getting data from a spreadsheet into a database often requires an element of hands-on crafting – coping with rows that contain irregular comment data, as well as handling columns or rows with multicolumn and multirow labels. So here are a couple of things that would make life easier in the short term, though they maybe don’t represent best practice in the longer term…:
1) release data as simple CSV files (odd as it may seem), because these can be easily loaded into applications that can actually work on the data as data. (I haven’t started to think too much yet about pragmatic ways of dealing with spreadsheets where cell values are generated by formulae, because they provide an audit trail from one data set to derived views generated from that data.)
2) have a column containing regular identifiers using a known identification scheme, for example, HESA or UCAS codes for HEIs. If the data set is a bit messy, and you can only partially fill the ID column, then only partially fill it; it’ll make life easier joining those rows at least to other related datasets…
As far as UK HE goes, the JISC monitoring unit/JISCMU has a an api over various administrative data elements relating to UK HEIs (eg GetTheData: Postcode data for HE and FE institutes, but I don’t think it offers a Google Refine reconciliation service, (ideally with some sort of optional string similarity service)…? Yet?! 😉 maybe that’d make for a good rapid innovation project???
PS I’m reminded of a couple of related things: Test Your RESTful API With YQL, a corollary to the idea that you can check your data at least works by trying to use it (eg generate a simple chart from it) mapped to the world of APIs: if you can’t easily generate a YQL table/wrapper for it, it’s maybe not that easy to use? 2) the scraperwiki/okf post from @frabcus and @rufuspollock on the need for data management systems not content management systems.
PPS Looking at the actual Guardian figures reveals all sorts of market levers appearing… Via @dkernohan, FT: A quiet Big Bang in universities
A new community for hyperlocal bloggers has been launched: Hyperlocal Alliance is “intended for grass-roots hyperlocal site owners, [and] is invite only (at the moment)”.
The Journalism Foundation has published a resource aimed at hyperlocal publishers – How To Build a Local Site (PDF) – including a chapter taken from the Online Journalism Blog (a rather curious choice, but there you go) and a link to Help Me Investigate in the Further Reading section.
It’s also offering up to £50,000 in funding for hyperlocal projects.
See comments for a 6th…
OpenlyLocal are trying to scrape planning application data from across the country. They want volunteers to help write the scrapers using Scraperwiki – and are paying £75 for each one.
This is a great opportunity for journalists or journalism students looking for an excuse to write their first scraper: there are 3 sample scrapers to help you find your feet, with many more likely to appear as they are written. Hopefully, some guidance will appear too (if not, I may try to write some myself).
Add your names in the comments on Andrew’s blog post, and happy scraping!
A must-read for any data journalist, aspiring or otherwise, is Simon Rogers’ post on The Guardian Datablog where he compares public and private sector pay.
This is a classic apples-and-oranges situation where politicians and government bodies are comparing two things that, really, are very different. Is a private school teacher really comparable to someone teaching in an unpopular school? What is the private sector equivalent of a director of public health or a social worker?
But if these issues are being discussed, journalists must try to shed some light, and Simon Rogers does a great job in unpicking the comparisons. From pay and hours worked, to qualifications and age (big differences in both), and gender and pay inequality (more women in the public sector, more lower- and higher-paid workers in the private sector), Rogers crunches all the numbers: Continue reading
People who live in areas branded as ‘problem communities’ by the media feel disengaged with the news – but hyperlocal citizen journalism offers an opportunity to re-engage citizens. These are the findings of a piece of research from the Netherlands called ‘When News Hurts‘, which measured mainstream coverage of ‘problem communities’ then followed a hyperlocal project which involved local people.
The findings won’t be a big surprise to those running hyperlocal blogs, which often focus on practical steps to improving their area and building civic participation rather than merely telling the stories of failure. But they do offer some lessons for traditional publishers, not just on what they could do better, but on what they’re doing badly in their current coverage – especially the regional publishers who would be expected to provide more ground-level reporting on local issues:
“Remarkably, in spite of being located close to these areas, the regional press hardly differed in their coverage from their national (quality) counterparts […] National newspapers quoted residents in 23 per cent of their larger reports on Kanaleneiland and 35 per cent of their reports on Overvecht. The regional newspaper quoted residents in only 26 per cent of its larger reports on Kanaleneiland and in 24 per cent of its reports on Overvecht. Unexpectedly, 55 per cent of all news items about a nearby elite neighbourhood (Wittevrouwen) used a resident as source.” Continue reading
In a guest post for OJB, Neil Thurman highlights a new research report that suggests the increased availability of news on mobile platforms, and its harnessing of social networks—like Facebook—to power recommendations, comes at a price: stories that are less relevant to readers’ interests than those recommended by editors and found on news providers’ traditional websites.