Tag Archives: rufus pollock

From CMS to DMS

There’s a persuasive argument being made by Francis Irving and Rufus Pollock in a joint blog post about the growth of data management systems – the ‘DMS’ to content management systems’ ‘CMS’:

“Just as then we wrote HTML in text files by hand and uploaded it by FTP, now we analyse data on our laptops using Excel, and share it with friends by emailing CSV files.

“But it reaches the point where using the filesystem and Outlook as your DMS stretches to breaking point. You’ll need a proper one.

“Nobody really knows what a proper one will look like yet. We’re all working on it.”

Their post lists what a DMS needs to do and the companies already trying to solve the ‘DMS problem’ from different directions: a list which includes Google Docs (“coming from the web spreadsheet direction”), the data social network BuzzData, visualisation tool Tableau, data marketplaces, operating systems, Scraperwiki, and PANDA (“making a DMS for newsrooms”)

It’s a well-drawn picture from an angle which I haven’t seen before. Certainly, a number of news organisations are trying to reduce the friction of producing content for different platforms by ‘atomising’ it in data-driven production processes (where a piece of content might be assembled and presented differently depending on the platform it is accessed through, for example), and their internal systems can probably be added to the list above.

What do you think? Is this a problem that’s being addressed in your own organisation?

Where do I get that data? New Q&A site launched

Get the Data

Well here’s another gap in the data journalism process ever-so-slightly plugged: Tony Hirst blogs about a new Q&A site that Rufus Pollock has built. Get the Data allows you to “ask your data related questions, including, but not limited to, the following:

  • “where to find data relating to a particular issue;
  • “how to query Linked Data sources to get just the data set you require;
  • “what tools to use to explore a data set in a visual way;
  • “how to cleanse data or get it into a format you can work with using third party visualisation or analysis tools.”

As Tony explains (the site came out of a conversation between him and Rufus):

“In some cases the data will exist in a queryable and machine readable form somewhere, if only you knew where to look. In other cases, you might have found a data source but lack the query writing expertise to get hold of just the data you want in a format you can make use of.”

He also invites people to help populate the site:

“If you publish data via some sort of API or queryable interface, why not considering posting self-answered questions using examples from your FAQ?

“If you’re running a hackday, why not use GetTheData.org to post questions arising in the scoping the hacks, tweet a link to the question to your event backchannel and give the remote participants a chance to contribute back, at the same time adding to the online legacy of your event.”

Off you go then.