During some training in open data I was doing recently, I ended up explaining (it’s a long story) how to pull a feed from Delicious into a Google Docs spreadsheet. I promised I would put it down online, so: here it is.
In a Google Docs spreadsheet the formula =importfeed will pull information from an RSS feed and put it into that spreadsheet. Titles, links, datestamps and other parts of the feed will each be separated into their own columns.
When combined with Delicious, this can be a useful way to collect together pages that have been bookmarked by a group of people, or any other feed that you want to analyse.
Here’s how you do it:
1. Decide on your tag, network or user
The spreadsheet will pull data from an RSS feed. Delicious provides so many of these that you are spoilt for choice. Here are the main three:
Used by various people.
Advantages: quick startup – all you need to do is tell people the tag (make sure this is unique, such as ‘unguessable2012’).
Disadvantages: others can hijack the tag – although this can be cleaned from the resulting data.
Consisting of the group of people who are bookmarking:
Advantages: group cannot be infiltrated.
Disadvantages: setup time – may need to create a new account to build the network around.
Created for this purpose:
Advantages: if users are not confident in using Delicious, this can be a useful workaround.
Disadvantages: longer set up time – you’ll need to create a new account, and work out an easy way for it to automatically capture bookmarks from the group. One way is to pull an RSS feed of any mentions on Twitter and use Twitterfeed to auto-tweet them with a hashtag, and then Packrati.us to auto-bookmark all tweeted links (a similar process is detailed here).
The RSS feed for each will be found at the bottom of pages, and is consistently formatted like so:
2. Create your spreadsheet
In Google Docs, create a new spreadsheet and in the first cell type the following formula:
…adding your RSS feed after the quotation mark, and then this at the end:
So it looks something like this:
Now press enter and after a moment the spreadsheet should populate with data from that feed.
You’ll note, however, that at most you will have only 15 rows of data here. That’s because the RSS feed you’ve copied includes that limitation.
If you look at the RSS feed you’ll see an easy clue on how to change this…
So, try editing it so that the count=15 part of that URL reads count=20 instead. You can put a higher number – but Google Docs will limit results to 20 at a time.
3. Collecting contributions
Technically, you’re now all set up. The bigger challenge is, of course, in getting people to contribute. It helps if they can see the results – so think about publishing your spreadsheet.
You’ll also need to make sure that you check it regularly and copy into a backup spreadsheet so you don’t miss results after that top 20.
But if you find it doesn’t work it may be worth thinking of other ways of doing this – for example, with a Google Form, or using =importfeed with the RSS feed for a search on results for a Twitter hashtag containing links (Twitter’s advanced search allows you to limit results accordingly – and all search results come with an RSS feed link like this one)
Of course there are far more powerful ways of doing this which are worth exploring once you’ve understood the basic possibilities.
Doing more with =importfeed
The =importfeed formula has some other elements that we haven’t used.
Another way to do this, for example, is to paste your RSS feed URL into cell A1 and type the following anywhere else:
=importfeed(A1, “Items Title”, FALSE, 20)
This has 4 parts in the parentheses:
- A1 – this points at the URL you just pasted in cell A1, and means that you only have to change what’s in A1 to change the feed being grabbed, rather than having to edit the formula itself
- “Items Title” – this is the part of the feed that is being grabbed. If you look in the feed you will see a part that says <item> and within that, an element called <title> – that’s it. You could change this to “Items URL” to get the <URL> part of <title> instead, for example. Or you could just put “Items” and get all 5 parts of each item (title, author, URL, date created, and summary). You can also use “feed” to get information about the feed itself, or “feed URL” or “feed title” or “feed description” to get that single piece of information.
- FALSE – this just says whether you want a header row or not. Setting to TRUE will add an extra row saying ‘Title’, for example.
- 20 – the number of results you want.