Monthly Archives: April 2011

The law, ethics & effectiveness of PR firms offering bloggers prizes-to-post

A PR firm recently invited me to review their client’s product, saying that if I did review it I would be entered into a prize draw with other ‘qualifying’ bloggers to win an iPad 2.

It was a product I might ordinarily have covered, but this approach made me reluctant.

Here’s reason number 1: I asked myself whether the PR firm will have made the same approach to print journalists. I doubt it. Why? Because it would have raised obvious ethical issues, and questioned the journalists’ professionalism.

So were they assuming that bloggers had different ethics? I doubt they thought that hard – more likely was that some bright spark thought that eager, amateur bloggers would jump at the chance to get anything for their hard work.

Here’s reason number 2: other bloggers will have been approached with the same offer. If they saw me review the product they would assume that I had done so in exchange for this prize draw ticket. They would see me as unprofessional, unethical, or both.

In PR terms, then, the approach was counter-productive: it actually made me less likely to give their client coverage.
Continue reading →

Data for journalists: JSON for beginners

13 Replies

Following the post earlier this week on XML and RSS for journalists I wanted to look at another important format for journalists working with data: JSON.

JSON is a data format which has been rising in popularity over the past few years. Quite often it is offered alongside – or instead of – XML by various information services, such as Google Maps, the UK Postcodes API and the Facebook Graph API.

Because of this, in practice JSON is more likely to be provided in response to a specific query (“Give me geographical and political data about this location”) than a general file that you access (“Give me all geographical data about everywhere”).

I’ll describe how you supply that query below. Continue reading →

Which blog platform should I use? A blog audit

11 Replies

When people start out blogging they often ask what blogging platform they should use – WordPress or Blogger? Tumblr or Posterous? It’s impossible to give an answer, because the first questions should be: who is going to use it, how, and what and who for?

To illustrate how the answers to those questions can help in choosing the best platform, I decided to go through the 35 or so blogs I have created, and why I chose the platforms that they use. As more and more publishing platforms have launched, and new features added, some blogs have changed platforms, while new ones have made different choices to older ones. Continue reading →

Guest post: visualising mobile phone data – the data retention app

3 Replies

In a guest post Lorenz Matzat, editor of ZEIT Online’s Open Data Blog, writes about the background to their online app exploring the issues around data retention by mobile phone companies.

It’s not very often that one can follow the direct impact of an article, let alone a piece of data journalism. But the visualization of the cellphone data of Malte Spitz from the Green party in Germany led to visible repercussions in the US.

Following a piece in the New York Times about Spitz and the data app, some days ago two senators wrote a letter to the 4 main US-carriers for information about their data retention policy.

After publishing the app in German one month ago (and 20 days later the English version), the feedback was overhelming. We didn’t think that so many people would be so interested in it. But Twitter and Facebook in Germany went wild with it for some days – along with coverage in many major tech websites.

Probably this is why data journalism works: Making an abstract notion everybody knows about visible: that every position of you, and every connection of your mobile phone does is – or could be – logged. Every call, text message and data connection.

The background

Around February 1st, ZEIT Online asked me if I had an idea what do do with the dataset of Malte Spitz (read the background story about the legal action of Spitz to get the data here). Continue reading →

Tech Tips: Making Sense of JSON Strings – Follow the Structure

Making Sense of the Notation

At its simplest, the structure has the form: {“attribute”:”value”}

If we parse this object into the jsonObject, we can access the value of the attribute as jsonObject.attribute or jsonObject[“attribute”]. The first style of notation is called a dot notation.

We can add more attribute:value pairs into the object by separating them with commas: a={“attr”:”val”,”attr2″:”val2″} and address them (that is, refer to them) uniquely: a.attr, for example, or a[“attr2”].

Try it out for yourself… Copy and past the following into your browser address bar (where the URL goes) and hit return (i.e. “go to” that “location”):

javascript:a={"attr":"val","attr2":"val2"}; alert(a.attr);alert(a["attr2"])

(As an aside, what might you learn from this? Firstly, you can “run” javascript in the browser via the location bar. Secondly, the javascript command alert() pops up an alert box:-)

Note that the value of an attribute might be another object.

obj={ attrWithObjectValue: { “childObjAttr”:”foo” } }

Another thing we can see in the Google geocoder JSON code are square brackets. These define an array (one might also think of it as an ordered list). Items in the list are address numerically. So for example, given:

arr[ “item1”, “item2”, “item3” ]

we can locate “item1″ as arr[0] and “item3″ as arr[2]. (Note: the index count in the square brackets starts at 0.) Try it in the browser… (for example, javascript:list=["apples","bananas","pears"]; alert( list[1] );).

Arrays can contain objects too:

list=[ “item1”, {“innerObjectAttr”:”innerObjVal” } ]

Can you guess how to get to the innerObjVal? Try this in the browser location bar:

javascript: list=[ "item1", { "innerObjectAttr":"innerObjVal" } ]; alert( list[1].innerObjectAttr )

Making Life Easier

Hopefully, you’ll now have a sense that there’s structure in a JSON object, and that that (sic) structure is what we rely on if we want to cut down on the “trial an error” when parsing such things. To make life easier, we can also use “tree widgets” to display the hierarchical JSON object in a way that makes it far easier to see how to construct the dotted path that leads to the data value we want.

A tool I have appropriated for previewing JSON objects is Yahoo Pipes. Rather than necessarily using Pipes to build anything, I simply make use of it as a JSON viewer, loading JSON into the pipe from a URL via the Fetch Data block, and then previewing the result:

Another tool (and one I’ve just discovered) is an Air application called JSON-Pad. You can paste in JSON code, or pull it in from a URL, and then preview it again via a tree widget:

Clicking on one of the results in the tree widget provides a crib to the path…

Summary

Getting to grips with writing addresses into JSON objects helps if you have some idea of the structure of a JSON object. Tree viewers make the structure of an object explicit. By walking down the tree to the part of it you want, and “dotting” together* the nodes/attributes you select as you do so, you can quickly and easily construct the path you need.

* If the JSON attributes have spaces or non-alphanumeric characters in them, use the obj[“attr”] notation rather than the dotted obj.attr notation…

PS Via my feeds today, though something I had bookmarked already, this Data Converter tool may be helpful in going the other way… (Disclaimer: I haven’t tried using it…)

If you know of any other related tools, please feel free to post a link to them in the comments:-)

Data for journalists: understanding XML and RSS

10 Replies

If you are working with data chances are that sooner or later you will come across XML – or if you don’t, then, well, you should do. Really.

There are some very useful resources in XML format – and in RSS, which is based on XML – from ongoing feeds and static reference files to XML that is provided in response to a question that you ask. All of that is for future posts – this post attempts to explain how XML is relevant to journalism, and how it is made up.

What is XML?

XML is a language which is used for describing information, which makes it particularly relevant to journalists – especially when it comes to interrogating large sets of data.

If you wanted to know how many doctors were privately educated, or what the most common score was in the Premiership last season, or which documents were authored by a particular civil servant, then XML may be useful to you. Continue reading →

UK Journalists on Twitter

A First Quick Viz of UK University Fees

Twitter & DataSift launch live social data services for under £1 (useful)

1 Reply

Journalists with an interest in realtime data should keep an eye on a forthcoming service from DataSift which promises to allow users to access a feed of Twitter tweets filtered along any combination of over 40 qualities.

In addition – and perhaps more interestingly – the service will also offer extra context:

“from services including Klout (influence metrics), PeerIndex (influence), Qwerly (linked social media accounts) and Lexalytics (text and sentiment analysis). Storage, post-processing and historical snapshots will also be available.”

The pricing puts this well within the reach of not only professional journalists but student ones too: for less than 20p per hour (30 cents) you will be able to apply as many as 10,000 keyword filters.

ReadWriteWeb describe a good example of how this may work out journalistically:

“Want a feed of negative Tweets written by C-level execs about any of 10,000 keywords? Trivial! Basic level service, Halstead says! Want just the Tweets that fit those criteria and are from the North Eastern United States? That you’ll have to pay a little extra for.”

The Charlie Sheen Twitter intern hoax – how it could be avoided