Monthly Archives: September 2010

A Quick Play with Google Static Maps: Dallas Crime

A couple of days ago I got an email from Jennifer Okamato of the Dallas News, who had picked up on one my mashup posts describing how to scrape tabluar data from a web page and get it onto an embeddable map (Data Scraping Wikipedia with Google Spreadsheets). She’d been looking at live crime incident data from the Dallas Police, and had managed to replicate my recipe in order to get the data into a map embedded on the Dallas News website:

Active Dallas Police calls

But there was a problem: the data being displayed on the map wasn’t being updated reliably. I’ve always known there were cacheing delays inherent in the approach I’d described, which involves Google Spreadsheets, Yahoo Pipe, Google Maps, as well as local browsers all calling on each other an all potentially cacheing the data, but never really worried about them. But for this example, where the data was changing on a minute by minute basis, the delays were making the map display feel too out of date to be useful. What’s needed is a more real time solution.

I haven’t had chance to work on a realtime chain yet, but I have started dabbling around the edges. The first thing was to get the data from the Dallas Police website.

Dallas police - live incidents

(You’ll notice the data includes elements relating to the time of incident, a brief description of it, its location as an address, the unit handling the call and their current status, and so on.)

A tweet resulted in a solution from @alexbilbie that uses a call to YQL (which may introduce a cacheing delay?) to scrape the table and generate a JSON feed for it, and a PHP handler script to display the data (code).

I tried the code on the OU server that ouseful.open.ac.uk works on, but as it runs PHP4, rather than the PHP5 Alex coded for, it fell over on the JSON parsing stakes. A quick Google turned up a fix in the form of a PEAR library for handling JSON, and a stub of code to invoke it in the absence of native JSON handling routines:

//JSON.php library from http://abeautifulsite.net/blog/2008/05/using-json-encode-and-json-decode-in-php4/
include("JSON.php");

// Future-friendly json_encode
if( !function_exists('json_encode') ) {
    function json_encode($data) {
        $json = new Services_JSON();
        return( $json->encode($data) );
    }
}

// Future-friendly json_decode
if( !function_exists('json_decode') ) {
    function json_decode($data) {
        $json = new Services_JSON();
        return( $json->decode($data) );
    }
}

I then started to explore ways of getting the data onto a Google Map…(I keep meaning to switch to OpenStreetMap, and I keep meaning to start using the Mapstraction library as a proxy that could in principle cope with OpenStreetMap, Google Maps, or various other mapping solutions, but I was feeling lazy, as ever, and defaulted to the Goog…). Two approaches came to mind:

– use the Google static maps API to just get the markers onto a static map. This has the advantage of being able to take a list of addresses in the image URL which then then be automatically geocoded; but it has the disadvantage of requiring a separate key area detailing the incidents associated with each marker:

Dallas crime static map demo

– use the interactive Google web maps API to create a map and add markers to it. In order to place the markers, we need to call the Google geocoding API once for each address. Unfortunately, in a quick test, I couldn’t get the version 3 geolocation API to work, so I left this for another day (and maybe a reversion to the version 2 geolocation API, which requires a user key and which I think I’ve used successfully before… err, maybe?!;-).

So – the static maps route it is.. how does it work then? I tried a couple of routes: firstly, generating the page via a PHP script. Secondly, on the client side using a version of the JSON feed from Alex’s scraper code.

I’ll post the code at the end, but for now will concentrate on how the static image file is generated. As with the Google Charts API, it’s all in the URL.

For example, here’s a static map showing a marker on Walton Hall, Milton Keynes:

OU static map

Here’s the URL:

http://maps.google.com/maps/api/staticmap?
center=Milton%20Keynes
&zoom=12&size=512×512&maptype=roadmap
&markers=color:blue|label:X|Walton%20Hall,%20Milton%20Keynes
&sensor=false

You’ll notice I haven’t had to provide latitude/longitude data – the static map API is handling the geocoding automatically from the address (though if you do have lat/long data, you can pass that instead). The URL can also carry more addresses/more markers – simply add another &markers= argument for each address. (I’m not sure what the limit is? It may be bound by the length of the URL?)

So -remember the original motivation for all this? Finding a way of getting recent crime incident data onto a map on the Dallas News website? Jennifer managed to get the original Google map onto the Dallas News page, so it seems that if she has the URL for a web page containing (just) the map, she can get it embedded in an iframe on the Dallas News website. But I think it’s unlikely that she’d be able to get Javascript embedded in the parent Dallas News page, and probably unlikely that she could get PHP scripts hosted on the site. The interactive map is obviously the preferred option, but a static map may be okay in the short term.

Looking at the crude map above, I think it could be nice to be able to use different markers (either different icons, or different colours – maybe both?) to identify the type of offence, its priority and its status. Using the static maps approach – with legend – it would be possible to colour code different incidents too, or colour or resize them if several units were in attendance? One thing I don;’t do is cluster duplicate entries (where maybe more than one unit is attending?)

It would be nice if the service was a live one, with the map refreshing every couple of minutes or so, for example by pulling a refreshed JSON feed into the page, and updating the map with new markers, and letting old markers fade over time. This would place a load on the original screenscraping script, so it’d be worth revisiting that and maybe implementing some sort of cache so that it plays nicely with the Dallas Police website (e.g. An Introduction to Compassionate Screen Scraping could well be my starter for 10). If the service was running as a production one, API rate limiting might be an issue too, particularly if the map was capable of being updated (I’m not sure what rate limiting applies to the static maps api, the Google maps API, or the Google geolocation API? In the short term (less coding) it might make sense to try to offload this to the client (i.e. let the browser call Google to geocode the markers), but a more efficient solution might be for a script on the server to geocode each location and then pass the lat/long data as part of the JSON feed.

Jennifer also mentioned getting a map together for live fire department data, which could also provide another overlay (and might be useful for identifiying major fire incidents?) In that case, it might be necessary to dither markers, so e.g. police and fire department markers didn’t sit on top of and mask each other. (Not sure how to do this in static maps, where we geocoding by address? Would maybe have to handle things logically, and use a different marker type for events attended by just police units, just fire units, or both types? If we’re going for real time, it might also be interesting to overlay recent geotagged tweets from twitter?

Anything else I’m missing? What would YOU do next?

PS if you want to see the code, here it is:

Firstly, the PHP solution [code]:

<html>
<head><title>Static Map Demo</title>
</head><body>

<?php

error_reporting(-1);
ini_set('display_errors', 'on');

include("json.php");

// Future-friendly json_encode
if( !function_exists('json_encode') ) {
    function json_encode($data) {
        $json = new Services_JSON();
        return( $json->encode($data) );
    }
}

// Future-friendly json_decode
if( !function_exists('json_decode') ) {
    function json_decode($data) {
        $json = new Services_JSON();
        return( $json->decode($data) );
    }
}

$response = file_get_contents("http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22http%3A%2F%2Fwww.dallaspolice.net%2Fmediaaccess%2FDefault.aspx%22%20and%0A%20%20%20%20%20%20xpath%3D'%2F%2F*%5B%40id%3D%22grdData_ctl01%22%5D%2Ftbody'&format=json");

$json = json_decode($response);

$reports = array();

if(isset($json->query->results))
{
    $str= "<img src='http://maps.google.com/maps/api/staticmap?center=Dallas,Texas";
    $str.="&zoom=10&size=512x512&maptype=roadmap";

    $ul="<ul>";

	$results = $json->query->results;

	$i = 0;

	foreach($results->tbody->tr as $tr)
	{

		$reports[$i]['incident_num'] = $tr->td[1]->p;
		$reports[$i]['division'] = $tr->td[2]->p;
		$reports[$i]['nature_of_call'] = $tr->td[3]->p;
		$reports[$i]['priority'] = $tr->td[4]->p;
		$reports[$i]['date_time'] = $tr->td[5]->p;
		$reports[$i]['unit_num'] = $tr->td[6]->p;
		$reports[$i]['block'] = $tr->td[7]->p;
		$reports[$i]['location'] = $tr->td[8]->p;
		$reports[$i]['beat'] = $tr->td[9]->p;
		$reports[$i]['reporting_area'] = $tr->td[10]->p;
		$reports[$i]['status'] = $tr->td[11]->p;

	    $addr=$reports[$i]['block']." ".$reports[$i]['location'];
	    $label=chr(65+$i);
	    $str.="&markers=color:blue|label:".$label."|".urlencode($addr);
	    $str.=urlencode(",Dallas,Texas");

	    $ul.="<li>".$label." - ";
	    $ul.=$reports[$i]['date_time'].": ".$reports[$i]['nature_of_call'];
	    $ul.", incident #".$reports[$i]['incident_num'];
	    $ul.=", unit ".$reports[$i]['unit_num']." ".$reports[$i]['status'];
	    $ul.=" (priority ".$reports[$i]['priority'].") - ".$reports[$i]['block']." ".$reports[$i]['location'];
	    $ul.="</li>";

		$i++;

	}

	$str.="&sensor=false";
    $str.="'/>";
    echo $str;

    $ul.="</ul>";
    echo $ul;
}
?>
</body></html>

And here are a couple of JSON solutions. One that works using vanilla JSON [code], and as such needs to be respectful of browser security policies that say the JSON feed needs to be served from the same domain as the web page that’s consuming it:

<html>
<head><title>Static Map Demo - client side</title>

<script src="http://code.jquery.com/jquery-1.4.2.min.js"></script>

<script type="text/javascript">

function getData(){
    var str; var msg;
    str= "http://maps.google.com/maps/api/staticmap?center=Dallas,Texas";
    str+="&zoom=10&size=512x512&maptype=roadmap";

    $.getJSON('dallas2.php', function(data) {
      $.each(data, function(i,item){
        addr=item.block+" "+item.location;
	    label=String.fromCharCode(65+i);
        str+="&markers=color:blue|label:"+label+"|"+encodeURIComponent(addr);
	    str+=encodeURIComponent(",Dallas,Texas");

	    msg=label+" - ";
        msg+=item.date_time+": "+item.nature_of_call;
	    msg+=", incident #"+item.incident_num;
	    msg+=", unit "+item.unit_num+" "+item.status;
	    msg+=" (priority "+item.priority+") - "+item.block+" "+item.location;
        $("<li>").html(msg).appendTo("#details");

      })
      str+="&sensor=false";
      $("<img/>").attr("src", str).appendTo("#map");

    });

}
</script>
</head><body onload="getData()">

<div id="map"></div>
<ul id="details"></ul>
</body></html>

And a second approach that uses JSONP [code], so the web page and the data feed can live on separate servers. What this really means is that you can grab the html page, put it on your own server (or desktop), hack around with the HTML/Javascript, and it should still work…

<html>
<head><title>Static Map Demo - client side</title>

<script src="http://code.jquery.com/jquery-1.4.2.min.js"></script>

<script type="text/javascript">

function dallasdata(json){
    var str; var msg;
    str= "http://maps.google.com/maps/api/staticmap?center=Dallas,Texas";
    str+="&zoom=10&size=512x512&maptype=roadmap";

      $.each(json, function(i,item){
        addr=item.block+" "+item.location;
	    label=String.fromCharCode(65+i);
        str+="&markers=color:blue|label:"+label+"|"+encodeURIComponent(addr);
	    str+=encodeURIComponent(",Dallas,Texas");

	    msg=label+" - ";
        msg+=item.date_time+": "+item.nature_of_call;
	    msg+=", incident #"+item.incident_num;
	    msg+=", unit "+item.unit_num+" "+item.status;
	    msg+=" (priority "+item.priority+") - "+item.block+" "+item.location;
        $("<li>").html(msg).appendTo("#details");

      })
      str+="&sensor=false";
      $("<img/>").attr("src", str).appendTo("#map");

}

function cross_domain_JSON_call(url){
 url="http://ouseful.open.ac.uk/maps/dallas2.php?c=true";
 $.getJSON(
   url,
   function(data) { dallasdata(data); }
 )
}

$(document).ready(cross_domain_JSON_call(url));

</script>
</head><body>

<div id="map"></div>
<ul id="details"></ul>
</body></html>

Quiz: Are you a socially networked journalist?

Are you a social media journalist?
Photo by mulmatsherm (click to view)

I wrote this some time ago (the plan was to do it properly in Javascript or Flash) and rediscovered it while clearing out my office. It’s just a bit of Friday fun:

Quiz: Are you a networked journalist?

Are you powering down the Information Superhighway, fueled by Google Juice bought with Social Capital? Or are you stuck in the News Cycle Lane pedalling the Penny Farthing of journalism?

Are you among the widows of journalism past – or the orphans of journalism future?*

Do you know your tweets from your twats? Your friends from your Friendster? In just 7 questions this quiz will determine – once and for all time, eternally – your value as an professional journalist in the networked economy**. Go ahead.

Question 1: You witness a car crash involving a Premiership footballer. Do you:

a) Whip out your iPhone and take photos that go straight onto Flickr and Twitpic. Then create a new venue on Foursquare: ‘scene of car crash’ – of which you are now mayor.

b) Phone into the office to ask them to send a photographer, then whip out your notebook and try to get a quote

c) Phone an ambulance, then rush over to help him

Question 2: The Prime Minister calls a press conference. As you rush off to attend do you:

a) Ask people on your blog to suggest what questions you should put to the PM

b) Ask people in your office what big issues you should raise

c) Ask your partner if your flies are undone

Question 3: When you arrive at the press conference do you:

a) Look for a wifi signal

b) Look for someone to interview

c) Look for the toilets

Question 4: A major international story breaks while you’re in the office. Do you:

a) Start scouring Twitter, Tweepsearch and Twitterfall to see if you can track down someone tweeting from the scene

b) Pick up the phone and call a relevant international agency for their 30th official quote of the hour

c) Turn on the TV

Question 5: You’re about to go home when the editor asks you for an 800 word background feature on an ongoing issue in your field. Do you:

a) Open up your Delicious account and look through all your bookmarks under the relevant tags – and those of your network. Then check LinkedIn for contacts.

b) Flick through your contacts book. Then search Google.

c) Say no – you have to pick up your kids from school

Question 6: The newsroom post contains a vaguely interesting press release. Do you:

a) Spend 10 seconds googling to see if it’s online, then bookmarking it on Delicious with a key passage, which is then automatically republished with a link to the source on your Twitter stream, blog, and 24 different social networks.

b) Spend 10 minutes rewriting it for a potential filler for the next day’s paper

c) Read something else

Question 7: A notorious local dies, suddenly. Do you:

a) Shamelessly lift a picture from their Facebook profile, and aggregate everything under the #deadlocal hashtag

b) Go through the cuttings files to pull together an obituary

c) Send a card

Are you a social journalist? Check your results:

Mostly a)

Congratulations: you’re a social journalist. You are permanently connected to the online world of your readers and contacts. Permanently.

Mostly b)

You’re an old school journalist. Your equipment doesn’t need a battery and a wifi signal. But occasionally a pen will leak all over your jacket’s inside pocket.

Mostly c)

You’re a human being. Expect a P45 any day now.

A mix of the above

What do you think this is? A Mensa test? OK, so you’re complicated. Do us all a favour and find a pigeonhole to sit in for once.

*Sub editing joke.

**Because you need external validation from someone you’ve never met before, obviously.

PS: You may want to add your own questions – this would be welcome.

Hyperlocal voices: Kate Feld (Manchizzle)

Manchester hyperlocal blog Manchizzle

Kate Feld is a US citizen who launched the Manchester blog Manchizzle in 2005 and founded the Manchester Blog Awards shortly after. Her perspective on blogging is informed by her background as a journalist, she says, but with a few key differences. The full interview – part of the hyperlocal voices series – follows:

Who were the people behind the blog, and what were their backgrounds before setting it up?

Me, Kate Feld. My background is in newspapers. I worked as a reporter on local and regional papers in my native USA (local beat, city hall, some investigative) then eventually worked for the AP on the national desk in New York.

I moved to the Manchester area in Dec 2003 to live with my boyfriend, who I eventually married. I intended to continue to try to do local/investigative reporting but very quickly realised there was no way for me to continue in news here. So I switched to writing about culture.

In 2004 I was the editor of a startup culture and listings magazine in the city, and when that went bust I had time on my hands and a lot about Manchester I wanted to write. So I started the blog. It was my second blog, having experimented with blogging when I was in journalism school at Columbia in NYC in 2002-03. Continue reading

From a 15-year-old’s blog to MSM: Bleachgate and Miracle Mineral Solutions

Bleachgate - Rhys Morgan's video blog

Rhys Morgan's video blog on Bleachgate

Journalists wanting evidence of the value of blogs should take a look at the ‘Bleachgate’ story which has taken a month to filter up from 15-year-old Rhys Morgan’s blog post through other skeptic and science bloggers into The Guardian.

Rhys has Crohn’s Disease and was sceptical of the Miracle Mineral Solutions ‘treatment’ being plugged on a support forum that was also described by the FDA as industrial bleach. The forum didn’t like his scepticism, and banned him. He blogged about his concerns, and it went from there.

I can only hope that enough people link to The Guardian’s piece with the words Miracle Mineral Solutions to help raise awareness of the concerns. *Cough*.

AFTERTHOUGHT: What deserves particular attention is how the Guardian reporter Martin Robbins is responding to critical comments – providing further details of how the forum dealt with his approaches, and addressing conspiracy theorists. This is journalism that gets out there and engages with the issue rather than simply broadcasting. Wonderful.

Martin Robbins' replies to comment

Martin Robbins' replies to comment

Hyperlocal voices: Jon Bounds (Birmingham: It’s Not Shit)

Hyperlocal blog Birmingham: it's not shit

Jon Bounds surely has the claim to the most memorable title of a hyperlocal blog. Birmingham: It’s Not Shit (“Mildly sarcastic since 2002”) is a legend of the local and national blogging scene in which Jon has been a pioneer. In the latest of my ‘Hyperlocal Voices’ series, he describes the history of the site:

Who were the people behind BiNS, and what were their backgrounds before setting it up?

There was, and to a large extent still is, just me Jon Bounds. Although I’ve now got a couple of ‘columnists’ and feel that there are people around that I can call on to let me have a break.

I’ve an odd background of a Degree in Computer Science and a postgrad (CIty & Guilds) qualification in Journalism (and a brief, not entirely successful time as a freelancer on very poor music publications), but it was really working on internet design books in the late 90s that made me think about “the web” as a method of sharing.

As a kid I’d run fanzines (computer games and later football), but there were real creatives getting to grips with the web at that time and that was exciting.

What made you decide to set up the blog?

The blog part of the site came a couple of years after the site itself — which was originally a much flatter website with funny articles/video and a forum. The idea behind the site came as a direct reaction to the terribly drab view of the city that Marketing Birmingham/the Council put forward for the European City of Culture bid in 2002 — and the fact that all of the local media went unquestioningly with it.

Birmingham wasn’t – and still isn’t – a city of loft living and canalside bars, yet “organisations” only seemed comfortable with that little bit of it. To cover the bits of Brum that real people recognise and care about is still the main thrust of the site. Continue reading

Hyperlocal voices: Alderleyedge.com’s Lisa Reeves

Hyperlocal site alderleyedge.com

Following on from last week’s blog post on the founder of Parwich.org, I interviewed Lisa Reeves, the co-founder of alderleyedge.com, launched in 2009 and already selling out advertising.

Who were the people behind the blog, and what were their backgrounds before setting it up?

I run alderleyedge.com with my husband Martin, we live in the village. Martin built the site, so we own the technology, and I do the rest.

Martin set up his first internet company 13 years ago, and has always worked on internet based businesses of his own. I worked in publishing for 8 years then spent several years running the commercial side of internet businesses before giving up my career to be a stay at home mum.

What made you decide to set up the blog?

I wouldn’t describe alderleyedge.com as a blog, more a community news and information platform. A primary motivation for setting up the site was so that I could have a flexible job that I enjoyed as the children started to spend more time in school. The concept of alderleyedge.com seemed perfect as it allowed us to combine our experience in the Internet sector with our passion for the village in which we live.

We also felt that Alderley Edge was poorly served by its local newspaper, The Wilmslow Express, which seemed very much focused on the adjoining town of Wilmslow, and paid very little attention to Alderley Edge which it also purported to cover – although they seem to be providing a lot more coverage of Alderley Edge since we have become more established. Continue reading

The first Birmingham Hacks/Hackers meetup – Monday Sept 20

Those helpful people at Hacks/Hackers have let me set up a Hacks/Hackers group for Birmingham. This is basically a group of people interested in the journalistic (and, by extension, the civic) possibilities of data. If you’re at all interested in this and think you might want to meet up in the Midlands sometime, please join up.

I’ve also organised the first Hacks/Hackers meetup for Birmingham on Monday September 20, in Coffee Lounge from 1pm into the evening.

Our speaker will be Caroline Beavon, an experienced journalist who caught the data bug on my MA in Online Journalism (and whose experiences I felt would be accessible to most). In addition, NHS Local’s Carl Plant will be talking briefly about health data and Walsall Council’s Dan Slee about council data.

All are welcome and no technical or journalistic knowledge is required. I’m hoping we can pair techies with non-techies for some ad hoc learning.

If you want to come RSVP at the link.

PS: There’s also a Hacks/Hackers in London, and one being planned for Manchester, I’m told.

Podcasting: the experiences of Bagel Tech News

Bagel Tech News podcast

As part of the research into a forthcoming book on online journalism (UPDATE: now published), I interviewed Ewen Rankin of independent podcast Bagel Tech News. Here are his responses in full:

The background

My background is as a commercial photographer. I started life in graphic design and quickly moved to shooting photographs for the agency at which I worked. It was kind of a lucky transition as I wasn’t much cop as a graphic artist. I took fairly low level stuff to start with (picture business cards were all the rage in the 80s) and then moved to more commercial work shooting the advertising shots for Pretty Polly and Golden Lady tights in about 1988.

I start broadcasting in July 2008 and after two weeks Amber Macarthur made us Podcast of the Week on the Net@Night show with Leo Laporte. Listenership rose and we began to grow.

The Daily News show was published… daily until November 2008 and then I started publishing the BOG Show with Marc Silk, and was opened by Andy Ihnatko on 30th November 2008. I removed Marc from the show in Christmas 2009 and installed a ‘Skype Wall’ in January 2010 to run a more panel based show. More shows have been added in the intervening period and the network now has 7 active shows Continue reading