<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Online Journalism Blog &#187; semantic web</title>
	<atom:link href="http://onlinejournalismblog.com/tag/semantic-web/feed/" rel="self" type="application/rss+xml" />
	<link>http://onlinejournalismblog.com</link>
	<description>A conversation.</description>
	<lastBuildDate>Thu, 24 May 2012 08:39:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
<cloud domain='onlinejournalismblog.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
		<item>
		<title>Investigations tool DocumentCloud goes public (PS: documents drive traffic)</title>
		<link>http://onlinejournalismblog.com/2011/01/26/investigations-tool-documentcloud-goes-public-ps-documents-drive-traffic/</link>
		<comments>http://onlinejournalismblog.com/2011/01/26/investigations-tool-documentcloud-goes-public-ps-documents-drive-traffic/#comments</comments>
		<pubDate>Wed, 26 Jan 2011 20:37:07 +0000</pubDate>
		<dc:creator>Paul Bradshaw</dc:creator>
				<category><![CDATA[data journalism]]></category>
		<category><![CDATA[amanda hickman]]></category>
		<category><![CDATA[api]]></category>
		<category><![CDATA[DocumentCloud]]></category>
		<category><![CDATA[documents]]></category>
		<category><![CDATA[OCR]]></category>
		<category><![CDATA[semantic web]]></category>

		<guid isPermaLink="false">http://onlinejournalismblog.com/?p=12683</guid>
		<description><![CDATA[The rather lovely DocumentCloud &#8211; a tool that allows journalists to share, annotate, connect and organise documents &#8211; has finally emerged from its closet and made itself available to public searches. This means that anyone can now search the powerful database (some tips here) of newsworthy documents. If you want to add your own, however, [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fonlinejournalismblog.com%2F2011%2F01%2F26%2Finvestigations-tool-documentcloud-goes-public-ps-documents-drive-traffic%2F" onclick="urchinTracker('/outgoing/api.tweetmeme.com/share?url=http_3A_2F_2Fonlinejournalismblog.com_2F2011_2F01_2F26_2Finvestigations-tool-documentcloud-goes-public-ps-documents-drive-traffic_2F&amp;referer=');"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fonlinejournalismblog.com%2F2011%2F01%2F26%2Finvestigations-tool-documentcloud-goes-public-ps-documents-drive-traffic%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>The rather lovely DocumentCloud &#8211; a tool that allows journalists to share, annotate, connect and organise documents &#8211; has finally <a href="http://blog.documentcloud.org/blog/2011/01/going-public/" onclick="urchinTracker('/outgoing/blog.documentcloud.org/blog/2011/01/going-public/?referer=');">emerged from its closet</a> and made itself available to public searches.</p>
<p>This means that anyone can now search the powerful database (<a href="http://www.documentcloud.org/help/searching" onclick="urchinTracker('/outgoing/www.documentcloud.org/help/searching?referer=');">some tips here</a>) of newsworthy documents. If you want to add your own, however, you still <a href="http://www.documentcloud.org/contact" onclick="urchinTracker('/outgoing/www.documentcloud.org/contact?referer=');">need approval</a>.</p>
<p>If you do end up on <a href="http://www.documentcloud.org/contributors" onclick="urchinTracker('/outgoing/www.documentcloud.org/contributors?referer=');">this list</a> you&#8217;ll find it&#8217;s quite a powerful tool, with quick conversion of PDFs into text files, analytic tools and semantic tagging (so you can connect all documents with a <a href="http://www.documentcloud.org/public/#search/organization%3A%20%22Federal%20Bureau%20of%20Investigation%22%20person%3A%20%22Barack%20Obama%22" onclick="urchinTracker('/outgoing/www.documentcloud.org/public/_search/organization_3A_20_22Federal_20Bureau_20of_20Investigation_22_20person_3A_20_22Barack_20Obama_22?referer=');">particular person</a>, or <a href="http://www.documentcloud.org/public/#search/organization%3A%20%22Federal%20Bureau%20of%20Investigation%22" onclick="urchinTracker('/outgoing/www.documentcloud.org/public/_search/organization_3A_20_22Federal_20Bureau_20of_20Investigation_22?referer=');">organisation</a>) among its best features. The site is <a href="http://www.documentcloud.org/opensource" onclick="urchinTracker('/outgoing/www.documentcloud.org/opensource?referer=');">open source</a> and has an <a href="http://www.documentcloud.org/help/api" onclick="urchinTracker('/outgoing/www.documentcloud.org/help/api?referer=');">API</a> too.</p>
<p>I asked Program Director <strong>Amanda B Hickman</strong> what she&#8217;s learned on the project so far. Her response suggests that documents have a particular appeal for online readers:</p>
<blockquote><p>&#8220;If we&#8217;ve learned anything, it is that people really love documents. It is pretty clear that when there&#8217;s something interesting going on in the news, plenty of people want to dig a little deeper. When Arizona Republic posted an annotated version of that state&#8217;s new immigration law, it got more traffic than their weekly entertainment round up. WNYC told us that <a href="http://www.wnyc.org/articles/wnyc-news/2011/jan/20/indictments-organized-crime-sweep" onclick="urchinTracker('/outgoing/www.wnyc.org/articles/wnyc-news/2011/jan/20/indictments-organized-crime-sweep?referer=');">the page listing the indictments in last week&#8217;s mob roundup</a> was still getting more traffic than any other single news story even a week later.</p>
<p>&#8220;These were big news documents, to be sure, but it still seems pretty clear that people do want to dig deeper and explore the documents behind the news, which is great for us and great for news.&#8221;</p></blockquote>
<iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fonlinejournalismblog.com%2F2011%2F01%2F26%2Finvestigations-tool-documentcloud-goes-public-ps-documents-drive-traffic%2F&amp;layout=standard&amp;show_faces=true&amp;width=450&amp;action=like&amp;colorscheme=light&amp;height=80" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:450px; height:80px;" allowTransparency="true"></iframe><div align="center"><a href="http://twitter.com/paulbradshaw" target="_blank" onclick="urchinTracker('/outgoing/twitter.com/paulbradshaw?referer=');"><img src="http://onlinejournalismblog.com/wp-content/plugins/igit-follow-me-after-post-button-new/twitter8.png" /></a><div style="font-size:8px;"><a href="http://php-freelancer.in/" style="color:#D2D2D2" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer"  onclick="urchinTracker('/outgoing/php-freelancer.in/?referer=');">PHP Freelancer</a></div></div>]]></content:encoded>
			<wfw:commentRss>http://onlinejournalismblog.com/2011/01/26/investigations-tool-documentcloud-goes-public-ps-documents-drive-traffic/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Games, systems and context in journalism at News Rewired</title>
		<link>http://onlinejournalismblog.com/2010/12/19/games-systems-and-context-in-journalism-at-news-rewired/</link>
		<comments>http://onlinejournalismblog.com/2010/12/19/games-systems-and-context-in-journalism-at-news-rewired/#comments</comments>
		<pubDate>Sun, 19 Dec 2010 18:00:39 +0000</pubDate>
		<dc:creator>Mary Hamilton</dc:creator>
				<category><![CDATA[data journalism]]></category>
		<category><![CDATA[online journalism]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[context]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[future journalism]]></category>
		<category><![CDATA[games]]></category>
		<category><![CDATA[linked data]]></category>
		<category><![CDATA[mary hamilton]]></category>
		<category><![CDATA[news games]]></category>
		<category><![CDATA[open standards]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[semantic journalism]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://onlinejournalismblog.com/?p=12169</guid>
		<description><![CDATA[I went to News Rewired on Thursday, along with dozens of other journalists and folk concerned in various ways with news production. Some threads that ran through the day for me were discussions of how we publish our data (and allow others to do the same), how we link our stories together with each other and [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fonlinejournalismblog.com%2F2010%2F12%2F19%2Fgames-systems-and-context-in-journalism-at-news-rewired%2F" onclick="urchinTracker('/outgoing/api.tweetmeme.com/share?url=http_3A_2F_2Fonlinejournalismblog.com_2F2010_2F12_2F19_2Fgames-systems-and-context-in-journalism-at-news-rewired_2F&amp;referer=');"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fonlinejournalismblog.com%2F2010%2F12%2F19%2Fgames-systems-and-context-in-journalism-at-news-rewired%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<div>
<p>I went to <a title="News Rewired" href="http://www.newsrewired.com" onclick="urchinTracker('/outgoing/www.newsrewired.com?referer=');">News Rewired</a> on Thursday, along with dozens of other journalists and folk concerned in various ways with news production. Some threads that ran through the day for me were discussions of how we publish our data (and allow others to do the same), how we link our stories together with each other and the rest of the web, and how we can help our readers to explore context around our stories.</p>

<p><a title="LIVE: SEO for business-to-business and specialist media | News Rewired" href="http://www.newsrewired.com/2010/12/16/live-seo-for-business-to-business-and-specialist-media/" onclick="urchinTracker('/outgoing/www.newsrewired.com/2010/12/16/live-seo-for-business-to-business-and-specialist-media/?referer=');">One session focused heavily on SEO for specialist organisations</a>, but included a few sharp lessons for all news organisations. <a title="Frank Gosch" href="http://www.newsrewired.com/frank-gosch/" onclick="urchinTracker('/outgoing/www.newsrewired.com/frank-gosch/?referer=');">Frank Gosch</a> spoke about the importance of ensuring your site&#8217;s RSS feeds are up to date and allow other people to easily subscribe to and even republish your content. Instead of clinging tight to content, it&#8217;s good for your search rankings to let other people spread it around.</p>
<p><a title="James Lowery | News Rewired" href="http://www.newsrewired.com/speakers-2/james-lowery/" onclick="urchinTracker('/outgoing/www.newsrewired.com/speakers-2/james-lowery/?referer=');">James Lowery</a> echoed this theme, suggesting that publishers, like governments, should look at providing and publishing their data in re-usable, open formats like XML. It&#8217;s easy for data journalists to get hung up on how local councils, for instance, are publishing their data in PDFs, but to miss how our own news organisations are putting out our stories, visualisations and even datasets in formats that limit or even prevent re-use and mashup.</p>
<p>Following on from that, in <a title="LIVE: Linked data and the semantic web | News Rewired" href="http://www.newsrewired.com/2010/12/16/live-linked-data-and-the-semantic-web/" onclick="urchinTracker('/outgoing/www.newsrewired.com/2010/12/16/live-linked-data-and-the-semantic-web/?referer=');">the session on linked data and the semantic web</a>,<a title="Currybet" href="http://www.currybet.net" onclick="urchinTracker('/outgoing/www.currybet.net?referer=');">Martin Belam</a> spoke about the Guardian&#8217;s API, which can be queried to return stories on particular subjects and which is starting to use unique identifiers -<a title="Adding Linked Data to the Guardian's API | Martin Belam" href="http://www.currybet.net/cbet_blog/2010/10/adding-linked-data-to-guardian-api.php" onclick="urchinTracker('/outgoing/www.currybet.net/cbet_blog/2010/10/adding-linked-data-to-guardian-api.php?referer=');">MusicBrainz IDs and ISBNs, for instance</a> &#8211; to allow lists of stories to be pulled out not simply by text string but using a meaningful identification system. He added that publishers have to licence content in a meaningful way, so that it can be reused widely without running into legal issues.</p>
<p><a title="Silver Oliver" href="http://blockslabpillar.com" onclick="urchinTracker('/outgoing/blockslabpillar.com?referer=');">Silver Oliver</a> said that semantically tagged data, linked data, creates opportunities for pulling in contextual information for our stories from all sorts of other sources. And conversely, if we semantically tag our stories and make it possible for other people to re-use them, we&#8217;ll start to see our content popping up in unexpected ways and places.</p>
<p>And in the long term, he suggested, we&#8217;ll start to see people following stories completely independently of platform, medium or brand. Tracking a linked data tag (if that&#8217;s the right word) and following what&#8217;s new, what&#8217;s interesting, and what will work on whatever device I happen to have in my hand right now and whatever connection I&#8217;m currently on &#8211; images, video, audio, text, interactives; wifi, 3G, EDGE, offline. Regardless of who made it.</p>
<p>And this is part of the ongoing move towards creating a web that understands not only objects but also relationships, a world of meaningful nouns and verbs rather than text strings and many-to-many tables. It&#8217;s impossible to predict what will come from these developments, but &#8211; as an example &#8211; it&#8217;s not hard to imagine being able to take a photo of a front page on a newsstand and use it to search online for the story it refers to. And the results of that search might have nothing to do with the newspaper brand.</p>
<p>That&#8217;s the down side to all this. News consumption &#8211; already massively decentralised thanks to the social web &#8211; is likely to drift even further away from the cosy silos of news brands (with the honourable exception of paywalled gardens, perhaps). What can individual journalists and news organisations offer that the cloud can&#8217;t?</p>
<p>One exciting answer lies in the <a title="LIVE: Are we ready to play the journalism game? | News Rewired" href="http://www.newsrewired.com/2010/12/16/live-are-we-ready-to-play-the-journalism-game/" onclick="urchinTracker('/outgoing/www.newsrewired.com/2010/12/16/live-are-we-ready-to-play-the-journalism-game/?referer=');">last session of the day</a>, which looked at journalism and games. I <a title="What if? News games | Metamedia" href="http://maryhamilton.co.uk/2009/09/what-if-news-games/" onclick="urchinTracker('/outgoing/maryhamilton.co.uk/2009/09/what-if-news-games/?referer=');">wrote some time ago</a> about ways news organisations were harnessing games, and could do in the future &#8211; and the opportunities are now starting to take shape. With constant calls for news organisations to add context to stories, it&#8217;s easy to miss the possibility that &#8211; as <a title="Philip Trippenbach" href="http://trippenbach.com" onclick="urchinTracker('/outgoing/trippenbach.com?referer=');">Philip Trippenbach</a>said at News Rewired - <a title="Stop Telling Stories | Philip Trippenbach" href="http://trippenbach.com/2010/12/16/stop-telling-stories/" onclick="urchinTracker('/outgoing/trippenbach.com/2010/12/16/stop-telling-stories/?referer=');">you can&#8217;t explain a system with a story</a>:</p>
<blockquote><p>Stories can be a great way of transmitting understanding about things that have happened. The trouble is that they are actually a very bad way of transmitting understanding about how things work.</p></blockquote>
<p>Many of the issues we cover &#8211; climate change, government cuts, the deficit &#8211; at macro level are systems that could be interestingly and interactively explored with games. (Like this <a title="Climate Challenge | BBC" href="http://www.bbc.co.uk/sn/hottopics/climatechange/climate_challenge/" onclick="urchinTracker('/outgoing/www.bbc.co.uk/sn/hottopics/climatechange/climate_challenge/?referer=');">climate change game</a> here, for instance.) Other stories can be articulated and broadened through games in a way that allows for real empathy between the reader/player and the subject because they are experiential rather than intellectual. (Like <a title="Escape from Woomera | Wikipedia" href="http://en.wikipedia.org/wiki/Escape_From_Woomera" onclick="urchinTracker('/outgoing/en.wikipedia.org/wiki/Escape_From_Woomera?referer=');">Escape from Woomera</a>.)</p>
<p>Games allow players to explore systems, scenarios and entire universes in detail, prodding their limits and discovering their flaws and hidden logic. They can be intriguing, tricky, challenging, educational, complex like the best stories can be, but they&#8217;re also fun to experience, unlike so much news content that has a tendency to feel like work.</p>
<p>(By the by, this is true not just of computer and console games but also of live, tabletop, board and social games of all sorts &#8211; there are rich veins of community journalism that could be developed in these areas too, as the<a title="Making social gaming scale: Lessons from the Democrat and Chronicle’s adoption of alternate reality | Nieman Lab" href="http://www.niemanlab.org/2010/11/making-social-gaming-scale-lessons-from-the-democrat-and-chronicles-adaption-of-alternate-reality/" onclick="urchinTracker('/outgoing/www.niemanlab.org/2010/11/making-social-gaming-scale-lessons-from-the-democrat-and-chronicles-adaption-of-alternate-reality/?referer=');">Rochester Democrat and Chronicle is hoping to prove for a second time</a>.)</p>
<p>So the big things to take away from News Rewired, for me?</p>
<ul>
<li>The systems within which we do journalism are changing, and the semantic web will most likely bring another seismic change in news consumption and production.</li>
<li>It&#8217;s going to be increasingly important for us to produce content that both takes advantage of these new technologies and allows others to use these technologies to take advantage of it.</li>
<li>And by tapping into the interactive possibilities of the internet through games, we can help our readers explore complex systems that don&#8217;t lend themselves to simple stories.</li>
</ul>
<p>Oh, and some very decent whisky.</p>
<p><em>Cross-posted at <a title="Mary Hamilton | Metamedia" href="http://maryhamilton.co.uk/2010/12/games-systems-context-journalism-news-rewired/" onclick="urchinTracker('/outgoing/maryhamilton.co.uk/2010/12/games-systems-context-journalism-news-rewired/?referer=');">Metamedia</a>.</em></p>
</div>
<iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fonlinejournalismblog.com%2F2010%2F12%2F19%2Fgames-systems-and-context-in-journalism-at-news-rewired%2F&amp;layout=standard&amp;show_faces=true&amp;width=450&amp;action=like&amp;colorscheme=light&amp;height=80" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:450px; height:80px;" allowTransparency="true"></iframe><div align="center"><a href="http://twitter.com/paulbradshaw" target="_blank" onclick="urchinTracker('/outgoing/twitter.com/paulbradshaw?referer=');"><img src="http://onlinejournalismblog.com/wp-content/plugins/igit-follow-me-after-post-button-new/twitter8.png" /></a><div style="font-size:8px;"><a href="http://php-freelancer.in/" style="color:#D2D2D2" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer"  onclick="urchinTracker('/outgoing/php-freelancer.in/?referer=');">PHP Freelancer</a></div></div>]]></content:encoded>
			<wfw:commentRss>http://onlinejournalismblog.com/2010/12/19/games-systems-and-context-in-journalism-at-news-rewired/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Extractiv: crawl webpages and make semantic connections</title>
		<link>http://onlinejournalismblog.com/2010/11/16/extractiv-crawl-webpages-and-make-semantic-connections-tool/</link>
		<comments>http://onlinejournalismblog.com/2010/11/16/extractiv-crawl-webpages-and-make-semantic-connections-tool/#comments</comments>
		<pubDate>Tue, 16 Nov 2010 15:27:01 +0000</pubDate>
		<dc:creator>Paul Bradshaw</dc:creator>
				<category><![CDATA[data journalism]]></category>
		<category><![CDATA[data analysis]]></category>
		<category><![CDATA[extractiv]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[web crawling]]></category>

		<guid isPermaLink="false">http://onlinejournalismblog.com/?p=11429</guid>
		<description><![CDATA[Here&#8217;s another data analysis tool which is worth keeping an eye on. Extractiv &#8220;lets you transform unstructured web content into highly-structured semantic data.&#8221; Eyes glazing over? Okay, over to ReadWriteWeb: &#8220;To test Extractive, I gave the company a collection of more than 500 web domains for the top geolocation blogs online and asked its technology [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fonlinejournalismblog.com%2F2010%2F11%2F16%2Fextractiv-crawl-webpages-and-make-semantic-connections-tool%2F" onclick="urchinTracker('/outgoing/api.tweetmeme.com/share?url=http_3A_2F_2Fonlinejournalismblog.com_2F2010_2F11_2F16_2Fextractiv-crawl-webpages-and-make-semantic-connections-tool_2F&amp;referer=');"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fonlinejournalismblog.com%2F2010%2F11%2F16%2Fextractiv-crawl-webpages-and-make-semantic-connections-tool%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><img src="http://rww.readwriteweb.netdna-cdn.com/images/extractivscreen.jpg" alt="Extractiv screenshot" /></p>
<p>Here&#8217;s another data analysis tool which is worth keeping an eye on. <a href="http://Extractiv.com/" onclick="urchinTracker('/outgoing/Extractiv.com/?referer=');">Extractiv</a> &#8220;lets you transform unstructured web content into highly-structured semantic data.&#8221; Eyes glazing over? Okay, <a href="http://www.readwriteweb.com/archives/extractiv_bulk_text_analysis.php?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+readwriteweb+%28ReadWriteWeb%29" onclick="urchinTracker('/outgoing/www.readwriteweb.com/archives/extractiv_bulk_text_analysis.php?utm_source=feedburner_amp_utm_medium=feed_amp_utm_campaign=Feed_3A+readwriteweb+_28ReadWriteWeb_29&amp;referer=');">over to ReadWriteWeb</a>:</p>
<blockquote><p>&#8220;To test Extractive, I gave the company a collection of more than 500 web domains for the top geolocation blogs online and asked its technology to sort for all appearances of the word &#8220;ESRI.&#8221; (The name of the leading vendor in the geolocation market.)</p>
<p>&#8220;The resulting output included structured cells describing some person, place or thing, some type of relationship it had with the word ESRI and the URL where the words appeared together. It was thus sortable and ready for my analysis.</p>
<p>&#8220;The task was partially completed before being rate limited due to my submitting so many links from the same domain. More than 125,000 pages were analyzed, 762 documents were found that included my keyword ESRI and about 400 relations were discovered (including duplicates). What kinds of patterns of relations will I discover by sorting all this data in a spreadsheet or otherwise? I can&#8217;t wait to find out.&#8221;</p></blockquote>
<p>What that means in even plainer language is that Extractiv will crawl thousands of webpages to identify relationships and attributes for a particular subject.</p>
<p>This has obvious applications for investigative journalists: give the software a name (of a person or company, for example) and a set of base domains (such as news websites, specialist publications and blogs, industry sites, etc.) and set it going. At the end you&#8217;ll have a broad picture of what other organisations and people have been connected with that person or company. Relationships you can ask it to identify include relationships, ownership, former names, telephone numbers, companies worked for, worked with, and job positions.</p>
<p>It won&#8217;t answer your questions, but it will suggest some avenues of enquiry, and potential sources of information. And all within an hour.</p>
<h2>Time and cost</h2>
<p>ReadWriteWeb reports that the process above took around an hour &#8220;and would have cost me less than $1, after a $99 monthly subscription fee. The next level of subscription would have been performed faster and with more simultaneous processes running at a base rate of $250 per month.&#8221;</p>
<p>As they say, the tool represents &#8220;commodity level, DIY analysis of bulk data produced by user generated or other content, sortable for pattern detection and soon, Extractiv says, sentiment analysis.&#8221;</p>
<p>Which is nice.</p>
<iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fonlinejournalismblog.com%2F2010%2F11%2F16%2Fextractiv-crawl-webpages-and-make-semantic-connections-tool%2F&amp;layout=standard&amp;show_faces=true&amp;width=450&amp;action=like&amp;colorscheme=light&amp;height=80" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:450px; height:80px;" allowTransparency="true"></iframe><div align="center"><a href="http://twitter.com/paulbradshaw" target="_blank" onclick="urchinTracker('/outgoing/twitter.com/paulbradshaw?referer=');"><img src="http://onlinejournalismblog.com/wp-content/plugins/igit-follow-me-after-post-button-new/twitter8.png" /></a><div style="font-size:8px;"><a href="http://php-freelancer.in/" style="color:#D2D2D2" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer"  onclick="urchinTracker('/outgoing/php-freelancer.in/?referer=');">PHP Freelancer</a></div></div>]]></content:encoded>
			<wfw:commentRss>http://onlinejournalismblog.com/2010/11/16/extractiv-crawl-webpages-and-make-semantic-connections-tool/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data and the future of journalism panel discussion: Linked Data London</title>
		<link>http://onlinejournalismblog.com/2009/09/09/data-and-the-future-of-journalism-panel-discussion-linked-data-london/</link>
		<comments>http://onlinejournalismblog.com/2009/09/09/data-and-the-future-of-journalism-panel-discussion-linked-data-london/#comments</comments>
		<pubDate>Wed, 09 Sep 2009 21:04:54 +0000</pubDate>
		<dc:creator>Paul Bradshaw</dc:creator>
				<category><![CDATA[data journalism]]></category>
		<category><![CDATA[online journalism]]></category>
		<category><![CDATA[BBC]]></category>
		<category><![CDATA[dan brickley]]></category>
		<category><![CDATA[foaf]]></category>
		<category><![CDATA[Guardian]]></category>
		<category><![CDATA[john o'donovan]]></category>
		<category><![CDATA[leigh dodds]]></category>
		<category><![CDATA[Linked Data London]]></category>
		<category><![CDATA[linkeddata]]></category>
		<category><![CDATA[London]]></category>
		<category><![CDATA[martin belam]]></category>
		<category><![CDATA[oauth]]></category>
		<category><![CDATA[paywalls]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[skos]]></category>
		<category><![CDATA[talis]]></category>

		<guid isPermaLink="false">http://onlinejournalismblog.com/?p=3397</guid>
		<description><![CDATA[Tonight I had the pleasure of chairing an extremely informative panel discussion on data and the future of journalism at the first London Linked Data Meetup. On the panel were: Martin Belam (Information Architect, The Guardian; blogger, Currybet) John O’Donovan (Chief Architect, BBC News Online) Dan Brickley (Friend of a Friend project; VU University, Amsterdam; SpyPixel Ltd; [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F09%2F09%2Fdata-and-the-future-of-journalism-panel-discussion-linked-data-london%2F" onclick="urchinTracker('/outgoing/api.tweetmeme.com/share?url=http_3A_2F_2Fonlinejournalismblog.com_2F2009_2F09_2F09_2Fdata-and-the-future-of-journalism-panel-discussion-linked-data-london_2F&amp;referer=');"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F09%2F09%2Fdata-and-the-future-of-journalism-panel-discussion-linked-data-london%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Tonight I had the pleasure of chairing an extremely informative panel discussion on data and the future of journalism at the first <a href="http://www.meetup.com/Web-Of-Data/calendar/11056905/" onclick="urchinTracker('/outgoing/www.meetup.com/Web-Of-Data/calendar/11056905/?referer=');">London Linked Data Meetup</a>. On the panel were:</p>
<ul>
<li>Martin Belam (Information Architect, The Guardian; blogger, <a href="http://www.Currybet.net/" onclick="urchinTracker('/outgoing/www.Currybet.net/?referer=');">Currybet</a>)</li>
<li><a href="http://www.bbc.co.uk/blogs/bbcinternet/john_odonovan/" onclick="urchinTracker('/outgoing/www.bbc.co.uk/blogs/bbcinternet/john_odonovan/?referer=');">John O’Donovan</a> (Chief Architect, BBC News Online)</li>
<li><a href="http://danbri.org/words/" onclick="urchinTracker('/outgoing/danbri.org/words/?referer=');">Dan Brickley</a> (<a href="http://www.foaf-project.org/" onclick="urchinTracker('/outgoing/www.foaf-project.org/?referer=');">Friend of a Friend project</a>; VU University, Amsterdam; SpyPixel Ltd; ex-W3C)</li>
<li><a href="http://www.ldodds.com/blog/" onclick="urchinTracker('/outgoing/www.ldodds.com/blog/?referer=');">Leigh Dodds</a> (Talis)</li>
</ul>
<p>What follows is a series of notes from the discussion, which I hope are of some use.</p>
<p>For a primer on Linked Data there is <a href="http://www.bbc.co.uk/blogs/radiolabs/s5/linked-data/s5.html" onclick="urchinTracker('/outgoing/www.bbc.co.uk/blogs/radiolabs/s5/linked-data/s5.html?referer=');">A Skim-Read Introduction to Linked Data</a>; <a href="http://tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf" onclick="urchinTracker('/outgoing/tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf?referer=');">Linked Data: The Story So Far PDF)</a> by Tom Heath, Christian Bizer and Berners-Lee; and <a href="http://www.ted.com/index.php/talks/tim_berners_lee_on_the_next_web.html" onclick="urchinTracker('/outgoing/www.ted.com/index.php/talks/tim_berners_lee_on_the_next_web.html?referer=');">this TED video by Sir Tim Berners-Lee</a> (who was on the panel before this one).</p>
<p>To set some brief context, I talked about how 2009 was, for me, a key year in data and journalism &#8211; largely because it has been a year of crisis in both publishing and government. The seminal point in all of this has been the MPs&#8217; expenses story, which both demonstrated the power of data in journalism, and the need for transparency from government &#8211; for example, the government <a href="http://blogs.cabinetoffice.gov.uk/digitalengagement/post/2009/06/09/Data-So-what-happens-now.aspx" onclick="urchinTracker('/outgoing/blogs.cabinetoffice.gov.uk/digitalengagement/post/2009/06/09/Data-So-what-happens-now.aspx?referer=');">appointment of Sir Tim Berners-Lee</a>, seeking developers to suggest things to do with public data, and the <a href="http://basiccraft.wordpress.com/2009/08/01/uk-government-and-social-media-10-to-watch/" onclick="urchinTracker('/outgoing/basiccraft.wordpress.com/2009/08/01/uk-government-and-social-media-10-to-watch/?referer=');">imminent</a> <a href="http://www.jenitennison.com/blog/node/115" onclick="urchinTracker('/outgoing/www.jenitennison.com/blog/node/115?referer=');">launch</a> of <a href="http://blogs.cabinetoffice.gov.uk/digitalengagement/post/2009/05/22/Information-and-how-to-make-it-useful.aspx" onclick="urchinTracker('/outgoing/blogs.cabinetoffice.gov.uk/digitalengagement/post/2009/05/22/Information-and-how-to-make-it-useful.aspx?referer=');">Data.gov.uk</a> around the same issue.</p>
<p>Even before then the <a href="http://onlinejournalismblog.com/2009/02/06/new-york-times-lets-users-build-things-with-its-content-open-api/">New York Times</a> and <a href="http://onlinejournalismblog.com/2009/03/10/guardian-joins-new-york-times-in-releasing-open-api/">Guardian both launched APIs</a> at the beginning of the year, MSN Local and the BBC have both been working with Wikipedia and we&#8217;ve seen the launch of a number of startups and mashups around data including <a href="http://timetric.com/" onclick="urchinTracker('/outgoing/timetric.com/?referer=');">Timetric</a>, <a href="http://verifiable.com/" onclick="urchinTracker('/outgoing/verifiable.com/?referer=');">Verifiable</a>, <a href="http://bevocal.org.uk/" onclick="urchinTracker('/outgoing/bevocal.org.uk/?referer=');">BeVocal</a>, <a href="http://openlylocal.com/" onclick="urchinTracker('/outgoing/openlylocal.com/?referer=');">OpenlyLocal</a>, <a href="http://www.mashthestate.org.uk/" onclick="urchinTracker('/outgoing/www.mashthestate.org.uk/?referer=');">MashTheState</a>, the open source release of <a href="http://Everyblock.com" onclick="urchinTracker('/outgoing/Everyblock.com?referer=');">Everyblock</a>, and <a href="http://mapumental.channel4.com/" onclick="urchinTracker('/outgoing/mapumental.channel4.com/?referer=');">Mapumental</a>.</p>
<h2>Q: What are the implications of paywalls for Linked Data?</h2>
<p>The general view was that Linked Data &#8211; specifically standards like RDF &#8211; would allow users and organisations to access information about content even if they couldn&#8217;t access the content itself. To give a concrete example, rather than linking to a &#8216;wall&#8217; that simply requires payment, it would be clearer what the content beyond that wall related to (e.g. key people, organisations, author, etc.)</p>
<p>Leigh Dodds felt that using standards like RDF would allow organisations to more effectively package content in commercially attractive ways, e.g. &#8216;everything about this organisation&#8217;.</p>
<h2>Q: What can bloggers do to tap into the potential of Linked Data?</h2>
<p>This drew some blank responses, but Leigh Dodds was most forthright, arguing that the onus lay with developers to do things that would make it easier for bloggers to, for example, visualise data. He also pointed out that currently if someone does something with data it is not possible to track that back to the source and that better tools would allow, effectively, an equivalent of pingback for data included in charts (e.g. the person who created the data would know that it had been used, as could others).</p>
<h2>Q: Given that the problem for publishing lies in advertising rather than content, how can Linked Data help solve that?</h2>
<p>Dan Brickley suggested that OAuth technologies (where you use a single login identity for multiple sites that contains information about your social connections, rather than creating a new &#8216;identity&#8217; for each) would allow users to specify more specifically how they experience content, for instance: &#8216;I only want to see article comments by users who are also my Facebook and Twitter friends.&#8217;</p>
<p>The same technology would allow for more personalised, and therefore more lucrative, advertising.</p>
<p>John O&#8217;Donovan felt the same could be said about content itself &#8211; more accurate data about content would allow for more specific selling of advertising.</p>
<p>Martin Belam quoted James Cridland on radio: &#8220;[The different operators] agree on technology but compete on content&#8221;. The same was true of advertising but the advertising and news industries needed to be more active in defining common standards.</p>
<p>Leigh Dodds pointed out that semantic data was already being used by companies serving advertising.</p>
<h2>Other notes</h2>
<p>I asked members of the audience who they felt were the heroes and villains of Linked Data in the news industry. The Guardian and BBC came out well &#8211; The Daily Mail were named as repeat offenders who would simply refer to &#8220;a study&#8221; and not say which, nor link to it.</p>
<p>Martin Belam pointed out that The Guardian is increasingly asking itself &#8216;How will that look through an API&#8217; when producing content, representing a key shift in editorial thinking. If users of the platform are swallowing up significant bandwidth or driving significant traffic then that would probably warrant talking to them about more formal relationships (either customer-provider or partners).</p>
<p>A number of references were made to the problem of provenance &#8211; being able to identify where a statement came from. Dan Brickley specifically spoke of the problem with identifying the source of Twitter retweets.</p>
<p>Dan also felt that the problem of journalists not linking would be solved by technology. In conversation previously, he also talked of &#8220;subject-based linking&#8221; and the impact of <a href="http://www.w3.org/2004/02/skos/" onclick="urchinTracker('/outgoing/www.w3.org/2004/02/skos/?referer=');">SKOS</a> and linked data style identifiers. He saw a problem in that, while new articles might link to older reports on the same issue, older reports were not updated with links to the new updates. Tagging individual articles was problematic in that you then had the equivalent of an overflowing inbox.</p>
<p>(I&#8217;ve invited all 4 participants to correct any errors and add anything I&#8217;ve missed)</p>
<p>Finally, here&#8217;s a bit of video from the very last question addressed in the discussion (filmed with thanks by <a href="http://twitter.com/countculture" onclick="urchinTracker('/outgoing/twitter.com/countculture?referer=');">@countculture</a>):</p>
<p><a href="http://vimeo.com/6514273" onclick="urchinTracker('/outgoing/vimeo.com/6514273?referer=');">Linked Data London 090909</a> from <a href="http://vimeo.com/paulbradshaw" onclick="urchinTracker('/outgoing/vimeo.com/paulbradshaw?referer=');">Paul Bradshaw</a> on <a href="http://vimeo.com" onclick="urchinTracker('/outgoing/vimeo.com?referer=');">Vimeo</a>.</p>
<iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F09%2F09%2Fdata-and-the-future-of-journalism-panel-discussion-linked-data-london%2F&amp;layout=standard&amp;show_faces=true&amp;width=450&amp;action=like&amp;colorscheme=light&amp;height=80" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:450px; height:80px;" allowTransparency="true"></iframe><div align="center"><a href="http://twitter.com/paulbradshaw" target="_blank" onclick="urchinTracker('/outgoing/twitter.com/paulbradshaw?referer=');"><img src="http://onlinejournalismblog.com/wp-content/plugins/igit-follow-me-after-post-button-new/twitter8.png" /></a><div style="font-size:8px;"><a href="http://php-freelancer.in/" style="color:#D2D2D2" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer"  onclick="urchinTracker('/outgoing/php-freelancer.in/?referer=');">PHP Freelancer</a></div></div>]]></content:encoded>
			<wfw:commentRss>http://onlinejournalismblog.com/2009/09/09/data-and-the-future-of-journalism-panel-discussion-linked-data-london/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Elsevier&#8217;s &#8216;Article of the Future&#8217; resembles websites of the past</title>
		<link>http://onlinejournalismblog.com/2009/07/27/elseviers-article-of-the-future-resembles-websites-of-the-past/</link>
		<comments>http://onlinejournalismblog.com/2009/07/27/elseviers-article-of-the-future-resembles-websites-of-the-past/#comments</comments>
		<pubDate>Mon, 27 Jul 2009 13:04:43 +0000</pubDate>
		<dc:creator>paulcarvill</dc:creator>
				<category><![CDATA[magazines]]></category>
		<category><![CDATA[online journalism]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[elsevier]]></category>
		<category><![CDATA[information architecture]]></category>
		<category><![CDATA[Jan Aalbersberg]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[multimedia]]></category>
		<category><![CDATA[nature.com]]></category>
		<category><![CDATA[prototype]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[science innovation]]></category>
		<category><![CDATA[science journalism]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[w3c]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://onlinejournalismblog.com/?p=3082</guid>
		<description><![CDATA[Elsevier, the Dutch scientific publisher, has announced details of their grandly titled Article of the Future project.  Their prototypes, published at http://beta.cell.com, are the result of what Emilie Marcus, Editor in Chief, Cell Press called, &#8220;&#8230;a challenge to redesign from scratch how to most effectively structure and present the content of a traditional scientific article [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F07%2F27%2Felseviers-article-of-the-future-resembles-websites-of-the-past%2F" onclick="urchinTracker('/outgoing/api.tweetmeme.com/share?url=http_3A_2F_2Fonlinejournalismblog.com_2F2009_2F07_2F27_2Felseviers-article-of-the-future-resembles-websites-of-the-past_2F&amp;referer=');"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F07%2F27%2Felseviers-article-of-the-future-resembles-websites-of-the-past%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Elsevier, the Dutch scientific publisher, has <a href="http://www.elsevier.com/wps/find/authored_newsitem.cws_home/companynews05_01279" onclick="urchinTracker('/outgoing/www.elsevier.com/wps/find/authored_newsitem.cws_home/companynews05_01279?referer=');">announced details</a> of their grandly titled <strong>Article of the Future</strong> project.  Their prototypes, published at <a title="Elsevier's Article of the Future prototypes" href="http://beta.cell.com" onclick="urchinTracker('/outgoing/beta.cell.com?referer=');">http://beta.cell.com</a>, are the result of what Emilie Marcus, Editor in Chief, Cell Press called,</p>
<blockquote><p>&#8220;&#8230;a challenge to redesign from scratch how to most effectively structure and present the content of a traditional scientific article in an online environment.&#8221;</p></blockquote>
<p><strong>Prototypes</strong><br />
Several things strike me about the prototypes — and let&#8217;s bear in mind that these are prototypes, and so are likely to change based on feedback from users in the scientific community and elsewhere; but also that they are <em>published</em> prototypes, and so by definition are completely open for comment — the most obvious being their remarkable lack of futuristic qualities.  Instead, the prototypes resemble an enthusiastic bash at a multimedia-infused online encyclopaedia circa 1997, when multimedia was still a buzzword, or such as you might have found on a CD-ROM magazine cover mounted giveaway around the same time.<span id="more-3082"></span></p>
<p>Now, I know relatively little about the scientific journal community, or how they consume their publications online.  But I imagine they will be intrigued though underwhelmed by an international publisher of scientific journals presenting them with these prototypes as the future of their work online.</p>
<p><strong>Design</strong><br />
There is a lack of nuance, uniformity and overall cohesion in the article designs which immediately make me feel uneasy.  From a scientific journal I expect highly structured, easily accessible data.  The inconsistency in design and layout here means I have to work harder to know where I am in the document and what exactly I&#8217;m looking at.  Additionally, the wide text columns, combined with poor linespacing, lack of subheadings and lack of general hierarchical structure within the text contribute to overall poor readability.  I&#8217;d also suggest that the tabbed element leaves little room for expansion should the number of article sections increase.  There is also too much emphasis on distracting design elements like pop-ups and modal dialogues to present small snippets of data which would have been far more suitably presented within their referencing context.</p>
<p><strong>Information architecture</strong><br />
There is a poor overall data structure, and to much repetition of content elements such as the figures.  Although this data structure undoubtedly encourages an initial period of browsing and discovery it also precludes an easy, linear reading of the content and lengthens the user journey considerably.</p>
<p><strong>Development<br />
<span style="font-weight: normal">From a development point of view it is important to note that these prototypes are completely reliant on JavaScript to present all of their content.  Without JavaScript the browser will only render the first tab&#8217;s content visibly.  Needless to say this is unacceptable on the modern web.  Separation of content, presentation and behaviour were long ago accepted as the minimum standard for a modern, user-friendly and broadly accessible web page.  Any web pages being designed now by international publishers should display this characteristic, especially this which claim to represent the future of the medium.  Similarly, these prototypes both have an obnoxious amount of inline JavaScript, making machine readability and text-browser readability a major pain, verging on the unusable.</span></strong></p>
<p><strong><span style="font-weight: normal"><strong>Data</strong><br />
Perhaps the most important point of all — where&#8217;s the data?  XML and specifications like RDF are capable of representing structured scientific data online in such a way that it can be consumed by browsers, applications and other user agents we haven&#8217;t yet developed.  Scientific research data would seem to be a perfect candidate to display the power of the semantic web, and yet Elsevier appears to have shown a complete ignorance of its availability, its ability to open up data to a wider audience and to normalise data and allow consumers of data and information to use it in the manner most appropriate to them. </span></strong></p>
<p><strong>Maintenance</strong><br />
Are there authoring tools for creation and maintenance of these articles?  Is there a published schema and validator to ensure consistency between producers?  Which user needs does this article format address?  Will Elsevier publish their user research so we can better understand the problems users have with current scientific paper formats?  The whole format seems not to have been thought through properly.  Why not work with the <a href="http://www.w3.org/" onclick="urchinTracker('/outgoing/www.w3.org/?referer=');">W3C</a> and other organisations dedicated to the description and ongoing development and maintenance of online document specifications.  Use the power of groups already working in this space instead of reinventing the wheel, badly.</p>
<p><strong>Innovation?</strong><br />
Jan Aalbersberg, Vice President of Content Innovation for Elsevier Science &amp; Technology Journal Publishing, says &#8220;We are confident that these tools will enhance the presentation of scientific results and improve the interpretation and speed of results analysis. They are central to driving innovation in scientific publishing and represent our investment in the future of research, enabling scientists all over the world to access, interpret, and create better science more efficiently.&#8221;  But I say that these &#8220;Articles of the Future&#8221; are not tools, and they are no more innovative than using a page layout application to alter the appearance of some printed matter.  Hyperlinking and the ability to add media files to a page have existed since the web was created, and these articles add nothing more to that basic paradigm of linked data files.  There are some nods towards current trends, with a comment feature and social bookmarking links.  But overall the feel is clunky, lacking research and distinctly amateur.</p>
<p><strong>Articles of the Future?<br />
</strong>Finally, there are two noteworthy features which may tell us more about the project, its origins and its ultimate destination than the prototypes themselves.  Firstly, the copyright notice, which proclaims the pages to be copyright of Elsevier 2008.  Are these prototypes already at least 6 months old?  Marcus herself notes,</p>
<blockquote><p>&#8220;The rapid pace of technological advancements means this will undoubtedly be an evolving design,&#8221;</p></blockquote>
<p>but is this an indication of the kind of turnaround we can expect for amendments and improvements to the article&#8217;s format?  Secondly, the entire article can be downloaded as a PDF, in its original published format, and which is perfect for printing, and I suspect that, excepting casual browsers, this will be how most users choose to consume their scientific research and analysis.</p>
<p>I think this area of publishing is indeed long overdue a complete overhaul of its staid online publishing practices, and any move to define a new specification for doing so should be welcomed.  Even the otherwise impressive <a href="http://www.nature.com" onclick="urchinTracker('/outgoing/www.nature.com?referer=');">nature.com</a> only goes so far in its presentation of research papers, and there is much room for improvement.  But when the result is as underwhelming, cumbersome and shortsighted as this, I despair.</p>
<iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F07%2F27%2Felseviers-article-of-the-future-resembles-websites-of-the-past%2F&amp;layout=standard&amp;show_faces=true&amp;width=450&amp;action=like&amp;colorscheme=light&amp;height=80" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:450px; height:80px;" allowTransparency="true"></iframe><div align="center"><a href="http://twitter.com/paulbradshaw" target="_blank" onclick="urchinTracker('/outgoing/twitter.com/paulbradshaw?referer=');"><img src="http://onlinejournalismblog.com/wp-content/plugins/igit-follow-me-after-post-button-new/twitter8.png" /></a><div style="font-size:8px;"><a href="http://php-freelancer.in/" style="color:#D2D2D2" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer"  onclick="urchinTracker('/outgoing/php-freelancer.in/?referer=');">PHP Freelancer</a></div></div>]]></content:encoded>
			<wfw:commentRss>http://onlinejournalismblog.com/2009/07/27/elseviers-article-of-the-future-resembles-websites-of-the-past/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>The services of the &#8216;semantic web&#8217;</title>
		<link>http://onlinejournalismblog.com/2009/03/23/the-services-of-the-semantic-web/</link>
		<comments>http://onlinejournalismblog.com/2009/03/23/the-services-of-the-semantic-web/#comments</comments>
		<pubDate>Mon, 23 Mar 2009 15:33:13 +0000</pubDate>
		<dc:creator>michaelhaddon</dc:creator>
				<category><![CDATA[online journalism]]></category>
		<category><![CDATA[browsing]]></category>
		<category><![CDATA[community]]></category>
		<category><![CDATA[computer aided reporting]]></category>
		<category><![CDATA[firefox]]></category>
		<category><![CDATA[headup]]></category>
		<category><![CDATA[iGlue]]></category>
		<category><![CDATA[OpenCalais]]></category>
		<category><![CDATA[Powerset]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[SemantiFind]]></category>
		<category><![CDATA[social networking]]></category>
		<category><![CDATA[Thomson Reuters]]></category>
		<category><![CDATA[Twine]]></category>
		<category><![CDATA[web 3.0]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://onlinejournalismblog.com/?p=2453</guid>
		<description><![CDATA[Many of the services that are being developed as part of the &#8216;semantic web&#8217; are necessarily works in progress, but they all contribute to extending the success of this burgeoning area of technology. There are plenty more popping up all the time, but for the purposes of this post I have loosely grouped some prominent [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F03%2F23%2Fthe-services-of-the-semantic-web%2F" onclick="urchinTracker('/outgoing/api.tweetmeme.com/share?url=http_3A_2F_2Fonlinejournalismblog.com_2F2009_2F03_2F23_2Fthe-services-of-the-semantic-web_2F&amp;referer=');"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F03%2F23%2Fthe-services-of-the-semantic-web%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Many of the services that are being developed as part of the &#8216;semantic web&#8217; are necessarily works in progress, but they all contribute to extending the success of this burgeoning area of technology. There are plenty more popping up all the time, but for the purposes of this post I have loosely grouped some prominent sites into specialities &#8211; social networking, search and browsing &#8211; before briefly explaining their uses.</p>
<p><span id="more-2453"></span><em>BROWSING</em></p>
<p><a href="http://www.opencalais.com/" onclick="urchinTracker('/outgoing/www.opencalais.com/?referer=');">OpenCalais</a> is a way to tag people, places, facts and events in pre-existing content to increase its value and accessibility. It makes use of RDF to annotate content intelligently and automatically so that it can be used in more meaningful ways. Developed by <a href="http://www.thomsonreuters.com/" onclick="urchinTracker('/outgoing/www.thomsonreuters.com/?referer=');">Thomson Reuters</a>, the service now has a <a href="http://sws.clearforest.com/calaisViewer/" onclick="urchinTracker('/outgoing/sws.clearforest.com/calaisViewer/?referer=');">preview tool</a> that can take any document and provide a display of the results of tagging and linking the semantic data. It provides an immediate and useful example of the way the technology works and is fun to play around with. OpenCalais is also available as a <a href="http://www.wordpress.com/" onclick="urchinTracker('/outgoing/www.wordpress.com/?referer=');">WordPress</a> <a href="http://www.readwriteweb.com/archives/calais_gets_a_wordpress_plugin.php" onclick="urchinTracker('/outgoing/www.readwriteweb.com/archives/calais_gets_a_wordpress_plugin.php?referer=');">plugin</a> which uses the service for auto tagging posts and archives with the correct themes.</p>
<p><a href="http://www.headup.com/" onclick="urchinTracker('/outgoing/www.headup.com/?referer=');">headup</a> is a <a href="http://www.mozilla-europe.org/en/firefox/" onclick="urchinTracker('/outgoing/www.mozilla-europe.org/en/firefox/?referer=');">Firefox</a> plugin that enables semantic capabilities within any web page. Extra data is displayed fully in context as the service just alerts the user with a &#8216;+&#8217; symbol when there is something else of interest to them. On encountering data about a band, headup might highlight the latest YouTube videos, tour dates and official blog-posts next to their name. This data can all be viewed without ever navigating away from the original page. Impressively headup&#8217;s semantic engine promises to provide a personalised service by retrieving information that specifically interests the individual user. You can watch a demonstration video <a href="http://www.youtube.com/watch?v=sZnwOKvtQ6M&amp;eurl=" onclick="urchinTracker('/outgoing/www.youtube.com/watch?v=sZnwOKvtQ6M_amp_eurl=&amp;referer=');">here.</a></p>
<p><em>SEARCH</em></p>
<p><a href="http://www.semantifind.com/" onclick="urchinTracker('/outgoing/www.semantifind.com/?referer=');">SemantiFind</a> claims to return more relevant results than traditional search engines, yet users can still continue using them as it is compatible with Google, Yahoo and Microsoft Live Search. You have to download and install a free browser plug-in, but SemantiFind results are displayed alongside normal search engine results, offering some familiarity. You can watch a demonstration video <a href="http://www.dailymotion.com/video/x7l21f_learn-semantifind-in-90-seconds_tech" onclick="urchinTracker('/outgoing/www.dailymotion.com/video/x7l21f_learn-semantifind-in-90-seconds_tech?referer=');">here</a>.</p>
<p><a href="http://www.powerset.com/" onclick="urchinTracker('/outgoing/www.powerset.com/?referer=');">Powerset</a> is among those services applying natural language processing to the web and Wikipedia already benefits from its approach. Powerset displays an interface alongside the Wiki itself so users can navigate quickly and seamlessly using the keywords, themes and sections which have been stripped out of the original article. You can watch a demonstration video <a href="http://vimeo.com/994819" onclick="urchinTracker('/outgoing/vimeo.com/994819?referer=');">here</a>.</p>
<p><a href="http://www.iglueit.com/" onclick="urchinTracker('/outgoing/www.iglueit.com/?referer=');">iGlue</a> is a search engine that tries to identify and manage entities, not keywords. The service finds relevant information even if the given element appears in a form different from that used in the original search. It understands that corresponding words can sometimes be substituted. You can see a demonstration of the technology <a href="http://iglueit.com/demo1/query.nytimes.com/gst/index.html" onclick="urchinTracker('/outgoing/iglueit.com/demo1/query.nytimes.com/gst/index.html?referer=');">here</a>.</p>
<p><em>SOCIAL NETWORKING</em></p>
<p><a href="http://www.twine.com/" onclick="urchinTracker('/outgoing/www.twine.com/?referer=');">Twine</a> seems to be the pre-eminent &#8216;semantic web&#8217; service when it comes to social networking. It acts as a means of collecting and sharing all kinds of online content, learning more about you as you fill it up and link to other content. Twine aims to build on the principles of developing communities of interest. You can even interact with <a href="http://www.twine.com/user/nova" onclick="urchinTracker('/outgoing/www.twine.com/user/nova?referer=');">Nova Spivack</a>, the site&#8217;s creator, and see what things have captured his attention.</p>
<p>It is clear that all of the services &#8211; whether targeted at browsing, search or social networking &#8211; foster more advances in the field and my final post on the &#8216;semantic web&#8217; explores the revolutionary uses of these new technologys for the benefit of journalism. My previous post called &#8216;The next step to the &#8216;semantic web&#8221; can be found <a href="http://onlinejournalismblog.com/2009/03/22/the-next-step-to-the-semantic-web/">here</a>.</p>
<iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F03%2F23%2Fthe-services-of-the-semantic-web%2F&amp;layout=standard&amp;show_faces=true&amp;width=450&amp;action=like&amp;colorscheme=light&amp;height=80" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:450px; height:80px;" allowTransparency="true"></iframe><div align="center"><a href="http://twitter.com/paulbradshaw" target="_blank" onclick="urchinTracker('/outgoing/twitter.com/paulbradshaw?referer=');"><img src="http://onlinejournalismblog.com/wp-content/plugins/igit-follow-me-after-post-button-new/twitter8.png" /></a><div style="font-size:8px;"><a href="http://php-freelancer.in/" style="color:#D2D2D2" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer"  onclick="urchinTracker('/outgoing/php-freelancer.in/?referer=');">PHP Freelancer</a></div></div>]]></content:encoded>
			<wfw:commentRss>http://onlinejournalismblog.com/2009/03/23/the-services-of-the-semantic-web/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>The next step to the &#8216;semantic web&#8217;</title>
		<link>http://onlinejournalismblog.com/2009/03/22/the-next-step-to-the-semantic-web/</link>
		<comments>http://onlinejournalismblog.com/2009/03/22/the-next-step-to-the-semantic-web/#comments</comments>
		<pubDate>Sun, 22 Mar 2009 19:08:41 +0000</pubDate>
		<dc:creator>michaelhaddon</dc:creator>
				<category><![CDATA[online journalism]]></category>
		<category><![CDATA[community]]></category>
		<category><![CDATA[computer aided reporting]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[journalism]]></category>
		<category><![CDATA[Nova Spivack]]></category>
		<category><![CDATA[OWL]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[tagging]]></category>
		<category><![CDATA[tim berners-lee]]></category>
		<category><![CDATA[Twine]]></category>
		<category><![CDATA[Vint Cerf]]></category>
		<category><![CDATA[web 3.0]]></category>
		<category><![CDATA[WWC]]></category>

		<guid isPermaLink="false">http://onlinejournalismblog.com/?p=2449</guid>
		<description><![CDATA[There are billions of pages of unsorted and unclassified information online, which make up millions of terabytes of data with almost no organisation.  It is not necessarily true that some of this information is valuable whilst some is worthless, that&#8217;s just a judgement for who desires it.  At the moment, the most common way to [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F03%2F22%2Fthe-next-step-to-the-semantic-web%2F" onclick="urchinTracker('/outgoing/api.tweetmeme.com/share?url=http_3A_2F_2Fonlinejournalismblog.com_2F2009_2F03_2F22_2Fthe-next-step-to-the-semantic-web_2F&amp;referer=');"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F03%2F22%2Fthe-next-step-to-the-semantic-web%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>There are billions of pages of unsorted and unclassified information online, which make up millions of terabytes of data with almost no organisation.  It is not necessarily true that some of this information is valuable whilst some is worthless, that&#8217;s just a judgement for who desires it.  At the moment, the most common way to access any information is through the hegemonic search engines which act as an entry point.</p>
<p>Yet, despite Google&#8217;s dominace of the market and culture, the methodology of search still isn&#8217;t satisfactory.  Leading technologists see the next stage of development coming, where computers will become capable of effectively analysing and understanding data rather than just presenting it to us.  Search engine optimisation will eventually be replaced by the ‘semantic web&#8217;.</p>
<p><span id="more-2449"></span>Correctly tagging the mass of available data to provide a clear sense of meaning is the best way of achieving this according to <a href="http://en.wikipedia.org/wiki/Nova_Spivack" onclick="urchinTracker('/outgoing/en.wikipedia.org/wiki/Nova_Spivack?referer=');">Nova Spivack</a>, founder of <a href="http://www.twine.com" onclick="urchinTracker('/outgoing/www.twine.com?referer=');">Twine</a>, one of the leading sites in this field.  He says:</p>
<blockquote><p>&#8220;This next generation is actually based on enriching the meaning, enriching the structure.  The reason we want to do this is so that software can understand the web like humans can understand the web. Because the semantic web is not for humans, it is for machines.&#8221;</p></blockquote>
<p>Undertaking this task will revolutionise the way we utilise the internet, creating intelligent interaction and impacting on the way the web is perceived in popular culture. <a href="http://en.wikipedia.org/wiki/Vinton_Cerf" onclick="urchinTracker('/outgoing/en.wikipedia.org/wiki/Vinton_Cerf?referer=');">Vint Cerf</a>, one of the driving forces behind the creation of the internet, <a href="http://www.guardian.co.uk/media/pda/2008/sep/25/internet.bbc" onclick="urchinTracker('/outgoing/www.guardian.co.uk/media/pda/2008/sep/25/internet.bbc?referer=');">says</a>:</p>
<blockquote><p>&#8216;I don&#8217;t believe that we will see arising out of the current internet&#8230;conscious artificial intelligence, but we will probably see the system become easier to interact with &#8211; for example, voice interaction is becoming increasingly easy to accomplish. I&#8217;m almost certain you&#8217;ll see products emerging that will allow you to orally interact with the network &#8211; ask for something, demand something, or command something and have [it] happen.</p></blockquote>
<blockquote><p>&#8220;We may feel that this system is more intelligent because we are interacting with it in ways that don&#8217;t require us to point, click and type. The semantic web idea will make the internet seem more intelligent because we are extracting knowledge that other people put into it in a way that looks pretty intelligent.&#8221;</p></blockquote>
<p>So the aim of the &#8216;semantic web&#8217; is to allow data to be accessed and shared effectively by wider communities, yet processed automatically by computer.  In order for this to happen there needs to be a simple system to catagorise data so it can be easily located and organised.</p>
<p>Much progress has been made in this infrastructure, particularly in the development of the new languages &#8211; Resource Description Framework (<a href="http://www.w3.org/RDF/" onclick="urchinTracker('/outgoing/www.w3.org/RDF/?referer=');">RDF</a>) and Web Ontology Language (<a href="http://www.w3.org/2004/OWL/" onclick="urchinTracker('/outgoing/www.w3.org/2004/OWL/?referer=');">OWL</a>) &#8211; by the <a href="http://www.w3.org/" onclick="urchinTracker('/outgoing/www.w3.org/?referer=');">World Wide Web Consortium</a> .  The languages are used to annotate code, representing &#8216;knowledge&#8217; which will enable applications to use them more intelligently.</p>
<p>At the moment <a href="http://en.wikipedia.org/wiki/HTML" onclick="urchinTracker('/outgoing/en.wikipedia.org/wiki/HTML?referer=');">HTML</a> is limited to describing static content, documents and the links between them. However RDF, OWL, and <a href="http://www.w3.org/XML/" onclick="urchinTracker('/outgoing/www.w3.org/XML/?referer=');">XML</a> can describe arbitrary things such as people, events or objects.  It is layered on top of HTML and consists of a subject, a predicate, and an object. For example: <em>&#8220;Jeremy Paxman&#8221; &lt;subject&gt; belongs to &lt;predicate&gt; journalists &lt;object&gt;</em>.</p>
<p>These descriptions allow increased meaning behind the static content, demonstrating the structure of the knowledge behind it.  In this way a machine can process knowledge itself instead of text, using a process similar to human reasoning.  This should result in more meaningful results being returned in searches and perhaps even allow for increased automation when it comes to research by computers.</p>
<p>The success (or failure) of these experimental technologies will motivate further research and development, not only from within the industry but also academia.  It is certain their efforts will influence the future development of information technology.  In a <a href="http://onlinejournalismblog.com/2009/03/23/the-services-of-the-semantic-web/">further post</a> I will explore the services currently being forged and in a final post on the &#8216;semantic web&#8217; I will tackle the revolutionary uses this new technology has for journalism.</p>
<p>However the last word here must go to <a href="http://en.wikipedia.org/wiki/Tim_Berners-Lee" onclick="urchinTracker('/outgoing/en.wikipedia.org/wiki/Tim_Berners-Lee?referer=');">Tim Berners-Lee</a>, the internet pioneer who says:</p>
<blockquote><p>&#8220;A ‘semantic web&#8217; has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents&#8217; people have touted for ages will finally materialise.&#8221;</p></blockquote>
<iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F03%2F22%2Fthe-next-step-to-the-semantic-web%2F&amp;layout=standard&amp;show_faces=true&amp;width=450&amp;action=like&amp;colorscheme=light&amp;height=80" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:450px; height:80px;" allowTransparency="true"></iframe><div align="center"><a href="http://twitter.com/paulbradshaw" target="_blank" onclick="urchinTracker('/outgoing/twitter.com/paulbradshaw?referer=');"><img src="http://onlinejournalismblog.com/wp-content/plugins/igit-follow-me-after-post-button-new/twitter8.png" /></a><div style="font-size:8px;"><a href="http://php-freelancer.in/" style="color:#D2D2D2" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer"  onclick="urchinTracker('/outgoing/php-freelancer.in/?referer=');">PHP Freelancer</a></div></div>]]></content:encoded>
			<wfw:commentRss>http://onlinejournalismblog.com/2009/03/22/the-next-step-to-the-semantic-web/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Kitemarks to save the news industry? Q&amp;A with Andrew Currah</title>
		<link>http://onlinejournalismblog.com/2009/02/23/kitemarks-to-save-the-news-industry-qa-with-andrew-currah/</link>
		<comments>http://onlinejournalismblog.com/2009/02/23/kitemarks-to-save-the-news-industry-qa-with-andrew-currah/#comments</comments>
		<pubDate>Mon, 23 Feb 2009 10:06:54 +0000</pubDate>
		<dc:creator>Paul Bradshaw</dc:creator>
				<category><![CDATA[online journalism]]></category>
		<category><![CDATA[regulation, law and ethics]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[advertising]]></category>
		<category><![CDATA[andrew currah]]></category>
		<category><![CDATA[geotagging]]></category>
		<category><![CDATA[kitemark]]></category>
		<category><![CDATA[media standards trust]]></category>
		<category><![CDATA[metatags]]></category>
		<category><![CDATA[Q&A]]></category>
		<category><![CDATA[reuters institute]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[tagging]]></category>
		<category><![CDATA[tim berners-lee]]></category>
		<category><![CDATA[what's happening to our news]]></category>

		<guid isPermaLink="false">http://onlinejournalismblog.com/?p=2104</guid>
		<description><![CDATA[Reuters recently published a report entitled: &#8216;What&#8217;s Happening to Our News: An investigation into the likely impact of the digital revolution on the economics of news publishing in the UK&#8216;. In it author Andrew Currah provides an overview of the situation facing UK publishers, and 3 broad suggestions as to ways forward &#8211; namely, kitemarks, [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F02%2F23%2Fkitemarks-to-save-the-news-industry-qa-with-andrew-currah%2F" onclick="urchinTracker('/outgoing/api.tweetmeme.com/share?url=http_3A_2F_2Fonlinejournalismblog.com_2F2009_2F02_2F23_2Fkitemarks-to-save-the-news-industry-qa-with-andrew-currah_2F&amp;referer=');"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F02%2F23%2Fkitemarks-to-save-the-news-industry-qa-with-andrew-currah%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><em>Reuters recently published a <a href="http://reutersinstitute.politics.ox.ac.uk/about/news/item/article/whats-happening-to-our-news.html" onclick="urchinTracker('/outgoing/reutersinstitute.politics.ox.ac.uk/about/news/item/article/whats-happening-to-our-news.html?referer=');">report entitled: &#8216;What&#8217;s Happening to Our News: An investigation into the likely impact of the digital revolution on the economics of news publishing in the UK</a>&#8216;. In it author <strong>Andrew Currah</strong> provides an overview of the situation facing UK publishers, and 3 broad suggestions as to ways forward &#8211; namely, kitemarks, public support, and digital literacy education. </em></p>
<p><em>The <a href="http://www.pressgazette.co.uk/story.asp?sectioncode=1&amp;storycode=42875&amp;c=1" onclick="urchinTracker('/outgoing/www.pressgazette.co.uk/story.asp?sectioncode=1_amp_storycode=42875_amp_c=1&amp;referer=');">kitemark idea</a> seems to have stirred  up the most fuss. In the first of a series of email exchanges I asked Currah <strong>how he saw this making any  difference to consumption of newspapers, and how it could work in practice</strong>. This is his response:<br />
</em></p>
<p>Yes, the kitemark idea has triggered quite a response&#8230; Unfortunately,  as the discussion online suggests, the term has implied to many a  top-down, centralised system of certification which would lead to some form  of<br />
&#8216;apartheid&#8217; between bloggers and journalists.<span id="more-2104"></span></p>
<p>That was certainly  not our intended message. The report simply wanted to foreground the idea of  digital labelling as a means of improving transparency in online news  coverage.</p>
<p>All we meant by a kitemark was a symbol (expressed visually,  and electronically as metadata) to convey to audiences, bloggers,  journalists and others that a piece of news content had been intelligently  labelled with relevant information and that it is open to derivative  checking/use&#8230; similar in a sense to the Creative Commons &#8216;mark&#8217; that  travels with media content across the web.</p>
<p>Our report only touched  upon this project of labelling, which the Media Standards Trust are busy  working on. For a more detailed discussion, see <a title="blocked::http://mediastandardstrust.blogspot.com/2009/01/making-news-transparent-is-not-about.html" href="http://mediastandardstrust.blogspot.com/2009/01/making-news-transparent-is-not-about.html" target="_blank" onclick="urchinTracker('/outgoing/mediastandardstrust.blogspot.com/2009/01/making-news-transparent-is-not-about.html?referer=');">the post by Martin  Moore</a> or  read about the related efforts of <a title="blocked::http://www.newscredit.org/" href="http://www.newscredit.org/" target="_blank" onclick="urchinTracker('/outgoing/www.newscredit.org/?referer=');">http://www.newscredit.org</a></p>
<p>So, in summary, we are in  favour of an open source, voluntary, bottom-up system of tagging NOT an  archaic, top-down system of certification dividing amateurs and  professionals. We did not envision participation in such an initiative as a  precursor to public funding &#8211; though intelligent labelling and linking to  external sites could, for example, be far more developed at the  BBC.</p>
<p>In terms of value, by intelligently labelling the news all sorts of  valuable derivative uses might be enabled (e.g. helping users to filter  content by criteria or triangulate stories). It might also help to avoid the  failures of purely algorithmic search approaches to news (e.g. the fiasco  surrounding <a href="http://technology.timesonline.co.uk/tol/news/tech_and_web/article4742147.ece" onclick="urchinTracker('/outgoing/technology.timesonline.co.uk/tol/news/tech_and_web/article4742147.ece?referer=');">the publication of an outdated United Airlines story on Google  News in August last year &#8211; triggered, in part, due to the lack of any embedded metadata about the story&#8217;s publication date</a>):</p>
<p><strong><em>Is this similar to the ideas that <a href="http://www.informationweek.com/news/internet/search/showArticle.jhtml?articleID=207800163" onclick="urchinTracker('/outgoing/www.informationweek.com/news/internet/search/showArticle.jhtml?articleID=207800163&amp;referer=');">Tim Berners- Lee is working on in  his Knight-funded project</a>? </em></strong></p>
<p>Yes &#8211; absolutely. This is something we only briefly touch in the report. We&#8217;re  hoping to spend more time looking at this approach in follow-on research. I  think the initiative being developed by Tim Berners-Lee and the Media Standards  Trust has a great chance of improving transparency, especially when tagging and  labelling technologies are seamlessly integrated into the workflow of the  newsroom.</p>
<p><em><strong>I can see how something around metadata could help users find  original journalism, but how do you see this kitemark keeping journalism alive  in a business sense?</strong></em></p>
<p>Whether this would realistically boost the economics of news is difficult to  answer. But on the basis of our research, it seems that a more transparent,  systematic way of tagging the news could help publishers in a variety of ways&#8230;</p>
<p>For example, by developing a more comprehensive network of tags  connecting stories, themes and content that might, in theory, keep people on a  site for longer &#8211; in turn, strengthening ad revenues. It might also perpetuate  the value and profile of a story long after it was published.</p>
<p>Metadata is also  the key to techniques such as search engine optimisation, social media  marketing, and the like, all of which are about attracting more attention around  the content for longer. It would also provide a system for displaying stories in  new formats, such as digital maps.</p>
<p>When or whether all of this will  translate into enough ad revenues to keep publishers afloat is an open question;  investing in the systems and training to make this archival linking possible is  another hurdle.</p>
<p>An alternative approach might be to buck the trend towards free  by introducing new forms of online paid subscription, to provide access to a  premium, searchable and fully digitized archive of all back content. Metadata  would also be a key step in that direction.</p>
<p><em><strong>The discussion continues in the comments</strong></em></p>
<p><em>NB: The Freeman&#8217;s Journal has <a href="http://freemansjournal.wordpress.com/2009/01/22/whats-happening-to-our-news/" onclick="urchinTracker('/outgoing/freemansjournal.wordpress.com/2009/01/22/whats-happening-to-our-news/?referer=');">an excellent critical overview of the report, </a>with responses from Currah in the comments.</em></p>
<iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fonlinejournalismblog.com%2F2009%2F02%2F23%2Fkitemarks-to-save-the-news-industry-qa-with-andrew-currah%2F&amp;layout=standard&amp;show_faces=true&amp;width=450&amp;action=like&amp;colorscheme=light&amp;height=80" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:450px; height:80px;" allowTransparency="true"></iframe><div align="center"><a href="http://twitter.com/paulbradshaw" target="_blank" onclick="urchinTracker('/outgoing/twitter.com/paulbradshaw?referer=');"><img src="http://onlinejournalismblog.com/wp-content/plugins/igit-follow-me-after-post-button-new/twitter8.png" /></a><div style="font-size:8px;"><a href="http://php-freelancer.in/" style="color:#D2D2D2" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer"  onclick="urchinTracker('/outgoing/php-freelancer.in/?referer=');">PHP Freelancer</a></div></div>]]></content:encoded>
			<wfw:commentRss>http://onlinejournalismblog.com/2009/02/23/kitemarks-to-save-the-news-industry-qa-with-andrew-currah/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>What won&#8217;t happen in 2009 &#8211; and what might</title>
		<link>http://onlinejournalismblog.com/2008/12/19/what-wont-happen-in-2009-and-what-will/</link>
		<comments>http://onlinejournalismblog.com/2008/12/19/what-wont-happen-in-2009-and-what-will/#comments</comments>
		<pubDate>Fri, 19 Dec 2008 08:30:31 +0000</pubDate>
		<dc:creator>Paul Bradshaw</dc:creator>
				<category><![CDATA[online journalism]]></category>
		<category><![CDATA[2009]]></category>
		<category><![CDATA[4ip]]></category>
		<category><![CDATA[Carnival of Journalism]]></category>
		<category><![CDATA[credit crunch]]></category>
		<category><![CDATA[free wifi]]></category>
		<category><![CDATA[funding]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[mobile web]]></category>
		<category><![CDATA[nesta]]></category>
		<category><![CDATA[predictions]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[startups]]></category>

		<guid isPermaLink="false">http://onlinejournalismblog.com/?p=1965</guid>
		<description><![CDATA[This month&#8217;s Carnival of Journalism looks forward to new media developments in the coming year. Here are my no doubt misguided and naive predictions: 2009 will not be the year of the mobile web Every year we make end of year predictions that the coming year will finally see the mobile web hit the mainstream. [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fonlinejournalismblog.com%2F2008%2F12%2F19%2Fwhat-wont-happen-in-2009-and-what-will%2F" onclick="urchinTracker('/outgoing/api.tweetmeme.com/share?url=http_3A_2F_2Fonlinejournalismblog.com_2F2008_2F12_2F19_2Fwhat-wont-happen-in-2009-and-what-will_2F&amp;referer=');"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fonlinejournalismblog.com%2F2008%2F12%2F19%2Fwhat-wont-happen-in-2009-and-what-will%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>This month&#8217;s <a href="http://carnivalofjournalism.com/" onclick="urchinTracker('/outgoing/carnivalofjournalism.com/?referer=');">Carnival of Journalism</a> looks forward to new media developments in the coming year. Here are my no doubt misguided and naive predictions:</p>
<h3>2009 will not be the year of the mobile web</h3>
<p>Every year we make end of year predictions that the coming year will finally see the mobile web hit the mainstream. In many ways,<a href="http://www.opera.com/smw/2008/10/" onclick="urchinTracker('/outgoing/www.opera.com/smw/2008/10/?referer=');"> it already has</a>. But any expectations of there being some significant spread in 2009 will be scuppered by the credit crunch: users will be increasingly reluctant to spend money on a smart phone as the purse strings tighten. We&#8217;re not all going to be carrying around iPhones.</p>
<p>On the plus side, as a result of that slowdown we can expect mobile service providers to become more competitive in their data rates and packages, so that those who do have smart phones will have more reason to take out a mobile web package.<span id="more-1965"></span></p>
<p>We can also expect to see increasing numbers of retailers offering free wifi to attract customers, <a href="http://uk.techcrunch.com/2008/12/15/grab-some-free-wifi-with-your-coffee-at-pret/" onclick="urchinTracker('/outgoing/uk.techcrunch.com/2008/12/15/grab-some-free-wifi-with-your-coffee-at-pret/?referer=');">as Pret A Manger have done</a>, or government investment in wifi clouds to stimulate growth. So those who do access the web on the move &#8211; not just mobile phones but laptops and ipods &#8211; could start to do so more.</p>
<h3>2009 will not be the year of the semantic web</h3>
<p>The semantic web holds enormous promise for journalism, but it&#8217;s still early days and even <a href="http://www.readwriteweb.com/archives/top_10_semantic_web_products_2008.php" onclick="urchinTracker('/outgoing/www.readwriteweb.com/archives/top_10_semantic_web_products_2008.php?referer=');">the best products</a> are far from mass market. I don&#8217;t expect that to change any time soon. However&#8230;</p>
<h3>In 2009 Google will look more vulnerable than ever</h3>
<p><a href="http://onlinejournalismblog.com/2008/11/05/will-alternative-voices-get-pushed-off-googles-first-page-of-results/">Google has been fiddling with its successful formula</a>, trying to keep users within its verticals and getting greedy for user data. It is <a href="http://onlinejournalismblog.com/2008/11/13/is-local-search-the-chink-in-googles-armour/">weakest on local search</a> and semantic search and both those areas should see a lot of development in 2009. In 2010, however, Google will probably simply buy the best competitors.</p>
<h3>2009 will see social media getting lean &#8211; and mean</h3>
<p>Social media startups who do not want to <a href="http://www.blogherald.com/2008/12/01/pownce-closes-team-joins-six-apart/" onclick="urchinTracker('/outgoing/www.blogherald.com/2008/12/01/pownce-closes-team-joins-six-apart/?referer=');">join Pownce</a><a href="http://www.blogherald.com/2008/12/01/pownce-closes-team-joins-six-apart/" onclick="urchinTracker('/outgoing/www.blogherald.com/2008/12/01/pownce-closes-team-joins-six-apart/?referer=');"> on the scrapheap</a> will stop developing extra features, trim others, and focus on their core business. Oh, and they&#8217;ll be under increasing pressure to actually start coming up with business models too, which means <a href="http://www.nytimes.com/2008/11/13/technology/internet/13youtube.html" onclick="urchinTracker('/outgoing/www.nytimes.com/2008/11/13/technology/internet/13youtube.html?referer=');">more advertising</a> (if they can sell it), <a href="http://www.sitepoint.com/blogs/2008/10/08/youtube-adds-ecommerce-video-advertisings-future/" onclick="urchinTracker('/outgoing/www.sitepoint.com/blogs/2008/10/08/youtube-adds-ecommerce-video-advertisings-future/?referer=');">more e-commerce</a>, and less stuff for free. All of which will mean less innovation, fewer users and startups without deep pockets joining Pownce on the scrapheap.</p>
<h3>2009 will see a lot of thinking and little action</h3>
<p>All those redundant journalists, publishers, developers, and estate agents will have plenty of time to reflect on how their industries are changing, to play around with online tools, meet people online and offline, and come up with ideas on where to go next.</p>
<p>They&#8217;ll be doing this in an environment where funds are beginning to appear that enable them to act on those. In the UK at least there is <a href="http://uk.techcrunch.com/2008/12/08/is-this-1bn-from-nesta-new-money-will-private-equity-really-join-in-and-why-is-nesta-not-answering-their-email/" onclick="urchinTracker('/outgoing/uk.techcrunch.com/2008/12/08/is-this-1bn-from-nesta-new-money-will-private-equity-really-join-in-and-why-is-nesta-not-answering-their-email/?referer=');">£1billion from NESTA</a>, <a href="http://www.4ip.org.uk/" onclick="urchinTracker('/outgoing/www.4ip.org.uk/?referer=');">£50m from 4iP</a>, <a href="http://www.thirdsector.co.uk/news/Article/863418/scottish-government-launches-1m-social-enterprise-fund/" onclick="urchinTracker('/outgoing/www.thirdsector.co.uk/news/Article/863418/scottish-government-launches-1m-social-enterprise-fund/?referer=');">£1m from the Scottish government</a> and various other pots of money aimed at maintaining economic growth.</p>
<p>So by 2010, when the bids have been put in, funds released, and pilots completed, we should see some very interesting new media indeed.</p>
<iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fonlinejournalismblog.com%2F2008%2F12%2F19%2Fwhat-wont-happen-in-2009-and-what-will%2F&amp;layout=standard&amp;show_faces=true&amp;width=450&amp;action=like&amp;colorscheme=light&amp;height=80" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:450px; height:80px;" allowTransparency="true"></iframe><div align="center"><a href="http://twitter.com/paulbradshaw" target="_blank" onclick="urchinTracker('/outgoing/twitter.com/paulbradshaw?referer=');"><img src="http://onlinejournalismblog.com/wp-content/plugins/igit-follow-me-after-post-button-new/twitter8.png" /></a><div style="font-size:8px;"><a href="http://php-freelancer.in/" style="color:#D2D2D2" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer"  onclick="urchinTracker('/outgoing/php-freelancer.in/?referer=');">PHP Freelancer</a></div></div>]]></content:encoded>
			<wfw:commentRss>http://onlinejournalismblog.com/2008/12/19/what-wont-happen-in-2009-and-what-will/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Semantic Journalism: Ideas</title>
		<link>http://onlinejournalismblog.com/2008/06/25/semantic-journalism-ideas/</link>
		<comments>http://onlinejournalismblog.com/2008/06/25/semantic-journalism-ideas/#comments</comments>
		<pubDate>Wed, 25 Jun 2008 07:28:34 +0000</pubDate>
		<dc:creator>nicolaskb</dc:creator>
				<category><![CDATA[online journalism]]></category>
		<category><![CDATA[future journalism]]></category>
		<category><![CDATA[journalism]]></category>
		<category><![CDATA[Nicolas Kayser-Bril]]></category>
		<category><![CDATA[semantic journalism]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[semantics]]></category>

		<guid isPermaLink="false">http://onlinejournalismblog.com/?p=1144</guid>
		<description><![CDATA[Semantic journalism is a vision for the future of journalism. As the writer works on her article, her computer would gather data on the matter, from pictures to other articles to assessing global opinion trends. It would read through the Wikipedia pages of a given theme and summarize key concepts. A semantic algorithm would bring [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fonlinejournalismblog.com%2F2008%2F06%2F25%2Fsemantic-journalism-ideas%2F" onclick="urchinTracker('/outgoing/api.tweetmeme.com/share?url=http_3A_2F_2Fonlinejournalismblog.com_2F2008_2F06_2F25_2Fsemantic-journalism-ideas_2F&amp;referer=');"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fonlinejournalismblog.com%2F2008%2F06%2F25%2Fsemantic-journalism-ideas%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Semantic journalism is a vision for the future of journalism. As the writer works on her article, her computer would gather data on the matter, from pictures to other articles to assessing global opinion trends. It would read through the Wikipedia pages of a given theme and summarize key concepts. A semantic algorithm would bring a selection of the most authoritative people on a subject.</p>
<p>The journalist is left with what she does best: checking and analyzing the data.</p>
<p>That means avoiding the pitfalls of <a href="http://publishing2.com/2008/05/04/the-declining-value-of-redundant-news-content-on-the-web/" onclick="urchinTracker('/outgoing/publishing2.com/2008/05/04/the-declining-value-of-redundant-news-content-on-the-web/?referer=');">redundant news content</a>. That means escaping the trap of writing about topics without having a clue of what&#8217;s at stake. That means interviewing people who do things rather than <a href="http://en.wikipedia.org/wiki/Alexis_Debat" onclick="urchinTracker('/outgoing/en.wikipedia.org/wiki/Alexis_Debat?referer=');">those who talk about it</a>.</p>
<p>This article is the first of a 4-part series. We&#8217;ll explore semantic hacks for newsgathering, writing and publishing in the coming weeks.<span id="more-1144"></span></p>
<h2>Part 1: Semantics today: What revolution?</h2>
<p>Semantic journalism is closely related to the <a href="http://en.wikipedia.org/wiki/Semantic_Web" onclick="urchinTracker('/outgoing/en.wikipedia.org/wiki/Semantic_Web?referer=');">semantic web</a>. The latter is a tidal wave redesigning the web since the early 2000&#8242;s, the motto of which is to make a webpage readable for machines. XML and RDF are the key words, <a href="http://en.wikipedia.org/wiki/Tim_Berners-Lee" onclick="urchinTracker('/outgoing/en.wikipedia.org/wiki/Tim_Berners-Lee?referer=');">Tim Berners-Lee</a> the guru.</p>
<p>Now, having machines precisely understand the meaning of a story is another matter. Querying a database in natural language has been done since the 1970&#8242;s. Concretely, it means typing ‘What is the temperature in London?&#8217; and seeing the machine display ‘20°C&#8217;.</p>
<p>But since the 1970&#8242;s, little has improved. Put simply, the computer reads the sentence, identifies a few words, their syntactical function and runs through a database to pick relevant information. Each word is given a meaning from the multiple senses it can carry.</p>
<p>In the example above, the computer can tell that ‘temperature&#8217; is not referring to <a href="http://www.deezer.com/#music/result/all/sean%20paul%20temperature" onclick="urchinTracker('/outgoing/www.deezer.com/_music/result/all/sean_20paul_20temperature?referer=');">Sean Paul&#8217;s hit</a> from the sentence&#8217;s structure. Then, it asks the database containing weather-related data for the current temperature in London.</p>
<p>Semantics rapid evolution has to do with <a href="http://en.wikipedia.org/wiki/Moore%27s_law" onclick="urchinTracker('/outgoing/en.wikipedia.org/wiki/Moore_27s_law?referer=');">Moore&#8217;s law</a> and its army of escorting laws, all of which say that it&#8217;s getting cheaper to store and access data. Semantic applications can add more meanings to each word. Eventually, a semantic app will know that <a href="http://us.imdb.com/title/tt0029282/combined" onclick="urchinTracker('/outgoing/us.imdb.com/title/tt0029282/combined?referer=');"><em>Temperature</em></a> is also a 1937 movie. With a large enough database, it can store an almost infinite amount of temperature-related data.</p>
<p>However, when Sean Paul says that he <em>‘got the right temperature fi shelter you from the storm&#8217;</em>, a computer will have a hard time understanding that there&#8217;s no actual shelter and no storm, no matter how many databases it commands. The key is to know that it&#8217;s a lush R&amp;B song.</p>
<p>Some researchers argue that the traditional approach will not solve the semantic conundrum, no matter how much processing power is unleashed. Instead of a stratified method, where the program identifies the grammatical syntax, then the different possible meanings of each word, they favor a ‘what&#8217;s going on&#8217; approach (they call it <em>dynamic sense building</em>, as opposed to <em>compositional sense computing</em>, in the words of semanticist <a href="http://www.wkdialogue.ch/symposia/2006/speakers-chairs/bernard-victorri/index.html" onclick="urchinTracker('/outgoing/www.wkdialogue.ch/symposia/2006/speakers-chairs/bernard-victorri/index.html?referer=');">Bernard Victorri</a>).</p>
<p><a href="http://arxiv.org/ftp/cs/papers/0607/0607084.pdf" onclick="urchinTracker('/outgoing/arxiv.org/ftp/cs/papers/0607/0607084.pdf?referer=');">In a paper</a>, Daniel Kayser (full disclosure: that&#8217;s my dad) and Farid Nouioua explain that when a computer reads the sentence <em>The truck in front of me braked suddenly</em>, the key to extracting meaning isn&#8217;t in any of the words, but in knowing what is not said.</p>
<div style="float:left;width:300px;margin:15px"><a href="http://onlinejournalismblog.com/wp-content/uploads/2008/06/semantics.gif"><img class="alignnone size-medium wp-image-1145" src="http://onlinejournalismblog.com/wp-content/uploads/2008/06/semantics-300x221.gif" alt="" width="300" height="221" /></a>
<p style="font-size:.8em">The semantic field for the word &#8216;car&#8217;, according to <a href="http://dico.isc.cnrs.fr/fr/index.html" onclick="urchinTracker('/outgoing/dico.isc.cnrs.fr/fr/index.html?referer=');">Sabine Ploux&#8217;s very cool semantic altlas</a></div>
<p>What the sentence actually means does not come by putting together the sense (as found in a dictionary) of each of its words. You need to know a lot about ordinary driving situations to grasp what any reader would find easily (e.g. the risk of accident was high). The knowledge required is not to be found in any dictionary or encyclopaedia, as thick as it might be. They argue that sense doesn&#8217;t come from what&#8217;s written, but from what&#8217;s assumed and left unwritten.</p>
<p>Semantics did not dramatically improve over the last decade. Automated summaries, for instance, a problem that has kept semanticists busy for the past 40 years, are still not expected for a distant future. Worse, it&#8217;s hard to see any technological lock that could, if broken, propel semantics into a higher gear.</p>
<p>In the coming weeks, the Online Journalism Blog team will test all kind of semantics apps that could help journalists. We&#8217;ll try to separate semantic snake-oil from genuinely innovative apps and discuss the value semantics can add. Stay tuned!</p>
<p><strong>By <a href="http://windowonthemedia.com" onclick="urchinTracker('/outgoing/windowonthemedia.com?referer=');">Nicolas Kayser-Bril</a></strong></p>
<iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fonlinejournalismblog.com%2F2008%2F06%2F25%2Fsemantic-journalism-ideas%2F&amp;layout=standard&amp;show_faces=true&amp;width=450&amp;action=like&amp;colorscheme=light&amp;height=80" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:450px; height:80px;" allowTransparency="true"></iframe><div align="center"><a href="http://twitter.com/paulbradshaw" target="_blank" onclick="urchinTracker('/outgoing/twitter.com/paulbradshaw?referer=');"><img src="http://onlinejournalismblog.com/wp-content/plugins/igit-follow-me-after-post-button-new/twitter8.png" /></a><div style="font-size:8px;"><a href="http://php-freelancer.in/" style="color:#D2D2D2" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer" title="PHP Freelancer , PHP Freelancer India , Hire PHP Freelancer"  onclick="urchinTracker('/outgoing/php-freelancer.in/?referer=');">PHP Freelancer</a></div></div>]]></content:encoded>
			<wfw:commentRss>http://onlinejournalismblog.com/2008/06/25/semantic-journalism-ideas/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

