Online Journalism Blog

Avatar

This is a conversation.

adobe photoshop cs2 tryout for mac Buy Premiere Pro CS4 MAC adobe premiere elements forums adobe photoshop cs2 prefences Buy Acrobat 9 Pro Extended adobe premiere elements 2.0 torrent adobe photoshop instructions Buy After Effects CS4 MAC adobe photoshop cs2 serial adobe illustrator serial code Buy After Effects CS4 caterpillar symbol adobe illustrator install adobe creative suite Buy Creative Suite 4 Design Standard adobe photoshop tutorials free adobe illustrator turorial Buy Creative Suite 4 Master Collection for Mac adobe photoshop cs crack mac adobe illustrator graphic styles download Buy Creative Suite 4 Master Collection adobe flash driver adobe photoshop 6 brushes Buy Creative Suite 4 Web Premium basics of adobe illustrator convert adobe illustrator ia jpg Buy Creative Suite 4 Web Standard adobe technote dreamweaver emerging issues mp3 in adobe premiere Buy Dreamweaver CS4 adobe indesign mac student album adobe photoshop product Buy Fireworks CS4 adobe photoshop font adobe photoshop vs corel Buy Flash CS4 Professional academic student adobe illustrator adobe illustrator cs3 crop marks Buy Illustrator CS4 adobe after effects 8.0 system requirements flash lite authoring adobe labs Buy InDesign CS3 adobe fireworks cs3 help on adobe indesign glyph count Buy InDesign CS4 MAC adobe illustrator cs2 crack adobe photoshop cs2 photomerge tutorial panorama Buy InDesign CS4 adobe after effects warez adobe creative suite 3 family pack Buy Photoshop CS3 Extended adobe illustrator cs3 crack serial number adobe premiere with crack Buy Photoshop CS4 Extended MAC adobe fireworks 8 cdkey adobe illustrator cs trial Buy Photoshop Elements 8 free download adobe after effects full free adobe flash player download install Buy Premiere Pro CS3 adobe photoshop cs3 oem

malcolmcoles
Should Murdoch win any lawsuit against Google?

August 6th, 2009 by malcolmcoles

Update: Chris Gaither from Google explained how to get removed from Google News while remaining in the main index here and here.

There’s a story in Australia that News Corp. is  preparing to sue Google and Yahoo to stop both from linking to, and quoting News Corp content. It comes as Rupert Murdoch promises to start charging for online content across his company’s news sites.

The suing story has prompted the usual hilarity, with comments such as if murdoch sues google & yahoo over news rather than use robots.txt file, it’ll be a short, embarrassing lawsuit. But here’s why Murdoch might have a case (first posted here) …

Robots.txt isn’t a panacea

The usual response to newspapers’ complaints about Google is to say ‘just use robots.txt to keep them out.’ This was Google’s response in its two fingers to the news industry.

However, most people don’t seem to realise that it’s hard to stay out of Google News and remain in the main Google index:

Please keep in mind that the robot we use for Google News, called Googlebot, is the same robot that we use for Google Web Search. This means that any settings you modify for Google News will also apply to Google Web Search. (From Google Support)

There’s a difference between Google News and Google Search

Google search is a way for a user to enter a term and for Google to show relevant pages. Google News these days looks like a fully fledged news aggregation service – check out its front page, and tell me how much that differs from a publisher’s news home page?

Just because publishers are happy to appear in normal search results, doesn’t mean they want their content used for free to create a rival news source/product. But there’s no way to use robots.txt – google’s supposed answer – to draw this distinction.

Google is ignoring ACAP

Publishers have attempted to help Google out with their own protocol called Automated Content Access Protocol – a way to build on robots.txt and allow better control over how their content is used.

Google won’t implement it saying that: “Our guiding principle is that whatever technical standards we introduce must work for the whole web (big publishers and small), not just for one subset or field”.

But Google already draws a distinction between big and small publishers. I publish a blog, but I’m not allowed in Google News, even though I’m in the main Google index.

Conclusion

I’m not saying that any publisher will actually want to stay out of Google. But robots.txt isn’t the answer to the problem of how publishers get paid for or control access to their content.

14 Comments, Comment or Ping

  1. That distinction that you say keeps your blog out of Google News isn’t actually much of an issue. Once you know what to do it’s relatively easy to get into Google News. We got The Lichfield Blog on there and pretty dominate any Google News search for Lichfield now.

    I had the blog’s founder do a guest post on it for me: Google news registration is an easy win.

  2. The range-of-contributors condition would keep me out – there’s only me at my blog. (Unlike here on OJB where there are lots of contributors).

  3. Ah my mistake, I thought you were referring to OJB.

  4. Thanks for that link Philip – have submitted OJB to see what happens. Notice BTW in the technical requirements they require 3 digits in your URLs (and it can’t be the year), which neither yours nor OJB meets.

  5. It’s a bizarre requirement and you’re right, we don’t have it and neither do the Guardian by the looks of it.

  6. The Guardian have been getting away with it for years – I never understood how. In better news all round, the 3-digit requirement is no longer true IF you submit a news sitemap.

    Here’s the link that explains the waiver for the 3-digit rule: http://www.google.com/support/news_pub/bin/answer.py?hl=en&answer=68323

    And here’s the explanation of news sitemaps: http://www.google.com/support/news_pub/bin/answer.py?answer=74288

    There’s a plugin to generate google news sitemaps for wordpress here, although I’ve never used it myself: http://wordpress.org/extend/plugins/google-news-sitemap-generator/

  7. So, just how do you get a site crawled by the Google News bots?

  8. I’m probably missing something, but Google says it will remove sites people don’t want included, so why is there any talk of suing them or using robots?

    http://www.google.com/support/news_pub/bin/answer.py?hl=en&answer=94003

    Do they refuse to remove some sites?

  9. >Thanks for that link Philip – have submitted OJB to see what happens. Notice BTW in the technical requirements they require 3 digits in your URLs (and it can’t be the year), which neither yours nor OJB meets.

    As has been mentioned, if you use a news sitemap you can dodge that criteria.

    I think that is perhaps easier for a clearly defined news site such as the Lichfield Blog.

    Or it could be sour grapes :-) . I’ve tried a couple of times without success.

    There’s a plugin for the news sitemap (which I can’t make work), or I use a short php script here:

    http://www.mattwardman.com/blog/news-sitemap.php

    Matt

  10. PS The only political blog site I’m aware of that is in is Slugger.

Reply to “Should Murdoch win any lawsuit against Google?”