Skip to content

silly lawsuit of the week

2007 March 19
by David Ma

OK. Short version of the story in InformationWeek: Woman puts up a website. She puts a "webwrap" agreement at the bottom - i.e. basically a contract that says if you use the site then you agree to the contract. Still some question as to whether such a mechanism is binding, but anyway...

So the Internet Archive of course comes along and indexes her site. Which apparently is a violation of the webwrap. So she sues, representing herself, I believe. The court throws out everything on a preliminary motion by IA except for the breach of contract.

InformationWork observes that "Her suit asserts that the Internet Archive's programmatic visitation of her site constitutes acceptance of her terms, despite the obvious inability of a Web crawler to understand those terms and the absence of a robots.txt file to warn crawlers away." (my emphasis). They then conclude with this statement:

If a notice such as Shell's is ultimately construed to represent just such a "meaningful opportunity" to an illiterate computer, the opt-out era on the Net may have to change. Sites that rely on automated content gathering like the Internet Archive, not to mention Google, will have to convince publishers to opt in before indexing or otherwise capturing their content. Either that or they'll have to teach their Web spiders how to read contracts.

(my emphasis).

They already have - sort of. It's called robots.txt - the thing referred to above. For those of you who haven't heard of this, its a little file that you put on the top level of your site and which is the equivalent of a "no soliciation" sign on your door. Its been around for at least a decade (probably longer) and most (if not all) search engines

From the Internet Archive's FAQ:

How can I remove my site's pages from the Wayback Machine?

The Internet Archive is not interested in preserving or offering access to Web sites or other Internet documents of persons who do not want their materials in the collection. By placing a simple robots.txt file on your Web server, you can exclude your site from being crawled as well as exclude any historical pages from the Wayback Machine.

Internet Archive uses the exclusion policy intended for use by both academic and non-academic digital repositories and archivists. See our exclusion policy.

You can find exclusion directions at exclude.php. If you cannot place the robots.txt file, opt not to, or have further questions, email us at info at archive dot org.

standardized methods of communications - privacy policies, etc. - more. Question is, will people be required to use it, or simply disregard and act dumb?

  • Google Bookmarks
  • Digg
  • del.icio.us
  • Facebook
  • email
  • LinkedIn
  • Slashdot
  • Technorati
  • Live
  • Print
  • Reddit
  • StumbleUpon
  • Yahoo! Buzz
  • Twitter
  • FriendFeed
  • MSN Reporter
  • NewsVine
  • Posterous
  • SphereIt
  • Sphinn
  • Suggest to Techmeme via Twitter
  • Tumblr
  • Yahoo! Bookmarks

related:

  1. Toronto Technology Week
  2. A Real Quantum Computer — This Week!
  3. first us gpl lawsuit filed
  4. Were You Once a Brobeck Client?

No comments yet

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS

Switch to our mobile site