Date online materials using the Internet Archive

Add to DeliciousAdd to DiggAdd to FaceBookAdd to Google BookmarkAdd to RedditAdd to StumbleUponAdd to TechnoratiAdd to Twitter
[tweetmeme source=”Intellogist” only_single=false]

We all know that the legal element of patent searching can add an interesting twist onto scientific and technical literature searches.   Those of us in the Intellogist community who do lots of validity investigations, for example, know that it’s not only *if* material is publicly available, but *when* it was publicly available, that counts.   Lots of folks I have talked to recommend the Wayback Machine, a.k.a. the  Internet Archive or archive.org, to provide useful evidence that web content was available online before a certain date.

The Internet Archive is an ambitious project to catalog historical versions of web pages.  In other words, using the archive, you can see a certain web page not only as it exists today, but as it appeared years ago.   The archive does this by using a web crawler to take “snapshots” of web pages.  These snapshots are then stored in chronological order, meaning that you can follow a web page’s history as the content displayed on it evolves.

Some uses for the Internet Archive are obvious, like turning back the clock to see what headlines were listed on CNN’s home page ten years ago.  However, it can also be used to build evidence of copyright or trademark infringement (e.g. an archive of a certain text passage or logo, once captured,  is stored forever).  And of course, it can help us during patentability and validity investigations to gather evidence about the date that materials were first publicly available on the web.   To use the Wayback Machine, first go to http://www.archive.org/.  Enter the URL containing your content of interest into the search bar which appears at the top of the page, and select “Take Me Back.” (the bar will prompt you for a URL and already contains the first part of the protocol, http://).  The display will show you captured versions of web pages in chronological order; as you can see from this test of http://www.cnn.com, the archive does not capture the page every day, only sporadically.

Wayback Machine

Archive for http://www.cnn.com

There are some caveats to using the Wayback Machine.  First, snapshots of a certain site may not be taken very frequently, which means that you can use the archive to prove a date when content existed on a page, but *not* when it was first added.  Secondly, the archive does not index pages which contain a robots.txt command, which specifically prohibits web crawlers from capturing the content.    Additionally, it may take 6 months to a year before snapshots are actually available in the archive for viewing.   One frequent user of the archive told me that it’s a good practice to check as many iterations of the page as you can manage – it’s possible that content can be put up, taken down, and put back up again during revisions to a site, so if you really need to beat a certain date it’s best to be thorough.  For more information about the legal uses of the site, you can check out the site’s FAQ, which has a section written especially for attorneys.

What other tips and tricks do you have for dating material on the web?  Share them with us in our comments section!

Like This!

Patent Searches from Landon IP

This post was contributed by Intellogist team member Kristin Whitman.

5 Responses

  1. […] Date online materials using the Internet Archive « The Intellogist Blog RT @Intellogist: Date online materials using the Internet Archive – New post on the Intellogist Blog! http://bit.ly/aDLbqg #patent (tags: twitter_automatisch patent) […]

  2. […] Date online materials using the Internet Archive – A great search tip for any professional, this post can help you cite prior art where no date is available (or so it might seem). […]

  3. Be warned!
    The Boards of Appeal of the European Patent Office have specifically ruled that internet archive services do not yet provide sufficiently secure evidence of a publication date. (Case Number: T 1134/06 – 3.2.04 Konami Corp, 16 Jan 07).
    According to David Rogers, Legal Member of a Board of Appeal at the EPO, “practitioners who are looking for ‘killer’ prior art would be well advised to stick with the traditional print means, unless they have a considerable body of evidence to support the reliability of an internet disclosure. [Konami] also sets out how a party can cast doubt on the reliability of such disclosures.”
     
    ^ David Rogers, Documents on the internet as prior art, Journal of Intellectual Property Law & Practice, Vol. 2 No. 6, June 2007, pp. 354-355.

  4. Thanks for the advice Michael! I wonder if this is the same outside the EPO’s jurisdiction. To my knowledge there have not been any definitive rulings elsewhere, but it’s certainly something to keep in mind. This is sort of a “last resort” strategy after exhausting traditional, more reliable sources.

  5. […] May we described how searchers could use the Internet Archives’ Wayback Machine to find prior art on […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: