Is it possible for software to summarize a patent?

[tweetmeme source=”Intellogist” only_single=false] Patent searchers need to be both thorough and efficient, and sometimes a searcher needs to skim the description and/or claims of a patent document to determine its relevance to the search criteria.  Here’s where we face a problem, though: patent documents can be long.  Very long.  Instead of wading through all 50 claims, there are tools out there that allow a searcher to view a summarized version of the document.  The abstract of a patent is meant to be a summary of the document, but sometimes the abstract fails to include important technical terms or features that are relevant to a prior art search.  Therefore, an extended summary automatically generated from the claims or description may be useful in those circumstance where the abstract doesn’t provide enough information to determine the document’s relevance.

In the newest version of PatBase, a “Summarise” tool can be used to condense the Title/Abstract, Claims, or Description sections of a patent document to 1, 5, 10, 25, or 50% of its original length.  If you don’t have a PatBase subscription, free options are also available. Back in January, 3 Geeks and a Law Blog highlighted 5 Resources for Summarizing Web or Other Content.  Two of these applications, GreatSummary and FreeSummarizer, seem particularly promising for patent summation, since they allow users to cut and paste large blocks of text into a form and choose the number of sentences for the condensed output.

After the jump, we’ll experiment with the PatBase “Summarise” tool and the two free summarizing tools to see which application produces the best patent document summary!

PatBase’s “Summarise” Tool

First, I tested the new “Summarise” tool on PatBase.  When a user is viewing the full-text sections of a patent document on PatBase (arranged under three sections: Title/Abstract, Claims, and Description), they can select the “Summarise” link to automatically condense the full text section to a selected percentage level (1, 5, 10, 25, or 50%). The user can them save or print this summary. Selecting the “View all” option will reveal the full text of the section to the user, with the summarized part of the text highlighted.  I tried summarizing the claims section of US2004234820A to 5%, and the resulting summary appeared to highlight a single claim (out of 41 claims).

The 5% summary of a patent document's claims section on PatBase.


I then cut and paste the entire claims section of US2004234820A into GreatSummary.com.  GreatSummary is a free online tool that allows the user to either paste a large block of text or the URL of a webpage into a form on the site, select a number of sentences for the final summary, and produce a summary of the pasted text or webpage.  I chose to produce a five sentence summary of the entire claims section, and the resulting summary was rather long due to the length of a sentence that constitutes a single claim.  The summary from GreatSummary.com seemed to be made up of five individual claims.

A five-sentence summary of the claims section on GreatSummary.


Finally, I pasted the claims section of the US patent application into FreeSummarizer.com, another free automatic summary tool where the user can paste a block of text and select the number of sentences for the final summary.  I chose to produce a five sentence summary, which again resulted in a lengthy summary consisting of five individual claims.

The five-sentence summary of the claims section on FreeSummarizer.


The main problem with trying to summarize patent text within the free online summary tools is that the condensed output is based on numbers of sentences, and sentences can be rather lengthy within a patent document.  The format of patent claims, especially, leads to complex and run-on sentence structures, so a summary based on a certain amount of sentences will result in a number of claims being incorporated into the final summary.  It still may be easier to read five important claims than to read all 41 claims, but I personally preferred the summary output based on the PatBase “Summarise” tool.  The PatBase tool focused on percentage of text rather than number of sentences, so the final output is guaranteed to be a specific length (in proportion to the original length).  I also appreciated the option to view the summarized text highlighted within the full text, so the user can see where important keywords may be concentrated in the patent document.

The PatBase  “Summarise” tool produces the most concise summaries, since it condenses the text to a certain percentage instead of a number of sentences.  Ultimately, though, the searcher should use these summary tools cautiously, since there is always the chance that the summary may overlook an important claim or keyword that makes the document relavent to the search.  Neither the PatBase manual nor the websites of the free summary tools describe how exactly the applications identify which sentences or sections of text to extract, so the user may not be getting the most high-quality summary available.  If a user wants a high-quality, hand-produced summary of a patent document, they should use the Derwent Abstracts from the Derwent World Patent Index.

Do you know of any online tools or database features that produce useful summaries of a patent document?  Let us know in the comments!

4 Responses

  1. I’ve long thought that a search function for Summary of the Invention section of U.S, patent applications (and many foreign applications based on or intended for the U.S) would be the best way to obtain a handle on relevancy. It was something that we pursued for a short time at Patent Lens, but unfortunately didn’t bring to completion.

  2. Try using http://www.cruxbot.com for automatic summarization

    with cruxbot you can not only Generate a summary but also dynamically change the “Point of View” of the summary

  3. We have produced a tool for patent summarization. We cut sentences into segments and after selecting the most relevant segments we combine the segments (using natural language generation) to make the text fluent.
    You can request access for a demo in http://www.topas-engine.upf.edu

