In the last few posts on the Intellogist Blog, we’ve focused on major updates to large patent search tools, such as Google Patents and TotalPatent. Today I’ll take a step back from the big players and highlight some obscure search tools that may give you that extra boost you need to locate that one relevant piece of non-patent literature (NPL) prior art which you’d have otherwise overlooked. Scholrly is free search engine for academic papers (currently in closed beta-testing phase) that may one day rival Google Scholar, and HQ Books is a free PDF search tool which can help you locate product manuals and user guides from all over the world. For those patent analysts who want their daily dose of obscure resources: don’t worry, I have one for you, too! We’ll take a quick look at Clustify, document clustering software that identifies important keywords, representative documents, and a hierarchy of customized tags for almost any dataset.
Continue reading for a round-up of little-known tools for prior art searchers and data analysts!
Scholrly is a search engine for academic writing that was founded by a startup company based in Atlanta, Georgia and is currently being tested by professors at Georgia Tech, according to a profile of the tool on Betakit. Justin Lee’s article on Betakit describes the search functions planned for Scholrly:
Users can search by keyword to find relevant academic papers, and when there are multiple versions of the same work, free versions are prioritized in Scholrly’s search results. Each result shows full abstracts, the date of publication, the publisher, and details about the author. […] The site then provides the link to where publication can be downloaded or purchased, which eliminates any copyright issues. Scholrly only works as the middle man between the researcher and the publisher, and it plans to monetize by partnering with publishers and authors to promote their works and help sell them.
Scholrly plans to eventually cover all disciplines but will focus on “Computer Science and Information Technology, and will only cover academic papers in the initial launch phase.”
The site is currently in closed beta testing, but users can sign up for the open beta testing phase on the Scholrly website by submitting their e-mail address.
Clustify™ is a software recently mentioned in a blog post at Beyond Search created by Hot Neuron LLC that has the ability to “analyze documents stored in virtually any database and export cluster information back into the database as additional columns that can be used by many review platforms and e-discovery tools.” According to a press release about the software, the system “groups related documents into labeled clusters, providing an overview of the document set and allowing the user to review and categorize related documents together for greater efficiency and consistency.” Users can choose to “group documents that are conceptually similar, near-duplicates, or elements of an email thread.”
According to the Cluster-text website, “Clustify uses a proprietary mathematical model to measure the similarity of document pairs” and completes the following functions:
- Identifies the most important keywords that cause the documents in the cluster to be considered similar to each other.
- Identifies a “representative document” for each cluster.
- Creates a hierarchy of custom tags that you can use to categorize your documents.
Currently, the software is on Version 3.1. Version 3.1 gives users the added option “to automatically ignore email headers and footers to produce cleaner results that are more useful during the document review phase of e-discovery.”
HQ Books (which I originally learned about through this io9 post) is a search engine for freely available PDF files. The site clarifies in its “About Us” section that the system doesn’t “store, hold or retain any files. The original creator or rights holder owns the files. TopHQBooks.com merely displays links of files available freely on the web.”
The site includes two search options:
- A basic keyword search form.
- The option to browse books by country (currently 30 countries are listed).
Search results include a thumbnail image of the first page of the document, a link to the full record on the document, an excerpt, and bibliographic details (document size, page number, date, and source). Up to five stars appear above each thumbnail result image, and these stars seem to indicate the result’s relevancy to your search. Select the document title to view the full record for the document. Select the “download” option from the full record view to open the PDF from its original location.
From the full record view for a document, you can view a bread-crumb trail at the top of the page that displays the country and subject matter of the record (i.e. Home > Argentina > Iron work > How an iron lung works). The full record view also displays links to related documents.
These resources may not be of much use during a targeted patent search or an in-depth patent landscape project, but researchers and data analysts can still find a variety of uses for these types of niche research tools. HQ Books can be used to locate user guides from specific countries, Scholrly may eventually become a search engine for academic papers that can locate relevant prior art even more efficiently than Google Scholar or Mendeley, and Clustify will be a useful product for a general data analysis project. Tuck these search tidbits away for later use, because you never know when they may come in handy!
Do you know of any miscellaneous, obscure search or analysis tools that you’ve found surprisingly useful during a prior art search or analysis project? Tell us your story in the comments!
This post was contributed by Joelle Mornini. The Intellogist blog is provided for free by Intellogist’s parent company Landon IP, a major provider of patent searches, trademark searches, technical translations, and information retrieval services.