Free Federated Search Engines for Scientific Data and Literature

[tweetmeme source=”Intellogist” only_single=false] Federated search portals are all the rage right now, and the US Government seems  enthusiastic about utilizing this technology to make its wide range of scientific data and literature easily accessible.  Patent searchers who need to do an exhaustive search, like Landon IP’s Scour the Earth® search service, to locate all relevant prior art will find these portals particularly useful, since the searcher can browse through relevant, ranked results  from multiple databases in a single hit list, instead of searching each database individually.  The Office of Scientific and Technical Information (OSTI) of the US Department of Energy gives an excellent overview on how federated search portals work:

When you enter a query in basic search, the query is sent to every individual data resource (database, collection, and portal) searched by the discovery tool. The individual data resources send back a list of results from the search query. Results are then ranked in relevance order. You can review the results and navigate to the host site of a particular result for more detailed information.

After the jump, we’ll look at four federated search portals, and we’ll see what features these search systems have in common.  We’ll also find which features make each federated search portal unique.

Science Accelerator

Science Accelerator is a free federated web search engine created by the U.S. Department of Energy’s OSTI. The tool allows users to search “key resources” from the Department of Energy’s collections, which are all listed on this page of the website.  Search features of the portal include:

  • Quick search option (keyword search form) available on the site homepage and in the upper right corner of each page on the portal.
  • Advanced Search form, where the user can search by keyword within the full record, title, or author fields. Within the advanced search form, the user can also select a date range and select which resources to search in (via checkboxes beside each resource name).
  • Refine search results in hit list (through a keyword search within the result set).
  • Re-rank the results (by rank, date, title, or author).
  • View results from a specific resource.
  • View results from Wikipedia and EurekAlert.
  • View clusters of results sorted by topic or date.
  • Select specific results from the hit list and e-mail the results.
  • Create an alert based on the query (for registered users only).

Each result in the hit list displays the document title, relevance rank (illustrated by colored star icons), source, a brief snippet from the document, and any other important bibliographic data (title, publication date, etc). Selecting a search result will take the user to the document record (or full-text version, if available) on a third-party site.  A small PDF icon  below the result listing indicated that the users can download a full-text PDF version of the document.

Search results for Science Accelerator.

WorldWideScience.org

WorldWideScience.org is a federated search portal that searches international scientific and technical literature. The service is offered by the WorldWideScience Alliance, which is a “governance structure” for the search portal. The portal was created by the OSTI (like Science Accelerator).  WorldWideScience.org has a multilingual interface, accessible on the homepage or through the advanced search form, which allows the user to select a language to search in (translations are powered by Microsoft Translator).

Search features and results are very similar to the Science Accelerator portal, with quick or advanced search forms, all searchable resources listed below the advanced search form, results ranked by relevance,  result topics (and common authors, publishers, publications, and dates) listed in a sidebar on one side of the hit list, results from Wikipedia and EurekaAlert on the other side of the hit list, options to print or email results, and the option to turn the search into an automatic alert (for registered users).  WorldWideScience.org divides results into two lists (accessible through tabs at the top of the hit list): Papers and Multimedia.  An option at the top of the hit list also gives user the option to “Translate Results” (although this feature didn’t seem to be functioning correctly at the time of testing).

Search results for WorldWideScience.org.

Science.gov

Science.gov is a search system with access to over 50 databases and over 2100 selected websites, offering 200 million pages of authoritative U.S. government science information, including research and development results.  Science.gov is an inter-agency initiative of 18 U.S. government science organizations within 14 Federal Agencies, and Science.gov is also the U.S. contribution to WorldWideScience.org. The About Section of Science.gov lists some of the most recent features of Science.gov 5.0.  Most of these listed search features are similar to the features available through Science Accelerator and WorldWideScience.org (all resources listed below advanced search form, relavency-ranked results, topic/author/date clusters, options to email results and create an alert of the search, etc.) .

The Image Search feature (described in more detail in a previous blog post) seems to be a unique tool on Science.gov, which allows users to search by keyword and select from a list of image resources in which to conduct their search.

Search results for Science.gov.

Scitopia

Scitopia is the only federated search portal in this list that isn’t maintained by the federal government.  Scitopia runs on  Deep Web Technologies’ Explorit Research Accelerator federated search engine (Science.gov also runs on Deep Web technology), and the portal is maintained by a collaborating group of science and technology societies.  According to the site homepage, users can search over “3.5 million documents, plus patents and government date.”

Like the previously listed federated search portals, Scitopia features simple and advanced search forms (with all resources listed under the advanced form),  a side menu of topic/author/publication/publisher/affiliation/date clusters based on the search results,  relevance-ranked results, alert and email options for queries and results, and search results that appear on third-party sites when selected.

Like WorldWideScience.org, Scitopia divides its results into separate lists, organized under three tabs: Societies, Patents, and Government.  Scitopia additionally includes a “topic browse” index, which lists common topics in alphabetical order and links to search results for each topic (Science.gov also has a topic index).

Search results for Scitopia.org.

Conclusion

All four federated science search portals have very similar interfaces and search features, such as advanced search forms with all available resources listed below, search results ranked by relevance,  a side menu of topic/author/date/etc. clusters, options for emailing results and creating alerts from searches, and search results that open directly in third party sites when selected.  Some unique features exist on each portal, such as an image search feature on Science.gov or a multilingual search interface on WorldWideScience.org. The main differences between the portals, however, seem to be the databases which these portals search through.  All the portals may share some similar resources, but each portal seems to have a slightly different focus: Department of Energy resources for Science Accelerator, international scientific information on WorldWideScience.org, US scientific government information on Science.gov, and scientific data from both government and technical society resources on Scitopia.

All four portals seem to use very similar search technology (Deep Web specifically for Scitopia and Science.gov), and the OSTI played a large part in the creation of three of the portals (Science Accelerator, WorldWideScience.org, and Science.gov).  Despite these similarities, each portal searches some unique databases, so each portal will produce a different result set for identical queries.  Prior art searchers should therefore search all four portals, as well as other available subscription and free search systems, in order to locate all relevant non-patent literature prior art.

Have you used any of these federated search portals for prior art searching? Which portal do you think has the best search features?  Let us know in the comments!

Technical Translations from Landon IP

This post was contributed by Joelle Mornini. The Intellogist blog is provided for free by Intellogist’s parent company Landon IP, a major provider of patent searches, trademark searches, technical translations, and information retrieval services.

Leave a comment