During last month’s annual PIUG meeting, it was my good fortune to see a presentation from George Garrity, Professor of Microbiology and Molecular Genetics at Michigan State University and a co-founder of NamesforLife, LLC. From George I learned about an important challenge affecting searchers of biological information: rapidly changing organism names.
This is an exciting time to be a biologist, as new knowledge is rapidly being discovered through DNA sequencing technology. But one downside is that a fast-moving field means a quickly changing taxonomy: as bacterial strains are differentiated from one another, their specific names frequently evolve. I was astonished to learn that the list of validly published names of Bacteria and Archaea changes about 15 times a week, and informal or trivial names are created and enter into the literature at a rate of approximately 100-150 times/day. Read on to discover how these challenges impact the patent field, and how the Names-for-Life technology is designed to help.
Changes in taxonomy create a difficult situation for the researcher, not only with respect to formulating good search queries, but also when interpreting search results. Obviously, searchers need to be aware of both older and newer names for any organism of interest when conducting a search, and should carefully investigate all possible keyword terms. In addition, from a biological perspective, searchers need to be aware of close family members to the organism of interest, as the invention in question may hinge on a metabolic process that could be shared by other organisms within the same family.
To address this challenge, Names-for-Life provides a carefully compiled, hand-edited, up-to-date reference database that matches biological names to their current meanings. Their system relies on the theory of semiotics to track a biological concept through taxonomy shifts; it maps biological names to their taxonomic concepts and organisms, and tracks fundamental concepts using the well-known method of the DOI, or digital object identifier. The reference work is performed by a team of human editors, and bibliographies are compiled to support the data. More in-depth information can be found on the company’s website.
Using this carefully compiled information, Names-for-Life technology scans electronic text, tags the names with DOIs, resolves synonyms, and provides access to vetted bibliographies and other information that allow researchers to probe more deeply into an organism’s background. The core idea is that researchers can more effectively resolve ambiguities presented by a rapidly changing taxonomy if they have the appropriate background material right at their fingertips. The company offers a web browser extension that semantically tags the names in a web page to offer these synonyms, and they are moving forward with efforts to pre-process and tag entire databases of patent and non-patent literature collections.
Not only does the Names-for-Life data allow searchers to be more inclusive when constructing search strategies, it has already allowed the company’s team to perform some extremely interesting analyses with regard to the use of biological terminology in patents. The Names-for-Life team recently performed this type of analysis on the green technology collection of the Alexandria database from Fairview Research. I’ll present a summary of the results here:
In cases where there are two synonyms, we find a number of instances in which the same organism is claimed under both names in the same patent documents. In only a small number of instances is it obvious that the inventor is aware of the synonymy. Most cases like this include broad and sweeping claims with hundreds of names listed as performing an equivalent function.
In cases where three synonyms occur, we find instances of pairs of names appearing in patents, but have not found a single instance in which all three synonyms appeared in the same document.
The other pattern we find is where patents include only one of the synonymous names. What we find in these cases are instances of overlap in technology and claims.
In other words, in known instances within the test set, a first patent references an organism, and a second patent discloses very similar technical content while referencing a synonymous name. It is important to note that I am not an attorney and nothing in this post should be construed as legal advice, however, the obvious conclusion to be drawn here is that the biological nomenclature in these patents could come under intense scrutiny in case of a legal challenge.
Another interesting result of the analysis relates to laundry lists of named organisms. According to George:
..there are a number of patents in the green technology collection that include long lists of named species (in some cases redundantly), but fail to specify a given strain that actually performs the claimed invention…Patents that include “laundry lists” of organisms that may or may not perform according to claims (and in fact, may not even exist) open the door to what could be some interesting challenges and counter-claims in the courts dealing with both non-enablement and prior art.
Based on this initial analysis from the Names-for-Life team, the challenges faced by biological taxonomists directly affect the work of inventors and patent searchers. I think it’s likely that their data may become integrated into more patent and non-patent databases as the value of their work becomes more obvious.
Does this challenge directly affect your patent searches? Let us know in the comments!
This post was contributed by Landon IP Librarian Kristin Whitman. The Intellogist blog is provided for free by Intellogist’s parent company, Landon IP, a major provider of patent search, technical translation, and information services.