Proximity Mining

Semantic proximity operators allow users to search for keywords that are in the same sentence, paragraph, or subfield. This is opposed to regular proximity operators (or just “proximity operators”) that allow users to search for keywords that are within a certain “distance” of each other (wherein “distance” is measured in number of words).

Proximity searching is one of my favorite ways to search for prior art. Finding two important keywords within 5 words of each other is a good way to cull more relevant search results from a search query than simply joining the keywords in a Boolean “AND” operation. The search query “skateboard AND flexible” may result in documents that mention “skateboard” and “flexible” pages apart, when what I really wanted was a “flexible skateboard.” Using proximity operators can make sure that the two terms are a certain distance apart, and thus more likely to be referring to my “flexible skateboard” concept. Various search systems handle proximity operators in different ways, and one of the most interesting ways is to incorporate semantic proximity search operators.

Semantic proximity operators have an advantage over regular proximity operators in that they are less likely to retrieve a certain kind of false positive hit. When searching for “flexible” within 5 words of “skateboard,” it is possible to return a hit with “flexible” and “skateboard” in different sentences, paragraphs, or sections of the document. In the case of a patent, perhaps “flexible” is at the end of the Background section and “skateboard” is at the beginning of the Summary section. If this is the only case of the two terms occurring near each other in the document, it is not very likely that the document is relevant to the concept I want to search for. Further modifying such a search query with the condition that the keywords must be in the same sentence or paragraph increases the likelihood that the hits have to do with my desired “flexible skateboard.”

For more reading on how individual search systems handle proximity operators, you can check out the Boolean and Proximity Operators section of each Intellogist Report. Examples of this content in search systems with semantic proximity operators include TotalPatent and QPAT.

Patent Searches from Landon IP

This post was contributed by Intellogist team member Chris Jagalla.

5 Responses

  1. […] be familiar with include AND, OR, and NOT. More advanced operators in any given system may include proximity and keyword weighting varieties (click on the links to see earlier posts on these subjects). Today […]

  2. […] Proximity Mining – A great post on how to use proximity operators to make your searches concise by specifying distance between words. […]

  3. […] me high maintenance, but I need more proximity operators. I need some wildcards. I could even use some coverage outside the US; maybe a little EP sprinkled […]

  4. […] value of proximity searching has been well documented on the Intellogist® Blog. Proximity operators allow users to specify a […]

  5. […] be specified. For example, NEAR3 specifies a distance of 3 or less between two terms. This enhanced proximity searching functionality is usually reserved for commercial patent search systems, so this is a nice addition […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: