It seems astonishing that the last annual PIUG conference took place more than three months ago. Although I learned so much at the conference, as time goes by I’m left with only my rapidly fading memories and a conference program full of illegibly scrawled notes. (I’ve heard the Handwriting without Tears program is all the rage in the public schools – I wonder if I should enroll myself.)
Before my memories fade entirely, I wanted to blog about one more conference tidbit. I thought it was interesting that Thomson Reuters chose to dedicate their workshop space to the Derwent Patent Citation Index (PCI) file.
You probably know that Thomson Reuters produces a number of value-added patent files, most notably the Derwent World Patents Index (DWPI). The Patent Citation Index (PCI) is a companion file to the DWPI, but has a unique value of its own. It groups documents into the Derwent patent family structure in order to make correlations between equivalent examiner citations on each family member. What I mean by equivalent citations is this: when a US Examiner cites a patent as prior art, it’s likely to be a US document, and when a Japanese Examiner cites the same patent, it’s likely to be a JP document, but both examiners might be referencing the same invention.
Consider a situation in which US and JP examiners were independently searching on US and JP pending applications from the same Derwent family – let’s call it family A. Derwent families are collections of related applications which represent a single new invention. In this case, each examiner could find relevant prior art from Derwent family B (an older invention covered by both US and JP patents). Within family B, it’s likely that the US Examiner is going to cite the US family member because it’s written in English, while the JP Examiner will cite the Japanese equivalent.
Now consider the plight of the independent patent analyst looking at the citations on both of these documents. At first glance, the two search reports give the impression that two independent cases of prior art exist. The PCI file was created to solve this problem by grouping both the citing and cited patents into their Derwent families, thus simplifying the relationship between four patent documents down into two inventive concepts.
Using the PCI can make citation searching more efficient: with a single step, users can retrieve the forward citations from all family members of a Derwent family at once, rather than having to investigate the citations of each family member at a time. The other value is that only the unique Derwent families are retrieved, rather than an ungainly collection of document numbers containing multiple redundant family members from various authorities. (I should also note here that for family citation searching, I believe Questel’s FamPat file can also produce the same effect – and, because FamPat lacks the editorial content of the DWPI, it is also probably more economical.)
The Thomson Reuters workshop at PIUG highlighted another extremely important use of this file in high-altitude patent analysis work. The PCI benefits from the editorially enhanced bibliographic data that is the hallmark of the DWPI file, which includes standardized assignee names (and Thomson has recently widened the scope of this feature to include any patenting entity with over 1,000 published patent documents). When performing citation analysis using data retrieved from the PCI, the end result will be grouped into Derwent families, which more clearly represent inventive concepts than individual patent documents do. And the presence of assignee name standardization reduces the amount of manual cleaning needed to produce meaningful analysis.
It surprised me that this database headlined the workshop at PIUG because the PCI file has not even been loaded into Thomson Innovation yet; you can only access it through Dialog or STN, although I hear that an implementation of PCI is in the pipeline (and I wish I could attribute this rumor, but sadly all I have is a big circled area in my notes saying “PCI on Thomson Innovation is IN THE PIPELINE!!!).
It was also a pleasant surprise to see this value-added file on display at the conference because I’ve always wondered why it didn’t seem to have a very high profile in the patent search community. I think the analysis angle is a particularly interesting way to use the file. What do you think – had you been using PCI long before I hastily scrawled my notes, and if so, what’s your take on the best way to use it?
This post was contributed by Intellogist Team member Kristin Whitman.