The Future of Patent Translations: Human or Machine?

The EPO recently partnered with Google to offer free machine translation of patents into multiple languages on espacenet.  According to an EPO news release from March 24, 2011, “the EPO will use Google Translate technology to offer translation of patents on its website into 28 European languages, as well as into Chinese, Japanese, Korean and Russian.”   This news made me wonder: what is the future of human translation of patent documents?  Can professional translators compete with the speed and lower prices of machine translation services?

I decided to ask an expert, Sonja Olson, Director of Translation Services at Landon IP.  If anyone would know how human translation compares to machine translation, Sonja would know.  I also found some useful articles that highlighted the importance of professional input in the patent translation process, especially if the document will be used for legal purposes.  Read on to see what insights the journal articles and my conversation with Sonja provided about the future of patent translation!

Benefits of Machine Translation

Tim Cavalier illustrates the benefits of machine translation in his article “Perspectives on machine translation of patent information” in Volume 23, Issue 4 of World Patent Information (December 2001).  According to Cavalier, machine translation (also known as MT) can lessen the workload for human translators by decreasing the amount of drafts that professional translators need to complete.  Cavalier describes how “in many cases, MT can definitely be used to automate the first translation (draft) stage and quite often completely removes the need for undertaking any second (post-editing) phase” (367).  Cavalier also discusses the lower cost  and efficiency of machines translations: “the migration of patent information away from traditional manual translation methods leads to much more cost-effective and timely production methods” (369).

Machine translation is a useful tool for human translators, it’s less expensive than human translators, and the process is faster.  However, Cavalier concludes that machine translations should only be used for limited purposes.  According to Cavalier, “free or low-cost services may be fine for finding out whether the information is of any interest (i.e. browsing),” but researchers should use “manual translations when higher quality translations (e.g. legal) are required” (370-371).  Why is human (manual) translation so important?

Why Human Translation Can Never Become Obsolete

Steve Vlasta Vitek discusses the importance of human translation of patent documents in his article “Reflections of a Human Translator on Machine Translation or -Will MT Become the ‘Deus Ex Machina’ Rendering Humans Obsolete in an Age When ‘Deus Est Machina?‘”, published in Volume 4, No.3 of  Translation Journal (July 2000).   Vitek acknowledges the benefits of the cheaper machine translation: “It is much cheaper to use this option—the average cost for translating a machine-translated patent is about 60 dollars, while the average cost of human translation is at least ten times higher.”  However, Vitek also highlights the main reason why machines can’t replace humans when it comes to creating high-quality translations:

The problem is that the machine does not understand the meaning of the document at all. Therefore, although most of the technical terms used by a machine will be correct, it is up to the reader to make sense of those words haphazardly jumbled up together by a non-thinking machine. (Vitek 2000)

Vitek comes to the same conclusion as Cavelier: machine translation is an excellent tool for researchers to use when browsing patent documents for relevance, but a professional translator should be used when translating the document for official purposes.  According to Vitek, patent searchers should translate documents “‘on the cheap’ by a machine and then ask a human translator to translate one or two important patents as evidence of prior art design.”

Vitek’s journal article was published a decade ago, and translation technology has greatly improved over the last 10 years.  Search tools, like Google Translate, use semantic learning technology to improve their machine translations by “detecting patterns in documents that have already been translated by human translators” (from “Inside Google Translate“).   However, even with this advancement in technology, translation tools still make many mistakes.

A recent article by Adam Wooten of the Deseret News, “Google Translate has great uses, disastrous misuses,” discusses some of the consequences of glitchy machine translations (quote originally found at a very informative blog post from Beyond Search):

A newspaper mistranslation repeatedly misquoted a former president of Kazakhstan as referring to the important issue of ‘passing gas.’ Israeli journalists nearly sparked an international incident when they seemed to insult a Dutch diplomat’s mother in a machine-translated message. Finally, an automatically translated furniture tag contained a racist slur that seriously offended customers in Toronto, Canada.

Obviously, machine translation technology isn’t perfect yet.

A Patent Translator’s Opinion

I wanted to hear an expert’s opinion first-hand, so I sat down with Landon IP’s Director of Translation Services Sonja Olson and Translation Services Manager Andreas Zierold to discuss the pros and cons of machine translation.  Sonja quickly summarized her opinion: If you need to know if a document contains information on a topic, you can go with a machine translation.  If you need to know how that topic relates to the document overall, then go with a professional translation.

Sonja described how IP professionals are usually divided into two camps: lawyers, who have a positive view of machine translation, and translators who have a not-so-positive view.  According to Sonja, “machine translation will get the gist of the document, but it will lose the nuance.  Machine translation can plug words together, but it can’t understand the sentence as a whole.”

When I asked Sonja about her thoughts on the partnership between the EPO and Google, she raised one main concern.  Sonja described how the Google Translation tool is trained to translate patents by feeding English documents and the equivalent non-English applications into the system, and the system will learn that these documents are equivalent. However, “post-editing the translation for filing in certain target countries may introduce structural differences” between the documents.  Sonja acknowledged that Google will probably account for this issue, but it is a problem that should definitely be addressed.

Machine or Human?

Sonja concluded that machine translation for the EPO documents is fine for informational and reference purposes, but it shouldn’t be used for filing or legal applications.  Sonja Olson, Tim Cavalier, and Steve Vlasta Vitek all seem to be in agreement that machine translation is a useful tool for researchers quickly browsing documents for relavence, but human translators are essential for translating documents needed for legal purposes.  A machine can create a rough translation of vocabulary and grammar, but humans can translate the meaning and structure of the document.

Have you ever encountered strange errors in machine translations of patents?  What are your views on the EPO’s partnership with Google Translate?  Let us know in the comments!

Technical Translations from Landon IP

This post was contributed by Joelle Mornini. The Intellogist blog is provided for free by Intellogist’s parent company Landon IP, a major provider of patent searches, trademark searches, technical translations, and information retrieval services.

About these ads

19 Responses

  1. As a patent attorney who entered the profession 30 years ago while being a Japanese-to-English technical translator, I feel some confidence in commenting on J-E MT as being “not ready for prime time” even in 2011.
    What’s good about MT? – price (low, sometimes even free) and speed (the ability to check very quickly whether a word of interest – I wouldn’t necessarily say a concept of interest – appears in the document), if the document is available for MT.
    What’s bad? – MT requires the source document in electronic text form (whereas alphabetic languages can rely much more on converted images) and poor quality of translation (because of the major differences between grammars, much more of a problem than between European languages). And vocabulary is a huge problem, among other things because of the Japanese habits of (a) using foreign words (mostly, but not solely, from English) and transliterating them – they don’t necessarily transliterate back into the original; (b) using transliterated words in odd ways – “handoru” (“handle”) means “steering wheel”, and (c) following the Germanic style of forming lengthy compound nouns – where the whole word is often much more than the sum of the translated parts.
    Even little things can hang up J-E MT – for example, the way words are not spaced in Japanese. My favorite is “Cat Cafe Leon” (http://nekocafe-leon.com/), where the slogan is “Cat Cafe Leon, where you can play with a cat” (“neko to asoberu neko cafe reon”). Not complex, but Google Translate renders it as “Kafere temperature play cat and cat”, among other things regrouping “cafe” in katakana and “reon” in hiragana together as “kafere” (untranslatable) and “on” (temperature). If all you want to know is whether the word “cat” appears, fine, but don’t look for meaning.
    I’d love to hear from others on the European language situation.

    • Thank you so much for your insight on the Japanese-to-English translation perspective. From my own experience with Spanish-to-English translation, slang or odd use of English or “Spanglish” terms within a Spanish-language context will always throw off machine translations. A human translator can work around these obstacles and actually produce a coherent, meaningful translation.

  2. As a professional patent specialist for more than 17 years, I would say MT will never understand Chinese, espeically while the volume of Chinese application worldwide is increasing rapidly. Very often in time, Chinese use some wordings to express a meaning, however, the meaning is somehow kept while the sequence of the wordings is changed. How about that! Can a MT replace manual translation? You tell me.

    • I personally don’t think MT translation can ever replace manual translation, especially if you need an accurate and understandable translation!

  3. Interesting post. I am the guy who wrote the article about machine translation for Translation Journal 10 years ago. (Would you mind correcting the spelling of my last name? It’s Vitek, not “Vistek”).

    Although as far as I can tell, the quality of machine translation has not changed much since I wrote that article, MT is now is used by hundreds of millions of people worldwide, including my family in Europe – my brother and nieces use it to read my blog, for instance.

    I use it for example when I translate Japanese chemical patents with complicated terminology because then I don’t have to look up names of long compounds in the dictionary or on the Web.

    I hope you till take a look at my blog and leave a comment once in a while.

    For example, I wrote a short post with a title that is similar to the title of this post, see below.

    http://patenttranslator.wordpress.com/2011/03/20/what-is-the-future-of-translation-in-the-translation-of-the-future/

    • Thank you so much for commenting! (And I’m sorry about the spelling mistake, it’s been corrected.) I believe I actually found your article through the “Articles of Interest” section on your blog; the blog was one of the main search results when I was researching information on the topic.

  4. Thanks for the correction.

    I put a link to your blog in my blog roll and plan to read it frequently.

    Best regards,

    Steve Vitek

  5. Good morning:

    I’ve seen your article late, whoever I would like to leave my opinion. The company I work for – Vivanco & García – has specialized in patent translation since 1995, so somehow we believe we have a real background about this topic. We have a complex quality system (besides ISO 9001, EN15038 standards) which is almost unique to the extent that a translation that passes SAE J2450 metric does not pass our quality tests. I’m saying this just to understand the meaning of quality in our company.

    Said so, we are all in the market and of course we are suffering a huge price pressure, so yes, we have been investigating how to decrease our cost structure for providing more competitive rates without losing quality.

    We have tried to increase our speed by using CAT tools, and we don´t mean just buying one license and try, we set up the whole infrastructure including dedicated TM server, two translators full time involved with this project, thousands of hours aligning documents (manual) and after almost 2 years, we are about to give up this project due to poor results.

    Pre-translated contents using our TMs are below 6% and in terms of speed and reliability, our translators are faster using the “old fashioned” system than when using CAT tools. One of them reached the same production level after 5 months, but never increased speed. In terms of quality, our certified VGM-WordMetrix-100 index did not show any improvement, was the same.

    About two weeks ago we did another trial using machine translation (MT), not Google’s one, and the results were, again, pretty poor.

    We won´t say anything related to Google, I believe the opinions given here are pretty clear, nothing else to add.

    Bear in mind that sometimes a misplaced comma changes the meaning and/or the scope of claims. Sometimes, claims are drafted ambiguously and such ambiguity have to be kept in your translation which involves mastering the domain you are translating. Sometimes you just get a document poorly translated to English, specially from Korea, China and Japan and you have to translate it into other languages…

    ¿How MT handles these? We believe it just doesn´t

    Of course technology is there, we can’t ignore it and yes, it’s improving! However, at this stage, can be used exclusively for information purposes.

  6. @Vivanco @ Garcia

    Contrary to what many people believe, especially those who don’t know anything about translation, MT is not and never will be a substitute for human translation.

    But MT is extremely useful to me for a number of reasons.

    I use it to estimate the number of English words that will be in my translation, and I also use it as a dictionary – if I print out the MT product, I don’t have to look up words in dictionary or online as often.

    In my experience, Google Translate does a much better job with German than with Japanese, but it is still not comparable to human translation and probably never will be.

    The public does not understand that MT is only software that can be used for a number of purposes, but usually not to replace human translators.

    MT can be used instead of a human translator only if the quality of the translation is not very important.

  7. I think we’re all in agreement that MT isn’t very well-suited to complex technical text, such as patents. That may or may not change in the future … we’ll simply have to see how the technology evolves!

    Regarding Vivanco & Garcia’s remarks about Computer-Aided Translation (CAT) tools, I’d say that that’s an entirely different topic, which does warrant some discussion.

    For the uninitiated, a CAT tool is comprised of a database, which collects and stores translated sentences. As the (human) translator progresses through the file, the CAT tool examines a source sentence for translation, compares it to existing translated sentences in the database, and displays any exact or “fuzzy logic” matches for the translator’s consideration. When the translator finishes a sentence, it is automatically added to the database.

    When dealing with reference/informational translation of prior art, I agree that CAT tools seldom add much value: the source material is simply too varied for any significant leverage, even if you are fortunate enough to have accessible source text, rather than a scanned PDF. Of course, you can use Optical Character Recognition (OCR) software to generate text from such a PDF, but the results are usually shaky at best. It’s often more efficient to work the old-fashioned way, as Vivanco & Garcia suggests.

    On the other hand, when a customer is frequently translating applications related to similar technology, CAT tools can offer significant advantages. When we see an assignee filing applications for improvements to their existing patented technology, for example, then CAT tools would be very useful … using the original translated applications for leverage might then be a cost-effective solution.

    Ultimately, we have to evaluate each translation project individually and apply the most effective solution to the specific problem at hand!

    Sonja Olson
    Director, Translation Services
    Landon IP, Inc.

  8. Leonid Gornik, translator, former patent attorney
    I like the blog, and I accept almost everything above.
    I started translating patents in 1964. It is 2011 today. Technology advance can’t be ignored. The advent of computers was a giant step after decades of typewriting (I used mechanical typewriters (one for Cyrillic and the other for English). Now we have TM software. I have avoided it until recently, but I have bought Memo Q half a year ago, and it’s working fine. A futile thing for patent translations though unless you are dealing with the same IPC class/group all the time. Now there is MT out there. I tried Google tool on simple texts – not very impressive and often time misleading. How about a German claim with many clauses? How about the choice of terms and even words? What about the word “head” that has so many different translations in some other languages? I also have certain doubts about a “great” help of MT in browsing (pre-search). Search is a probabilistic thing, and it is very important to know what you are looking for. Yes, you give the stupid machine some keywords or meta words, but when I used to conduct patent search (for 20+ years) in the patent library using my hands and eyes only, I was looking not only for exact matches, but “around.” It is like you are driving your car looking forward while using your side vision. I used to find “gems” very often not exactly in the places Google would go. The cost is a very important consideration, but how about the cost of missing something or mistranslating? My expectation is that post-editors for Google or for people who want to have their cost cut are going to be as cheap as “machine translators.” This is a global tendency anyway. I am not so sure that professional translators will be highly enthusiastic to do the clean-up job after computers. The whole idea of using the databases of human-made translations is a part of Goggle world-wide drive to grab any intellectual property there is out there on the web for free (and I believe that translation is indeed the intellectual property). I would add that the databases already contain a lot of mistranslations and blunders. I have seen a lot of that in EPO/PCT abstracts translated from other languages. The errors will multiply like bugs … We know that patent specifications are formalized and contain many formal elements, but for the rest the description is as complicated as any scientific paper. I can’t say that MT will never replace human translators (never say never), but it has a long way to go until the time an artificial intellect with elements of human intuition is created and put to work. I am not sure I will see that.

  9. The quality of the translation depends not only of the type of MT, but also the type of document and the language combination. Legal documents are handled better than literary translation. Should human translators ever be replaced by machines, legal ones would go before literary translators. More about our study on:

    http://blog.bab.la/2011/10/31/promt-systran-google-bing-%E2%80%93-has-the-age-of-machine-translation-finally-arrived/

  10. I work as a patent attorney in Asia. I would split the uses of translation into three categories.

    1. Filing a patent application. Only a fool would use a machine translation at present, its not good enough.

    2. Arguing with the Examiner about relevance of prior art which is not in English – machine translation will do where the issues are simple. If the issues or document are in anyway complicated you are shooting in the dark.

    3. Infringement clearance. It is possible to give advice based on a machine translation, but very risky if the patent is remotely close to what your client is doing. We did because the client wouldn’t pay for a translation. However, I wouldn’t recommend it even for German to English, never mind Chinese to English.

    In my opinion more effort should have been put on making human translations available. For years it was necessary to file human translations to validate European patents in each country. However, the translation was only available by filing in an official form and making a request to the national patent office and then waiting six weeks. Not practical for most purposes.

    These translations still exist on hundreds of thousands of patents which are in force. For these cases the EPO should focus on making these human translations available through their website, rather than providing machine translations. In my opinion the public and local companies in Europe have been robbed, by the insistence of a few multinationals that everyone understands English so translation of patents is not necessary.

  11. Great post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 756 other followers

%d bloggers like this: