Friday Fun: Is IBM’s Watson the future of search?

Add to DeliciousAdd to DiggAdd to FaceBookAdd to Google BookmarkAdd to RedditAdd to StumbleUponAdd to TechnoratiAdd to Twitter
[tweetmeme source=”Intellogist” only_single=false]

This week the New York Times published a fascinating in-depth piece about Watson, a new supercomputer created by IBM specifically to answer the type of vague, allusive riddles posed on the quiz show “Jeopardy!”  The show may even host the supercomputer as a contestant as early as fall 2010.   Anyone interested in search technology, and specifically natural language query processing, should definitely give this article a look.

It’s not just that Watson can answer simple, factual questions – it’s that the machine can even put together two concepts, such as a typical Jeopardy! question that requires contestants to name a combination between a candy bar and a Supreme Court Justice – “Baby Ruth Ginsberg.” There are no major breakthroughs in search technology to report; instead, the machine is able to answer the questions in the short time frame given to contestants by virtue of its enormous computing power and memory. But it doesn’t just use one single search algorithm to answer questions. Instead, Watson analyzes its knowledgebase using many different statistical methods, generating various possible answers. If multiple methods pinpoint the same result, it has a higher chance of being correct, and will get a higher “confidence” score from the computer. After a certain item passes an acceptable confidence threshold, Watson will buzz in and answer the question (of course, the machine will use the famous “What is ___ ?” format that the game show requires).

After checking out the article, you can also play an interactive trivia game,  simulating a round of “Jeopardy!” with Watson.  I was amazed at the questions Watson was able to answer, but just as interested to learn about the questions it missed, and why.

The obvious practical application for Watson’s technology would be providing quick decision support.  For example, a later generation system might analyze the body of medical research to supply fast answers for doctors, or to provide the first line of customer support at a virtual call center.   For patent and prior art searching, a program that can produce a quick match is always interesting, but when it fails, we’re often still in the position of having to prove a negative.  However, we also stand to benefit from any improvements in search technology that arise from Watson’s development.

What are your ideas about how Watson’s technology might one day benefit the IP industry?  Let us know in the comments!

Like This!

Patent Resources Group

This post was contributed by Intellogist team member Kristin Whitman.

4 Responses

  1. I wrote this for a discussion group I am on that discusses cultural psychology but decided not to send it. It is kind of long, hope that is okay. It is a little opinion piece on search technology spurred by the article in the NY Times on the Watson device.

    – Steve

    There is a a long Sunday June 20, 2010 NY Times article entitled “What Is I.B.M.’s Watson?” by Clive Thompson. IBM now has a computer program named “Watson” that can play a respectable game of ‘Jeopardy!’, a feat unheard of a few years ago. Watson, as I see this supercomputer + program, is basically a massively expanded wikipedia that uses statistical analysis to group facts to try to come up with the likeliest answer to any factual question. It makes lots of mistakes, but also gets enough right, and quickly enough, to at least win some of the time against good Jeopardy players.

    My own inclination is than some commentators seem to impressed with the wrong things about this machine. (One comment following the NYTimes article I did really like: “A well-trained man can answer almost any question. A thoughtful man knows which questions are worth asking.”)

    Answering trick questions is good for creating the illusion this machine can “think,” and pleases the proponents of AI. And it keeps alive the hope that machines can learn to talk like humans, a grand hope, especially in the much needed are of language translation. Of course, this computer, like those before it, can’t really “think,” not in the sense that any living organism is a thinking body (Spinoza), let alone can it think or talk the way a human does. IBM’s Watson supercomputer/program is not alive in any way whatsoever. (In fact, the IBM ads cleverly say this machine is “designed to rival” the human mind – an important admission). It is just a tool, an extension of human organs, although certainly an amazing one. Of course, it is always fun to see and imagine machines acting like humans – I will always love that sci-fi theme, and our attempts are always intriguing.

    And I still get a kick out of the old Eliza program, which psychoanalyzes you by turning all your responses into an intrusive question. Watson’s creator create the illusion this expensive machine can understand questions. It can’t. It does not have reactions to the world in any living sense. But this remarkable machine can, at lightning speeds, associate and relate data from enormous quantities of sources in hundreds of ways at the same time. It is a super-encyclopedia.

    So what is really interesting to me is not how Watson answers **tricky** questions, but how it seems to demonstrate new possibilities for computers assisting humans in finding answers to relatively **clear** ones. We are currently living through two amazing generations of technology in this regard – the printed reference book (the science book, dictionary, encyclopedia, library catalog, etc.) and now their digital forms in google, bing, wikipedia, google books, etc. etc. We library and computer users are amazed everyday at how one can get a extraordinary number of factual questions accurately answered in short periods of time. Even a single science book or article reflects these capacities for instant scholarship, which have exploded in the past couple centuries, and to new heights in the last couple decades.

    The Watson superprogram seems to be a beginning of possible extension of the new computer generation to a whole new kind of level – showing an ability to not just organize factual statements in books, and not just search for keywords among vast arrays of documents on screen, but now search through masses of documents for **statistical patterns** among the words and phrases.

    This approach is turning out to be surprisingly useful. A popular and simplified version of this approach can be found, for example, on Wordle.com, a site which creates an artistic-looking version of the main words in a document, making the most frequent words the largest. This kind of program is sort of like an aerial photograph, helping us see things we know are true, but can’t fully see from the ground. Watson is a very high level version of this approach to language.

    I don’t know if Watson is going to be able to beat clever humans at the buzzer aspect of Jeopardy, apparently one of its weaker talents. But if computer programs and big computers like Watson can become accessible to the population, these tools will surely be faster than anyone can google, thumb through, copy, highlight, skim, etc. – and do so through far more documents than one can google through in an afternoon. If Google and the Wikipedia and their colleagues in the web allow us to pursue 10 factual questions for every one we could look up in an encyclopedia (not to mention do it in our kitchens!), with sufficient articles and books to search, Watson-wordling might be able to help us gain that kind of speed 1,000 fold.

    To the extent we are asking good questions and **we** are understanding the results (remember, Machine Watson understands nothing), we may learn new ways of getting new kinds of aerial snapshots that will help us soar even faster and higher over our subject matters.

    In other words, this Watson-wordle world strikes me as an entry into a new ability of the machine to assist human intellectual labor. It is not thinking, it is not “research,” it is not “intelligence,” it is not language – okay, I know some are bristling at that claim – but however one feels about that, it is a potentially powerful way of using machines to help us see patterns in masses of written materials that can help humans do what machines **cannot** do – ask better and better questions.

    Imagine, for example, if a Watson-wordle program could access the entire library of Google Books – which grows by the hour – and address some straightforward question (for example, “who holds the position that machines can think?”). And here is its glory over Google. While it isn’t understanding a question, it is searching for a **complex phrase** with important directions. “Who” and “holds the position” and “machines can think.” A smart program like Watson can take these pieces and search millions of pages. Maybe not yet the whole Google library, but maybe some of it. The point is, it will be able to do more than just look up a word or word combination – it will be able to look up phrases, and synonyms of those phrases – and then tell us how frequently those phrases and similar phrases appear, and **where**. And it will be able to use statistical formulas to boil its search into coherent answers. For example, it will, at some point, be able to look over the relevant bibliographies of sources and list authors, articles, books that frequently appear in connection to the original phrases (the “question”).

    The key, of course, is to what extent the Watson program can access these articles, chapters, books, reviews, and usefully digest them into useful parts. I am hoping it will be able to do this. And that is when the fun really begins. That initial “report” then helps the researcher create even **more** relevant questions directed toward their purpose, and then do another search. And another. Perhaps Watson 9.0 may even have some software that can help suggest various lines of questioning to the researcher, helping to keep them organized.

    The process is repeated, more questions are asked, different combinations of information are assembled, and the researcher goes this way or that – just like we do with Wikipedia when we are browsing threads – only on a higher level. With watson-wordling, I am imagining, we can survey through much larger amounts of literature, and we can more confidently search for **ideas** and not just **words**. Let me say that again – ideas, and not just words. Ideas in the form of phrases and sentences, and not just words in the form of strings.

    Will this make us lazy? LOL I don’t think do. It probably won’t cut down on our reading at all – it will probably increase it! But it might help us zero in on more of what we really want to read. It is more likely to make us more curious and more interested in pursuing information – because it will help us make the connections we need to better understand it

    My prediction is that something very interesting would begin to emerge if the researcher sticks with this kind of inquiry. As more precise questions are posed along the way, the answers to the questions would start becoming less and less straightforward. As the questions got sharper, the answers would begin to get less linear, curvier, more complex, more multi-dimensional. A unique pattern would emerge in this topic area as the investigator discovered the specific places where the scientific literature was not so clear, where controversies hit stalemates, where evidence was murky, where the **collective** knowledge reached limits, where things weren’t so certain, where old and new ideas were in serious conflict, where science was reaching its edges. The **real** horizons of the topic would begin to emerge. Same thing with any topic – literary criticism, etc. etc.

    Now, when as reader takes a **closer** look at specific scholarly articles, after beginning with this larger survey, the same edges and horizons that the authors had to deal with may begin to get more alive and more real to the reader. They would be able to see more of the history behind the area of study – the waves of ideas that preceded it – and the limitations of the work today – the unsolved problems – and the various possible futures of the questions. This is what happens with anyone who seriously studies something, whether it is rose gardening or theoretical xenobiology. As they learn the history of their craft, art or science, they become engaged in its issues.

    My point here is that a robot program like Watson, if it could scan the general library of humanity using its statistical search techniques that “rival the human mind,” it could help us not only more speedily find the “factual” (accepted, authoritative, received) answers to our questions, but could also help us ask better questions that will reveal the **limitations** of those answers. It will help us not only grasp fact, but theory. And conflicting theories.

    And in this process, such tools could help each of us be more than just individuals confronting the limitations of our **own** knowledge, but also become participants in exploring the edges of the **collective** knowledge of humanity – and to perhaps help, in one way or another, to expand it.

    The lack of this sense of being on the growing edge of humanity’s true expansion – as a species, and as part of an historical world culture of countless sub-cultures – is one of the great tragedies of modern youth and society. Far too few of us feel like we are any kind of a part of that, or that we ever could. Books, digital ‘pedias, and perhaps some super-wordles that can wire us into the vast store of human knowledge can help turn that around. It is time for the peoples of this planet to become the inheritors and masters of their planet and their cultures, and that includes our books, which programs like Watson can help us study.

    So when will such watson superwordles, and such generally-available-for-search digital libraries, as I am imagining, become available? I don’t know. But I hope it won’t be **too** long. In the meantime, we amuse ourselves with gameshows pretending machines are contestants – while we treat far too many people like machines. We need to find ways to get magnificent devices like the Watson-wordle-pedia into people’s real hands, minds and lives, where they belong, and can help make them not just spectators in front of screens, but real actors.

    – Steve

  2. Thanks for posting that very thoughtful analysis, Steve! I do agree that the fanfare surrounding Watson’s current abilities is overshadowed by the value that Watson’s technology can one day bring to search in general. I think that a flashy novelty like Watson is beneficial in that it draws attention to what we can do and inspires people to think of other (more useful) applications, while simultaneously generating a boatload of good press for IBM.

    Your way of looking at the benefit of Watson as a way to delineate the boundaries of our current knowledge is a take on this that I had not thought of before. If and when Watson technology searches the whole web, the possibilities for what it can be used for are greatly extended. I think that providing fast answers in a high stress environment, such as analyzing the incredible volume of new medical research to help doctors make quick decisions, is only the first obvious application of what good semantic search can do. I hope we do see the day when Watson can actually point us to the horizons of our knowledge and show us where the gaps are.

    Thanks again for posting your response!

  3. […] Friday Fun: Is IBM’s Watson the future of search? – This interesting post looks at a curious (and unbelievable?) future in information science. […]

  4. […] we’ve looked at new search technology this year including Xyggy (also see this follow-up) and IBM’s Watson (which is set to play Jeopardy next […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: