The Miracle Called IBM Watson

The Miracle Called IBM Watson

IBM Watson – Technology Or Magic ?

 

Isaac Asimov, a science fiction author wrote a trilogy series called “Foundation” in 1950s. The foundation is all about a scientist named Harry Seldon who picks up a group of high IQ people in different fields at a very early age of 8 to 10 years and creates a civilization on an uninhabited planet. A super computer governs this civilization. Since all the people are of known behavioural trend, this computer not only analyses  characters and their offsprings, but also governs them silently. At any given time it can predict who is going to be their leader, how long he is going to rule and who will be the successor. It can predict the entire civilization for next 150 years. When an issue arises, the computer can predict and provides the solution for the same. It learns from the current civilization to prepare prediction for next 150 years.

Now the entire story is far fetched, but seemingly plausible, thanks to Watson. That is the power of Watson. Its artificial intelligence, though not as accurate as that depicted in the fiction, it is a starting point.

 

The Miracle Called Watson

 

IBM Watson can analyse all the data fed into it and come up with an accurate prediction. This is not an easy task for any computer or logic. It really pains us when somebody thinks Watson just answers queries. It is not a product or a piece of code, it is an IBM (marketing) brand used for a whole bunch of stuff.

Please don’t confuse a framework with an algorithm. Tensorflow is a software library that can be used to implement a number of machine learning algorithms. It’s the algorithm itself that matters, not the framework. Tensorflow is just a library that helps with parallelism, which is only useful in a hand full of cases.

IBM developers – as far as I know – are a bit indifferent when it comes to libraries. They rely heavily on (and contribute to) open source and will use whatever works best. A lot of the components/algorithms they use are much older than TensorFlow and most machine learning libraries. If you ask me, they probably have built most of this stuff from scratch without using any particular framework.

IBM Watson is a cognitive computing based Artificial intelligence super computer which uses unstructured big data as a source. Watson is a question answering computer system capable of answering questions posed in natural language.

Watson is a question answering computer system capable of answering questions posed in natural language, developed in IBM’s DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM’s first CEO, industrialist Thomas J. Watson. The computer system was specifically developed to answer questions on the quiz show Jeopardy!

In 2011, Watson competed on Jeopardy! against former winners Brad Rutter and Ken Jennings.

Watson received the first place prize of $1 million.

Watson had access to 200 million pages of structured and unstructured content consuming four terabytes of disk storage including the full text of Wikipedia, but was not connected to the Internet during the game. For each clue, Watson’s three most probable responses were displayed on the television screen. Watson consistently outperformed its human opponents on the game’s signaling device, but had trouble in a few categories, notably those having short clues containing only a few words.

In February 2013, IBM announced that Watson software system’s first commercial application would be for utilization management decisions in lung cancer treatment at Memorial Sloan Kettering Cancer Center, New York City, in conjunction with health insurance company WellPoint.  90% of nurses in the field who use Watson now follow its guidance wholeheartedly.

At the core, Watson is a complex NLP system. Numerous processes are involved that are rule-based, such as Lucene building a variety of indices, based on rules, as one of 20+ pre-processing steps for corpus content i.e documents that contain the domain knowledge.

There is a second phase where humans provide examples of implicit rules. A textual query is related to a portion of the corpus, Q&A, essentially telling Watson that when it sees the same query after training it should respond with the area of the corpus indicated.

The challenge is that Watson, and NLP in general, is a non-deterministic system based on probabilities. The training process above is repeated thousands of times and the algorithms  build up probabilities of the relationship of a text query to an area of the corpus.

Some experts will suggest that IBM Watson is a failure and some will tell you that it is the biggest technological marvel ever. The debate will be forever, the lesson is to take the positives from the Watson and build on it.

Harnessing its powers is the way forward.