Answering the tough questions: Watson vs. Humans


IBM have always been on the cutting edge of innovation, they’ve moved more becoming merely a computer company to  what is probably the first truly all encompassing technology company, they don’t just make fancy gadgets or shiny tinga-ma-jigs, they make actual solutions for real-world problems.

In 1996, IBM introduced the world to Deep Blue. Kasparov met Deep Blue and wasn’t impressed, he had no reason to be, he defeated Deep Blue 4-2, and walked away comfortably.

However, in 1997, IBM re-introduced the world to the 2nd version of Deep Blue (unofficially named Deeper Blue), and this time Kasparov was beaten –but not by much. Kasparov is the Tiger Woods, Pele and Michael Jordan of the Chess world, and he was beaten by a super computer with 11.38 GFLOPs of power.

In turns out though, we had nothing to be afraid off, Chess is after all a pretty simple game when you break it down, the number of possible moves are finite, together with the number of possible scenarios to play out. It’s not an easy game to master, but as it turns out playing chess is infinitely easier than just plain talking.

In fact, of all the talking games, Jeopardy seems the most difficult. At the end of this post, I will make an argument to show that Jeopardy — a simple talking game — is about 6,500 times more difficult than Chess (a game we often associate with genius). Turns out Kasparov has to bow to Ken Jennings.

Talking Computers

Talking has always been the bane of computers, I remember playing with Dr. Sbaitso as a kid and felt that it was really cool that Dr. Sbaitso had all this information stored in a couple of 1.44 floppy disk. However, Dr. Sbaitso wasn’t exactly the best companion and after a while the limited number of responses based on the key words just meant that Dr. Sbaitso was the Siri of it’s generation, nice to talk to once in a while, but overall quite useless.

So after putting Kasparov out of business, IBM honed their scientific muscles onto language and chose to challenge themselves on the game of Jeopardy!. For those unfamiliar with the game, here’s what wikipedia has to say:

Jeopardy! is an American television quiz show featuring trivia in history, literature, the arts, pop culture, science, sports, geography, wordplay, and other topics. The show has a unique answer-and-question format in which contestants are presented with clues in the form of answers, and must phrase their responses in question form.

Jeopardy isn’t exactly a straightforward trivia game like Who Wants to be a Millionaire, it has 3 added layers of complexity,

1. Contestants have to choose the question category from the 5 possible categories.

2. Different questions have different dollar values signifying different risk levels

3. Questions aren’t phrased straight-forward, in other words even if I gave you unlimited time and unlimited access to Google, you’d still have to figure out what searches to begin with and you may give me the wrong answer. If I gave you google for Who wants to be a Millionaire–you’d be a Millionaire.

Watson: The Jeopardy playing Computer

In 2011, an IBM computer named Watson took on the best Jeopardy! contestants of all time — and Watson won — by a wide margin. This represented a significant step for IBM and the world, these weren’t questions with a clear set of rules (like chess), this was human language, something we’d thought would forever remain only accessible to humans. Watson was taking computers closer to humanity.

Watson was able to take a improperly formed question, understand it, then contextualize it, then search through its database generating hypotheses, before eliminating the bad answers and arriving at the ‘correct’ answer with a probability for the answer to be correct. If the probability was low, Watson wouldn’t answer the question, if the probability was high, Watson would answer–he wasn’t always right–but he was right a lot!!

The videos speak for themselves, this is an amazing step. Think about it as Google answering your actual question instead of spewing up links for you to visit the link to get the answer. Our children will have a very different understanding of what a search engine is because of Watson. Search Engines would instead be called Answer engines, that provided Answers instead of search results. *Of course search engines would have to still exist for porn and bit torrent downloads.

IBM have already begun focusing their efforts on getting Watson to a level where he can medically diagnose a person based on information available. Unlike an actual doctor,  Watson can actually troll through far more data, like family history, geographic location, time and date to come up with a more wholistic answer. Add to that, the fact that medical literature doubles every 5 year, means that only computer would realistically be able to keep up with the additional medical data that naturally flows through the medical community every year.

According to the IBM country manager I spoke to, Watson is now ‘technically’ a 3rd year medical student. It’s now a question of when Watson will graduate rather than if, and that should disrupt the medical community quite a bit 🙂

Who knows what the next 10 years holds from big blue, probably they’ll develop a computer that can get itself elected to Parliament, it’s a realistic outcome if you ask me, and I for one welcome our computer overlords.

Not all questions have one answer

Watson may be able to diagnose a patient based on probably 15 volumes of data, but at the end of the day Watson is being honed in on questions with just one answer. Jeopardy! never has a question with 2 or 3 answers, only one answer is right and all other answers are wrong.

That’s not the way the world works, sometimes we have questions with many possible answers, it’s called ambiguity and that’s what humans are used to. I’d be very interested to see if Watson would be able to diagnose a patient with 2-3 diseases accurately, if you have 2-3 diseases would a Jeopardy algorithm trained to approach single answer question be good enough to diagnose you?

So what happens when you have opened-ended questions? Well, you have to turn to humans.

For instance, How do we stop Global Warming?

How do we solve the energy crisis? How do we fix the financial markets?

These are important real-world questions that have more than one answer and interdependence between those answers. These are questions that Watson in it’s current form cannot answer, and based on it’s architecture, I predict will never be able to answer–but who knows–big blue might surprise me.

So I guess for the foreseeable future of course, we’ll still need humans.

Oh, and by the way

In 1997, Deep Blue had just under 12 GFLOPS of power making a supercomputer, but not something in the big league of supercomputers. In 2011, Amazon Web Services spinned up a supercomputer on their cloud with 240 TeraFLOPS of power. That’s 20,000 more power for under USD1500 an hour. That puts things into perspective for Cloud Computing.

Watson on the other had about 80 TeraFLOPs of power, less than the Amazon super computer but about 6,500 times more than Deep Blue. This isn’t just about raw computing power, but its a good indication of what the difference is between playing chess and Jeopardy! …Jeopardy is 6,500 times more difficult than Chess.

Of course that’s a simplistic way of looking at it, but it just shows you how much more processing power is involved in contextualizing an improperly formed question over playing a game with straightforward rules.

Looks like my super computer overlords will have to wait.

Add comment

Astound us with your intelligence