IBM's Watson Gets A 'Swear Filter' After Learning The Urban Dictionary

 @redletterdave
on January 10 2013 3:36 PM
Official Watson logo
Logo of Watson, the supercomputer from IBM (NYSE:IBM) that understands natural language. IBM Corp.

Watson, the name for IBM's supercomputer best known for crushing '"Jeopardy!" contestants at their own game, briefly went from "smart" to "smart ass" with the help of the Urban Dictionary.

According to Eric Brown, an IBM research assistant and the "brains" behind Watson, he and his 35-person team wanted to get IBM's supercomputer to sound more like a real human. In Brown's mind, what better way to learn the intricacies of informal human communication and conversation than having Watson memorize the Urban Dictionary?

The Urban Dictionary, for those who don't know, is comprised of submissions from everyday people and regulated by volunteer editors, who are given an extremely small set of rules to maintain quality control. 

But for the most part, even with the help of human editors, the Urban Dictionary still turns out to be a rather profane place on the Web. The Urban Dictionary even defines itself as "a place formerly used to find out about slang, and now a place that teens with no life use as a burn book to whine about celebrities, their friends, etc., let out their sexual frustrations, show off their racist/sexist/homophobic/anti-(insert religion here) opinions, troll, and babble about things they know nothing about."

However, the Urban Dictionary has a few useful definitions, including Internet abbreviations like OMG, and slang that humans use every day, such as calling someone a "hot mess." Brown believed Watson could be more human if it could learn these kinds of language complexities, so in 2011, shortly after Watson's reign as "Jeopardy!"champ, Brown taught Watson the Urban Dictionary.

What could've been another landmark for Watson -- being able to participate and enjoy in a full conversation using natural, informal human language -- turned out to be a step in the wrong direction.

Watson may have learned the Urban Dictionary, but it never learned the all-important axiom, "There's a time and a place for everything." Watson simply couldn't distinguish polite discourse from profanity.

Watson unfortunately learned all of the Urban Dictionary's bad habits, including throwing in overly -crass language at random points in its responses; in answering one question, Watson even reportedly used the word "bullshit" within an answer to one researcher's question. Brown told Forbes that Watson picked up similarly bad habits from reading Wikipedia.

In the end, Brown and his team were forced to remove the Urban Dictionary from Watson's vocabulary, and additionally developed a smart filter to keep Watson from swearing in the future.

For now, Watson will keep doing what it's great at: Helping hospitals diagnose sick patients based on their records and symptoms, and beating the snot out of game show participants. If Watson's brief stint with the Urban Dictionary teaches us anything, it's that artificial intelligence will take a long time to finally learn the complicated, ever-changing ins and outs of human communication.

Share this article

More News from IBT MEDIA