The Michael Scott Algorithm
by Matthew Burnside on May 3, 2011
You’ve got to give it to Chloé Kiddon and Yurly Brun. That’s what she said. These two computer scientists at the University of Washington created my new favorite system for analyzing natural language. Okay. You caught me. It’s my first favorite system for analyzing natural language. They call it DEviaNT, or Double Entendre via Noun Transfer, used for a computer to add “that’s what she said” to sentences. DEviaNT, with 70% accuracy, identified sentences that potentially contained euphemisms following certain patterns. Effectively, they were pinpointing potential setups for the punchline that someone always seems to overuse.
By analyzing nouns, adjectives, and verbs as boolean data (data that is either true or false), in this case classified as double entendres or not, they determined the level of sexiness contained within the text. Then, if structured correctly, determined if the sentence was a metaphor. If the process determined it as so, the computer could jam it in there. Say it with me now, “That’s what she said!”
In a wonderfully math-y way, which you can read in their paper here, they determined the accuracy. In the paper they generalize it quite well. When someone says something that could be a TWSS joke but do not state the punchline, no one cares. In fact, you probably prefer it. In that case, the accuracy is not affected. Yet, if someone says something that should not be a TWSS joke and does state the punchline, you call them out and tell them to shut their stupid mouth before you sue them for sexual harassment. This is a negative mark on the accuracy. As stated earlier, they managed to produce an accuracy of roughly 70%. Although, with more data they predict they could achieve a 99.5% accuracy.
I understand many of you may think this is a stupid thing to develop. Everyone has that boss or coworker that doesn’t understand why it is hilarious when Michael Scott utters (or sadly, uttered) the phrase, but not when he or she does. However, by mapping metaphors in this way, the formula could generally be used for other double entendres and brands of humor. Determining humor, such as sarcasm in texts or comments on blogs, has proven to be incredibly difficult problems in computer science and life. Considering Steve Carell’s departure from The Office last week, this is a fun and well timed step in the progression of computing linguistics.
And if anyone is wondering, yes, I cried during Michael Scott’s last episode. I’m still a man dammit. I have hair on my chest, I urinate out of a penis, and I eat meat. That’s what she said.
via New Scientist
Is there an empty void in your life? Following Matthew on Twitter won’t fill it, but it will distract you from it.