Computers still are in need of humans, after all

3/18/2013

BY STEVE LOHR
NEW YORK TIMES NEWS SERVICE

Trading stocks, targeting ads, steering political campaigns, arranging dates, besting people on Jeopardy, and even choosing bra sizes: Computer algorithms are doing all this work and more.

But increasingly, behind the curtain there is a decidedly retro helper — a human being.

Although algorithms are growing ever more powerful, fast, and precise, the computers themselves are literal-minded, and context and nuance often elude them. Capable as these machines are, they are not always up to deciphering the ambiguity of human language and the mystery of human reasoning. Yet these days they are being asked to be more humanlike in what they figure out.

“For all their brilliance, computers can be thick as a brick,” said Tom M. Mitchell, a computer scientist at Carnegie Mellon University.

And so, while programming experts still write the step-by-step instructions of computer code, additional people are needed to make more subtle contributions as the work the computers do has become more involved. People evaluate, edit, or correct an algorithm’s work. Or they assemble online databases of knowledge and check and verify them — creating, essentially, a crib sheet the computer can call on for a quick answer. Humans can interpret and tweak information in ways that are understandable to both computers and other humans.

Question-answering technologies like Apple Computer Inc.’s Siri and IBM Corp.’s Watson rely particularly on the emerging machine-man collaboration. Algorithms alone are not enough.

Twitter Inc. uses a far-flung army of contract workers, whom it calls judges, to interpret the meaning and context of search terms that suddenly spike in frequency on the service.

For example, when Mitt Romney talked of cutting government money for public broadcasting in a presidential debate in the fall and mentioned Big Bird, messages with that phrase surged. Human judges recognized instantly that “Big Bird,” in that context and at that moment, was mainly a political comment, not a reference to Sesame Street, and that politics-related messages should pop up when someone searched for “Big Bird.” People can understand such references more accurately and quickly than software can, and their judgments are fed immediately into Twitter’s search algorithm.

“Humans are core to this system,” two Twitter engineers wrote in a blog post in January.

Even at Google Inc., where algorithms and engineers reign supreme in the company’s business and culture, the human contribution to search results is increasing.

Google uses human helpers in two ways. Several months ago, it began presenting summaries of information on the right side of a search page when a user typed in the name of a well-known person or place, like “Barack Obama” or “New York City.” These summaries draw from databases of knowledge such as Wikipedia, the CIA World Factbook, and Freebase, whose parent company, Metaweb, Google acquired in 2010. These databases are edited by humans.

When Google’s algorithm detects a search term for which this distilled information is available, the search engine is trained to go fetch it rather than merely present links to Web pages.

“There has been a shift in our thinking,” said Scott Huffman, an engineering director in charge of search quality at Google. “A part of our resources are now more human curated.”

Other human helpers, known as evaluators or raters, help Google develop tweaks to its search algorithm, a powerhouse of automation, fielding 100 billion queries a month. “Our engineers evolve the algorithm, and humans help us see if a suggested change is really an improvement,” Mr. Huffman said.

Katherine Young, 23, is a Google rater — a contract worker and a college student in Macon, Ga. She is shown an ambiguous search query like “what does king hold,” presented with two sets of Google search results and asked to rate their relevance, accuracy, and quality. The current search result for that imprecise phrase starts with links to Web pages saying that kings typically hold ceremonial scepters, a reasonable inference.

Her judgments, Ms. Young said, are “not completely black and white; some of it is subjective.” She added, “You try to put yourself in the shoes of the person who typed in the query.”

IBM’s Watson, the powerful question-answering computer that defeated Jeopardy champions two years ago, is in training these days to help doctors make medical diagnoses. But it too is turning to humans for help.

To prepare for its role in assisting doctors, Watson is being fed medical texts, scientific papers, and digital patient records stripped of personal identifying information. Instead of answering questions, however, Watson is asking them of clinicians at the Cleveland Clinic and medical school students. They are giving answers and correcting the computer’s mistakes, using a “Teach Watson” feature.

Watson, for example, might come across this question in a medical text: “What neurological condition contraindicates the use of bupropion?” The software may have bupropion, an antidepressant, in its database, but stumble on “contraindicates.” A human helper will confirm that the word means “do not use,” and Watson returns to its data trove to reason that the neurological condition is a seizure disorder.

“We’re using medical experts to help Watson learn, make it smarter going forward,” said Eric Brown, a scientist on IBM’s Watson team.

Ben Taylor, 25, is a product manager at FindTheBest.com, a fast-growing start-up in Santa Barbara, Calif. The company calls itself a “comparison engine” for finding and comparing more than 100 topics and products, from universities to nursing homes, smart phones to dog breeds. Its Web site went up in 2010, and the company now has 60 full-time employees.

Mr. Taylor helps design and edit the site’s education pages. He is not an engineer but an English major who has become a self-taught expert in the arcane data found in Education Department studies and elsewhere. His research methods include talking to and emailing educators. He is an information sleuth.

On FindTheBest, more than 8,500 colleges can be searched quickly according to geography, programs, and tuition costs, among other criteria. Go to the page for a university, and a wealth of information appears in summaries, charts, and graphics — down to the gender and race breakdowns of the student body and faculty.

Mr. Taylor and his team write the summaries and design the initial charts and graphs. From hundreds of data points on college costs, for example, they select the most relevant ones to college students and their parents. But much of their information is prepared in templates and tagged with code a computer can read. So the process has become more automated, with Mr. Taylor and others essentially giving “go fetch” commands that the computer algorithm obeys.

The algorithms are getting better. But they cannot do it alone.

“You need judgment, and to be able to intuitively recognize the smaller sets of data that are most important,” Mr. Taylor said. “To do that, you need some level of human involvement.”