What is Captcha?
By Clint Swett, The Sacramento Bee, Calif.
Feb. 20–Nearly everyone who’s tried to register for a Web site has struggled to recognize a series of distorted letters and numbers seemingly plucked from a Salvador Dali dreamscape.
That could be changing soon, though, because automated programs called “bots” are thwarting these online defenses, no matter how unreadable they appear to humans.
“The harder we work to fool automated tools, the more difficult the sites are for people to use,” wrote analyst Rob Enderle in an e-mail.
Consequently, researchers have been developing alternatives to the contorted characters. In some cases, the new tests require users to separate dogs from cats. In others, researchers are identifying words in books that computers can’t make out, then scanning in images of the words for people to decipher.
The original idea for such tests grew out of a proposal made by Alan Turing, a British mathematician often considered the father of computer science. He suggested a theoretical test to judge whether computers were good at imitating humans.
Today, the name of the original test gives a nod to Turing: Completely Automated Public Turing test to tell Computers and Humans Apart.
The CAPTCHA system, as it’s known, remains in wide use. No one is certain how proficient bots have become at defeating it because many companies are loath to admit security breaches.
A Web site run by the Ukraine-based OCR Research Team (ocr-research.org.ua) boasts of developing software that defeated CAPTCHA defenses at such well-known sites as PayPal, MySpace and Friendster.
Aleksey Kolupaev, who works on the OCR-Research site, told the New York Times that his work highlights weaknesses in the original test system and ultimately makes the Internet more secure.
But maybe not secure enough. In one high-profile case last November, Ticketmaster won an injunction against RMG Technologies of Pittsburgh. In court papers, Ticketmaster alleged that software developed by RMG had allowed ticket brokers to breach CAPTCHA and other online security, swooping in to scoop up popular concert tickets ahead of the general public.
Thwarting such activities is “a cat and mouse game and has been ever since we first began selling tickets online,” wrote Ticketmaster spokeswoman Bonnie Poindexter in an e-mail.
“Cat and mouse” is an apt analogy for John Douceur, a researcher with Microsoft Corp. in Seattle who’s developing a CAPTCHA alternative based on recognizing animal photos.
It’s called Animal Species Image Recognition for Restricting Access, or ASIRRA.
When trying to register for a Web site, the user is shown 10 photographs of dogs and cats and asked, for instance, to click only cat pictures. If the user clicks correctly, the user is allowed past the defense.
“Recognizing a three-dimensional object in natural lighting is very difficult (for a computer) to do,” Douceur said. “And dogs and cats in particular are very close.”
To make it even tougher for hackers, the pictures are pulled from a database of more than 3 million photos from petfinder.com. Many of the pictures are out of focus, some are of animals looking through bars of pet cages, or pets dressed in little outfits. Such variety makes it even more difficult for a bot to identify the images, Douceur said.
He also said informal tests have shown that people are less irritated by animal puzzles than distorted letters, meaning that Web sites can use the technology more freely.
ASIRRA is still in the developmental stage and being used for security by a handful of nonprofits. Douceur said he is not aware of any plans by Microsoft to adopt the technology for Hotmail or any of its other services.
Another alternative that’s closer to the original CAPTCHA seems to be getting more traction.
Called reCAPTCHA, the system draws upon words culled from old books and publications that are being scanned into digital archives.
When software is unable to recognize the words, because the ink is faded or smeared, the images are transmitted to the reCAPTCHA project at Carnegie Mellon University in Pittsburgh.
There they are used in the online puzzles, and when users correctly identify them, they are sent back to the digital archives project and inserted into the database.
Carnegie Mellon professor Luis von Ahn, one of the developers of CAPTCHA and now heading the reCAPTCHA project, said that while some bots are becoming adept at reading computer-generated CAPTCHA words, they have a much harder time with smeared or faded ink found in old publications.
So, he said, reCAPTCHA serves two purposes. First, it fools bots. And second, it harnesses the collective power of CAPTCHA users to identify the faded words.
Perhaps the biggest threat to CAPTCHA-protected sites, von Ahn said, are organizations, primarily overseas, that are said to hire low-wage workers to get past the character puzzles.
“That speaks to the effectiveness of CAPTCHAs,” he said. “If somebody has to resort to that, they must be hard to break.”
—–
To see more of The Sacramento Bee, or to subscribe to the newspaper, go to http://www.sacbee.com.
Copyright (c) 2008, The Sacramento Bee, Calif.
Distributed by McClatchy-Tribune Information Services.
For reprints, email tmsreprints@permissionsgroup.com, call 800-374-7985 or 847-635-6550, send a fax to 847-635-6968, or write to The Permissions Group Inc., 1247 Milwaukee Ave., Suite 303, Glenview, IL 60025, USA.
EBAY, NWS, NYT,
