How does Google exploits with CAPTCHAs

A CAPTCHA is a test used to distinguish between humans and computers. It's mainly used to avoid spam.

One program that makes this test is reCAPTCHA, which was published on 27 May 2007, and acquired by Google on September 20091. This program is used in websites all over the world.

reCAPTCHA has been used a lot to digitalize texts thanks to the free work of millions of users, who are able to identify words incomprehensible by computers. The vast majority of people don't know that Google is taking advantage of their work; other people don't care.

reCAPTCHA used to digitalize text

The article «Deciphering Old Texts, One Woozy, Curvy Word at a Time» published in the newspaper The New York Times informed that Internet users had finished digitalizing the archives of that newspaper (published since 1851) using reCAPTCHA. As the creator of reCAPTCHA said then, the users, or better referred to as "useds", deciphered around 200 million CAPTCHAs a day, spending around 10 seconds to solve each one. This works corresponds to 500,000 hours of work a day2.

A CAPTCHA can not only be used with text which is difficult to understand, but also with images. For example, Google uses reCAPTCHA as well to identify images from establishments, traffic signs, etc., made for Google Maps. In addition, the Google's reCAPTCHAs are used for other purposes unknown by its useds.

reCAPTCHA made for Google Maps

Only Google knows the economic benefit which this system of exploitation provides. It's impossible to audit reCAPTCHA because it's proprietary software, and the useds don't have any power. The only thing they can do is reject solving CAPTCHAs like the ones from Google to stop being used.

Now Google uses an identification method that saves its useds some time and refrains them from deciphering texts and images. The useds now only have to press a button. This mechanism imposes the execution of JavaScript code unknown by the useds, which could be a great privacy risk. Google can obtain a lot of information from the useds who use this mechanism, which probably will sell for a substancial monetary sum.

reCAPTCHA button

Furthermore, reCAPTCHA discriminates against disabled users and Tor users: on the one hand, the challenges presented to disabled people are longer and more difficult to solve; on the other hand, the who surf the web privately have to solve a more difficult challenge that requires more time.

The article writen by the creator of reCAPTCHA3 when it was acquired by Google read:

Improving the availability and accessibility of all the information on the Internet is really important to us, so we're looking forward to advancing this technology with the reCAPTCHA team.

Nevertheless, the work made using reCAPTCHA isn't available or accessible in most cases. The data are presented in a way that economically benefit Google and other companies. The useds who helped digitalize the archive of The New York Times have to pay watching ads when they consult the archive they themselves helped to digitalize without getting anything in return.

  1. reCAPTCHA. Wikipedia. Retrieved 2017-5-5. 

  2. Deciphering Old Texts, One Woozy, Curvy Word at a Time (2011-3-28). The New York Times. Retrieved 2017-5-5. 

  3. Teaching computers to read: Google acquires reCAPTCHA (2009-9-16). Official Google Blog. Retrieved 2017-5-5.