Fuck…. "You've been training Google's AI for 15 years. You h...

npub1kyeml3tma4su8yw5aru48wgxclchp8zr3kguwhakmtmegjw40zws82sfjk
hex
ebe51cc5f243befd9d0ed36a1241495172af1af975e0164a89ae768535013ea5nevent
nevent1qqswheguchey80han58dx6sjg9y4zu40rtuhtcqkf2y6ua59x5qnafgprpmhxue69uhhyetvv9ujuem4d36kwatvw5hx6mm9qgstzvalc4a76cwrj82w372nhyrv0utsn3pcmyw8t7md4au5f82h38g2x9m0mKind-1 (TextNote)
Fuck…. "You've been training Google's AI for 15 years. You had no idea. 500,000 hours of free human labor. Every single day. By people who thought they were just trying to log in to their bank. reCAPTCHA is the most successful invisible data operation in internet history. 200 million people solved it daily at its peak. Almost none of them understood what they were actually building. Waymo, Google's autonomous vehicle company, is worth $45 billion today. It got a critical portion of its training data from you. For free. On every website you've ever visited. Here's the full story. How it started: a clever idea
In 2000, spam bots were destroying the internet. Forums flooded. Inboxes crushed. Websites needed a way to separate humans from machines. Carnegie Mellon professor Luis von Ahn solved it. He invented the CAPTCHA: a distorted word only a human could read. Bots failed. Humans passed. But von Ahn saw something more. Millions of people were spending cognitive effort on these challenges. What if that effort could do two things at once? In 2007, he launched reCAPTCHA. The twist: instead of random nonsense, it showed two words. One the system already knew. One scanned from a real book computers couldn't decipher yet. Your answer helped digitize it. The books were from the New York Times archive. And Google Books. 130 million books worth. You thought you were logging in. You were doing OCR for the world's largest digital library. Google acquired reCAPTCHA in 2009. Image Then Google changed the game
The squiggly-word era ended around 2012. Google had a new problem. Street View cars were photographing every road on earth. But photos are raw data. For the AI to be useful, it needed to understand what it was seeing: signs, crosswalks, traffic lights, storefronts. So Google redesigned reCAPTCHA v2. Instead of distorted text, it showed photo grids. "Click all squares with a traffic light." "Select every crosswalk." "Identify the storefronts." Those images came directly from Google Street View. Your clicks were the labels. Every selection told Google's computer vision model: this pixel cluster is a traffic light. This shape is a crosswalk. You weren't passing a test. You were building a dataset. Image The scale nobody talks about
At peak, 200 million reCAPTCHAs were solved every single day. 10 seconds per challenge. That's 2 billion seconds of human labor. Every day. 500,000 hours. Daily. Paid data annotation costs $10 to $50 per hour. At the low end: $5 million in free labor extracted every single day. And reCAPTCHA wasn't on one app. It was on every bank. Every government portal. Every e-commerce site. Every login page on the internet. You had no choice. Want to access your account? Annotate the dataset first. Google didn't ask. Didn't pay. Didn't even tell you. Image What all of it built
The data fed directly into two products. Google Maps. The most used navigation tool on earth. Its ability to read street signs, identify businesses, and understand urban geography was built, in part, on billions of human annotations from people trying to log in to websites. And Waymo. Waymo is Google's self-driving car project, spun off as its own company in 2016. To navigate safely, a self-driving car needs to recognize thousands of visual patterns with near-perfect accuracy. Traffic lights. Crosswalks. Pedestrians. Stop signs. The ground truth training data for that recognition? Annotated by millions of humans. Via reCAPTCHA. Without their knowledge. Waymo completed over 4 million paid rides in 2024. It operates in San Francisco, Los Angeles, and Phoenix. It is expanding monthly. It is valued at $45 billion. The foundation was built by unpaid internet users trying to check their email.
0:10 / 3:41 Why nobody could replicate this
Data annotation is expensive. Companies like Scale AI, Appen, and Labelbox exist entirely to solve it. They employ hundreds of thousands of workers to label images, sometimes for less than a dollar an hour. Google solved this differently. They made annotation mandatory. Not for pay. Not with consent. As the price of entry to every site on the web. The result: billions of labeled images. Global coverage. Every weather condition. Every time of day. Every city on earth. No annotation company could build this. The internet itself was the factory. Every person on it was an employee who never signed a contract. Image The version you're still doing today
reCAPTCHA v3, launched in 2018, doesn't show you a challenge at all. It watches how you move your mouse. How you scroll. How long you hover. Your behavioral fingerprint tells it whether you're human. That behavioral data feeds back into Google's AI systems too. You never opted in. There was never a box to check. You are still doing it right now, on most of the sites you visit. The irony that should bother everyone
Luis von Ahn's original vision was brilliant: redirect the cognitive effort humans already spend on spam filters toward something useful. Digitize the world's books. Solve a real problem. What Google did with that vision is something else. They took a security mechanism users had no choice but to use, deployed it across the entire internet, and harvested the output to build commercial products worth tens of billions of dollars. The users got nothing. Not even awareness. The deepest irony: you spent years proving you were human. By doing exactly the kind of visual recognition work that AI could not do yet. The work that, once learned, made human visual annotation unnecessary. You proved you were human. By making yourself replaceable”
Raw JSON
{
"kind": 1,
"id": "ebe51cc5f243befd9d0ed36a1241495172af1af975e0164a89ae768535013ea5",
"pubkey": "b133bfc57bed61c391d4e8f953b906c7f1709c438d91c75fb6daf79449d5789d",
"created_at": 1773803054,
"tags": [],
"content": "Fuck…. \"You've been training Google's AI for 15 years. You had no idea. 500,000 hours of free human labor. Every single day. By people who thought they were just trying to log in to their bank.\nreCAPTCHA is the most successful invisible data operation in internet history. 200 million people solved it daily at its peak. Almost none of them understood what they were actually building.\nWaymo, Google's autonomous vehicle company, is worth $45 billion today. It got a critical portion of its training data from you. For free. On every website you've ever visited.\nHere's the full story.\nHow it started: a clever idea\n\nIn 2000, spam bots were destroying the internet. Forums flooded. Inboxes crushed. Websites needed a way to separate humans from machines.\nCarnegie Mellon professor Luis von Ahn solved it. He invented the CAPTCHA: a distorted word only a human could read. Bots failed. Humans passed.\nBut von Ahn saw something more. Millions of people were spending cognitive effort on these challenges. What if that effort could do two things at once?\nIn 2007, he launched reCAPTCHA. The twist: instead of random nonsense, it showed two words. One the system already knew. One scanned from a real book computers couldn't decipher yet. Your answer helped digitize it.\nThe books were from the New York Times archive. And Google Books. 130 million books worth.\nYou thought you were logging in. You were doing OCR for the world's largest digital library.\nGoogle acquired reCAPTCHA in 2009.\nImage\nThen Google changed the game\n\nThe squiggly-word era ended around 2012.\nGoogle had a new problem. Street View cars were photographing every road on earth. But photos are raw data. For the AI to be useful, it needed to understand what it was seeing: signs, crosswalks, traffic lights, storefronts.\nSo Google redesigned reCAPTCHA v2. Instead of distorted text, it showed photo grids. \"Click all squares with a traffic light.\" \"Select every crosswalk.\" \"Identify the storefronts.\"\nThose images came directly from Google Street View.\nYour clicks were the labels. Every selection told Google's computer vision model: this pixel cluster is a traffic light. This shape is a crosswalk.\nYou weren't passing a test. You were building a dataset.\nImage\nThe scale nobody talks about\n\nAt peak, 200 million reCAPTCHAs were solved every single day.\n10 seconds per challenge. That's 2 billion seconds of human labor. Every day. 500,000 hours. Daily.\nPaid data annotation costs $10 to $50 per hour. At the low end: $5 million in free labor extracted every single day.\nAnd reCAPTCHA wasn't on one app. It was on every bank. Every government portal. Every e-commerce site. Every login page on the internet. You had no choice. Want to access your account? Annotate the dataset first.\nGoogle didn't ask. Didn't pay. Didn't even tell you.\nImage\nWhat all of it built\n\nThe data fed directly into two products.\nGoogle Maps. The most used navigation tool on earth. Its ability to read street signs, identify businesses, and understand urban geography was built, in part, on billions of human annotations from people trying to log in to websites.\nAnd Waymo.\nWaymo is Google's self-driving car project, spun off as its own company in 2016. To navigate safely, a self-driving car needs to recognize thousands of visual patterns with near-perfect accuracy. Traffic lights. Crosswalks. Pedestrians. Stop signs.\nThe ground truth training data for that recognition? Annotated by millions of humans. Via reCAPTCHA. Without their knowledge.\nWaymo completed over 4 million paid rides in 2024. It operates in San Francisco, Los Angeles, and Phoenix. It is expanding monthly. It is valued at $45 billion.\nThe foundation was built by unpaid internet users trying to check their email.\n\n0:10 / 3:41\nWhy nobody could replicate this\n\nData annotation is expensive. Companies like Scale AI, Appen, and Labelbox exist entirely to solve it. They employ hundreds of thousands of workers to label images, sometimes for less than a dollar an hour.\nGoogle solved this differently. They made annotation mandatory. Not for pay. Not with consent. As the price of entry to every site on the web.\nThe result: billions of labeled images. Global coverage. Every weather condition. Every time of day. Every city on earth.\nNo annotation company could build this. The internet itself was the factory. Every person on it was an employee who never signed a contract.\nImage\nThe version you're still doing today\n\nreCAPTCHA v3, launched in 2018, doesn't show you a challenge at all. It watches how you move your mouse. How you scroll. How long you hover. Your behavioral fingerprint tells it whether you're human.\nThat behavioral data feeds back into Google's AI systems too.\nYou never opted in. There was never a box to check. You are still doing it right now, on most of the sites you visit.\nThe irony that should bother everyone\n\nLuis von Ahn's original vision was brilliant: redirect the cognitive effort humans already spend on spam filters toward something useful. Digitize the world's books. Solve a real problem.\nWhat Google did with that vision is something else.\nThey took a security mechanism users had no choice but to use, deployed it across the entire internet, and harvested the output to build commercial products worth tens of billions of dollars.\nThe users got nothing. Not even awareness.\nThe deepest irony: you spent years proving you were human. By doing exactly the kind of visual recognition work that AI could not do yet. The work that, once learned, made human visual annotation unnecessary.\nYou proved you were human. By making yourself replaceable”",
"sig": "0a1e76e4081810072ed74bbd5282a6e1fdbc339e2300251064c1449add8260c79471ba23dd32c978e732c3eb5e13d5a531ab18c334545a346de0a80f97dc6075"
}