Akshay Walimbe

The Consent Form You Clicked “Agree” On Was Written for a World That No Longer Exists

The Consent Form You Clicked "Agree" On Was Written for a World That No Longer Exists

The Consent Form You Clicked “Agree” On Was Written for a World That No Longer Exists

Part of “The AI You Don’t See” series by Akshay A. Walimbe

In February 2024, Reddit signed a content licensing deal with Google worth 60 million dollars a year. A few months later, in May 2024, it signed another deal with OpenAI, estimated at 70 million dollars a year. In total, Reddit disclosed licensing agreements worth 203 million dollars, reporting a 450 per cent year over year increase in non ad revenue driven largely by AI data licensing.

What was Reddit selling? Your posts. Your comments. Your arguments about cricket, your relationship advice, your medical questions, your half drunk rants at 2 AM, your detailed reviews of pressure cookers and hair oils.

Content you created for a community. Content you shared because someone asked a question and you had an answer. Content you posted under a username, in a forum, for free, because that is how the internet worked.

Reddit packaged all of it and sold it to AI companies so they could train models on it. When users pushed back  removing content, protesting the API changes that paved the way for these deals  Reddit moved to reassert control over the platform and its content.

Stack Overflow did the same thing. In May 2024, it signed a deal with OpenAI  and separately with Google  for access to its dataset through its OverflowAPI: fifteen years of questions and answers written by software developers, for software developers, voluntarily, for free, under a Creative Commons licence. When contributors tried to delete their answers rather than have them used to train commercial AI, Stack Overflow suspended their accounts en masse, including highly rated contributors. Some users invoked GDPR’s right to be forgotten to justify deleting their content. The company’s position was simple: you posted under our terms of service. Your content is ours to license.

You typed the answer. They cashed the cheque.

The Consent You Actually Gave

Let me ask you something. When you signed up for Reddit, or Stack Overflow, or Instagram, or Swiggy, or any of the dozens of apps on your phone, you clicked “I Agree” on a terms of service document. Do you remember what it said?

Of course you do not. Nobody does. A widely cited study by researchers at Carnegie Mellon University calculated that the average American internet user encounters roughly 1,462 privacy policies per year. Reading all of them would take approximately 244 hours  that is 76 full working days. A Deloitte survey of 2,000 consumers found that 91 per cent of people consent to terms and conditions without reading them. Among 18 to 34 year olds, that number is 97 per cent. Pew Research found that only 9 per cent of American adults say they always read a privacy policy before agreeing.

So here is the consent model the entire digital world runs on: a document nobody reads, agreeing to terms nobody understands, for purposes that did not exist when the document was written.

When you signed up for Reddit in 2015 and agreed to their terms, you consented to your content being used on the Reddit platform. You did not  could not  consent to your content being sold to train GPT-5. That product did not exist. That use case had not been imagined. The entire industry of large language models was a research curiosity, not a hundred billion dollar commercial enterprise.

And yet, legally, your consent covers it. Because the terms said something like “we may use your content for any purpose related to our services and the services of our partners.” A sentence broad enough to cover anything, specific enough to cover nothing.

This is the consent fiction. A legal framework built for a world where a company collected your name, address, and credit card number, and used it to ship you a product. That world is gone. What replaced it is a world where every word you type, every photo you upload, every question you ask, every purchase you make becomes training data for AI systems that will influence decisions about millions of people you will never meet.

The consent form did not change. The world it was written for did.

Your Data Trains Models You Never Consented To

Let me make the chain visible. Because once you see it, you cannot unsee it.

You post a review of a restaurant on Zomato. That review, along with millions of others, becomes part of a dataset. That dataset gets used  directly or through licensing agreements or through web scraping  to train a language model. That language model powers a chatbot. That chatbot gives dietary advice to someone in another country. The advice is wrong. The person gets sick.

You never agreed to any of this. You agreed to post a review of butter chicken.

Or consider something closer to home. You use a period tracking app. Millions of Indian women do. Multiple investigations  including research by privacy advocacy groups and technology journalists  have found that many popular period tracking apps share intimate health data with third parties, including data brokers and advertising platforms. Your menstrual cycle, your symptoms, your irregularities  packaged and sold. Now imagine that data being used to train a health AI model. A model that might one day be used by an insurance company to assess risk profiles. A model that infers things about your health that you never told anyone.

You consented to track your period. You did not consent to having your fertility data sold to a data broker, aggregated into a training dataset, used to build a predictive model, and deployed by an insurance company to adjust your premium.

But at every step in that chain, someone will point to a terms of service agreement and say: she clicked “Agree.”

India’s DPDPA: A Brave Attempt, an Honest Gap

India passed the Digital Personal Data Protection Act on August 11, 2023, and notified the DPDP Rules on November 13, 2025, covering an estimated 800 million internet users. In many ways, it is serious legislation  consent requirements, purpose limitation, breach notifications within 72 hours, penalties up to 250 crore rupees. The 2025 Rules require Significant Data Fiduciaries to conduct annual Data Protection Impact Assessments and algorithmic fairness assessments. A Data Protection Board has been formally established. Somebody was paying attention.

But the DPDPA was written for the world of 2023. And that world is already gone.

It does not give you a right to explanation for automated decisions. If an AI denies you a loan or rejects your job application, nobody has to explain why. The EU’s GDPR, through Article 22, at least implies a right to “meaningful information about the logic involved” in automated decisions with legal effects. The DPDPA has no equivalent provision.

It does not address inference data. The data you give a company is covered. The conclusions a company draws from that data? Grey zone. The fact that you bought prenatal vitamins is personal data. The prediction that you are pregnant? The DPDPA does not clearly cover that.

And a significant structural gap: anonymised data is exempt. If a company anonymises your data, the Act does not apply. But modern reidentification techniques can defeat anonymisation  AI researchers have shown that models can infer, predict, or reconstruct personal details from statistical patterns even after the original data is removed. The shield the DPDPA relies on has holes, and AI is the tool that can find them.

To be fair, these are not uniquely Indian problems. The GDPR faces similar challenges with AI training data, and no jurisdiction on earth has fully solved the consent problem for AI. But India’s consent only regime  without the “legitimate interests” basis that GDPR provides  creates steeper compliance challenges for companies training AI models on large datasets, which may paradoxically push more activity into grey areas rather than regulating it clearly.

The Training Data Problem

There is a more fundamental issue that no consent framework has solved. Once your data trains an AI model, it cannot be untrained.

Your data is not stored in a database that can be searched and deleted. It is dissolved into mathematical relationships between billions of parameters. Extracting it is like trying to unbake a cake  you cannot separate the eggs from the flour once it has been in the oven.

The “right to be forgotten” becomes technically meaningless. You can ask a company to delete your data from their servers. But if that data has already trained a model, the model remembers  not your specific words, but the patterns your data helped shape. Those are permanent.

In December 2024, Italy’s data protection authority, the Garante, fined OpenAI 15 million euros for GDPR violations  including processing personal data without adequate legal basis, transparency failures, and a failure to report a March 2023 security breach that exposed chat histories and payment information. It was the first fine imposed on a generative AI company in the EU. As of October 2025, 51 copyright lawsuits have been filed against AI companies, according to tracking by legal researchers, though none has yet resulted in a definitive fair use ruling. The legal system is trying to catch up, but it is running a race it cannot win because the technology moves faster than legislation.

The Consent Model Is Broken

Let me be direct about what I think is happening.

Consent, as a mechanism for protecting privacy in the age of AI, is broken. Not cracked. Not strained. Broken.

It was designed for a transaction. I give you my data, you give me a service. Simple exchange. Clear boundaries. Understandable terms.

What we have now is not a transaction. It is an ecosystem. Your data flows from app to platform to data broker to AI company to model to product to decision that affects someone else’s life. The chain is so long, so distributed, so opaque that the idea of a single consent click at the beginning covering everything that happens downstream is absurd.

You cannot consent to a purpose that has not been invented yet. You cannot understand terms that describe technologies that do not yet exist. You cannot meaningfully agree to your data being used in ways that the company collecting it cannot predict, because those ways depend on what the next generation of AI models can do, and nobody knows that yet.

The consent form you clicked “Agree” on was written for a world where data was stored in a database and used for a specific purpose. We now live in a world where data is dissolved into models, replicated across systems, inferred from patterns, and used for purposes that were not imagined when the data was collected.

Consent was designed for humans reading documents. Not for machines ingesting your life.

I’m have written a book about exactly this  how AI and automated systems make decisions about your life, where accountability disappears, and what we can do about it. If you want to know morea about this book or order a copy, you can do it here: https://akshaywalimbe.com/beyond-bias/

Scroll to Top