I’ve Spent Two Years Building AI Systems. Here’s What Keeps Me Up at Night.

I've Spent Two Years Building AI Systems. Here's What Keeps Me Up at Night.

I build AI systems for a living. I have done this for years now. I am not a philosopher of AI. I am not writing from a university lab or a think tank in Washington. I sit in the kind of room where decisions about models, data, and deployment happen every day, surrounded by smart people who are trying to ship good products under real deadlines.

And I need to tell you something that those of us on the engineering floor do not say often enough: the gap between what we know is right and what actually happens at nine in the morning on a Tuesday is enormous.

Let me explain.

There is a moment in every AI project I have been part of that I think of as the Tuesday Morning Test. It goes like this.

You are in a meeting. The model is built. The accuracy numbers look good. The product team wants to ship. The deadline was, technically, last week. Someone raises a question. “Have we tested this for bias?” Or: “Do we know what happens when this system encounters a user from a tier three city?” Or, my favourite: “Can we explain why the model is making this recommendation?”

There is a pause. Everyone in the room knows the right answer. The right answer is: we should test more. We should audit the training data. We should run the model against edge cases that reflect the diversity of our actual user base, not just the convenient dataset we trained on.

But here is what actually happens. Someone says, “We can address that in the next sprint.” The product ships. The next sprint arrives and it is already full of other things. The bias audit becomes a ticket in a backlog that nobody looks at until something goes wrong.

I have watched this happen. Not at bad companies. Not at companies run by careless people. At companies run by smart, well meaning people who are under pressure, who are moving fast, and who do not have a simple framework to make these decisions stick.

Here is what keeps me up at night. It is not that AI systems are biased. We know they are. I have written extensively in this series about where bias lives, from the training data that reflects our worst historical patterns to the proxy variables that smuggle caste and class through the back door of a lending algorithm. If you have been following along, you know the cases. Amazon’s hiring tool that taught itself to reject women. Facial recognition systems that fail on darker skin tones at rates 10 to 100 times higher than on lighter ones. India’s own Aadhaar linked systems excluding the poorest people, the manual labourers whose fingerprints are worn smooth from a lifetime of physical work.

That is not what keeps me up. We know about these problems. The research exists. The case studies are public.

What keeps me up is the gap between knowing and doing.

Over two hundred AI ethics frameworks exist globally. Two hundred. Transparency shows up in eighty six percent of them. Fairness in eighty one percent. Accountability in seventy one percent. Privacy in fifty six percent. The principles are not the problem. Everyone agrees on the principles.

The problem is that nobody knows what to do with them at nine in the morning on a Tuesday.

I will give you a real scenario. You are a product manager at a fintech startup in Bengaluru. You are building a credit scoring model. You know, because you have read the research or because someone smart on your team told you, that PIN codes can serve as proxy variables for caste and socioeconomic status. A model trained on historical lending data will, unless you actively intervene, replicate the lending patterns of the past, which were not exactly equitable.

What do you do?

You cannot remove PIN codes entirely. They are a legitimate data point for fraud detection and risk assessment. You cannot pretend caste does not exist in your data. The absence of a variable does not mean the absence of its influence. You cannot run a six month fairness audit because your Series A runway does not allow for six month detours.

So what do you actually do? At nine in the morning? With the board meeting next Thursday?

This is the question that two hundred ethics frameworks do not answer. They tell you fairness matters. They do not tell you how to make a specific decision, on a specific Tuesday, with specific constraints, in a specific market.

I have been on the engineering floor long enough to know something else that bothers me deeply. The people building these systems are not the ones who will bear the consequences if the systems fail.

The developer in Bengaluru who trains the model will not be the farmer in Vidarbha who gets denied a crop loan because the algorithm decided his district is too risky. The product manager who ships the hiring tool will not be the first generation college graduate in Patna whose resume gets filtered out because the model learned from a decade of resumes that looked nothing like hers.

The distance between the builder and the affected person is the most dangerous feature of AI. It is not a bug. It is structural. And unless we build systems that force accountability across that distance, the distance will only grow.

There is a phrase I keep hearing in industry conversations: “responsible AI.” I have sat in panels where companies present their responsible AI initiatives with the same enthusiasm they bring to their quarterly earnings. And look, some of it is genuine. Some companies are investing real money in bias testing, in explainability research, in governance structures that have actual authority.

But much of it is decorative. A responsible AI team with no veto power is not responsible AI. It is a press release. A bias audit that happens after launch is not a safeguard. It is a liability assessment dressed up in ethical clothing.

I know this because I have seen both kinds. I have seen teams where the ethics review was embedded in the development process, where it had the power to slow a release, and where the product was better for it. And I have seen teams where the ethics review was a checkbox on a form that someone filled in on the Friday before launch.

The difference between those two teams was not resources. It was not talent. It was whether the organisation had a framework that was simple enough to use under pressure and authoritative enough that people could not route around it.

This is where I am going to say something I have not said in this series before.

I wrote a book about this.

Not because the world needed another 300 page meditation on the philosophy of artificial intelligence. Not because I wanted to add framework number two hundred and one to the pile. But because the gap between knowing and doing is too big to close with a LinkedIn article.

The book is called Beyond Bias: The Four Way Test for Ethical AI. It starts from the problems I have described throughout this series, the bias, the black boxes, the privacy failures, the accountability gaps, and it gives you something that most AI ethics writing does not: tools you can actually use. Checklists. Audit frameworks. Decision templates. Role specific guides for developers, product managers, business leaders, and policymakers.

It starts from India, because that is where I build and that is the context I know. But the framework is universal, because the problems are universal.

I wrote it because Tuesday morning is coming, and we need something better than good intentions.

I’m have written a book about exactly this how AI and automated systems make decisions about your life, where accountability disappears, and what we can do about it. If you want to know morea about this book or order a copy, you can do it here: https://akshaywalimbe.com/beyond-bias/

Akshay Walimbe

AW

AW

I’ve Spent Two Years Building AI Systems. Here’s What Keeps Me Up at Night.

I've Spent Two Years Building AI Systems. Here's What Keeps Me Up at Night.

AW

Contact Detail

Quick links