AI·18 March 2026·8 min read

Dangers of AI Coding: What Happens When You Ship Without Review

AI writes convincing code and convincing bugs. The failures are harder to debug than anything a human wrote because the code looks authoritative.

By Jay

Dangers of AI Coding: What Happens When You Ship Without Review

Dangers of AI Coding: What Happens When You Ship Without Review

AI writes convincing code. That is the problem.

When a junior developer writes a bug, it usually looks uncertain. There are signs: commented-out attempts, inconsistent naming, a function that does too much. When an AI writes a bug, it looks like it was written by someone who knew exactly what they were doing. The structure is clean. The variable names make sense. The comments explain the logic clearly. The logic is wrong.

This is the part nobody talks about when they celebrate shipping entire features with Claude or ChatGPT. The promise is speed. The reality is that the code you get back is a plausible first draft, not a production-ready artefact. The gap between those two things is where projects get into trouble.

The Authoritative Bug Problem

Human-written bugs tend to cluster around areas where the developer was uncertain. You can usually find them by looking for the messy parts of the code. AI-generated bugs are distributed evenly across confident-looking code, which makes them significantly harder to track down.

We have seen this directly. Building an internal tool, we asked Claude to write a function that processed user-submitted data before storing it. The function worked. It passed our manual tests. It was only during a code review that we noticed the sanitisation step was applied after interpolation into the query string, not before. The code read as though it was doing the right thing. It was not.

SQL injection through AI-generated code is not a hypothetical risk. The model is trained on code from across the internet, including code with this exact class of bug. When it synthesises a solution, it sometimes synthesises the mistake with it.

What Actually Breaks in Production

Beyond injection vulnerabilities, there are 3 failure patterns we see repeatedly in AI-generated code.

First: hardcoded secrets. Ask an AI to wire up an API integration and it will sometimes include a placeholder that looks like a real key, or worse, use the key you mentioned in the prompt. If you are not reviewing the diff before committing, that key ends up in your git history. It does not matter that you delete it later. The history is public if the repo is public, and git history is permanent.

Second: silent fallbacks. AI code tends to handle errors by catching them and returning empty values or null. The code does not crash, so you do not notice it is failing. A campaign tracking integration that silently returns null instead of throwing an error will happily let your attribution data disappear for weeks before anyone notices the numbers look off.

Third: dependency conflicts. When you ask an AI to write a module using a library, it will use the version it was trained on. If your project is running a newer or older version, the method signatures may not match. This does not always cause an obvious error. Sometimes it just behaves differently from what the AI intended, and the discrepancy is subtle.

The Review Process We Use

We use AI to write code constantly. We are not against it. The output is often excellent and genuinely accelerates development. But we treat every AI-generated code block as code submitted by a developer who has not read our full codebase and may have hallucinated some of the requirements.

Every AI-generated piece of code goes through the same review checklist before it ships.

Check one: does it handle the unhappy path? Most AI code handles the success case well. The failure cases are where it gets lazy. We read through every error handler and ask whether that failure mode is actually handled or just silenced.

Check two: are there any credentials, tokens, or API keys in the output? We search the diff explicitly. This takes 30 seconds and has caught real problems.

Check three: does the query or input handling sanitise before interpolation? Not after. Not in a wrapper somewhere. At the point where the input touches the query.

Check four: do the dependencies match the version constraints in the project? We check this against the package file, not against what we remember.

Check five: can we explain what this code does to another person? If the answer is no, the code does not ship until we can.

Why This Matters More When the Stakes Are Higher

For a landing page or a marketing quiz, the risk profile of AI-generated code is relatively low. The worst case is a broken layout or a form that fails to submit. For anything that touches customer data, payment flows, authentication, or third-party APIs with billing implications, the risk profile changes completely.

We built the initial version of Doorlist, a venue check-in PWA, using AI-assisted development. The convenience logic, the UI components, the basic routing. That part went quickly. But the authentication layer and the data handling were written by hand and reviewed line by line. Not because we do not trust AI, but because the cost of getting that part wrong is a data breach, not a broken UI.

The distinction matters. AI is a tool, not a co-founder. It does not carry accountability for what ships.

The Confidence Problem Is a Feature, Not a Bug

Here is the uncomfortable part. The reason AI code looks authoritative is that it has been trained to look that way. Code that looks clean and confident is rewarded in training data. That optimisation is useful in many contexts and genuinely makes the output easier to work with. It also means the model has no strong signal for when it should look uncertain.

A human developer will sometimes write a comment that says "not sure this is right, needs review." An AI will not. It will write the confident version of its best guess, every time.

The fix is not to stop using AI for code. The fix is to build a review culture that is not fooled by confident-looking output. Treat the AI like a contractor who does excellent work but has never worked in your codebase before and may have misunderstood the spec.

What to Do With This Information

If you are shipping AI-generated code without review, stop. Not because AI is bad at coding, but because the review step is what turns fast output into reliable software.

Set up a pre-commit hook that scans for common patterns like API key formats and hardcoded secrets. Use a linter that catches obvious SQL injection patterns. Require that anyone using AI-generated code in your codebase can walk through it verbally before it merges.

The promise of AI coding is real: faster first drafts, better boilerplate, cleaner starting points. The reality is that you still need to know what good code looks like to recognise when the AI has handed you something that only looks like it.

If you are building something real and you want a team that uses AI sensibly rather than recklessly, talk to us. We have done this work and we know where the traps are.

AI codingcode reviewsecuritydevelopment
Skip the small talk