The Security Blind Spot in AI-Generated Code: 7 Vulnerabilities LLMs Keep Introducing

LLMs generate code that works. They also generate code that's insecure — not because they're incompetent, but because their training data is full of insecure patterns. Here are the seven vulnerabilities that show up in almost every vibecoded codebase.

Cover Image for The Security Blind Spot in AI-Generated Code: 7 Vulnerabilities LLMs Keep Introducing

The Security Blind Spot in AI-Generated Code: 7 Vulnerabilities LLMs Keep Introducing

Your AI agent just built a login system in 10 minutes. It also introduced three security vulnerabilities. You won't find them until someone exploits them.


AI-generated code security refers to the systematic vulnerabilities introduced when LLMs generate code from training data that is overwhelmingly insecure — tutorials that skip validation, examples that hardcode secrets, and Stack Overflow answers that use string concatenation for SQL. These aren't random bugs; they're predictable patterns that appear in nearly every vibecoded codebase because the model's statistical default is "code that works," not "code that's safe."

Here's an uncomfortable fact about AI-generated code: LLMs are trained on the internet, and the internet is full of insecure code.

Stack Overflow answers that skip input validation. Tutorial code that hardcodes API keys. Blog posts that demonstrate SQL with string concatenation. GitHub repos with permissive CORS and missing auth checks. All of it is in the training data. All of it influences what the model generates.

The result: when you ask an AI coding agent to build a feature, it generates code that works. It handles the happy path. It looks clean. But it's built on the security assumptions of a tutorial, not a production system.

This isn't theoretical. Stanford researchers found that developers using AI assistants produced significantly less secure code than those working without AI, and — critically — were more confident that their code was secure. The AI gives you working code and a false sense of safety.

Why LLMs Default to Insecure Code

It's not that LLMs can't generate secure code. They can — if you explicitly ask for it. The problem is three-fold:

1. Security is implicit. When you prompt "build a user registration endpoint," the implied requirements are: accept email and password, create a user, return success. The unspoken requirements — hash the password, validate input length, rate-limit the endpoint, prevent enumeration attacks — aren't in your prompt. The LLM generates what you asked for, not what you need.

2. Insecure patterns are more common in training data. Tutorials prioritize clarity over security. Open-source code skips hardening. The most-copied code on Stack Overflow often omits security considerations. The LLM's statistical model of "what code usually looks like" skews toward insecure defaults.

3. Security is cross-cutting. A single security requirement (e.g., "all user data must be encrypted") affects dozens of files, functions, and data paths. LLMs generate code file-by-file, prompt-by-prompt. Cross-cutting concerns that span the entire codebase are exactly what they're worst at maintaining.

The 7 Vulnerabilities

These seven patterns appear in nearly every vibecoded codebase we've analyzed. They map loosely to OWASP categories but are specific to the way LLMs generate code.

1. Missing Input Validation

What the LLM generates:

app.post('/api/users', async (req, res) => {
  const { email, password, name } = req.body;
  const user = await db.createUser({ email, password, name });
  res.json(user);
});

What's wrong: No validation on email format, password strength, name length, or unexpected fields. An attacker can send a 10MB name field, a password of "a", or inject additional properties into the user object.

What it should include: Schema validation (Zod, Joi, or similar) that rejects malformed input before it reaches your business logic. Input length limits. Field allowlisting.

Why the LLM misses it: Tutorials show the "clean" version. Validation code is verbose and distracts from the tutorial's point. The training data bias is toward clarity, not safety.

2. Hardcoded Secrets

What the LLM generates:

const stripe = new Stripe('sk_live_abc123...');

or slightly better but still wrong:

const API_KEY = 'sk_live_abc123'; // TODO: move to env

What's wrong: Secrets in source code end up in version control, build logs, and error stack traces. Even with a TODO comment, the secret is committed.

What it should include: Environment variable references (process.env.STRIPE_SECRET_KEY) with validation that the variable exists at startup, plus a .env.example file documenting required variables without values.

Why the LLM misses it: It's generating a working example. Hardcoded values work. The LLM completes the pattern it sees most often, and most code examples use inline values.

3. SQL Injection via String Concatenation

What the LLM generates:

const user = await db.query(`SELECT * FROM users WHERE email = '${email}'`);

What's wrong: Classic SQL injection. An attacker sends ' OR '1'='1 as the email and gets every user record.

What it should include: Parameterized queries (db.query('SELECT * FROM users WHERE email = $1', [email])) or an ORM that handles parameterization automatically.

Why the LLM misses it: String interpolation is the "obvious" way to build a dynamic string. Parameterized queries require knowing the specific database driver's API. The LLM defaults to the pattern that's syntactically simplest.

4. Broken Authentication Boundaries

What the LLM generates:

// Public routes
app.get('/api/health', healthCheck);
app.post('/api/auth/login', login);

// Protected routes
app.get('/api/users/:id', getUser);
app.post('/api/billing/charge', chargeBilling);

// Auth middleware applied globally... or is it?
app.use(authMiddleware);

What's wrong: Middleware ordering. The routes registered before app.use(authMiddleware) are unprotected, but so might be routes registered in other files that load before the middleware is applied. The protection depends on import order, which is fragile and invisible.

What it should include: Explicit per-route auth, or a route registration pattern that makes the auth boundary visible. Routes should be authenticated by default, with explicit opt-out for public endpoints.

Why the LLM misses it: It generates routes and middleware in the order you ask for them. If you ask for routes first and auth later, the generated code has exactly this vulnerability.

5. Overpermissive CORS

What the LLM generates:

app.use(cors({ origin: '*' }));

or:

app.use(cors({ origin: true })); // Reflects any origin

What's wrong: Allowing any origin means any website can make authenticated requests to your API using your users' credentials (via cookies or other ambient auth).

What it should include: An explicit allowlist of permitted origins, ideally loaded from configuration. Different allowlists for development vs. production.

Why the LLM misses it: origin: '*' is the simplest working configuration. Most tutorials use it. The LLM generates what works, not what's safe.

6. Information Leakage in Error Responses

What the LLM generates:

catch (err) {
  res.status(500).json({ error: err.message, stack: err.stack });
}

What's wrong: Stack traces reveal internal file paths, dependency versions, database connection strings, and application structure. This is a reconnaissance goldmine for attackers.

What it should include: Generic error messages for clients (Internal server error), detailed error logging server-side only, and never exposing stack traces in production.

Why the LLM misses it: During development, detailed errors are helpful. The LLM generates code optimized for debugging, not for production. It doesn't distinguish between development and production error handling.

7. Missing Rate Limiting

What the LLM generates: Nothing. It simply doesn't add rate limiting unless you ask.

What's wrong: Every public endpoint without rate limiting is vulnerable to brute force attacks (login), resource exhaustion (expensive operations), and abuse (automated scraping).

What it should include: Per-endpoint or per-route rate limiting, especially on authentication endpoints, password reset, and any endpoint that triggers email sends or API calls.

Why the LLM misses it: Rate limiting is infrastructure, not features. When you ask for "a login endpoint," rate limiting isn't part of "login." It's a cross-cutting non-functional requirement that the LLM doesn't add unprompted.

Why "Just Review the Code" Doesn't Scale

The common response to AI security concerns is: "Just review everything the AI generates."

This worked when AI was generating 10 lines at a time in Copilot autocomplete. It doesn't work when AI agents are generating entire features — dozens of files, hundreds of lines, complex interactions between components.

Review fatigue is real. When the AI generates 500 lines of clean-looking code, your attention is on "does it do what I asked?" not "did it validate every input?" Security vulnerabilities hide in what's missing, not in what's wrong. Reviewing for absence is cognitively exhausting.

The volume exceeds human bandwidth. A single vibecoding session can produce more code than a developer would normally review in a day. If you're reviewing everything manually at that rate, you're not getting the speed benefit of AI coding.

Security requires cross-file reasoning. Vulnerability #4 (broken auth boundaries) can only be detected by understanding the relationship between route registration order, middleware application, and file import sequence. Reviewing a single file won't catch it.

The Structural Fix

Security vulnerabilities in AI-generated code aren't random. They're predictable — the same seven patterns, in every codebase, every time.

Predictable vulnerabilities can be prevented structurally:

Encode security requirements as constraints, not suggestions. A constraint graph attaches security requirements as typed, enforceable nodes: "All endpoints in /api/ are GOVERNED_BY auth-required constraint. Exception: /api/health, /api/webhooks/* with webhook-signature-verification constraint."

Need compliance? Cutline automatically loads SOC 2 security controls, HIPAA PHI protection, and PCI-DSS payment security based on what you're building.

Propagate security through dependencies. When your user service handles PII, every component that depends on it inherits the PII-handling constraint: encryption at rest, audit logging, data minimization. The LLM doesn't need to be told — the constraint is automatically in its context.

Inject security context per-task. When the AI is building a new API endpoint, the relevant security constraints — auth requirements, input validation schema, rate limiting config, error response format — are automatically injected. Not a 50-page security policy. Just the constraints that apply to this specific task.

Detect violations before generation. The constraint graph can identify "this endpoint is missing auth middleware" or "this database query uses string interpolation" as constraint violations — before the code is committed.

The Vibecoded Security Checklist

Until you have constraint-level enforcement, use this checklist on every vibecoded feature:

  • All user input validated with schema library (Zod, Joi)
  • No hardcoded secrets — all from environment variables
  • Database queries use parameterized queries or ORM
  • Auth middleware explicitly applied (not dependent on import order)
  • CORS restricted to allowed origins (no * in production)
  • Error responses don't leak stack traces or internal paths
  • Rate limiting on auth endpoints and expensive operations
  • File uploads validate type and enforce size limits
  • No eval(), Function(), or dynamic code execution with user input
  • Sensitive data (passwords, tokens) not logged

Print this out. Put it next to your monitor. Check every item on every feature. The AI won't do it for you.


FAQ

Q: Is AI-generated code secure?

No, not by default. LLMs are trained on internet code that is overwhelmingly insecure — tutorials skip input validation, examples hardcode API keys, and Stack Overflow answers use string concatenation for SQL. Stanford research found developers using AI assistants produced significantly less secure code and were more confident it was secure.

Q: What are the most common AI code security vulnerabilities?

The seven most common are: (1) missing input validation, (2) hardcoded secrets, (3) SQL injection via string concatenation, (4) broken authentication boundaries from middleware ordering, (5) overpermissive CORS with wildcard origins, (6) information leakage in error responses exposing stack traces, and (7) missing rate limiting on public endpoints.

Q: Why do LLMs generate insecure code?

LLMs default to insecure code for three reasons: security requirements are implicit (not in your prompt), insecure patterns are more common in training data (tutorials prioritize clarity over security), and security is cross-cutting (a single requirement affects dozens of files, but LLMs generate code file-by-file).

Q: How do you secure vibecoded code?

Secure vibecoded code by encoding security requirements as typed constraints in a constraint graph rather than relying on manual review. For immediate action, use the security checklist: validate all input with a schema library, no hardcoded secrets, parameterized queries, explicit auth middleware, restricted CORS, no stack traces in error responses, and rate limiting on sensitive endpoints.

Q: Does code review catch AI security vulnerabilities?

Manual code review doesn't scale for AI-generated code. Security vulnerabilities hide in what's missing, not what's wrong. Reviewing for absence is cognitively exhausting, the volume exceeds human bandwidth, and cross-file vulnerabilities like broken auth boundaries require understanding relationships between files that single-file review can't catch.


Take Action: Secure Your AI-Generated Code

Free Security Vibe Check: Scan your codebase for these 7 vulnerabilities →

Learn More:

Cutline encodes security requirements as constraint graph nodes and injects them into every AI coding session. Your security posture, enforced structurally.


Read more about

·7 min read·📝Posts

SlopBurn reframes agentic software quality as a depth-first roguelike dungeon crawl. Bugs become monsters, tests become weakpoints, and software quality becomes the main loop instead of an afterthought.

·9 min read·📝Posts

We're evolving from a technical product manager to a research company focused on safe vibecoding. Our mission remains the same: help developers build secure, scalable, and reliable software with AI coding agents — from the first line of code.

·9 min read·📝Posts

A new category of freelance work is exploding: fixing apps that AI built and humans shipped. Full disclosure: I'm a former Upwork employee (2022–2024). All observations below are based on publicly available data. Here's what the numbers say about the vibecoding cleanup economy — and why the hardest 20% is where all the money is.

·11 min read·📝Posts

Whether you just shipped an MVP or are still prompting your first feature, your vibecoded app has security gaps. They're not bugs — they're structural omissions baked into how LLMs generate code. Here's how to find them, fix them, and prevent them at every stage of the software engineering lifecycle.

·14 min read·📝Posts

In 2015, Google warned that ML systems were the 'high-interest credit card of technical debt.' A decade later, vibecoding tech debt makes that metaphor quaint. AI-generated code doesn't carry credit card rates — it carries payday lender rates, with terms designed to look cheap until the first payment is due.

·15 min read·📝Posts

Traditional TDD asks developers to write tests before code. Cutline's Red-Green Refactoring mode flips the script — the constraint graph writes the tests for you, turning every feature into a gauntlet of security, performance, and stability checks that the AI must pass.