How do I prevent SQL injection from LLM-generated queries?

Question

Accepted Answer

LLM-generated SQL is a fundamentally different threat from classic SQL injection (OWASP A03:2021). Classic SQLi assumes an attacker controls a parameter inside a developer-written query. LLM SQLi assumes the entire query is generated by a probabilistic model that can be steered via prompt injection (OWASP LLM01) or insecure output handling (OWASP LLM02). Parameterized queries do not help here because the LLM is *writing* the query, not interpolating into one. The defensible architecture is a four-layer guardrail: 1. **Parse to AST, not regex.** Reject any query the parser can't fully resolve. Regex blocklists fail trivially on `/* DROP */ TABLE users` and Unicode homoglyphs. 2. **Statement-type allowlist.** Default-deny on `DROP`, `TRUNCATE`, `ALTER`, `GRANT`, `CREATE`, multi-statement, and (usually) `DELETE`/`UPDATE` for read agents. 3. **Row/column policy enforcement.** A declarative DSL (e.g., "agent X can only see `tenant_id = :ctx.tenant`") evaluated against the parsed AST — not the prompt. 4. **Evidence logging.** Every accepted and rejected query, with the LLM prompt that produced it, signed and timestamped for SOC 2 / HIPAA forensics. QueryShield ships all four as a drop-in HTTP API and MCP server. Database RLS (Postgres `RLS`, MSSQL Row-Level Security) is a useful *defense-in-depth* layer but should not be the only one — it cannot enforce column-level redaction on `SELECT *` and cannot block schema-mutation statements that have valid grants.