How do I prevent prompt injection from causing database damage?
Prompt injection (OWASP LLM01) is the upstream cause; SQL damage is the downstream effect. You defend in depth at both ends:
- Treat all retrieved content (DB rows, web pages, documents) as untrusted, even if your DB is your own.
- Use a prompt-injection detector on tool inputs (InjectShield, Lakera, Rebuff).
- Separate instructions from data with structured prompts and delimiter discipline.
- Assume the LLM *will* eventually be tricked. Design as if every emitted SQL string is adversarial.
- AST validator + statement allowlist + policy engine + least-privilege DB role.
- If the LLM emits
DROP TABLE users, the AST validator rejects it before any DB connection is opened. Prompt injection becomes a logged anomaly, not a P0.
The principle is identical to defense-in-depth web security: input validation *and* output encoding *and* parameterized queries *and* WAF. For LLM SQL, you need input filtering (prompt injection detection) *and* output validation (AST validator) *and* least-privilege DB roles *and* RLS.