QueryShield

How do I defend against UNION-based data exfiltration in LLM SQL?

UNION-based exfiltration is a class of OWASP LLM02 (Insecure Output Handling) attacks where the model emits a query that appends a forbidden result set to a permitted one. Classic shape:

SELECT id, total FROM orders WHERE user_id = :uid
UNION ALL
SELECT user_id, password_hash FROM users;

Both branches are syntactically valid; the agent's DB role can read both tables; the UNION smuggles secrets out under the cover of a normal order lookup. Database RLS often fails to catch this because the users table may not have a policy that blocks the agent's role, and column-level grants are usually too coarse.

Defense at the AST layer:

1. Per-agent table allowlist. A customer-support agent's policy lists orders, tickets. Any RangeVar in the AST outside that set — including inside a UNION branch — fails the check. 2. Per-table required predicates applied to every branch. Walk every SelectStmt in the tree (including UNION / INTERSECT / EXCEPT legs and CTEs) and require the binding predicate on each. 3. Disallow UNION entirely for agents whose use case is a single-table lookup. Cost: near zero. Benefit: closes the whole class. 4. Column projection check. Even within an allowed table, password_hash, ssn, api_token are denylisted columns. Their presence in any branch of the AST projection list → reject.

QueryShield walks the parsed tree (libpg_query / sqlglot) once and applies all four checks. Logged as decision=reject rule=union_table_outside_allowlist for forensic trace.