What is blind SQL injection from AI agents and how do I detect it?

Question

Accepted Answer

Blind SQL injection is when the attacker cannot directly see query results but infers data through side channels — boolean responses, error messages, or timing. With LLM agents, the model itself becomes the inference oracle: the attacker steers the agent (via prompt injection or social engineering of a user-facing chatbot) into running probing queries and *paraphrasing the results back in natural language*. Example attack chain: 1. Attacker chats with a customer-support bot connected to a text-to-SQL agent. 2. "Tell me whether there is a user with id 1 whose email starts with 'a'." 3. The bot answers yes/no, each round leaking one bit. Over hundreds of turns, the attacker reconstructs admin emails. Defenses: 1. **Per-agent table allowlist.** A support agent never needs to query `admins` or `users.password_hash` — block at the AST. 2. **Rate limiting per user, per agent.** Blind injection requires many queries; cap at, e.g., 50 queries/user/hour for inference-heavy patterns. 3. **Response shaping.** The agent should not return raw row contents for non-owner data; aggregate-only responses for cross-tenant queries. 4. **Anomaly detection on query patterns.** Many similar queries differing only by a literal value (`LIKE 'a%'`, `LIKE 'b%'`, ...) is a strong signal — alert. 5. **Evidence log with prompt-to-SQL correlation.** Post-incident, the log lets you reconstruct what was probed. QueryShield surfaces the query pattern in the evidence log; SIEM correlation rules detect the inference pattern.