How does PCI DSS apply to AI text-to-SQL pipelines?

Question

Accepted Answer

PCI DSS v4.0 (mandatory March 2025) governs any system that stores, processes, or transmits cardholder data (CHD). An LLM text-to-SQL pipeline querying a CHD-adjacent database is in scope if the agent can return CHD or sensitive authentication data (SAD). Relevant requirements: - **Req 3 — Protect stored CHD.** PAN must be rendered unreadable; truncation/masking at the view layer is the norm. AST policy enforces that the agent's SQL never projects `pan_full` and only the masked `pan_last4` view. - **Req 7 — Restrict access to need-to-know.** Per-agent role with access only to masked CHD views; QueryShield policy file is the documented control. - **Req 8 — Identify and authenticate access.** Every agent has an identity; every request carries an authenticated subject; evidence log ties both to the SQL. - **Req 10 — Log and monitor all access to CHD.** Tamper-evident audit log of every query against CHD tables; retain one year online + three years archived. - **Req 11.4 — Penetration testing.** Annual + significant-change tests must include the LLM SQL path; QueryShield publishes a red-team test suite (prompt injection → SQL). Realistic posture: keep the LLM out of the CHD scope wherever possible (use tokenized references), and where it must touch CHD-adjacent data, enforce minimum-necessary at the AST layer with QueryShield + masked views + comprehensive logging. The QSA's first question will be "show me the access control list for this agent" — your policy file *is* the answer.