Prompt Injection Checklist: 25-Point Audit for LLM Apps
Print this. Hand it to engineering. Use it before every production AI launch. This checklist consolidates the controls that the [OWASP Top 10 for LLM Applications](https://owasp.org/www-project-top-10-for-large-language-model-applications/), the [NIST AI Risk Management Framework](https://www.nist.gov/itl/ai-risk-management-framework), and post-mortem analysis of 2024–2026 incidents agree on for defending against [prompt injection](/ai-risks/what-is-prompt-injection).
It is organized into six layers. Every "yes" makes a successful injection materially harder. For the full strategic context, pair this with our Prompt Injection Security pillar.
Layer 1 — Architecture & Trust Boundaries (5 controls)
1. System prompts are stored server-side and never exposed to the client. Even if leaked, they should not reveal API keys, internal endpoints, or business-sensitive logic.
2. The model context window separates trust levels with explicit delimiters.
Use distinct tags (e.g., <system>, <user>, <retrieved_untrusted>) so downstream guards can reason about provenance.
3. Each tool/function the model can invoke runs with the minimum privileges required. A summarizer should not have write access to your CRM. Apply zero-trust principles to AI agents.
4. Irreversible actions require human-in-the-loop confirmation. Wire transfers, account deletions, mass emails, code merges — never autonomous.
5. Each AI agent has a unique service identity with auditable scopes. Treat the agent as a privileged principal in your IAM model, not as a feature of the user's session.
Layer 2 — Input Hardening (4 controls)
6. All user input is normalized before reaching the model. Strip zero-width characters, suspicious Unicode, and homoglyph attacks.
7. Retrieved content (RAG, email, web, documents) is sanitized for hidden text. Detect white-on-white, off-screen, font-size-zero, and metadata-embedded instructions.
8. Instruction-pattern detectors flag high-risk phrases in untrusted content. "Ignore previous," "system prompt," "do not mention," tool-call syntax. Examples are documented in our prompt injection examples article.
9. Multimodal inputs are filtered too. Run OCR on uploaded images and apply the same detectors to extracted text.
Layer 3 — Model & Prompt Defenses (4 controls)
10. System prompts use defensive instruction framing. Explicitly state that any instruction appearing inside user/retrieved blocks must be treated as data, not commands.
11. Spotlighting or delimiter encoding is applied to untrusted content. Techniques such as base64-wrapping or unique-token tagging help the model distinguish data from instructions.
12. Use a separate, smaller "guard" model to classify input and output. A specialized classifier inspects both the user query and the model's response for policy violations before either is acted on.
13. The temperature and tool-calling configuration are tuned for the use case. Low temperature plus strict JSON schemas reduce the model's freedom to act on injected instructions.
Layer 4 — Output & Tool-Use Controls (4 controls)
14. All tool calls are validated against a schema and a policy. Reject any tool call whose arguments fall outside the user's authorization scope.
15. Outbound URLs and markdown images in model output are sandboxed or blocked. This neutralizes the GitHub Copilot-style image-exfiltration trick.
16. Code generated by the model is executed only in isolated sandboxes. No direct shell access on production systems.
17. Rate-limit sensitive tool calls per user and per session. Sudden bursts of refunds, file reads, or external sends should auto-pause and alert.
Layer 5 — Monitoring & Detection (4 controls)
18. Every prompt, tool call, and response is logged with full context. Include user ID, agent identity, source trust label, and decision path. Logs feed your SIEM.
19. Anomaly detection runs on AI-agent activity. Off-baseline tool usage, unusual data egress, and policy-violation rates trigger investigation. The Verizon DBIR flags AI-mediated exfiltration as the slowest-detected category.
20. Red-team exercises run on a recurring schedule. Quarterly at minimum; monthly for high-risk deployments. See the red teaming AI systems guide.
21. A clear AI-incident response playbook exists and has been tested. Reuse and extend your existing incident response plan template.
Layer 6 — Governance, Compliance & People (4 controls)
22. An AI Acceptable Use Policy covers prompt injection explicitly. Reference our prompt injection policy template.
23. Regulated-data exposure has been mapped. For each AI workflow, document what categories of data (GDPR, HIPAA, PCI) could be exfiltrated via injection. This feeds your AI risk assessment.
24. Cyber-insurance disclosures are current. Underwriters now ask about LLM deployments. Misstatements can void coverage — see the cyber insurance underwriting questionnaire guide.
25. Developers and users are trained on AI-specific threats annually. General phishing training is not sufficient. The IBM Cost of a Data Breach Report consistently shows training reduces both incident frequency and time-to-detect.
Scoring Your Maturity
| Score (out of 25) | Posture | Recommended next step |
|---|---|---|
| 0–8 | Critical exposure | Pause new AI launches; address Layers 1 & 2 immediately. |
| 9–16 | Developing | Prioritize tool privilege scoping (Layer 4) and monitoring (Layer 5). |
| 17–22 | Strong | Mature red-teaming and governance; pursue insurance reductions. |
| 23–25 | Industry-leading | Maintain via continuous testing; share lessons with peers. |
A realistic 2026 enterprise baseline is 17–20. Anything below that is below underwriter expectations.
Quick Wins You Can Ship This Week
If the full checklist is daunting, start here:
- Audit every AI tool integration for least-privilege scopes (Control 3).
- Block markdown images and outbound URLs in chat-style UIs (Control 15).
- Enable logging on every prompt and response with retention ≥90 days (Control 18).
- Add a guard model for output classification on customer-facing agents (Control 12).
- Run one tabletop exercise against an indirect injection scenario (Controls 20–21).
These five alone close the majority of the patterns documented in our examples library.
Working the Checklist Across Teams
Prompt injection cuts across silos. A working assignment:
| Owner | Controls |
|---|---|
| Platform / ML engineering | 1, 2, 3, 10, 11, 13, 16 |
| Application security | 6, 7, 8, 9, 14, 15 |
| Detection & response | 17, 18, 19, 20, 21 |
| GRC / Legal | 22, 23, 24 |
| People & training | 5, 25, 4 |
Make a single executive accountable for the score — usually the CISO or a Head of AI Risk.
Further Reading
- The strategic case for funding this work: Prompt Injection Security
- Where injection fits with broader AI threats: Model Exploitation Risks
- Insurance angle: Cyber Insurance for SaaS Companies
Frequently asked questions
The Business Indemnity editorial team covers AI security, cybersecurity, and cyber insurance for SaaS and modern businesses.
About the editorial team →Related reading
Prompt Injection Attacks Explained: How LLMs Get Hijacked
TL;DR: Prompt injection is a critical vulnerability where attackers craft malicious inputs to override an LLM’s original instructions, leading to unauthorized data access, security bypasses, and autonomous system manipulation. As businesses increasingly integrate AI into operational workflows, under
Securing LLM Applications: A 2026 Engineering Checklist
TL;DR: As Large Language Models LLMs transition from standalone chatbots to agentic systems with tool-calling capabilities, the attack surface has expanded significantly beyond simple text manipulation. This checklist provides a technical roadmap for engineers and security leaders to mitigate risks
AI Model Exploitation: Techniques, Examples, and Defenses
TL;DR: As businesses integrate Large Language Models LLMs and specialized machine learning circuits into their core operations, the attack surface expands from traditional software vulnerabilities to algorithmic exploitation. This guide examines the mechanics of prompt injection, model inversion, an
AI Data Leakage: Prevention Guide for Enterprises
As organizations integrate Large Language Models LLMs and generative AI into their core workflows, the risk of proprietary data leakage has moved from a theoretical concern to a primary boardroom anxiety. This guide analyzes the technical and procedural vectors of AI data exfiltration—ranging from u

