100 Questions
Your AI Architecture Either Passes or It Doesn't
Your AI product needs a security review before it ships. This checklist covers the 100 specific things auditors, pen testers, and enterprise buyers will check - organized into 10 categories with a scoring rubric and priority remediation order.
Why AI Architecture Reviews Get Skipped
Traditional security reviews are designed for traditional software. They check whether encryption is in place, whether authentication is properly implemented, whether logging covers the right events. Those checks still matter for AI systems - but they cover maybe half the attack surface. The other half is specific to how AI systems are built: how models are integrated, how prompts are constructed, how outputs are validated, and how tenants are isolated when they share model endpoints.
Most teams shipping AI products have never done an AI-specific security review. Pen tests catch infrastructure vulnerabilities. Dependency scans catch known CVEs. Neither catches a tenant isolation failure in a shared embedding database, a system prompt that leaks via jailbreak, or an agent that executes arbitrary code because output wasn't validated before being passed to a code interpreter.
The 100 checks in this kit cover what auditors, enterprise buyers, and penetration testers will evaluate when they look at your AI architecture. Prompt injection, model supply chain integrity, tenant data isolation, output validation, inference infrastructure hardening, business continuity for model outages - organized into 10 categories with a scoring rubric and prioritized remediation order so you know which gaps to close first.
Run it before a pen test to fix the obvious issues first. Run it before an enterprise security review to know your score before they tell you. Run it quarterly as your architecture evolves. Two to four hours with your engineering team.
10 Categories. 100 Specific Checks.
Not vague best practices. Specific, actionable questions with pass/fail criteria.
Data Flow — 15 Checks
Where does user data go? Is it encrypted in transit and at rest? Do you log prompts? Can you trace data lineage from input to model to output to storage?
Tenant Isolation — 12 Checks
Can Tenant A's data leak into Tenant B's responses? Are embeddings isolated? Are API keys scoped per tenant? Is there cross-tenant prompt contamination?
Model Supply Chain — 10 Checks
Where do your models come from? Are you pinning versions? Do you validate model integrity? What happens when your provider deprecates the model you depend on?
Prompt Injection — 12 Checks
Can users override system prompts? Do you sanitize inputs? Are there jailbreak protections? Can indirect prompt injection via retrieved documents compromise your agent?
Output Validation — 10 Checks
Do you validate model outputs before displaying them? Can the model generate harmful content, leak system prompts, or produce executable code that runs unsandboxed?
Logging & Monitoring — 10 Checks
Are you logging prompts and responses? Can you detect anomalous usage patterns? Do you have alerting for cost spikes, abuse patterns, and model behavior changes?
Access Control — 8 Checks
Who can access the model? Are API keys rotated? Is there RBAC for different AI capabilities? Can you revoke access without redeploying?
Infrastructure — 8 Checks
Is your AI infrastructure isolated from production databases? Are GPU instances hardened? Do you have rate limiting? What's your blast radius if a model endpoint is compromised?
Business Continuity — 7 Checks
What happens when your AI provider has an outage? Do you have fallback models? Can your product degrade gracefully? Is there a kill switch for AI features?
Compliance — 8 Checks
Can you demonstrate AI governance to auditors? Do you have an AI risk register? Are you tracking regulatory requirements for your industry and geography?
Built for Teams Shipping AI Products
If someone told you "make it SOC 2 ready" or "prepare for a security review," start here.
Engineering Leads
You built the AI feature. Now someone wants a security review. This checklist tells you exactly what they'll ask and what "good" looks like for each item.
Security Teams
Your company shipped an AI product and you need to audit it. The checklist covers AI-specific attack vectors that traditional security reviews miss.
CTOs Preparing for Reviews
An enterprise prospect wants a security assessment of your AI product. Score yourself first so you know where you stand before they do.
100 checks. 10 categories. Know exactly where you stand.
- 100-point scored checklist across 10 security categories
- Pass/fail criteria for each check
- Scoring rubric with 4-tier interpretation
- Priority remediation order (fix the critical gaps first)
- Category-level scoring for targeted improvement
- Single-user commercial license
Instant download. Professional Word document (.docx) for easy customization and team sharing.
Questions
Is this a pass/fail test?
No. It's a scored assessment. Each check is worth 1 point, and you get a total score out of 100. Strong (90+), Good (70-89), Needs Work (50-69), Critical Gaps (below 50). The category-level breakdown shows you exactly where to focus remediation efforts.
How long does it take?
2-4 hours for a thorough review with your engineering team. Some checks require inspecting code or infrastructure configs. The checklist is designed to be completed in a single working session, not spread across weeks.
Does this replace a pen test?
No. This is a self-assessment that identifies gaps before a pen test finds them for you. Use this to fix the obvious issues first, then bring in a pen tester to validate and find what you missed.
Should we run this before or after building?
Ideally both. Run it at architecture review stage (before building) to validate your design decisions. Run it again before launch to catch implementation gaps. The checklist is structured to be useful at both stages - some checks evaluate design decisions, others evaluate implementation specifics that only exist after you've built the system.
How often should we re-run it?
Quarterly, or whenever you make significant architecture changes - adding a new AI capability, switching model providers, adding multi-tenant features, integrating retrieval-augmented generation. The 10-category structure lets you scope the review to just the areas that changed rather than running all 100 checks every time.
Is this relevant for AI features, not just standalone AI products?
Yes. Whether you've built an AI-native product or added AI features to an existing application, the security considerations are the same. Prompt injection, tenant isolation, and output validation are just as relevant to a copilot feature in a SaaS product as they are to a standalone AI assistant. The checklist works for both architectures.
David A. Moline, CISSP | CISM
Your AI automation, built by someone who secures DoD systems.
Find Your Gaps Before Your Auditor Does
100 questions. 2-4 hours. A clear score and a prioritized fix list. Know exactly where your AI architecture stands.