The CISO’s AI Agent Production Approval Checklist: 7 Gates to Clear Before Go-Live
Your engineering lead is in your office Thursday morning. They want to push an AI...
Mar 6, 2026
Your security team has done the homework. You’ve built a risk taxonomy covering agent escape, prompt injection, tool misuse, and data exfiltration. You’ve mapped those threats against your agent architecture’s seven layers. You’ve classified your agents by autonomy level — separating read-only chatbots from fully autonomous workflow agents that can book meetings, modify databases, and invoke other agents. The risk assessment is thorough. The governance documentation is solid.
And none of it is running in production.
The Gravitee State of AI Agent Security 2026 report puts numbers behind what most security teams already feel: only 14.4% of organizations have achieved full security and IT approval for their entire AI agent fleet, yet 80.9% of technical teams have moved past the planning phase into active testing or production. Meanwhile, 88% of organizations reported confirmed or suspected AI agent security incidents within the past year. The governance exists. The implementation doesn’t.
We call this the governance-implementation gap: the structural disconnect between having documented AI security frameworks and actually running security controls in production. It’s not that the available governance frameworks are wrong — threat taxonomies, multi-layer threat models, and autonomy classification matrices are genuinely useful for understanding the landscape. But they answer the question “what should we worry about?” They don’t answer the question your CISO is actually asking: What do we deploy first, in what order, and how do we build the program around it?
The governance-implementation gap isn’t caused by a lack of effort or awareness. It’s structural. AI agents present a sequencing problem that traditional security programs don’t face.
With traditional workloads, you can write security policies before deployment because the workloads are deterministic. A microservice makes the same API calls, accesses the same data stores, and follows the same execution paths every time. You can define a network policy, apply a role binding, and enforce least privilege based on the service’s declared behavior — because declared behavior and actual behavior are essentially the same thing.
AI agents break this assumption. They execute generated code. They traverse permissions dynamically based on prompt inputs. They invoke tools that weren’t in their original design because a developer added an MCP connection last Thursday. A single agent’s behavior can change from one interaction to the next depending on the prompt, the context window, and which tools are available. Declared behavior and actual behavior diverge — sometimes dramatically.
This means you can’t write enforcement policies for AI agents the way you write them for traditional workloads. If you try, you get one of two outcomes: policies so permissive they don’t actually constrain anything, or policies so restrictive they break the agent in production. Security teams know this intuitively, which is why they stall. They have the threat model, they understand the risks, but they don’t know what normal behavior looks like for their agents — so they can’t define policies, so they can’t enforce, so the agents run unsecured while the governance documentation sits in a shared drive.
The solution is an implementation methodology that starts with observation rather than policy. You don’t write policies first and hope they match reality. You observe reality first and generate policies from what you see. That’s the core principle behind the Observe → Posture → Detect → Enforce framework: each stage’s outputs become the next stage’s required inputs.
The framework works as a dependency chain. You cannot assess posture meaningfully without runtime observability data — because without observing what an agent actually does, your posture assessment compares declared permissions against other declared permissions, which tells you nothing about actual risk.
You cannot detect AI-specific threats accurately without behavioral baselines — because without knowing what “normal” looks like, every unfamiliar agent action becomes a potential alert. And you cannot enforce controls safely without the evidence that observation and detection provide.
| Stage | What It Answers | Requires From Previous Stage | Enables Next Stage |
| 1. Observe | What AI agents exist in my clusters and what do they actually do at runtime? | — (starting point) | Runtime behavioral data for posture assessment |
| 2. Posture | What’s the gap between declared permissions and observed behavior? | Observability data showing actual runtime activity | Behavioral baselines for accurate threat detection |
| 3. Detect | Is this new behavior an AI-specific attack or a legitimate workflow change? | Behavioral baselines defining “normal” agent activity | Confirmed threats informing enforcement decisions |
| 4. Enforce | Can you constrain agents to observed behavior without breaking production? | Baselines + detection data proving what’s normal vs. malicious | Production-safe least privilege per agent |
Each stage has its own implementation depth that goes well beyond what a summary table can convey.
The Observe stage alone involves automatic runtime discovery of AI agents, inference servers, and MCP tool runtimes, plus generation of a runtime-derived AI-BOM and full agent execution graph mapping — all of which is covered in the walkthrough on how runtime observability actually works for AI agents in Kubernetes.
The Posture stage introduces the concept of comparing what agents can do versus what they actually do — a distinction that runtime-informed AI security posture management makes possible but that static scanning structurally misses.
The Detect stage requires understanding why AI-specific threat detection differs fundamentally from traditional container security alerting — agents don’t just execute known-bad binaries, they generate novel behaviors that require behavioral context to evaluate.
And the Enforce stage depends on progressive sandboxing that promotes observed baselines into enforcement policies rather than starting with restrictive rules that break production.
If you want the complete methodology with evaluation criteria and technical specifics for each stage, the 4-pillar approach to AI agent security covers that ground in depth. The rest of this article focuses on what that guide doesn’t: how to build the organizational program around this methodology — the timelines, ownership, engineering coordination, and success metrics that turn a framework into a running security program.
Framework understanding doesn’t automatically translate into a running security program. Between “we know the methodology” and “our agents are secured in production” sits the organizational work that governance frameworks leave out: concrete timelines, team ownership, engineering buy-in, and measurable milestones that prove the program is working.
| Phase | Timeline | Key Activities | Deliverables |
| Phase 1: Discovery Sprint | Weeks 1–2 | Deploy eBPF-based runtime sensor across clusters. Auto-discover AI agents, inference servers, MCP tool runtimes. Generate initial AI-BOM (runtime bill of materials). Map agent execution graphs: Agent → Tool → API → Data → Identity. | Complete AI workload inventory. Shadow AI identified. Behavioral data collection active. |
| Phase 2: Posture Assessment | Weeks 3–4 | Compare declared permissions vs. observed behavior. Identify excessive access (47 APIs available, 3 actually used). Run AI supply chain risk scan. Establish behavioral baselines per workload. | Risk gap analysis. AI-specific vulnerabilities with runtime context. Behavioral baseline v1. |
| Phase 3: Detection Tuning | Month 2+ | Enable AI-specific detection: agent escape, prompt injection, tool misuse. Tune against baselines to reduce false positives. Configure attack story correlation aggregating signals across layers. | AI-aware alerting operational. False positive rate measurable. Context-rich investigation workflow. |
| Phase 4: Progressive Enforcement | Month 3+ | Start in visibility-only mode. Validate generated policies against behavior. Gradually promote to enforcement. Differentiate policies per agent type. | Production-safe least privilege per agent. eBPF sandboxing active. Continuous refinement. |
Two things make this timeline realistic rather than aspirational.
First, Phase 1 delivers value immediately. Discovery and initial behavioral visibility provide actionable intelligence from day one. Your security team can see which AI agents exist in your clusters, which ones they didn’t know about, and what those agents are doing in production. The Gravitee report found that 47% of enterprise AI agents operate without any security oversight or logging. For most organizations, simply knowing what exists and what it’s doing represents a meaningful security improvement over the current state — and it gives you the data to make the case for Phases 2 through 4.
Second, enforcement doesn’t require writing policies from scratch. By the time you reach Phase 4, you’ve accumulated weeks of behavioral data showing exactly what each agent does — which tools it calls, which APIs it reaches, which network destinations it contacts, which data it accesses. Policies are generated from evidence, not guesswork. This is what solves the paralysis that stalls most AI agent enforcement programs: you know you should enforce least privilege, but you can’t write the policies because you don’t yet understand what the agents need. The observe-first workflow eliminates that problem by definition.
AI agent security sits at the intersection of security and platform engineering. In most organizations, the security team owns detection rules, enforcement policy, and incident response, while the platform team owns sensor deployment, cluster management, and observability infrastructure. This is not a security-only initiative — deploying runtime monitoring across Kubernetes clusters requires the people who manage those clusters.
Getting platform engineering buy-in comes down to two things: performance data and workflow design. eBPF-based runtime monitoring runs at 1–2.5% CPU and 1% memory overhead — within the performance budget most platform teams already accept for observability tooling. And the observe-first approach means zero production disruption during Phases 1 and 2. No policies are enforced, no traffic is blocked, no agents are constrained. Enforcement only activates in Phase 4, after behavioral validation confirms that generated policies match observed behavior. Nobody is asked to accept controls that might break their agents on day one.
Complementary, not replacement. Your existing posture tools continue handling infrastructure-level findings — misconfigured storage, overprivileged IAM roles, unpatched images. The AI agent security program adds the runtime behavioral layer that CSPM structurally cannot provide: visibility into what agents actually do versus what they’re configured to do. If your CSPM surfaces 500 findings on AI workloads, most are theoretical. Runtime context reduces that noise by surfacing only the findings that represent actual risk — vulnerabilities in packages that are loaded, in code paths that execute, in workloads that are externally reachable.
Start with discovery metrics: how many AI agents were found that the security team didn’t know about? Then noise reduction: runtime reachability analysis should reduce CVE findings by 90% or more, eliminating theoretical vulnerabilities that aren’t reachable in your runtime environment. Track investigation time: attack story correlation should compress what used to take hours of manual signal assembly into minutes of reviewing a unified narrative. And track enforcement coverage: what percentage of your AI agents have active behavioral policies? These metrics give leadership quantified evidence that the program is reducing risk — not just deployed, but measurably working.
The Observe → Posture → Detect → Enforce methodology applies regardless of where your AI workloads run. But each cloud platform and regulated industry adds implementation considerations that affect how you execute each stage.
Every major cloud provider now offers native AI security capabilities — GuardDuty for SageMaker on AWS, Defender for AI on Azure, Security Command Center for Vertex AI on Google Cloud. These are valuable, particularly for the Observe and Posture stages. But the critical question for each platform is: does the native tooling provide runtime behavioral visibility into AI agents, or just posture scanning of the infrastructure they run on? For production agents requiring behavioral baselining and progressive enforcement, teams typically need specialized tooling that provides the full Observe-through-Enforce sequence.
The implementation details differ meaningfully by platform. On AWS, the way GuardDuty’s SageMaker coverage interacts with EKS pod-level monitoring and IAM roles for service accounts creates specific integration patterns that affect how each framework stage deploys on EKS. Azure teams face a different set of challenges — Defender for AI’s coverage boundaries, managed identity for agent workloads, and Azure Policy’s role in the Enforce stage mean that AKS implementations require their own approach to the dependency chain. On Google Cloud, Vertex AI’s native features handle parts of the Observe stage well, but Workload Identity Federation introduces identity management nuances that change how the Detect and Enforce stages operate on GKE.
The hardest implementation challenge belongs to multi-cloud teams. Consistent observability across providers, unified policy management, and cross-cluster behavioral correlation — understanding how an agent on EKS interacts with services on AKS — requires a unifying runtime layer that isn’t tied to any single cloud’s native tooling. Without that layer, you end up with three separate, inconsistent implementations of the same methodology.
Regulated industries don’t change the methodology — you still observe before you posture, posture before you detect, and detect before you enforce. But they add specific outputs at each stage: continuous monitoring mapped to regulatory controls, audit-ready evidence, data residency constraints, and industry-specific detection rules.
Financial services teams, for example, need model risk management integrated into the Observe stage, SOX and PCI controls layered into Posture, suspicious activity reporting tied to Detect, and audit evidence generated from Enforce. The quantified success metrics from the previous section matter doubly here — board-level communication of AI security program maturity requires numbers, not narratives.
Healthcare organizations face a different compliance dimension: HIPAA-specific observability around PHI flow tracking through agent interactions, posture controls mapped to administrative and technical safeguards, and enforcement evidence that satisfies auditor requirements. The way PHI data flows through AI agents in Kubernetes creates detection and enforcement requirements that general-purpose tools miss entirely.
Every phase in the program timeline above maps to a specific ARMO platform capability. Phase 1’s runtime discovery uses ARMO’s Kubernetes-first AI workload detection and AI-BOM generation. Phase 2’s posture assessment uses runtime-informed gap analysis comparing declared permissions against observed behavior. Phase 3’s detection uses ARMO’s Cloud Application Detection & Response with LLM-powered attack story correlation that assembles individual signals — a suspicious prompt, an unusual tool call, an unexpected network connection — into a single narrative. Phase 4’s enforcement uses eBPF-based progressive sandboxing that promotes behavioral baselines into production-safe policies.
That alignment isn’t coincidental. ARMO built its AI workload security capabilities around this same dependency chain — specifically to be the operational layer that closes the governance-implementation gap. The quantified outcomes: 90%+ CVE noise reduction through runtime reachability (Phase 2). 90%+ faster investigation through attack story correlation (Phase 3). 80%+ reduction in issue overload through runtime-based prioritization. All at 1–2.5% CPU and 1% memory overhead.
The platform is built on Kubescape, one of the most widely adopted open-source Kubernetes security tools available, used by over 100,000 organizations. That open-source foundation means your team can start Phase 1 with Kubescape before committing to the full platform — proving value with free runtime observability, then expanding into posture, detection, and enforcement as behavioral data accumulates. Kubernetes-native, eBPF-based, no sidecars, no code changes.
To see how each program phase maps to your environment: book a demo or start Phase 1 with Kubescape.
What’s the difference between AI agent security and traditional container security?
Traditional container security works because workloads are deterministic — they make the same calls and follow the same paths every time, so policies can be written from static configuration. AI agents break that assumption by executing generated code, traversing permissions dynamically, and behaving differently from one interaction to the next. That non-deterministic behavior means runtime behavioral observation becomes the foundation of security rather than something layered on after policies are written.
Can I secure AI agents without runtime monitoring?
You can apply baseline hygiene — network policies, RBAC, image scanning — but you can’t achieve meaningful least privilege. A deployment manifest might declare access to 47 APIs while the agent only uses 3 in normal operation. Without runtime data showing which 3, you’re either leaving all 47 open or guessing which to restrict.
What is an AI-BOM and why does it matter?
An AI-BOM (AI Bill of Materials) is a runtime-derived inventory of what an AI workload actually uses — the models it loads, RAG sources it connects to, external APIs it calls, and frameworks it runs on. The gap between what’s declared in a deployment manifest and what’s exercised at runtime is where the biggest risks hide. Without this inventory, your posture assessment compares declarations against other declarations rather than measuring actual risk.
How does this framework handle shadow AI?
Shadow AI is the primary reason Phase 1 (Observe) must come first. The framework uses automatic runtime discovery to detect AI agents, inference servers, and MCP tool runtimes as they appear across clusters — without requiring developers to tag or register workloads. Industry data shows 47% of enterprise AI agents operate without any security oversight, so discovery that depends on manual registration misses exactly the workloads that represent your biggest blind spot.
Your engineering lead is in your office Thursday morning. They want to push an AI...
Security teams deploying AI agents into Kubernetes know they need behavioral baselines. The concept is...
A missing null check in libssh’s SFTP directory listing code lets a malicious server crash...