Get the latest, first
arrowBlog
How to Harden AI Agents in Cloud Environments: The 9 Capabilities Your Stack Must Provide

How to Harden AI Agents in Cloud Environments: The 9 Capabilities Your Stack Must Provide

May 6, 2026

Shauli Rozen
CEO & Co-founder

Key takeaways

  • Why does CIS-Benchmark-style hardening miss the point for AI agents? CIS Benchmarks check configuration before runtime. AI agents introduce non-deterministic behavior no pre-runtime configuration can fully constrain — which is why "hardened" environments still ship agents that exfiltrate through allowed channels.
  • What are the four phases of AI agent hardening? Declare (set baselines and inventory before runtime), Observe (build per-agent behavioral baselines from production behavior), Enforce (auto-generate and progressively roll out controls from observed behavior), and Reconcile (continuously detect drift and produce audit-grade evidence). Capabilities that don't map to one of these phases are features, not hardening lifecycle support.
  • Where do most tool stacks have hardening gaps? Phases 2 through 4. Most categories cover Phase 1 well because Phase 1 looks like traditional posture management. The gap opens the moment the agent starts running and behavior becomes the source of truth.

Most “hardening” advice for AI agents is a checklist of things to configure before the agent runs. CIS Kubernetes Benchmark gates. Pod Security Standards baselines. NetworkPolicy templates. None of it’s wrong — it’s just one of four phases, the one your stack already covers.

The other three are Observe, Enforce, and Reconcile. They’re where AI agents actually get breached, and they’re where most stacks have nothing. The gap is invisible on a CIS report and obvious in an incident postmortem — which is the wrong order to find it.

Consider a customer-support AI agent runs on EKS at a mid-market SaaS company. CIS Kubernetes Benchmark passes. NetworkPolicy locks egress to three allowlisted SaaS endpoints — Salesforce, Zendesk, and the company’s S3 export bucket. Per-Pod seccomp profiles in place. Six weeks in production, zero alerts.

In week seven, an attacker submits a customer support ticket with instructions buried in the body. The agent reads it, follows the instructions, and writes 2,400 customer records to the S3 bucket — the same bucket the legitimate weekly export uses. Every dashboard stays green.

Every Phase 1 control did exactly what it was designed to do. The breach lived in territory CIS doesn’t have a category for: drift in agent behavior, exploited through a sanctioned channel. The other three phases are where the gaps live. The audit list below is how you find them.

Hardening as a Four-Phase Lifecycle

Phase 1 — Declare. What you set up before the agent runs: runtime AI inventory, per-agent class security baseline, capability map of attached tools, models, MCP servers, and prompt templates. Most CSPM tools cover this phase competently — it looks like traditional posture management. Phase 1 is necessary, not sufficient.

Phase 2 — Observe. Per-agent behavioral baseline plus a reconciliation report comparing what the agent declared at Phase 1 against what it exercised in production. Per-Pod baselines fragment under autoscaling; cluster-level baselines lose per-agent variance. The right unit is the Deployment — stable across replica churn, specific enough to encode per-agent behavior. ARMO’s Application Profile DNA is the canonical instance.

Phase 3 — Enforce. NetworkPolicies, seccomp envelopes (not tight profiles — inference workloads vary too much to survive a model update), and IAM scope tightenings derived from observed behavior. Rolled out per agent class first in monitor-only mode, then enforcement, with the blast radius measured before reaching the next agent class. ARMO’s observe-to-enforce methodology walks through the pattern.

Phase 4 — Reconcile. Every traditional hardening framework — CIS Benchmarks, STIG, NIST 800-53 — covers Phase 1. Continuous-monitoring frameworks (Cloud Controls Matrix, parts of SOC 2) reach into Phases 2 and 3. None have Phase 4. None were built for workloads whose behavior changes between deploys.

Reconcile is the continuous loop. Benign evolution (model updates, prompt revisions, tool catalog expansions) has to be separated from drift (compromise, prompt injection residue, dependency tampering) — and the residue of indirect prompt injection (OWASP LLM01:2025) often shows up as legitimate-looking tool use (LLM06:2025) days later. The phase produces both the drift signal and the cross-layer audit trail a regulator might one day demand. It maps to the Manage function in the NIST AI Risk Management Framework.

Capabilities that don’t map to a phase are features, not hardening lifecycle support.

The Nine-Capability Hardening Demand List

Two capabilities define what your stack must declare before runtime, three what it must observe, two what it must enforce, two what it must reconcile.

#CapabilityPhaseEvidence the capability producesTool category that typically fails this
1Runtime-Derived AI InventoryDeclareInventory of what the agent actually loads at startup — model identity, framework version with applied patches, attached MCP servers, prompt template versionDeclared-manifest scanners, SCA tools, CSPM with an AI tag
2Per-Agent Class Security Baseline DerivationDeclareDifferentiated security baseline per agent class regardless of namespace topologyCluster-level admission controllers without per-agent variance
3Per-Agent Behavioral Baseline at Deployment LevelObserveBaseline convergence chart showing variance decreasing toward a stable envelope per DeploymentTools producing per-Pod baselines or single-cluster baselines
4Declared-vs-Observed Scope ReconciliationObserveReconciliation report sorting findings into Unused Excess vs Inherited Overreach categoriesCIEM tools producing single-axis “excess” findings
5Observation-Window Risk ClassificationObserveTrack classification mechanism for high-regulatory-cost agents (pre-enforced track for staging-built baselines)Tools with a single observation methodology
6Auto-Generated Enforcement ArtifactsEnforceNetworkPolicies, seccomp envelopes, and IAM scope tightenings derived from observed behavior with the trace each was derived fromTools that recommend policies without generating them, or require manual templating
7Per-Agent Enforcement Granularity at Pod LevelEnforceTwo different agents in the same cluster running under different auto-generated syscall envelopes simultaneouslyGeneric eBPF tools (Tetragon, Falco) with manually authored TracingPolicies
8Drift Classification That Separates Evolution From ThreatReconcile30-day classification broken into expected evolution (correlated with deploy events) vs risky drift (uncorrelated)Anomaly-detection tools without deployment-correlation logic
9Cross-Layer Audit Trail That Survives a Regulatory ReconstructionReconcileAudit row tying kernel events, network egress, application-layer prompt content, and tool invocation sequences to a triggering inputSIEMs operating on disconnected alert streams without cross-layer correlation

Phase 1 capabilities (1–2). Declared manifests omit dynamically loaded models, side-loaded MCP servers, and framework-version drift the agent picks up at startup. The runtime inventory has to come from what the agent actually loads. A code-execution agent is not the same security profile as a read-only retrieval agent; namespace-level Pod Security Standards profiles can’t encode per-agent variance.

Phase 2 capabilities (3–5). Per-Pod baselines fragment because Pods are ephemeral; per-Deployment baselines converge because the Deployment is stable. The reconciliation report has to sort findings into Unused Excess and Inherited Overreach — single-axis CIEM “excess” findings miss the second category, where the agent’s effective scope lives. For agents handling regulated data, time spent being observed in production is itself a compliance exposure.

Phase 3 capabilities (6–7). Hand-authored policies for AI agents lag behavior by definition. Generic eBPF tools support per-Pod TracingPolicies, but the policies require manual authoring — intractable at scale for non-deterministic workloads. Capability 7 is whether the tool generates differentiated profiles per agent class from observed behavior, without manual templating.

Phase 4 capabilities (8–9). A working drift classification combines three signals: deployment correlation, pattern continuity, resource bounds. Single-axis anomaly scoring fails this, as ARMO’s intent drift detection walks through. The audit trail is a different artifact — what a regulator’s investigator pulls three months later, tying behavior to its triggering input across kernel, network, application, and tool invocation layers.

Reading the Capability Map: Where Each Tool Category Stops

No product fits cleanly into one category. Wiz and Prisma Cloud span CSPM, CNAPP, and CWPP; Sysdig and Aqua span CWPP and runtime detection. Score whatever covers a category in your stack against the four phases.

Tool categoryPhase 1 — DeclarePhase 2 — ObservePhase 3 — EnforcePhase 4 — Reconcile
CSPMPartial — declared posture onlyNoneNoneNone
CNAPPPartial — declared posture plus some runtimePartial — workload-level baselinesPartial — namespace-level enforcementNone — alert correlation, not drift classification
CWPPPartial — declared posturePartial — process-level baselines that fragment for AI agentsStrong — pod-level enforcementPartial — anomaly detection without lineage
Generic eBPF (Tetragon, Falco)NonePartial — kernel telemetry without AI awarenessStrong — kernel-level enforcement, manual policiesNone — semantic gap between syscalls and intent
AI-aware runtime CADRStrong — runtime AI-BOMStrong — per-Deployment behavioral baselinesStrong — auto-generated, per-agent enforcementStrong — drift classification plus cross-layer audit

The fifth row is the architectural shape of an AI-aware runtime CADR — the category that has to cover all four phases, because none of the first four were built for non-deterministic workloads. The demand list is the architectural specification.

Where the Demand List Shifts for Managed Agent Platforms

If your agents deploy through Bedrock Agents, Vertex AI Agent Builder, or Azure AI Foundry, three capabilities have different evidence demands. Phase 1 inventory comes from provider APIs, not Kubernetes manifests. Phase 3 enforcement targets shift from kernel-level seccomp to provider IAM. Phase 4 reconciliation depends on provider-side telemetry; if the platform doesn’t surface fine-grained tool invocation telemetry, no third-party tool can produce the cross-layer audit trail. The capability map shifts toward “architecturally constrained by provider” rather than “fully covered by stack.”

The Five-Question Vendor Diagnostic

After the live demo, the question changes from “what does this tool do” to “which of the nine capabilities does it actually support.” The AI workload security tool evaluation checklist runs in the demo. This runs after.

Question 1 — Show me the runtime-derived AI inventory for this agent at 14:32 last Tuesday

Feature-level: a dashboard showing the model identity declared in the deployment manifest, framework version pinned in requirements, and tools listed in the agent config — all declared, all pre-deployment.

Lifecycle-support: the runtime inventory captured at the timestamp — model identity actually loaded by the running process (which can differ from manifest if a fallback path triggered), framework version with patches at startup, tool catalog including side-loaded MCP servers, and prompt template version. The same query against today surfaces what’s changed. If the inventory is the manifest, the vendor doesn’t have Phase 1.

Question 2 — Show me the per-agent behavioral baseline convergence chart at Deployment level over the past 14 days

Lifecycle-support: a chart with measurable convergence — variance decreasing toward a stable envelope. Feature-level: an alert stream. Alert streams aren’t baselines.

Question 3 — Show me two different agents in the same cluster running under different auto-generated syscall envelopes, with the trace each was derived from

If the vendor shows one recommended profile per cluster, or per-Pod profiles requiring manual templating per agent class, they don’t have Capability 7. Lifecycle support means two agents, two envelopes, traceable to two distinct baselines.

Question 4 — Show me your tool’s drift classification for this agent over the past 30 days, broken into expected evolution and risky drift

A real answer separates the two by deployment correlation: changes correlated with a model update or prompt revision are evolution; uncorrelated changes are drift. Feature-level treats every change as an anomaly to triage.

Question 5 — Show me the audit row your tool produced when an enforcement artifact violation was correlated to a cross-layer attack story

Lifecycle-support: a single audit entry tying kernel events, network egress, application-layer prompt content, and tool invocation sequences to a triggering input. Feature-level: disconnected alerts the SOC has to correlate manually.

The Audit Becomes Continuous

The demand list is not a one-time audit. It’s the diagnostic you run quarterly on a stack that evolves under you, against agents whose behavior evolves under both of you. ARMO has previously argued the case for runtime context in why traditional cloud security fails for AI workloads. For regulated environments, the financial services application goes deeper on what regulators ask for. The complete buyer’s guide collects this analysis alongside the framework critiques and per-cloud evaluations it builds on.

What you’ve just walked through is what an AI-aware runtime CADR has to do across all four phases. ARMO’s Cloud Application Detection and Response platform is built around it — runtime AI inventory, Application Profile DNA, auto-generated enforcement, drift classification, and the cross-layer audit trail. To see it on a live cluster, ARMO’s AI workload security platform is the place to start, or book a demo to walk your stack through the nine capabilities.

Frequently Asked Questions

How long does the Phase 2 observation window typically take to converge?

Stable agents — read-only retrieval, well-bounded customer-support — converge in two to four weeks of production traffic. Code-execution agents and multi-agent orchestrators may not fully converge, which is why the tool needs an aging envelope, not a single-snapshot baseline.

Can I run Phase 3 enforcement without first completing Phase 2 observation?

Not for most agents. Auto-generated enforcement depends on baselines from observed behavior, so production observation comes first. The exception is regulated environments where the observation window itself carries regulatory cost; a pre-enforced track using staging-built baselines is the alternative, and Capability 5 makes it possible.

Where does the Phase 1 runtime AI-BOM come from if my agents use externally hosted models?

The runtime AI-BOM captures the call envelope, not model weights. For Bedrock or Vertex, the inventory includes model identity, API endpoint, credential scope, and the behavioral profile of calls. The model itself sits in the provider’s infrastructure and is out of scope; everything around it is in scope and capturable from your side.

What happens when an agent calls a SaaS vendor that hosts its own inference?

Capability 9 shortens. Once the call leaves your infrastructure, the audit trail tracks the call envelope rather than the inference itself. Your tool needs faithful capture of the envelope plus correlation back into the agent’s action chain.

How does the demand list interact with my SOC’s existing detection stack?

The two intersect at Capabilities 8 and 9. Drift classification feeds the SOC as a categorically different signal than per-event anomaly alerts; the cross-layer audit trail is what an incident reconstruction pulls. The demand list adds evidence the SOC’s tools don’t natively produce.

Close

Your Cloud Security Advantage Starts Here

Webinars
Data Sheets
Surveys and more
Group 1410190284
Ben Hirschberg CTO & Co-Founder
Rotem_sec_exp_200
Rotem Refael VP R&D
Group 1410191140
Amit Schendel Security researcher
slack_logos Continue to Slack

Get the information you need directly from our experts!

new-messageContinue as a guest