Why Your Detection Latency Budget Determines Blast Radius
Most teams buy detection on a single number. The datasheet says “millisecond detection,” the proof-of-concept...
May 29, 2026
An AI agent moving laterally through a Kubernetes cluster does not look like an intrusion. There is no foreign process, no exploit, no dropped binary — just the agent using the identity, network routes, and tools it was handed at deployment to reach targets it was technically allowed to touch. That is the entire problem. The controls you run were built to catch an outsider pivoting from host to host. This is the opposite case: the insider you provisioned, moving on the access you already granted it — and the controls built for the outsider never fire.
And it is the teams that did everything right who are most exposed. Network segmentation allows the connection because the policy permits it. Posture management stays green because nothing is misconfigured. Per-agent application logs look clean because the agent is calling its own tools. The detection stack is watching, and it is watching the wrong thing.
Most coverage of this stops at “watch for east-west traffic,” which is why most detection programs only ever see fragments of the movement. The tactic actually travels as three different hops, each one surfacing in a different place. Here is how they assemble into one sequence you can alert on.
Lateral movement gets discussed as if it were one behavior, but for an AI agent it arrives as three distinct kinds of hop. The distinction that matters is not the mechanism — at the wire, a tool call to an internal service is also a network connection — but the signal that betrays each one and the place that signal shows up. Defined that way, the three stop overlapping — and that separation is what makes each one detectable.
The network hop is a connection to a service the agent has never contacted, with no corresponding tool call or identity change to explain it: namespace hopping, a new internal destination, east-west traffic to a peer outside the agent’s established pattern. The signal lives in kernel-level network telemetry — a connect to an IP and port that does not appear in the agent’s history. This is the form of movement MITRE ATT&CK treats as reaching across remote services, and it is the hop segmentation tools are built to prevent. But preventing is not detecting — and the connection that matters is the one the policy already permits, which segmentation waves through without a sound.
The identity pivot is a change in which credential or scope the agent is using: a service account token reused against the Kubernetes API to enumerate adjacent namespaces, a role assumed for the first time, a Kubernetes-service-account-to-IAM exchange, or a reach to the instance metadata service (IMDS) for cloud credentials that carry broader permissions than the pod’s own. The signal lives on the identity-and-action layer — Kubernetes audit events and cloud IAM streams. A first-time role assumption that does not track a deployment is one of the highest-confidence signals available anywhere in this tactic.
The tool-and-delegation reach is an authorized tool, or a handoff to another agent, used to touch an internal target the agent has never touched: the customer-records tool fired immediately after a knowledge-base lookup it would never normally follow, or a delegation edge to a second agent carrying an instruction the first was never meant to pass. The signal lives on the tool-invocation and cross-agent surfaces, and it is the one form of the movement that per-agent application logging actually sees.
Notice where these land. The identity pivot and the tool reach show up on the agent’s decision surfaces; the raw network hop appears only in the runtime telemetry beneath them. Lateral movement does not respect a tidy taxonomy of surfaces — it threads the decision surfaces and the kernel telemetry together, which is exactly why it is the tactic most likely to slip a program that instruments only one of them. Holding all three hops in view at once requires a baseline that already knows the agent’s normal reach across all three. ARMO’s Application Profile DNA builds that baseline at the deployment level — a behavioral envelope capturing the network destinations, identity usage, and tool and API graph each agent actually exercises in normal operation — so a hop of any kind has something concrete to stand out against.
Each of the three hops is visible to some tool. The problem is that no single tool class sees more than one, and the tools that see one are blind to the others entirely. Run the movement past the stack most teams actually operate and the coverage falls apart by hop.
| Hop type | Segmentation (NetworkPolicy) | Posture & attack-path | Per-agent app logging | Cross-surface runtime correlation |
| Network hop | Prevents — no detection event | Potential reach only | Blind — below the SDK | Detects |
| Identity pivot | Blind | Misconfiguration only | Often blind — below the SDK | Detects |
| Tool-and-delegation reach | Blind | Blind | Detects | Detects |
Network segmentation belongs in the prevention column, not the detection column. A correctly written NetworkPolicy can stop a network hop, but once the connection is permitted it produces no detection event — and it has nothing to say about an identity pivot or a tool reach that never opened a new connection. Posture and attack-path analysis map what an agent could reach if compromised; that is potential, computed before anything happens, and it goes silent on movement in progress. Per-agent application logging — the approach the field increasingly recommends — captures the tool-and-delegation reach well, because tool calls route through the agent’s framework. But the network hop and the identity pivot frequently never touch that framework: a stolen token used against the Kubernetes API and a raw cross-namespace connection both happen below the layer the SDK can see.
For a low-autonomy agent with one tool and no cluster permissions, a single surface may genuinely be enough — the movement has nowhere to go. The gap opens as autonomy and reach grow, and that is exactly the agent profile shipping to production now.
So the common program — single-surface coverage, usually the identity layer because IAM is already a discipline — ends up holding one hop type and missing two. It is not that the other hops are silent. They are loud, on surfaces nobody is correlating. The team has fragments of an attack scattered across three consoles and no reason to connect them, because each fragment, read alone, is an action the agent was allowed to take.
If the hops are loud but scattered, detection is not a matter of better single-surface rules. It is a matter of assembly: pulling the network event, the identity event, and the tool event together into one sequence attributable to one agent, and judging the sequence rather than its parts.
The join key is the agent’s identity. Every signal — a connect from kernel telemetry, a role assumption from the IAM stream, a tool invocation from the framework layer — has to carry a per-agent identity tag rather than dissolving into a shared service account, so the three can be recognized as the same actor doing three things in succession. Without that attribution, correlation has nothing to anchor on and falls back to joining alerts by timestamp, which is guesswork.
What you score is the sequence, not the event. The agent reads a token it has never read, assumes a role it has never assumed, opens a connection to a namespace it has never contacted, and invokes a tool against a record set outside its normal scope — each step permissible, the order and span unprecedented. That is the whole signal. Scope, sequence, and rate are what carry it: a chain of permitted actions the agent has never strung together, executed faster or wider than its baseline. Because the judgment is made against a behavioral envelope rather than a fixed allowlist, it tolerates the non-determinism that makes AI agents resist static rules — the agent is allowed to behave variably; it is not allowed to break the shape of its own history.
One correlation cuts most of the false positives: deployment events. Legitimate change in an agent’s reach almost always tracks a rollout, a restart, or a configuration change. A capability that appears with no infrastructure event behind it is the suspicious case. We have previously drawn this same line between expected evolution and risky drift, and it applies directly here — a new role that correlates with a deploy is an upgrade; the same role assumed out of nowhere is a pivot.
This assembly is the job ARMO’s CADR layer is built to do: it correlates signals across the cloud, Kubernetes, kernel, and application layers into a single attack story with a timeline and the entities involved, so the movement arrives as one narrative instead of three disconnected alerts a SOC analyst stitches together by hand. The baseline it scores against is a runtime-derived inventory of what the agent normally reaches — the internal APIs, identities, and destinations actually exercised in production rather than the ones declared in a manifest. The assembled story then sorts cleanly into a three-tier read: a benign event that resembles an attack, an attempt that failed, or an active movement with observable impact — the difference between a page at 2 a.m. and a note in the morning.
Detecting the movement is only useful if the response fits the hop that carried it. Killing the pod is the reflex, and it is often the wrong move — it destroys forensic state and may take down a workload the business depends on, while doing nothing about a stolen credential that already works elsewhere. Containment has to match the kind of hop the agent made. An identity pivot is contained by revoking or rescoping the credential, not the pod. A network hop is contained by tightening egress to known-good destinations. A tool reach is contained by revoking the tool’s scope. An agent spreading actively is contained by per-agent quarantine that isolates the one workload without collapsing the service around it.
This is where prevention and detection divide cleanly. A NetworkPolicy prevents the network hop and stays silent on the identity and tool hops; detection reaches the hops prevention cannot see. The durable pattern is to observe each agent’s real behavior first and promote those observations into enforcement scoped to that agent — so containment, when it fires, is precise rather than blunt.
Lateral movement is the tactic that most exposes a single-surface detection program, because it is the one tactic that refuses to stay on one surface. The network hop, the identity pivot, and the tool reach each land in a different place, each one authorized, each one quiet on its own — and a program watching any one of them holds fragments of a movement it will never assemble. The work is to attribute the hops to one agent, score the sequence against that agent’s own baseline, and let containment follow the hop the agent made. Teams running agents in production can see where their own coverage fragments by walking a live cluster against this model — ARMO’s platform for cloud-native security for AI workloads runs the full path, from runtime telemetry through per-agent baselines to cross-surface correlation, against a real deployment.
How do I detect lateral movement when every hop the agent makes is authorized?
You stop trying to catch a single bad action and start scoring the sequence. Attribute the network, identity, and tool events to one agent through a per-agent identity tag, then judge the chain against that agent’s behavioral baseline by scope, order, and rate. An individual permitted action is not the alert; a chain of permitted actions the agent has never strung together is.
Won’t NetworkPolicies stop lateral movement on their own?
A well-written NetworkPolicy prevents the network hop, which is genuinely useful, but prevention is not detection. Once a connection is permitted, the policy produces no event, and it sees nothing of an identity pivot or a tool reach that never opens a new connection. Segmentation belongs in your prevention layer; you still need detection for the hops it cannot observe.
Is per-agent application logging enough to catch it?
Application logging captures the tool-and-delegation reach well, because tool calls run through the agent’s framework. It tends to miss the network hop and the identity pivot, which happen below the framework — a raw cross-namespace connection and a stolen service account token used against the Kubernetes API never pass through the SDK. Logging is a necessary surface, not a sufficient one.
How is lateral movement different from agent escape?
They sit on different axes. Agent escape is vertical — breaking out of the container or sandbox boundary through the kernel. Lateral movement is horizontal — reaching across peer workloads, namespaces, and services, with or without an escape having happened first. We treat agent escape detection as its own discipline for that reason.
What’s the minimum telemetry to start detecting it?
Kernel-level network and identity telemetry, a per-agent behavioral baseline, and a correlation layer that can join signals across them on agent identity. Kernel telemetry gives you the network and identity hops the application layer misses; the baseline tells you what is normal for each agent; the correlation layer assembles the sequence. Any one alone produces scattered signals; together they produce the movement.
Most teams buy detection on a single number. The datasheet says “millisecond detection,” the proof-of-concept...
The first time a security team needs an AI agent audit trail is usually 72...
Every AI-SPM tool runs posture and detection with a single arrow: runtime evidence flowing back...