AI Workload Baseline and Drift Detection: Defining “Normal” Agent Behavior
Security teams deploying AI agents into Kubernetes know they need behavioral baselines. The concept is...
Apr 10, 2026
A platform security engineer gets an alert at 2:14 a.m. One of the LangChain agents running in their production Kubernetes cluster has produced an execution graph with eleven nodes, seven tool calls, and an egress edge to a domain that is not in the agent’s approved integration list. The chain is fully rendered in their console. Every signal is there. The engineer has roughly thirty seconds to decide whether to page the incident commander, open a ticket to investigate during business hours, or document the event and go back to sleep.
This is not a detection problem. The detection stack already fired. The signals are assembled and the graph is waiting. What is missing is the decision loop that sits between the assembled graph and the action the engineer has to take. We have previously mapped the five-layer observability stack that produces the graph in the first place, and we have previously walked four AI-specific attack chains to show what each detection layer catches across every stage of an incident. Neither piece teaches the decision that comes after.
That decision — page, investigate, or document — is the subject of this article. The framework that collapses the decision into thirty seconds is a three-tier classification: info-only, attack attempt, and active attack. Each tier carries a distinct graph signature, a prescriptive first action, and an escalation threshold that tells the engineer when to flip from one tier to the next. Three branches, three actions, one runbook.
Most security tooling for AI agents stops at assembly. The graph gets rendered, the attack story gets generated, the cross-layer correlation fires, and the signal lands in front of the engineer with an implicit expectation: figure out what to do with this. That expectation is where AI agent triage breaks down, because AI agents do not behave like the workloads a SOC is used to. A chain that looks alarming may be a legitimate admin debugging a tool. A chain that looks benign may be the fourth in a probing sequence an attacker has been running for six hours.
Visibility has become a commodity. Putting a sensor in the environment is easy; explaining why an alert came up — or why it did not come up — is where the work lives. But explanation is only half the value. The other half is the decision the engineer makes once the explanation lands. A SOC that understands every alert and pages for all of them is no better off than one that pages for none. The goal is to collapse the decision into a predictable action in under a minute, every time.
That collapse requires a framework. The framework requires the agent’s behavioral baseline to be learned, maintained, and queryable at the moment of decision — because the distinction between the three tiers is not in the raw signal, it is in how the signal compares to what the agent has done before.
The three-tier framework is not a severity scale. Severity describes how bad something is; the three tiers describe what the engineer should do about it. An info-only chain can be technically interesting and still not warrant a page. An attack-attempt chain can be a probe so early in the kill chain that its damage score reads low, and it still warrants immediate investigation.
The three tiers are:
Assigning an incoming chain to the correct tier requires comparing the chain against the agent’s learned baseline. ARMO’s Application Profile DNA builds that baseline across the agent’s normal tool calls, API interactions, data access paths, and egress destinations, and the classification is what the comparison produces. Without a baseline, the attempt tier collapses: everything unusual looks either benign (which misses real probes) or malicious (which burns out the SOC). The middle tier only exists as a real category when the engineer can answer the question “has this agent ever done something like this before” without guessing.
The three sections that follow walk one tier at a time — the graph signature that assigns a chain to it, the first action the engineer takes, and the threshold that flips the classification upward.
The canonical info-only chain starts with a legitimate identity doing something the agent’s baseline has not seen before but has no reason to treat as hostile. An admin execs into a pod and runs a few diagnostic commands. A developer triggers a rarely-used tool in staging to validate a fix. A platform engineer runs a one-off query against a dataset the agent normally reads from, just not in this particular combination. The chain looks odd because the activity is odd. It is not an attack.
Info-only chains share three properties on a rendered graph. The root node belongs to an authorized, non-anomalous identity — typically a human or a recognized service account with a legitimate reason to drive the agent. The depth is shallow and the branching factor is small: investigative work produces short fan-outs, not long chains. And every edge carries parameters that sit inside the agent’s historical distribution, even if the specific combination is new. Nothing crosses a boundary the baseline does not already allow. No egress edge lands on a destination the agent has never talked to. No tool call carries arguments shaped like an injection payload.
Document the chain, tag it for business-hours review, and do not page. The documentation matters more than it looks. Info-only chains are training data the next tier’s decisions depend on — the more baseline coverage the team has of what “unusual but fine” looks like for a specific agent, the sharper the attempt classification gets later. A SOC that discards info-only chains as noise is cannibalizing its own future triage accuracy.
An info-only classification flips to Tier 2 when one of three things happens. The same pattern appears across multiple sessions within a short window, which suggests coordination rather than one-off exploration. A second agent starts exhibiting the same pattern, which suggests something upstream is changing the shared context. Or the deviation score from the baseline crosses a defined floor — teams typically set this in the low-single-digit standard deviations above the agent’s historical norm, adjusted for the agent’s own variance. The specific number matters less than the fact that it lives in the runbook and is applied consistently.
Tier 2 is the hardest triage call in the framework and the tier where most AI agent incidents actually live. An attack attempt is a chain that is doing something it should not be doing but has not yet succeeded at doing it. The canonical framing for this category is that someone is at the door and is fishing. The probe is real, the intent is questionable, the damage is zero — for now.
Attempt chains show a distinctive pattern of repetition. The agent makes the same tool call repeatedly with argument variations, each failing differently. Return codes do not match the agent’s historical success distribution. Input strings begin to carry shapes that look like injection attempts — SQL syntax in fields that have never seen SQL, command separators in parameters that have never seen shell metacharacters, path traversal fragments in identifiers that have never seen slashes. Most map to categories in the OWASP Top 10 for LLM Applications, a useful reference for naming the probe pattern in a runbook. The chain fans out toward resources just outside the agent’s baseline envelope: a database table adjacent to the authorized one, an API endpoint in the same namespace but not in the integration list, a service account that shares a prefix with an approved account but is not itself approved. This kind of edge-walking is the early surface form of intent drift — runtime behavior pulling away from what the baseline says the agent should be doing.
The qualifier that keeps the chain in Tier 2 rather than Tier 3 is unresolved. No probe has yet terminated in a successful cross-boundary action. No data has moved. No write has completed. The fishing is active but the line has not caught anything.
Investigate now, do not page. Open a ticket with elevated priority, notify the on-call through a non-paging channel, and begin active investigation immediately. This is the tier where the most common triage mistake happens in both directions — paging for everything burns out the SOC, paging for nothing loses the catch window. Runbook discipline prevents both errors.
Tier 2 flips to Tier 3 at three distinct trigger points. The first is the moment a probing chain resolves to a successful action outside the baseline envelope — a write completes, a read returns data from an unauthorized resource, an egress connection returns bytes. The second is when the same probing pattern appears across multiple agents simultaneously, which suggests the attacker is scripting against the agent fleet rather than poking at a single target. The third is when the arguments in the failing calls begin to look coordinated — sequential parameter values, structured enumeration, patterns consistent with a deliberate exploration strategy rather than opportunistic probing. Any one of the three trips the classification upward.
Tier 3 is the call every runbook is built around. The chain has terminated in a successful boundary crossing, data has moved, and the window for preventing damage has already started closing. Every action from this point is containment-first. Investigation happens after the agent has been isolated, not while it continues to operate.
Active-attack chains contain at least one edge that represents a completed action the agent’s baseline has never produced. The common patterns: a write to a tool that has only ever been used for reads in the agent’s entire history. An egress edge carrying real payload bytes to a destination that does not resolve against the agent’s approved integration list. A tool call whose arguments carry extracted data in plain form. An agent identity used in a way the baseline has never seen that identity used before — a service account invoking an API it has no historical reason to touch. The common thread is completion: the action is not attempted, it is done.
Page the incident commander, contain the agent, trace the blast radius. The containment step is tactical, and this is where the distinction between runtime execution escapes and privilege boundary escapes becomes operationally useful. A runtime execution escape is the case where the agent reaches outside its process, container, or host — cross-container file operations, code planting with elevated privileges, execution outside the original sandbox. A privilege boundary escape is the case where an agent meant to read from a database begins writing to it, or invokes an API it has no authorization to invoke at the application layer. Both are documented under the MITRE ATLAS framework, and we have previously broken down agent escape detection in depth for teams hardening against either case. The two require different containment paths: runtime escapes demand workload isolation and neighbor review; privilege escapes demand permission revocation and a full audit of the access grant. The three-tier framework assigns the chain to Tier 3. The runtime-versus-privilege distinction tells the engineer which containment runbook to open.
Not applicable. Tier 3 is the top of the decision tree. The concept of an escalation threshold belongs to Tiers 1 and 2, where the question is when to move upward. At Tier 3 the only forward motion is into the post-containment investigation and the root-cause review that follows. This is also the tier where cross-layer signal correlation earns its value — we have previously shown how disconnected kernel, container, and application-layer signals can be assembled into a single coherent incident narrative. The work this article extends is the decision the engineer makes once that narrative is assembled. The story is the input. The decision is the action.
A classification framework that lives in a blog post is worthless. The only version of this framework that actually changes how a SOC operates is the version baked into the team’s runbook, tied to specific detection signatures, and referenced during every post-incident review.
The operational conversion is three paragraphs. The first names Tier 1, describes the graph signature in the team’s vocabulary, prescribes the document-and-review action, and names the agreed escalation threshold. The second does the same for Tier 2 — probing signature, investigate-now action, the three flip conditions. The third covers Tier 3 — completed boundary crossing, page-and-contain action, and the two sub-cases for containment path selection. Three paragraphs, one page, readable before every rotation.
Classification is half of the operational loop. The other half is enforcement — the observe-to-enforce path that turns a Tier 2 or Tier 3 verdict into a constraint on the agent’s future behavior. The three-tier framework is the hinge between them.
The framework’s real value shows up in post-incident reviews. Every Tier 3 incident gets walked backward through the classification history: was there an earlier Tier 2 chain that should have been caught? Was there an info-only chain that matched the attempt signature in hindsight? The framework turns the post-mortem into a training signal for the classifier, not just an after-action report. That is what portable explainability looks like — the decision and its rationale travel together through every review, every auditor conversation, and every shift handoff.
Having the execution graph is not enough. Understanding what the signals mean is not enough. The decision the engineer makes in the next thirty seconds is what determines whether the incident is contained, investigated, or documented — and the decision has to be predictable, runbook-backed, and defensible. The three-tier framework collapses thousands of possible chains into three possible actions, anchored in the behavioral baseline the agent has been learning since it was deployed. Security teams running AI agents in production can see how each tier is assigned to live chains in their own tooling with ARMO’s cloud-native security for AI workloads.
An AI agent execution graph maps the complete chain of a single agent interaction, from the initial prompt through tool invocations, API calls, data access, and system operations. The graph is the assembled artifact that triage decisions are made against.
Classify the rendered chain into one of three tiers — info-only, attack attempt, or active attack — based on how the chain compares to the agent’s learned behavioral baseline. Each tier carries a distinct first action: document, investigate, or page. The classification happens in under thirty seconds; the action depends entirely on which tier was assigned.
Misbehaving agents produce chains that look unusual but stay inside the agent’s historical baseline envelope. Attacked agents produce chains that either probe against the edges of that envelope or cross them in a completed action. The distinction requires a baseline to compare against — without one, both categories look identical.
When a completed boundary crossing has occurred — a write to a tool that has only ever been read, an egress connection to an unapproved destination, a tool call that succeeded in extracting data outside the baseline. Probing, failed attempts, and unusual-but-contained chains do not meet the paging bar. They meet the investigate-now bar.
LangSmith and similar developer observability tools were built to answer debugging questions — why did the agent hallucinate, which prompt produced which output, how many tokens did the chain consume. They do not carry the authorization, egress approval, or baseline comparison signals a security engineer needs to triage an incident. The telemetry overlaps. The reading discipline does not.
The graph needs application-layer visibility into prompts and tool calls, runtime visibility into the syscalls and network activity the agent produces, and identity context linking every action back to a service account and its permissions. The triage decision assumes those signals have already been correlated into a single chain.
Security teams deploying AI agents into Kubernetes know they need behavioral baselines. The concept is...
When your CNAPP flags a suspicious dependency in an AI agent container, your WAF logs...
EKS gives you more sandboxing primitives for AI agent workloads than any other managed Kubernetes...