Get the latest, first
arrowBlog
Runtime-Derived Least Privilege for AI Agents: From Observed Behavior to Enforcement

Runtime-Derived Least Privilege for AI Agents: From Observed Behavior to Enforcement

May 12, 2026

Yossi Ben Naim
VP of Product Management

Key takeaways

  • What is runtime-derived least privilege for AI agents? Least privilege built from observed agent behavior rather than declared intent. Instead of asking developers to enumerate the API calls, network destinations, and syscalls an agent should be allowed to make, you run the agent in observation mode, collect evidence over a defined window, and let the evidence define the enforcement boundary. The derivation step — converting evidence into an enforceable artifact — is where most teams stall.
  • Why is the derivation step harder than the observation step? Because observation produces a distribution of behaviors, but a policy must commit to a deterministic set of allowed behaviors. The conversion requires four decisions — granularity, closure, confidence, validity — and there is no universal default. A literal encoding of every observation is too brittle to survive prompt variation; a generous envelope around them re-creates the gap the methodology was meant to close.

A platform team finishes a two-week observation window on a new internal research agent. The baseline is stable; the sensor produced a clean profile. By Friday, no policy has shipped — and the blocker isn’t tooling. “Turn observed behavior into a policy” is shorthand for a pipeline with four decisions: granularity (which substrate to encode against), closure (literal observations or a structural envelope), confidence (what to do with behaviors observed once or twice), and validity (when the policy expires). 

Public observe-to-enforce guidance treats this step as automatic. Default any of the four and the policy ships either too tight to survive the first new prompt or too loose to count as least privilege.

A security practitioner will ask whether a runtime-derived policy is “based on data that is incomplete.” The objection is sound, and the answer lives inside the four decisions. Defaulting any of them is still a choice — just an invisible one. Each decision below names its failure mode and shows the artifact it produces when made deliberately.

Granularity: The Enforcement Substrate Decides What Can Be Encoded

Each decision trades off against the next — substrate against encoding, closure against tolerance, confidence against coverage, validity against cadence — and the first of them is forced by what the enforcement substrate can express. Three substrates dominate Kubernetes AI agent deployments, each capturing a different layer of agent behavior.

  • Network and identity (NetworkPolicy, IRSA or Workload Identity scoping, IAM boundary policies). Coarse but cloud-native, enforced at the pod-to-resource boundary, blind to anything that happens inside an authorized connection.
  • Syscalls and process behavior (seccomp profiles, Linux capabilities, LSM hooks). Finer-grained, encoded against a stable system-call surface. Agents that generate code at inference time produce variable syscall sequences, so a tight syscall profile from week one breaks on week three.
  • Application-layer behavior (HTTP method and path on outbound calls, MCP tool invocations, model API parameters). The substrate where most agent behavior actually lives. Network and syscall enforcement can both authorize a request that an application-layer policy would deny — the destination is allowlisted, the syscalls look normal, but the request body asks for data the agent has no legitimate reason to retrieve.

The default failure mode is encoding against whichever substrate the team’s existing tooling supports. CNAPP teams default to network-layer enforcement. Generic eBPF teams default to syscall enforcement. The substrate inherited from existing tooling is rarely the substrate the agent’s threat model lives in. We have previously described how generic kernel-level enforcement hits a semantic gap where syscalls no longer encode the agent’s actual decision; granularity selection is where that gap surfaces inside the derivation pipeline.

The deliberate counter-default is to encode against the substrate where observed behavior carries information about intent. For most AI agents, that is the application layer: tool calls, API parameters, request shapes. A policy generated across multiple substrates simultaneously — application-layer for tool calls, network-layer for egress, syscall-layer for code-execution constraint — is the most defensible artifact, and the only one that survives prompt-driven variation in any single dimension. ARMO’s sensor produces multi-layer enforcement from a single evidence stream rather than three policies that drift independently.

Closure: A Policy of Observed Behaviors Is Not the Same as a Policy of Allowed Behaviors

The second decision is how to convert a set of observed behaviors into the set of allowed behaviors the policy will encode. The observed set is a sample. The allowed set is a commitment. Closing the gap between them requires deciding how much structural latitude to grant.

Literal closure

The allowed set equals the observed set. If the agent called boto3.s3.get_object on bucket research-corpus-v2, the policy permits that exact call on that exact bucket and nothing else. This is the closure model most observation-derived policy generators default to. It produces an extremely tight artifact and fails the first time the bucket is renamed, the model is updated, or a prompt produces a slightly different code path. Within days, the platform team is in the security team’s Slack channel asking why the agent stopped working.

Structural closure

The allowed set is the structural envelope around the observed set: same API surface, same identity scope, same destination patterns, with the specific values free to vary. Observing get_object on research-corpus-v2 produces a policy that permits any get_object on buckets under the research corpus prefix, with writes still denied. This is the closure model behind ARMO’s Application Profile DNA — behavioral profiles attached at the Kubernetes Deployment level, encoding the structural envelope of observed behavior rather than the literal sequence of events. The envelope tolerates variation in dimensions that do not matter and constrains the dimensions that do.

Probabilistic closure

The allowed set is the structural envelope encoded as a distribution rather than a hard boundary. Each observed dimension carries a probability mass — tool call frequencies, parameter shape variation, destination request rates — and the policy permits behaviors within N standard deviations of the observed distribution. A tool call sequence that lands inside the structural envelope but outside the typical distribution is treated as a soft violation that triggers a review rather than a silent allow. This is the most defensible artifact against novel-but-similar inputs, but it commits the team to a confidence threshold — which is the next decision.

The default failure mode is silently defaulting to literal closure because the tooling makes it the path of least resistance. The deliberate counter-default is structural closure with explicit envelope dimensions named — which API surfaces, which network CIDR blocks, which IAM action prefixes — so the platform team can reason about what changed when a deviation alert fires. The operational question to ask any runtime-derived policy generator is: “show me a policy you generated, and the closure rule that produced it.” If the closure logic is not surfaced, the tool is encoding literally and the brittleness is already accumulating.

Confidence: Rare Observations Are the Hardest Encoding Decision in the Pipeline

Confidence is the hardest decision because the cost of getting it wrong is bidirectional. Set the threshold too low — too few occurrences required before a behavior is encoded — and the policy encodes noise as permission, pre-authorizing the deviations the policy was supposed to catch. Set it too high and the policy refuses legitimate behaviors the agent will encounter on its next prompt. This is also the decision that answers the customer objection from the intro: a confidence threshold is how a derived policy treats its own incompleteness, surfacing the gap in the observation data rather than encoding around it.

The third decision is what to do with behaviors observed once, twice, or three times during the observation window. The frequency tail is structurally ambiguous: a single rare observation could be a legitimate edge case, a prompt-induced one-time path, or the first signal of an attack that started during the observation window itself.

The deliberate counter-default is a confidence threshold: behaviors must appear more than N times across more than M distinct sessions before being encoded. Behaviors below the threshold are flagged in a derived-policy review report — not silently dropped, but not silently included either. The numbers depend on window length and agent call volume; for a stable agent observed over fourteen days, N=5 and M=3 is a reasonable starting point, with the agent’s historical session distribution defining what counts as “distinct.”

The mechanism reuses the frequency and pattern-continuity tests we have described for distinguishing behavioral drift from anomaly. The same tests apply during derivation — the question is different (is this behavior part of the baseline?) but the underlying signal is the same.

The hardest case is the behavior observed once that turns out to be the first event of an attack already underway. No threshold makes that visible at derivation time. The mitigation is coordination between the threshold and the detection pipeline: behaviors that did not clear the threshold stay visible to detection after enforcement deploys, so a re-occurrence triggers an alert rather than a silent allow.

Validity: Derived Policies Have a Shelf Life

The fourth decision is when the derived policy expires and re-derivation is required. Observation-derived policies are valid at the moment of derivation. They become progressively invalid as the agent evolves — and AI agents evolve along axes that traditional workloads do not.

Four upstream changes invalidate a runtime-derived policy:

  • Model version change. A new model version can produce different tool selection, different parameter shapes, and different syscall sequences for the same prompt. The behavioral envelope shifts even though nothing in the deployment manifest moved.
  • Prompt template change. Updates to system prompts or RAG context templates change which tools the agent invokes and with what parameters. The change is often invisible to the platform team because it happens in a separate repo.
  • Tool catalog change. Adding a new MCP tool, expanding an existing tool’s parameter surface, or rotating a downstream API expands the dimensional space the envelope was derived against.
  • Identity scope change. New IAM permissions, expanded RBAC bindings, or new ServiceAccount mappings change what the agent can do — and therefore what the policy should now constrain.

The default failure mode is treating the derived policy as evergreen. Most observe-to-enforce content treats derivation as a one-time exercise and then frames “drift” as a detection problem rather than a re-derivation trigger. The result is a policy that detects deviations from a baseline an upstream change quietly invalidated weeks earlier.

The deliberate counter-default is event-driven re-derivation, with triggers wired into the same change-management surfaces the upstream events live on. Model version updates trigger a re-baseline. Prompt template commits trigger a re-baseline. Tool catalog changes trigger a re-baseline. A runtime-derived AI Bill of Materials is the operational substrate for those triggers — the AI-BOM catches the upstream changes that should invalidate the derived policy, and the link between AI-BOM updates and re-derivation is what keeps the policy from decaying invisibly.

Defaulting these decisions is still a choice — just an invisible one. Naming them is what converts “we have observation data” into a defensible enforcement artifact an audit can trust and a security team can ship. That’s what ARMO’s cloud-native security platform for AI workloads is built around: closure rule, confidence threshold, validity triggers, and substrate selection surfaced in every generated policy, visible at review time rather than buried in defaults.

Frequently Asked Questions

How long should the observation window be before deriving a policy?

Length depends on the agent’s prompt variance, not the calendar. A stable agent with a fixed prompt template and a small tool surface stabilizes within seven days. A code-generation agent with prompt-driven syscalls may not stabilize within thirty. The operational measure is coverage-stop: when the rate of newly-observed API surfaces, syscalls, and network destinations drops below a threshold (e.g., fewer than one new dimension per twenty-four hours), the window is long enough to derive against.

Should I generate one policy per agent, or one policy per Deployment?

Per Deployment. Attaching behavioral profiles at the Kubernetes Deployment level produces policies that survive replica churn and autoscaling. Per-pod baselines reset on every restart; cluster-level baselines lose per-agent variance. For the per-Deployment workflow in detail, per-agent guardrails covers the operational application.

What happens if the derived policy is wrong — can the agent be unblocked?

A deliberate runtime-derived enforcement workflow always supports a controlled bypass: alert-only mode for policies derived under low confidence, time-boxed exception windows for incident response, and a re-baseline trigger when behavioral evidence supports a new legitimate behavior. The bypass is part of the artifact design, not a separate workflow, and the audit log captures every override with the evidence that justified it.

Can I derive least-privilege IAM policies from observed AWS API calls?

Yes — observed aws-sdk calls produce IAM action sets that can be encoded as IRSA boundary policies or session policies. The closure decision is especially load-bearing here: literal closure on resource ARNs produces policies that break on any new resource creation; structural closure on action prefixes and resource patterns is the operational sweet spot. Pair the IAM-layer policy with a network-layer policy so the IAM boundary cannot be reached through an unexpected egress path.

Does runtime-derived least privilege replace static policy review?

No. Static review is necessary for declared intent; runtime derivation is necessary for actual behavior. They are complementary, and the reconciliation between declared and observed catches both over-permission and under-coverage. A complete cloud-native security stack for AI workloads maintains both surfaces side by side and treats their reconciliation as a first-class output.

Close

Your Cloud Security Advantage Starts Here

Webinars
Data Sheets
Surveys and more
Group 1410190284
Ben Hirschberg CTO & Co-Founder
Rotem_sec_exp_200
Rotem Refael VP R&D
Group 1410191140
Amit Schendel Security researcher
slack_logos Continue to Slack

Get the information you need directly from our experts!

new-messageContinue as a guest