Get the latest, first
arrowBlog
Runtime Observability for AI Agents: What to Instrument and Why

Runtime Observability for AI Agents: What to Instrument and Why

May 22, 2026

Ben Hirschberg
CTO & Co-founder

Key takeaways

  • What is the Signal Source Trust Hierarchy and why does it matter? A tamper-resistance taxonomy of the instrumentation sources available across the AI agent observability stack. At the top sit signals the workload cannot reach — kernel-level eBPF observation and cloud audit logs that fire outside the workload's reachability surface. At the bottom sit signals the workload emits about itself — application logs, framework callbacks, in-process telemetry. The hierarchy matters because incident response uses signals as evidence, and not all signals are admissible.
  • How do you sequence an AI agent observability buildout when you can't instrument everything at once? Tamper-resistant first. Phase 1 deploys kernel-level eBPF as a single DaemonSet, covering Discovery, syscall-level side effects, and identity actually exercised. Phase 2 adds Layer 3 depth through framework callbacks or MCP protocol capture. Phase 3 integrates managed-service audit logs for cloud-data-plane coverage. The threat model determines what gets prioritized.

Every guide to AI agent observability tells you what to capture — prompts, tool calls, token usage, traces, syscalls. Almost none address which of those signal sources you can still trust when the agent itself is part of the threat model. That distinction is the entire difference between observability that helps your SRE team debug a slow reasoning chain and observability that helps your security team investigate a breach.

The dev-observability default — token usage dashboards, latency trace trees, cost-per-trace breakdowns, evaluator scores — was built for debugging. When a product manager asks why a chain produced a wrong answer, that stack is exactly the right tool. When an SOC analyst asks what an agent actually did between receiving a prompt and exfiltrating data, the same stack is the wrong tool. Not because it captures the wrong events, but because it captures them in a way the threat can control.

What follows is a signal-source map of the AI agent observability stack — organized by where each signal physically lives, who controls it, and what tampering tolerance it has. The five-layer model that defines what to capture at each layer — Discovery, AI-BOM, Behavioral Visibility, Execution Graph, Identity Mapping — is the subject of our end-to-end walkthrough of runtime observability for AI agents. This piece works one layer below it: which instrumentation sources actually produce each layer’s signals, and which of those sources survive in an adversarial threat model. It extends the architectural argument behind why traditional security tools fail for AI workloads — telemetry exists; the question is whose telemetry your incident response process can actually use.

The Trust Hierarchy Determines What Survives a Compromise

Every instrumentation source in the AI agent observability stack has a trust property: how much the workload being observed can do to the signal before it reaches the security team. That trust property is the difference between data that holds up as evidence in an investigation and data an attacker can plausibly deny.

The dev-observability ecosystem converges almost entirely on the cooperative tier. OpenTelemetry’s GenAI semantic conventions, LangSmith, Arize Phoenix, Langfuse, groundcover’s eBPF-to-trace bridge for LLM traffic — every dominant vendor in the AI agent observability category collects telemetry by asking the workload to emit it. That works for the use cases they were built for. It does not work as the sole signal source when the workload is the threat.

Four tiers describe the trust property of every instrumentation source in the stack.

Tier 1 — Tamper-resistant. Kernel-level eBPF and cloud provider audit logs. Both fire outside the workload’s reachability surface. eBPF programs run inside the Linux kernel, observing syscalls, network events, and file operations from below the application layer — a compromised agent that strips its own OTel instrumentation cannot strip kernel observation. Cloud audit logs are emitted by the provider’s control plane in response to API calls; the workload sees the call but does not control the log. ARMO combines an eBPF substrate with application-layer correlation: Tier 1 ground truth carrying the evidence, Tier 3 semantic context carrying what it means.

Tier 2 — Partially resistant. Sidecar capture and service mesh telemetry sit out of process but inside the application’s reachability radius. A compromised pod cannot reach a sidecar in another container, but service mesh policy and configuration are within reach of an attacker who has compromised the cluster control plane. MCP protocol capture at a proxy sits here: useful, more resistant than in-process, not kernel-grade.

Tier 3 — Cooperative. Framework callbacks (LangChain handlers, LlamaIndex events) and OpenTelemetry or ADOT SDK instrumentation. These sources exist only because the framework chooses to emit them. A compromised framework can strip the handlers. A prompt-injected agent that bypasses framework abstractions to call tools directly emits nothing.

Tier 4 — Repudiable. Application logs and agent self-reports. The agent writes them. The agent can suppress them. The agent can falsify them. Useful for debugging when nobody is lying. Not evidence.

The pattern is consistent across the tiers: the more cooperative the source, the richer the semantic context — and the more the security team has to assume good faith from the workload to read the signal. Tier 1 sources are syntactically poorer (a syscall doesn’t tell you what prompt produced it) but evidentially stronger. Effective AI agent security observability layers them: kernel-level signals for what survives, application-layer signals for what they mean.

Every Layer Has Both a Trusted Source and a Repudiable One

Before mapping signal sources to layers, the five layers in summary:

LayerThe question it answersExample signal
1 — DiscoveryWhat AI agents exist in my clusters?Process tree showing an unannounced LangChain pod
2 — AI-BOMWhat is each agent actually using at runtime?Dynamic library load of a vector DB client not in the manifest
3 — BehavioralWhat is each agent doing at the prompt and tool-call layer?MCP method invocation with arguments
4 — Execution GraphWhat’s the full chain — Agent → Tool → API → Data?Correlated syscall + tool-call sequence for a single request
5 — IdentityIf this agent is compromised, what can it reach?IAM roles actually exercised vs. roles granted

Each layer has at least one tamper-resistant source and at least one cooperative source. Which one you collect determines whether the layer produces evidence — or just commentary.

Layer 1 — Discovery. The tamper-resistant source is kernel eBPF observing process creation. A new LangChain pod, an inference server, an MCP tool runtime — all appear in the syscall stream the moment they execute, regardless of whether the developer registered them anywhere. The cooperative source is Kubernetes admission webhook telemetry and manifest tagging — useful when the team self-reports, blind to shadow AI deployments by definition.

Layer 2 — AI-BOM. The tamper-resistant source is observed runtime behavior: which dynamic libraries the agent actually loaded, which network destinations it actually called, which model artifacts it actually pulled. The cooperative source is the deployment manifest and the framework’s declared dependencies. The gap between declared and actual is where supply chain attacks unfold — a runtime-derived AI-BOM closes it.

Layer 3 — Behavioral. This is where the trust split is sharpest, and where most observability stacks collect the wrong half. The same tool-call event produces two records. The first comes from the framework: a LangChain handler fires, a LangSmith span is emitted, an OTel trace records method name and arguments. Rich semantic context. Cooperative tier. The second comes from the kernel: a syscall opens a network connection, a file descriptor reads, bytes flow out. No semantic context. Tamper-resistant tier. When the agent is the threat, only the second survives — and most stacks collect only the first.

Layer 4 — Execution Graph. The graph itself is a derived artifact: it joins the framework’s record of which tools were called in which order with the kernel’s record of which syscalls and network calls actually fired. Neither source alone produces an execution graph. Cloud audit logs anchor the graph at the managed-service boundary — when an agent calls Bedrock or Vertex AI, the provider’s audit log is the kernel-equivalent for that segment of the chain.

Layer 5 — Identity. The control-plane source is Kubernetes RBAC and cloud IAM audit logs — definitive for what permissions exist, less useful for what permissions get exercised. The data-plane source is kernel-level observation of which service account tokens the workload actually used to authenticate which calls. The gap between granted and exercised is the blast radius question.

Three deployment patterns produce three different coverage profiles. A fully self-hosted stack — vector DB, embedding service, agent, framework all in the cluster — reaches every layer through eBPF and produces complete kernel-grade coverage. A managed vector DB with a self-hosted agent loses kernel reach into index-time signals; closing that gap requires provider audit telemetry. A fully managed RAG stack — Bedrock Knowledge Bases, Vertex AI Agent Builder, Azure AI Studio — shrinks kernel reach to network calls and identity attribution only; everything else depends on cloud provider audit logs as the Tier 1 substitute.

Most Existing Stacks Sit in the Repudiable Tier

Before adding instrumentation, audit what’s already in place. Five questions to run against the namespaces and accounts where your AI agents operate:

1. For agent tool-call telemetry, list every source you collect today. For each, ask: can the workload disable this source without your operator team noticing? If the source is an in-process OTel SDK, the answer is yes — environment variable, exception, code path. That source belongs in Tier 3.

2. For agent identity actions, are you observing at the IAM control plane or at the data plane? Most stacks observe the control plane only. The control plane tells you which permissions exist; only the data plane tells you which permissions get exercised.

3. For Layer 3 behavioral signals, is your collector in-process or out-of-process? In-process collectors share fate with the workload they observe. Out-of-process collectors — sidecars, service meshes, proxies, kernel sensors — survive a compromise of the workload itself.

4. For cloud-managed AI services, do you ingest the provider’s audit logs into your security pipeline, or only their performance dashboards? Provider dashboards are SRE telemetry. Provider audit logs are the Tier 1 substitute for kernel reach into managed services.

5. For your AI-BOM, is it derived from runtime observation or declared at deploy time? A declared AI-BOM is a compliance artifact. A runtime-derived AI-BOM is a security control.

Most teams complete this audit and find the same pattern: strong cooperative-tier coverage, partial process-tier coverage, near-zero kernel-grade visibility into agent decisions. For the broader posture exercise across the full security stack — not just AI workloads — work through the visibility stack audit framework in the AI workload security buyer’s guide.

Tamper-Resistant Sources First, Application-Layer Second

The buildout sequence follows the trust hierarchy. Start with what’s tamper-resistant; layer cooperative sources on top once you have signals that survive a compromise.

Phase 1 — Deploy kernel eBPF. A single DaemonSet on every node covers Layer 1 discovery, Layer 4 syscall-level side effects, and Layer 5 identity actually exercised at the data plane. No framework integration, no SDK injection, no code changes required — the sensor observes any container on the node regardless of language, framework, or whether the developer instrumented it. ARMO’s eBPF sensor deploys here at 1–2.5% CPU and roughly 1% memory per node, comparable to standard CWPP eBPF substrates. Kubescape — the open-source foundation underneath the platform — is available for the team to audit against its own workloads before any commercial commitment.

Phase 2 — Add Layer 3 depth. Framework callbacks for in-cluster agent code (LangChain handlers, LlamaIndex events) and MCP JSON-RPC capture at the proxy for tool-call semantics. These sources are cooperative — they tell you what the agent thought it was doing, which is exactly the context the kernel-level signals lack. Treat them as semantic enrichment on top of the Tier 1 substrate, not as primary evidence. When Phase 1 says a syscall opened an unfamiliar outbound connection and Phase 2 says the agent invoked a database tool with arguments referencing a table it has never touched, you have a connected attack story rather than two disconnected anomalies.

Phase 3 — Integrate managed-service audit. CloudTrail, Cloud Audit Logs, and Azure Diagnostic Settings for managed AI service calls. Vector DB audit logs for managed retrieval. Provider-side telemetry for any segment of the agent’s execution graph that runs outside your cluster. This phase closes the cloud-data-plane gap that kernel reach cannot cover by design.

Threat priority shapes how teams sequence the three phases. Prompt injection chains are visible at Phase 1 plus Phase 2 — the kernel observes the resulting syscalls and network egress, framework callbacks observe the tool-call sequence that produced them. Data exfiltration to external endpoints is visible at Phase 1 plus Phase 3 when the destination is a managed service. Supply chain compromise is visible at Phase 1 alone, through runtime-derived AI-BOM.

Observability Is a Trust Architecture, Not a Signal List

Every layer of the AI agent observability stack has both a tamper-resistant source and a cooperative one. The choice between them determines whether the layer produces evidence or atmosphere.

The exercise that matters for any security architect reading this is concrete: place every current and proposed observability tool in your environment on the Trust Hierarchy. Identify which layers sit only in the cooperative or repudiable tier. Prioritize the tamper-resistant gap before adding more cooperative signals. The dev observability stack you already have is not the wrong stack — it is half the stack.

The other half is kernel-grade ground truth correlated with application-layer agent decisions. That is the substrate underneath ARMO’s platform for cloud-native security for AI workloads. To see it running against a live agent workload in your environment, book a demo.

FAQ

Can I rely on OpenTelemetry alone for AI agent security observability?

No. OTel produces cooperative-tier instrumentation — the workload emits telemetry through SDKs the workload itself controls. That is sufficient for cost, latency, and quality monitoring under a non-adversarial threat model. Under an adversarial threat model, where the workload may be compromised or prompt-injected, OTel telemetry can be skipped, stripped, or falsified by the code being observed. Pair OTel with a tamper-resistant source — kernel-level eBPF or out-of-cluster audit logs — and treat OTel as the semantic enrichment layer, not the primary evidence stream.

What does kernel-level eBPF see that framework callbacks don’t?

Syscall-level ground truth that survives a compromised framework. eBPF observes process creation, network connections, file access, and credential reads from inside the Linux kernel, below the application layer. A compromised agent can strip its LangChain handlers and bypass its OTel instrumentation, but it cannot skip the kernel observation of the syscalls it actually executes. The trade-off is semantic context: eBPF sees that a network connection opened, not what prompt produced it. The two signals are complements, not substitutes.

Where do managed AI services like Bedrock AgentCore and Vertex AI Agent Builder sit in the Trust Hierarchy?

The provider’s audit logs are Tier 1 — they fire outside the workload’s reachability surface and cannot be tampered with by the agent calling the service. The trade-off is that managed services shrink your kernel reach to the network call boundary; everything that happens inside the managed service depends on the provider’s audit telemetry. CloudTrail for AWS, Cloud Audit Logs for Google Cloud, and Azure Diagnostic Settings are the operational anchors.

How much overhead does kernel-level instrumentation add to an AI workload?

ARMO’s eBPF-based sensor operates at 1–2.5% CPU and roughly 1% memory per node at production scale, substrate-comparable to standard CWPP eBPF deployments. Overhead scales with nodes, not with the number of pods or agents — a cluster with 500 pods on 10 nodes pays the same per-node cost as 50 pods on 10 nodes. Validate against your specific workload in staging before committing.

Does security observability for AI agents replace my existing SRE observability stack?

No. Security observability sits above the existing telemetry stack and correlates back into it. The OTel traces, Prometheus metrics, and LLM observability vendor data you already collect remain useful for reliability questions. What security observability adds is the tamper-resistant layer the cooperative stack cannot produce on its own, plus the correlation logic that turns syscall-level events and framework-level decisions into a single attack story.

Close

Your Cloud Security Advantage Starts Here

Webinars
Data Sheets
Surveys and more
Group 1410190284
Ben Hirschberg CTO & Co-Founder
Rotem_sec_exp_200
Rotem Refael VP R&D
Group 1410191140
Amit Schendel Security researcher
slack_logos Continue to Slack

Get the information you need directly from our experts!

new-messageContinue as a guest