Sandboxing AI Agents on AKS: Network Policies, Workload Identity, and Least Privilege
Your AI agent runs on AKS with a managed identity that can read Azure Key...
Apr 28, 2026
Your security team sees an MCP tool server throw an error. Your APM dashboard shows a latency spike. Your logs capture the JSON-RPC request with its method name and parameters. But none of that tells you whether the tool just read a harmless config file or dumped credentials to an external IP.
Traditional observability tools—the APM platforms, the OpenTelemetry traces, the centralized logging pipelines—track performance across your Model Context Protocol deployments. They measure latency, error rates, and throughput. What they miss is the security context that matters: which files tools actually touched, where they sent data, and how agent behavior changed in response.
The core problem is architectural. MCP tool servers sit at the junction of three separate observability domains—protocol traffic, operating system execution, and agent decision-making—each typically owned by a different team using different tools. The spaces between those domains are where attacks live. This guide explains how to build runtime observability across those three layers so you can detect real threats instead of chasing dashboard noise.
Every MCP interaction follows the same lifecycle: a JSON-RPC request arrives specifying a method and parameters, the server validates capabilities, tool invocation triggers process execution, the tool accesses system resources, and a response returns to the agent. That lifecycle has three transition points where security context disappears.
Protocol to tool execution. The protocol layer knows the method name and parameters. The moment the tool process starts, it loses visibility. A request to read_config looks identical at the protocol layer whether the tool reads the intended file or also reads /etc/shadow and ~/.aws/credentials.
Tool execution to OS effects. The tool runs system calls—openat, read, execve, connect—but the kernel has no concept of “MCP tool invocation.” Without mapping syscalls to the JSON-RPC method that triggered them, your monitoring produces generic container security alerts that lack the context to determine intent.
OS effects to agent response. The tool’s return value flows back to the agent, which sees results but not side effects. Was the response from a legitimate query, or did the tool silently exfiltrate data before returning? The agent layer can’t tell.
Attackers exploit these gaps deliberately—crafting inputs that look normal at every individual layer while the attack spans the boundaries between them. This is what makes MCP security fundamentally different from standard API monitoring. The five-layer observability model for AI agents addresses this at the architectural level. For MCP tool servers specifically, the operational question is: how do you instrument each transition point and correlate signals across them?
Your existing observability stack answers the wrong questions for MCP security. We’ve broken down why legacy security tools fail to protect cloud AI workloads in detail elsewhere; here’s how those failures manifest specifically for MCP.
APM platforms see requests entering an MCP server and responses leaving. They measure latency and error rates. They cannot see what the tool process did between those points—which files it read, which outbound connections it opened, whether it spawned child processes.
OpenTelemetry instrumentation depends on code being instrumented—creating a trust problem when the threat model includes the tool itself being compromised. Malicious tools can strip or falsify instrumentation. You see only what the code chooses to report, not what the kernel observed.
Protocol-level monitoring captures every MCP JSON-RPC request and response but ends at the protocol boundary. It proves a tool was invoked; it cannot prove what executed on the host afterward.
Agent-layer analytics profile which tools agents call and where they fail. Without kernel-level telemetry, you can’t distinguish a benign error from an exploitation attempt—they produce identical patterns at the agent layer.
If you can’t trace an incident from MCP protocol request through OS-level execution to agent behavior, you have telemetry, not security observability.
Structure your MCP observability around three layers, each producing different signals that become meaningful only when correlated. The MITRE ATLAS framework catalogs the adversary techniques that span these layers.
| Layer | Threat Vector | Key Signal | Required Context | Detection Question |
| Protocol | Schema manipulation, session hijacking, capability escalation | Method anomalies, auth failures, payload size spikes | Correlation IDs, client identity, capability set | Is this client requesting tools outside its normal scope? |
| Tool Execution | Credential harvesting, data exfiltration, supply chain compromise | File access, network egress, process lineage, binary integrity | Container context, syscall events, namespace/pod identity | Did this tool access sensitive paths or unexpected endpoints? |
| Agent Behavior | Prompt injection, capability drift, automated probing | Tool-call sequences, retry patterns, policy violations | Reasoning context, invocation records, trace IDs | Is this agent chaining tools in a pattern never seen before? |
The protocol layer covers everything on the wire—transport, JSON-RPC handshake, and ongoing request/response messages. For security, you care about how the protocol is used and whether usage shifts in dangerous ways.
Threat scenarios: Parameter manipulation pushing tools into privileged states (paths outside allowed directories, injected fields). Session hijacking through replayed tokens or captured MCP messages. Capability escalation abusing the MCP handshake to claim unauthorized tool access.
Signals: Method frequency anomalies—clients suddenly calling sensitive tools beyond baseline. Payload size spikes, malformed requests, or repeated invalid schemas suggesting probing. Auth failures with token reuse from new IPs. Unusual error code patterns aligning with known probing signatures.
Logging: Every protocol event needs a correlation ID that flows into tool execution telemetry, client identity and session details, and the capability set at request time. Without correlation IDs that survive the protocol-to-execution boundary, anomalies stay isolated.
This is where ground truth lives. When the MCP server invokes a tool, that tool becomes running processes performing syscalls. This is the layer most monitoring ignores—and where compromised tools reveal their true behavior.
Scenario: Credential harvesting. A compromised read_config tool targets /etc/shadow, cloud metadata endpoints, and credential stores via openat and read syscalls. The protocol layer still sees a valid call with normal parameters.
Scenario: Unexpected network egress. A tool that never used the network opens outbound connections to unknown IPs and sends data disproportionate to its normal operation—especially after reading sensitive files. The exfiltration is visible only at the execution layer.
Key signals: Process lineage mapping MCP server to tool processes to child processes. File access classified by path sensitivity. Network flows tied to specific tool processes. Binary integrity checks for unexpected executables.
The most practical capture method for Kubernetes is eBPF-based runtime monitoring—attaching programs to kernel events to collect ground-truth behavior without modifying application code. ARMO’s Cloud Application Detection and Response (CADR) sensors capture these signals, recording syscalls tagged with Kubernetes metadata (namespace, pod, container, node) and tied to the specific MCP tool process. That execution-layer ground truth is what makes protocol and agent anomalies actionable.
The agent layer is about decisions—which tools the agent calls, in what order, and how sequencing changes over time. From a security perspective, this is where prompt injection, jailbreaks, and policy bypasses become visible as unusual tool-call patterns.
Scenario: Prompt injection driving abnormal chaining. Hidden instructions cause the agent to jump from benign tools to file-read and network-send tools in a single turn. Each individual call looks valid at the protocol and execution layers. Only the agent’s tool-call graph reveals the anomalous sequence—and only correlation with execution-layer signals (sensitive file access followed by outbound transfer) produces a high-confidence alert.
Scenario: Agent capability drift. After a protocol anomaly, the agent starts invoking tools outside its historical scope—potentially signaling session hijacking or the early stages of agent escape, where an agent acts outside its intended boundaries.
ARMO’s behavioral detection builds baselines automatically through Application Profile DNA, learning normal tool sequences and resource access patterns for each MCP tool server. When new patterns emerge, ARMO calculates anomaly scores combining signals from all three layers into a single risk score—cross-layer correlation rather than single-source guesswork.
The OWASP Top 10 for Agentic AI catalogs threat categories that apply to MCP environments. The following scenarios show how those threats manifest across all three layers—not as detection problems, but as observability architecture problems.
Prompt injection becomes uniquely dangerous in MCP environments because injected instructions translate directly into tool invocations with real system effects. A user message includes hidden instructions telling the agent to call a file-read tool on sensitive directories, then chain to a network tool that sends data externally.
At the protocol layer, you see valid MCP methods with normal parameters—nothing malformed, nothing outside the session’s capability set. At the execution layer, the file-read tool accesses /etc, secret paths, and credential stores while the network tool connects to an external host not in its historical destination list. At the agent layer, the tool-call graph shows a never-before-seen jump from benign tools to high-risk tools in a single turn. Without cross-layer correlation, each layer either sees nothing or generates a low-priority alert that gets triaged as noise.
MCP tools often run with broad permissions—they can read files, query databases, and call APIs. When a tool is misconfigured, compromised, or misused through prompt manipulation, it becomes a data exfiltration channel. A tool designed to read “any file” becomes a secrets reader. A tool that runs database queries gets pushed to dump entire tables.
Detection depends on execution-layer signals: tools reading directories they never normally touch, database queries selecting far more rows than typical, API calls reaching admin or export endpoints from tools that previously operated in narrow scopes. You need an audit trail of resource access tied to MCP context—which tool accessed which path, which query ran against which database—connected back to the protocol request and agent session that initiated it.
MCP tools ship as packages, containers, or plugins—making them software supply chain targets. There is no CVE for “tool now exfiltrates data when called with certain parameters.” SCA and SBOM tools report clean because the attack added malicious logic, not a published vulnerability. Static analysis sees legitimate code structure—the malicious behavior triggers only when specific MCP parameters arrive at runtime.
The only reliable detection is runtime behavioral comparison: profile normal behavior and watch for deviations after updates. New outbound connections, new privileged file reads, new child processes with no justification. This is the same principle behind progressive enforcement for AI agents: observe what’s normal first, then flag what deviates.
MCP security logging differs from debug logging in its objective: enabling threat detection and incident investigation, not bug reproduction. Six event categories matter most: authentication events (all auth attempts, token issuance, permission changes), tool invocations (every tool call with name, caller identity, timestamp, and parameter metadata), resource access (file reads/writes, database queries, and API calls tagged with tool and process identity), network connections (outbound connections with destination, port, and byte counts), policy violations (any action blocked or flagged by security rules), and anomalies (events deviating from baseline—uncommon tool sequences, unusual error patterns, access volume spikes).
When these event types are collected in a structured, correlated way, analysts can reconstruct an attack path from protocol entry to system impact instead of chasing disconnected log lines across multiple tools.
Use a consistent JSON schema with high-precision timestamps, trace/span IDs, MCP method name and version, Kubernetes pod/namespace identity, and security-specific fields (resource sensitivity classification, baseline deviation flags). Redact raw prompts and secrets; retain data type, size, and security tags (like “contained secret-like pattern”). Align with OTLP format to integrate with existing platforms while adding MCP-specific security fields that standard OTLP schemas don’t include.
eBPF lets you run small programs inside the Linux kernel observing syscalls, network events, and file operations from all containers on a node—without modifying applications. Three properties make it ideal for MCP security: kernel-level ground truth that compromised tools can’t evade, no code changes to MCP servers or tools, and predictable overhead suitable for production.
ARMO’s eBPF sensors deploy as a DaemonSet—one pod per node—attaching to kernel hooks like kprobes and uprobes to observe syscalls. Events buffer through ring buffers before shipping to a central collector. Treat eBPF monitoring overhead like any production dependency: measure CPU and memory impact in staging, adjust event filtering to focus on security-relevant signals, and monitor latency impact on critical MCP workloads. The ARMO Platform is built on Kubescape, an open-source Kubernetes security platform used by over 100,000 organizations, so your team can validate sensor behavior and overhead before committing to the full commercial platform.
For multi-cluster environments, ship events into a central correlation plane that aggregates telemetry and runs cross-layer detection logic. ARMO’s CADR platform connects protocol anomalies from one cluster with execution-layer signals from another and agent behavior from a third into a single incident timeline—transforming distributed alerts into coherent attack stories that show exactly who made the request, which tool was invoked, where it ran, what resources it touched, and where data went.
Design retention around investigation needs. Protocol logs—relatively compact—can have longer retention for audit purposes. High-volume syscall events from the execution layer may need shorter retention with summary rollups. Agent traces sit in a middle tier. Most MCP incidents require 7–30 days of execution-layer data and 90+ days of protocol and agent data.
Use this framework to assess any product claiming MCP security coverage. These criteria align with the broader AI agent security framework for cloud environments organizations should use when evaluating AI workload security posture.
| Layer | Coverage Question |
| Protocol | Can you detect schema manipulation, session abuse, and capability escalation in MCP JSON-RPC traffic? |
| Tool Execution | Do you have kernel-level visibility into file access, network connections, and process behavior tied to specific MCP tool invocations? |
| Agent Behavior | Can you identify abnormal tool-call sequences, capability drift, and policy violations in agent behavior? |
| Correlation | Can you connect anomalies across all three layers into a single incident timeline? |
If you can’t answer “yes” to all four, you have runtime gaps attackers will find. Most organizations discover they have strong protocol logging, partial or absent execution-layer visibility, fragmented agent behavior data, and no cross-layer correlation at all. That’s the typical starting point—and closing those gaps is where the security value lives.
You can’t secure MCP tool servers with static scans and protocol logs. The attacks that matter span the boundaries between protocol, execution, and agent layers—boundaries no single tool category monitors. The three-layer model gives your team a framework for building and evaluating observability, and the transition from observability to enforcement follows the same observe-to-enforce progression that applies to AI agent security broadly: watch first, build baselines, then enforce with evidence.
To see all three layers working together—MCP tool servers profiled automatically, behavioral baselines built from runtime observation, attack chains traced end to end—watch a demo to see how ARMO builds complete attack stories from MCP runtime telemetry.
How does runtime observability differ from static MCP security scanning? Static scanning catches known vulnerabilities before deployment. Runtime observability watches how tools actually behave in production. Supply chain compromises and prompt injection only manifest at runtime, making static scanning necessary but insufficient.
Can traditional APM tools monitor MCP tool servers for security? APM tracks latency and error rates but misses which files tools touch, which connections they open, and how agent tool-call sequences change. It lacks the cross-layer correlation MCP threat detection requires.
What is the performance impact of eBPF-based MCP monitoring? eBPF runs inside the kernel with strict safety guarantees. CPU and memory overhead is low and suitable for always-on production use. With event filtering focused on security-relevant signals, most teams find the impact acceptable.
How does MCP observability connect to broader cloud security? MCP tool servers are gateways into your cloud environment. Their observability must connect to IAM logs, network policy data, and Kubernetes workload signals to trace full attack chains. ARMO’s cloud-native security platform for AI workloads provides this unified visibility.
How can teams demonstrate MCP observability for compliance audits? Map the three-layer framework to specific logs and signals collected at each layer, then show how they correlate in your security platform to produce incident timelines. Sample reconstructions and documented detection rules provide concrete audit evidence.
Your AI agent runs on AKS with a managed identity that can read Azure Key...
For six weeks, a mid-size hospital system’s CDS agent issued recommendations biased by a poisoned...
A healthcare CISO opens her AI-SPM dashboard at the start of the quarter. Every clinical...