Get the latest, first
arrowBlog
AI Agent Security Framework on AWS EKS: Implementation Guide

AI Agent Security Framework on AWS EKS: Implementation Guide

Mar 30, 2026

Yossi Ben Naim
VP of Product Management

Key takeaways

  • What does implementing an AI agent security framework on EKS actually require? AWS gives you the building blocks—IRSA, EKS Pod Identity, GuardDuty EKS Runtime Monitoring, CloudTrail, VPC Flow Logs—but none of them understand AI agent behavior at the application layer. Implementation means wiring those primitives together with runtime behavioral intelligence so each framework phase (Observe, Posture, Detect, Enforce) draws on EKS-specific telemetry, not generic Kubernetes assumptions.
  • Where does GuardDuty EKS Runtime Monitoring stop? GuardDuty deploys an eBPF-based agent as an EKS add-on that detects known threat signatures—reverse shells, crypto miners, credential exfiltration—inside containers. It does not build behavioral baselines for AI agents, cannot distinguish legitimate tool invocations from malicious ones, and has no concept of prompt-driven behavior. For AI-specific detection, you need a layer that understands what “normal” means for each agent workload.
  • Should I use IRSA or EKS Pod Identity for AI agent workloads? a EKS Pod Identity is simpler to configure and eliminates the OIDC provider setup that IRSA requires, but IRSA works across EKS, EKS Anywhere, and self-managed clusters. For AI agents that call Bedrock, SageMaker, S3, and DynamoDB, the choice matters less than the workflow: start with observation-mode permissions, profile which AWS APIs each agent actually calls, then tighten to least privilege based on observed behavior.

You’ve enabled GuardDuty EKS Runtime Monitoring across your clusters. You’ve configured IRSA for your Bedrock-calling agents. CloudTrail is logging every bedrock:InvokeModel event. And last Tuesday, one of your AI agents exfiltrated 12,000 customer records through a sequence of API calls that every one of those tools recorded as completely normal—because at the control plane level, they were.

AWS-native tools handle identity, encryption, and control-plane logging well — but they stop at the workload boundary, leaving a blind spot exactly where agentic AI threats happen: inside your containers, at runtime, where agents make autonomous decisions about which tools to call and which data to access. 

This gap is structural, not a configuration miss — and closing it requires a phased approach: 

Observe what your agents actually do, assess Posture gaps between permitted and observed behavior, Detect deviations from established baselines, then Enforce least privilege based on evidence. 

This Observe → Posture → Detect → Enforce methodology applies regardless of cloud provider. This article is the execution layer: how to run it on EKS specifically, using the AWS primitives you already have and the runtime behavioral intelligence you need to add.

If your AI agents run on EKS, call AWS AI services, and make autonomous tool decisions, this is your implementation playbook.

What You Already Have: AWS-Native Primitives for AI Agent Security

Before deploying anything new, take inventory of what EKS already gives you. Each AWS primitive contributes to a specific framework phase—but each also has a boundary where it stops being useful for AI agent behavior.

AWS PrimitiveWhat It ContributesFramework PhaseWhere It Stops
GuardDuty EKS Runtime MonitoringeBPF agent detects reverse shells, crypto miners, credential exfiltration, suspicious process execution inside containersDetect (known threat signatures)No behavioral baselines. Cannot distinguish legitimate AI tool calls from malicious ones. No AI agent context.
CloudTrailAPI-level audit trail: who called what, when, from which role. Captures bedrock:InvokeModel, sagemaker:InvokeEndpoint, sts:AssumeRoleWithWebIdentityObserve (API activity), Posture (permission usage)Logs the request, not the consequence. Cannot see what the agent did after the API returned a response.
VPC Flow LogsNetwork connection metadata: source/destination IPs, ports, protocols, bytes transferred per flowObserve (egress mapping)Shows that a connection happened, not why. Cannot attribute network activity to a specific tool call or prompt.
IRSA / EKS Pod IdentityPer-pod IAM role mapping. Temporary credentials via STS. No long-lived keys in manifests.Enforce (identity boundaries)Grants the permissions ceiling, not the actual permissions needed. Cannot tell you which of 47 permitted APIs the agent actually uses.
Security Groups for PodsPod-level network segmentation using AWS security groups, applied via the VPC CNI pluginEnforce (network isolation)Static rules only. Cannot adapt to dynamic agent egress patterns without runtime observation data.

The pattern is clear: AWS primitives handle identity, audit, and infrastructure-level detection well. The gap is behavioral—understanding what your AI agents actually do at runtime, distinguishing normal tool invocations from malicious ones, and deriving enforcement policies from observed behavior. That’s the gap the Observe → Posture → Detect → Enforce framework was built to close, and the sections below show how each phase wires into EKS specifically.

Phase 1: Observe — Deploying Runtime Visibility on EKS

Observation is the foundation. You cannot assess posture, detect threats, or enforce policies for AI agents until you know what they actually do—which tools they call, which endpoints they reach, which models they load, and which data sources they access. AWS-native telemetry captures the control-plane view. The Observe phase adds the runtime behavioral view.

eBPF sensor deployment on EKS. ARMO’s sensor deploys as a DaemonSet on EKS managed node groups via Helm. The sensor instruments kernel-level events—syscalls, process trees, network connections, file access—for every pod on each node, including your AI agent workloads. At 1–2.5% CPU and 1% memory overhead, it falls within the performance budget most platform teams already accept for observability tooling. One important constraint: EKS Fargate does not support DaemonSets. GuardDuty’s own EKS Runtime Monitoring has the same limitation—it runs on EC2-backed nodes only. If you’re running AI agent workloads on Fargate, you’ll need to move them to managed node groups for runtime observability, or accept that Fargate workloads operate without kernel-level visibility.

CloudTrail event filtering for AI workloads. Configure CloudTrail to capture the specific API actions that matter for AI agent observation: bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream for Bedrock calls, sagemaker:InvokeEndpoint for SageMaker inference, sts:AssumeRoleWithWebIdentity for IRSA token exchanges, and secretsmanager:GetSecretValue for credential access. These events tell you which agent called which AWS service, from which role, at what time—the control-plane half of the observation picture.

VPC Flow Logs for egress mapping. Enable VPC Flow Logs at the ENI level for your agent pods. This catalogs every outbound connection—model API endpoints, vector database hosts, external SaaS tools, RAG data sources. Flow Logs show you that your agent connected to api.openai.com:443 at 14:32:07 and transferred 2.3 MB. What they cannot show you is why—was that a legitimate embedding request, or was the agent exfiltrating context window contents through a permitted endpoint? That “why” comes from correlating network events with application-layer behavior, which is what ARMO’s runtime sensor provides.

The first deliverable: Runtime AI-BOM. Within the first 48 hours of observation, ARMO generates a runtime AI-BOM—an inventory of what your AI agents actually use at runtime. Not what your deployment manifest declares, but what’s actually loaded: which AI frameworks (LangChain, LlamaIndex, CrewAI), which models, which tool integrations, which external data sources. On EKS, this catches the gap between ECR image contents and runtime reality—the adapter model your agent downloads from Hugging Face at startup, the MCP server connection a developer added last Thursday, the Python package that loads a transitive dependency nobody audited.

Phase 2: Posture and Detect — Building Baselines from Combined Telemetry

Once observation is running, the framework’s Posture and Detect stages draw on the same telemetry pool. Posture assesses the gap between what agents can do and what they actually do. Detection watches for deviations from what they normally do. Both require behavioral baselines—and on EKS, those baselines fuse AWS-native signals with runtime sensor data.

IRSA and Pod Identity role profiling. Your agent’s IRSA role might permit access to 47 AWS APIs. CloudTrail data from Phase 1 shows it actually calls three: bedrock:InvokeModel, s3:GetObject on a specific RAG bucket, and dynamodb:Query on a session table. That 47-to-3 ratio is your posture gap—the difference between declared permissions and observed behavior. ARMO’s Application Profile DNA captures this per agent, creating a behavioral profile that includes not just which APIs are called but the call patterns, frequencies, payload sizes, and data access volumes that constitute “normal” for each workload.

Network destination baselining. VPC Flow Logs mapped your agents’ egress destinations in Phase 1. Now the baseline separates expected from suspicious. Your Bedrock-calling agent normally connects to bedrock-runtime.us-east-1.amazonaws.com and your internal vector DB at 10.0.4.x:6333. If it suddenly reaches 45.33.x.x:8443—an IP nobody has seen before—that deviation against the baseline triggers investigation. Without the baseline, that connection is either invisible (if you haven’t written a deny rule for it) or one of thousands of network events your SOC has to manually triage.

Where GuardDuty EKS Runtime Monitoring fits—and where it stops. GuardDuty’s eBPF agent on EKS detects known threat patterns: reverse shells, crypto mining processes, credential exfiltration to known malicious IPs, and suspicious binary execution. This is valuable detection coverage, and it contributes to the Detect phase. But GuardDuty operates on pre-defined threat signatures, not behavioral baselines specific to your workloads. It cannot answer “is this tool invocation normal for this agent?” because it has no concept of per-agent behavioral profiles. The practical distinction: GuardDuty catches an agent spawning /bin/sh (known-bad). ARMO catches an agent making 500 s3:GetObject calls when its baseline shows 5 per hour (behavioral anomaly). Both detections matter. They cover different threat categories, which is why the layered approach works.

Attack story correlation across layers. The most dangerous AI agent attacks chain through multiple layers. A prompt injection at the application layer triggers an abnormal tool call, which causes unexpected process execution in the container, which abuses a service account to call the Kubernetes API, which assumes an IAM role and exfiltrates data from S3. CloudTrail sees the S3 access. GuardDuty might catch the suspicious process. VPC Flow Logs record the network connection. But each tool produces a separate, uncorrelated alert. ARMO’s CADR platform correlates signals across cloud, Kubernetes, container, and application layers into a single attack story—reducing what used to be hours of manual log assembly into minutes of reviewing a unified narrative. Customers report 90%+ reduction in investigation and triage time.

Phase 3–4: Enforce — Progressive Policy Rollout on EKS

Enforcement is where most AI agent security programs stall. You know you should restrict agent permissions, but writing policies for workloads whose behavior changes with every prompt feels like guessing. The framework solves this by making enforcement the output of observation and baselining, not the starting point. On EKS, enforcement maps to three concrete mechanisms.

Seccomp profiles from observed syscall behavior. During Phases 1 and 2, ARMO’s sensor records every system call each agent pod makes. From that record, the platform generates a least-privilege seccomp profile that allows exactly the syscalls the agent actually uses and blocks everything else. On EKS, you deploy these as Kubernetes SeccompProfile resources referenced in the pod’s security context. The progression: deploy in audit mode first (log what would be blocked, don’t block yet), validate for a week against live traffic, then graduate to enforce. Because the profile was generated from observed behavior, not guessed from a manifest, the risk of breaking production drops dramatically.

Network policy enforcement. EKS supports Kubernetes NetworkPolicy through the Amazon VPC CNI plugin (with network policy support enabled) or through third-party CNIs like Calico or Cilium. For AI agent workloads, Cilium is worth noting because its eBPF foundation is architecturally complementary to ARMO’s runtime sensor—both operate at the kernel level. The enforcement workflow: deploy NetworkPolicies in audit/logging mode, compare real egress traffic against the Phase 2 baseline, tighten to deny-all with explicit allows for baselined destinations, and maintain monitoring for legitimate new destinations that require policy updates. For pod-level segmentation using AWS primitives specifically, Security Groups for Pods lets you assign AWS security groups directly to individual pods via the VPC CNI, giving you VPC-level isolation without leaving the AWS networking model.

IAM permission tightening from observed behavior. This is where the 47-to-3 API ratio from Phase 2 becomes actionable. Your agent’s IRSA or Pod Identity role started with broad permissions so you wouldn’t break anything during observation. Now you have weeks of CloudTrail data showing exactly which APIs the agent calls, which S3 buckets it reads, which DynamoDB tables it queries. Generate a replacement IAM policy scoped to only those observed actions and resources. For IRSA, update the role’s policy document. For EKS Pod Identity, update the associated role. The principle is the same: enforcement based on evidence, not guesswork.

ARMO integrates into this workflow by generating policy candidates from baseline data—seccomp profiles, network policies, and identity constraints—that you review and promote to enforcement with a single action. The platform maintains rollback capability throughout, so if a policy blocks legitimate behavior that wasn’t captured during the baseline period, you can relax it immediately and re-observe.

Identity Security for AI Agents on EKS: IRSA vs. EKS Pod Identity

AI agents present a specific identity challenge: they need AWS credentials to call Bedrock, SageMaker, S3, and other services, and those credential needs can shift based on which tools the agent invokes. The identity mechanism you choose affects how you observe, baseline, and enforce.

IRSA maps a Kubernetes service account to an IAM role via an OIDC provider tied to your EKS cluster. It’s the established approach, works across EKS, EKS Anywhere, and self-managed clusters, and has broad tooling support. The drawback: OIDC provider setup adds complexity, trust policy management becomes unwieldy at scale (especially across multiple clusters), and role bindings are tightly coupled to service accounts.

EKS Pod Identity eliminates the OIDC provider requirement entirely. Role-to-service-account mappings are managed through the EKS API using a DaemonSet-based credential agent. Setup is simpler, cross-account role assumption is built in, and you don’t hit the IAM trust policy size limits that IRSA encounters at scale. The trade-off: Pod Identity is EKS-only (no EKS Anywhere or self-managed support) and requires the Pod Identity Agent add-on.

For AI agent security, the choice matters less than the workflow around it. Whichever mechanism you use, the pattern is the same: start with an observation-mode role that permits the broad set of APIs your agent might need, let ARMO’s profiling capture which APIs it actually calls over a representative observation period, then replace the role with a least-privilege policy derived from observed behavior. CloudTrail’s AssumeRoleWithWebIdentity events (IRSA) or AssumeRoleForPodIdentity events (Pod Identity) feed directly into this workflow—they show exactly when and how often each agent assumes its role.

The critical security question for AI agents isn’t which identity mechanism to choose. It’s whether you can detect when an agent uses its permitted credentials in unexpected ways—calling an API it’s authorized to use but has never used before, accessing an S3 bucket that’s in scope but outside its normal pattern. IAM policies can’t catch this because the call is authorized. CloudTrail can’t catch this because the event is valid. Only behavioral baselines that know what “normal” looks like for each specific agent can flag it.

Network Isolation for AI Agent Workloads on EKS

AI agents reach a wider and more dynamic set of external endpoints than traditional workloads. A single agent might call Bedrock for inference, a Pinecone or Weaviate instance for vector search, an external SaaS API through MCP tool integration, and an internal microservice for business logic—and which endpoints it calls can change based on the prompt. Static network rules written before deployment either block legitimate tool calls or leave egress wide open.

On EKS, you have three network isolation mechanisms that layer together. Kubernetes NetworkPolicy (supported natively through the VPC CNI plugin or via Calico/Cilium) controls pod-to-pod and pod-to-external traffic at L3/L4. Security Groups for Pods applies AWS security groups to individual pods, giving you VPC-level segmentation using the same security group rules your network team already understands. And VPC-level controls (NACLs, route tables, VPC endpoints) provide perimeter boundaries.

The implementation sequence follows the framework: observe all egress during Phase 1, baseline legitimate destinations during Phase 2, then progressively restrict during Phases 3–4. ARMO maps each agent’s network destinations from kernel-level connection data—not just IP/port from Flow Logs, but correlated with the process and tool invocation that initiated the connection. That correlation is what turns a list of IP addresses into an enforceable network policy: you know that api.pinecone.io:443 is the vector DB your RAG pipeline uses, not just an anonymous outbound connection, so you can confidently allow it and deny everything else.

From Framework to Running Controls on EKS

ARMO’s Cloud Application Detection & Response (CADR), built on Kubescape, provides the integrated runtime layer that connects each framework phase to your EKS environment:

Phase 1 (Observe): Deploy ARMO’s eBPF sensor to EKS clusters via Helm. The sensor auto-discovers AI agents, inference servers, and MCP tool runtimes. Runtime AI-BOM generates within hours. Agent-to-tool-to-data-source interaction maps appear in the ARMO console alongside CloudTrail and VPC Flow Log data you’re already collecting.

Phase 2 (Posture + Detect): Application Profile DNA builds behavioral baselines for each agent workload—capturing syscalls, network patterns, file access, Kubernetes API usage, and tool invocation sequences. Detection rules grounded in those baselines cut false positives by surfacing only genuine deviations. CADR correlates signals across cloud and cluster layers for full attack stories.

Phase 3–4 (Enforce): Auto-generated seccomp profiles, network policies, and identity constraints deploy in audit mode first. The platform monitors for false positives, lets you adjust, then graduates to enforcement. Per-agent granularity means your high-risk autonomous agent gets stricter controls than your read-only chatbot. All without writing policies from scratch.

The quantified outcomes from this workflow: 90%+ CVE noise reduction through runtime reachability analysis, 90%+ faster investigation through LLM-powered attack story generation, 80%+ reduction in issue overload through runtime-based prioritization. All at 1–2.5% CPU and 1% memory overhead. Kubernetes-native, eBPF-based, no sidecars, no code changes.

To see how the full framework maps to your EKS environment: book a demo today.

Frequently Asked Questions

How does GuardDuty EKS Runtime Monitoring differ from ARMO’s runtime detection?

GuardDuty deploys an eBPF agent that detects known threat signatures—reverse shells, crypto miners, credential exfiltration. ARMO builds behavioral baselines per agent workload and detects deviations from observed normal behavior, including AI-specific threats like anomalous tool invocation patterns and prompt-driven behavioral shifts. GuardDuty catches known-bad. ARMO catches abnormal-for-this-agent. Both are valuable, and they’re complementary—run both.

Should I use IRSA or EKS Pod Identity for AI agent workloads?

If you’re EKS-only and want simpler setup, Pod Identity eliminates OIDC provider management and scales better across clusters. If you need cross-environment support (EKS Anywhere, self-managed), IRSA is still the right choice. For AI agent security specifically, the identity mechanism matters less than the observe-then-enforce workflow: profile actual API usage per agent, then tighten permissions to match reality.

Can I deploy eBPF runtime sensors on EKS Fargate?

No. Neither ARMO’s sensor nor GuardDuty’s EKS Runtime Monitoring supports Fargate—both require DaemonSet access to node-level kernel instrumentation, which Fargate’s serverless model doesn’t expose. If AI agent workloads require runtime behavioral monitoring, run them on EKS managed node groups or EKS Auto Mode.

How do I baseline AI agent behavior against Bedrock and SageMaker endpoints?

CloudTrail captures every Bedrock and SageMaker invocation with role, timestamp, and endpoint details. ARMO correlates those API events with runtime sensor data showing what the agent did before and after each invocation—which tool triggered the call, what data was accessed, and which network connections followed. Together, they build a complete behavioral profile that defines “normal” for each agent’s interaction with AWS AI services.

Close

Your Cloud Security Advantage Starts Here

Webinars
Data Sheets
Surveys and more
Group 1410190284
Ben Hirschberg CTO & Co-Founder
Rotem_sec_exp_200
Rotem Refael VP R&D
Group 1410191140
Amit Schendel Security researcher
slack_logos Continue to Slack

Get the information you need directly from our experts!

new-messageContinue as a guest