MITRE ATLAS for AI Agent Attack Detection: A Complete Mapping

May 17, 2026

Ben Hirschberg
CTO & Co-founder

Key takeaways

Why isn’t MITRE ATLAS a detection plan? ATLAS catalogs what adversaries do; detection programs are organized around where the program looks. The two structures share no unit of analysis. A list of techniques organized by adversary objective serves a different purpose than an operating model — it needs translation through the surfaces and layers a runtime detection program actually runs on before it produces a coverage answer.
Which ATLAS techniques is runtime detection blind to? Training-time techniques, build-time supply-chain compromises, and model-theft scenarios sit outside the runtime layer entirely. Configuration-state techniques — overprivileged service accounts, exposed model artifacts — belong to CSPM and CIEM, not the runtime detection program. The honest version of the coverage report names these explicitly rather than padding the matrix to look universal.

MITRE ATLAS catalogs sixteen tactics and eighty-four techniques adversaries use against AI systems, including fourteen agent-focused techniques added through the October 2025 Zenity Labs collaboration. It is the canonical taxonomy a security architect’s CISO, auditor, or RFP will name. It is not a detection plan.

ATLAS organizes around adversary objectives. Detection programs organize around surfaces a program looks at, signals it captures, and layers that assemble those signals into an attack story. Two structures, two purposes — neither does the other’s job. Asking “is AML.T0051 covered?” has no operational answer without a translation between them.

This article produces that translation. Every agent-relevant ATLAS technique mapped to the four detection surfaces and five operating layers from the AI agent attack detection framework, with the techniques runtime cannot evidence named explicitly so the coverage report draws the line where the line actually is.

ATLAS Is an Adversary Taxonomy, Not a Detection Surface Map

MITRE ATLAS is built on the conventions inherited from ATT&CK. A tactic names an adversary objective. A technique names a method used to achieve it. Each technique carries an AML.TXXXX identifier the security community can reference unambiguously. That structure is excellent for communication — every defender globally knows what AML.T0051 means without ambiguity — and that’s the value ATLAS delivers.

It’s a different structure than what operating detection programs run on. The unit of analysis the architect’s program runs on is the detection surface, not the adversary objective. A program that looks at “Initial Access” doesn’t exist; a program that looks at the Input & Reasoning surface does. The mismatch shows up in any specific coverage question: “is AML.T0051 covered?” requires answering which surface the technique crosses, which layers of the stack are required to detect it at that surface, and what signal source feeds those layers. Without that translation, the question has no operational answer — only a vendor claim.

The CISO’s report still has to use ATLAS vocabulary. So the architect needs both: the canonical taxonomy for the report’s audience, and the runtime structure for the program that actually catches the techniques.

ATLAS Techniques Sort Into Three Buckets — Only One Lives at Runtime

Before any per-technique mapping, the eighty-four ATLAS techniques sort into three buckets that determine which tool class actually owns the detection. The sort is the first step that makes the matrix tractable: it cuts the technique set the runtime program is responsible for from the set that belongs elsewhere.

Bucket 1 — Runtime-evidenced. Techniques that produce a signal crossing at least one of the four detection surfaces during agent execution. Prompt injection crosses Input & Reasoning. Tool misuse crosses Tool Invocation. Container escape crosses Identity & Action. Multi-agent coordination attacks cross Cross-Agent Coordination. The majority of agent-relevant ATLAS techniques sit here — the runtime detection program’s responsibility, and where the matrix walk in the next section spends its budget.

The criteria for recognizing a runtime-evidenced technique: the attack produces observable behavior during agent execution, that behavior crosses one or more of the four surfaces, and the evidence is sequence-shaped or context-shaped rather than configuration-shaped. If detection requires comparing what the agent did against what its baseline says is normal, the technique sits in Bucket 1.

Bucket 2 — Posture-evidenced. Techniques that produce evidence in configuration state rather than runtime behavior. Overprivileged service accounts, exposed model artifacts, missing network segmentation around agent workloads — these are detectable in CSPM and CIEM tools that read configuration data, not in the runtime detection program. Detailed below.

Bucket 3 — Lifecycle-evidenced. Techniques whose evidence sits in the model lifecycle before runtime exists. ML Attack Staging at training time. ML Supply Chain Compromise on model artifacts. Model Theft via the training pipeline. The runtime program cannot see these — by the time the model is loaded into a running container, the attack already happened. Detailed below.

The Translation Matrix: Every Agent-Relevant Technique Maps to a Surface, a Layer, and a Signal

Per runtime-evidenced technique, the matrix names four things: which surface(s) the technique crosses, which layers of the five-layer stack must be running to detect it, the specific runtime signal that evidences it, and the structural blind spot if any required layer is missing. The signal source matters more than the technique ID — the architect’s report will list the signal sources as the coverage evidence, with the technique IDs as the auditor-facing labels on top.

Surface 1 — Input & Reasoning. Prompt Injection (AML.T0051) is the canonical example, and we’ve previously broken down the eight-stage pattern this technique follows. At Surface 1, the runtime signal is content and provenance: instruction-shaped tokens entering the agent’s context window from a source not previously trusted with that role. Indirect Prompt Injection (AML.T0054) and AI Agent Context Poisoning — contributed in the October 2025 Zenity Labs collaboration — sit here as well. Both produce signal as cross-session behavioral drift: no single anomalous event, just gradual deviation as the agent’s reasoning context gets manipulated across sessions. The Memory and Thread techniques from the same Zenity contribution operate similarly, persisting changes either into future chat sessions (Memory) or within a single thread (Thread). All four techniques need Layers 1, 2, and 3 — runtime telemetry capturing the prompt and context-window state, per-agent baselines that make drift detectable, and cross-layer correlation that links the input event to the downstream action it triggered.

Surface 2 — Tool Invocation. The agent-focused additions concentrate manifestation here. “Publish Poisoned AI Agent Tool” — added in v5.4.0 — is detected at the syscall pattern the poisoned tool produces during execution, not at the source-code review that would have caught it in supply chain scanning. AI Agent Tools (AML.T0085.001) sits here, alongside Modify AI Agent Configuration from the Zenity contribution; both produce signal where the action is authorized but the combination violates the baseline. We’ve previously broken down the scope, sequence, and rate categories of tool misuse and the framework SDK callbacks and eBPF process trees that catch them. Surface 2 detection requires Layer 1 (kernel and framework telemetry), Layer 2 (per-agent baselines tied to deployment-level behavior), and Layer 3 (cross-layer correlation that links the framework-layer tool call to the kernel-layer syscall pattern that followed it).

Surface 3 — Identity & Action. “Escape to Host” — added in v5.4.0 — is detected at the kernel-level syscall sequence (unshare, mount, capability acquisition) that the breakout produces, not at the IAM policy that allowed it. Service-API abuse and credential access against agent identities sit here at the moment the credential is exercised. We’ve previously laid out the five-stage agent escape pattern these techniques follow. Exfiltration via AI Inference API (AML.T0024) crosses Surfaces 1 and 3 — input that triggers the exfiltration on Surface 1, identity that authorizes the egress on Surface 3 — and the destination is always allowlisted, so the destination-layer blind spot is structural. Surface 3 detection requires Layer 1 (Kubernetes audit logs and cloud IAM events), Layer 2 (per-agent identity envelopes), and Layer 3 (correlation that ties the authorizing action back to the input that triggered it).

Surface 4 — Cross-Agent Coordination. Multi-agent attacks are visible only at the inter-agent layer that per-agent baselines structurally cannot see. We’ve previously laid out the three surfaces multi-agent systems introduce: delegation edges, shared-context layers, orchestrator nodes. The framework-layer telemetry that catches them — LangGraph state transitions, CrewAI delegation events, AutoGen speaker selections — feeds the four-surface picture for multi-agent systems. AI Agent Context Poisoning also crosses Surface 4 when the poisoned context propagates across delegated agents through shared scratchpads or vector stores. The publicly documented Zenity contributions don’t yet name techniques specific to this surface — multi-agent coordination is one of the active areas of ATLAS expansion. Surface 4 detection requires Layer 1 (orchestrator telemetry and shared-context store audit), Layer 2 (per-edge behavioral envelopes, not just per-agent), and Layer 3 (graph-shape correlation across the delegation network).

Technique (ATLAS)	Surface(s)	Layers	Runtime Signal	Bucket
AML.T0051 Prompt Injection	1	L1, L2, L3	Instruction-shaped tokens from untrusted source in context window	Runtime
AML.T0054 LLM Prompt Injection: Indirect	1	L1, L2, L3	Provenance break: untrusted source writing instruction content into retrieved context	Runtime
AI Agent Context Poisoning (AML.T0080, v5.0.0)	1, 2, 4	L1, L2, L3	Cross-session behavioral drift correlated with tool-selection deviation	Runtime
AI Agent Context Poisoning: Memory (v5.0.0)	1	L1, L2, L3	Persistent behavioral shift across LLM sessions; no single anomalous event	Runtime
AI Agent Context Poisoning: Thread (v5.0.0)	1	L1, L2, L3	Malicious instructions embedded within a chat thread	Runtime
RAG Credential Harvesting (v5.0.0)	1, 3	L1, L2, L3	LLM query patterns toward credential-shaped strings in RAG index	Runtime
Modify AI Agent Configuration (v5.0.0)	2	L1, L2, L3	Configuration change followed by tool-behavior deviation across agent fleet	Runtime
AML.T0085.001 AI Agent Tools	2	L1, L2, L3	Authorized tool invoked outside scope/sequence/rate envelope	Runtime
Publish Poisoned AI Agent Tool (v5.4.0)	2	L1, L2, L3	Syscall pattern from tool execution divergent from per-tool baseline	Runtime
AI Service API (AML.T0096, v5.2.0)	3	L1, L2, L3	Living-off-the-land via AI service APIs as covert C2 channel	Runtime
Escape to Host (AML.T0105, v5.4.0)	3	L1, L2, L3	Kernel syscalls: unshare, mount, capability acquisition	Runtime
AML.T0024 Exfiltration via AI Inference API	1, 3	L1, L2, L3	Volume/pattern deviation to allowlisted destinations correlated with input	Runtime

The pattern in the matrix is structural, not coincidental. The agent-focused techniques contributed in the October 2025 Zenity Labs collaboration and the v5.4.0 release characteristically enter on Surface 1 — Context Poisoning, Memory, Thread, RAG Credential Harvesting all manipulate what the agent reasons over — and manifest on Surface 2 or Surface 3, where modified tool behavior, host escape, and service-API abuse produce the detectable signal. Single-surface detection is structurally insufficient for this technique class. Layer 3 — cross-layer correlation — is what assembles the input on Surface 1 with the downstream action on Surface 2 or 3 into a single attack story. That’s the architectural slot ARMO’s CADR was built to occupy, with Application Profile DNA producing the per-Deployment baselines that make sequence detection feasible at all.

The Other Two Buckets — Posture and Lifecycle Belong to Different Tool Classes

The matrix above covers Bucket 1. The credibility of the coverage report depends on what the architect says about Buckets 2 and 3 — and the honest answer is that the runtime detection program isn’t responsible for them.

Posture-evidenced techniques. Overprivileged service accounts, exposed model artifacts in object storage, missing network segmentation around agent workloads — these are configuration-state findings, detectable in CSPM and CIEM tools that read declarative state from cloud APIs and Kubernetes manifests. The runtime program sees the consequence (an action that exercised the standing permission) but not the root cause (the permission existed in the first place). On the coverage report, posture-evidenced techniques get marked as covered by the posture management program, with the specific tool named. The architect’s report should list both stacks side by side; the auditor doesn’t care which tool produced the evidence, only that the technique has a named owner.

Lifecycle-evidenced techniques. ML Attack Staging at training time, ML Supply Chain Compromise on model artifacts, Poison Training Data, Model Theft — the evidence for these techniques sits in the model lifecycle before runtime exists. By the time the model is loaded into a running container, the compromise is already baked into the weights. The runtime program cannot see the training-data injection that produced the malicious behavior; it can only see the behavior itself, often weeks or months later. These techniques belong to supply-chain scanning of model artifacts, training-data integrity controls, and model artifact provenance pipelines. A credible report names them and points at the responsible program. Naming the boundary explicitly is what makes the rest of the matrix defensible — universal-coverage claims collapse under the first lifecycle-bucket question an auditor asks.

One partial bridge does exist: the runtime-derived AI-BOM tags what actually loaded into runtime, with principal, source, and timing. That doesn’t prevent the lifecycle attack, but it produces the forensic trail that lets an incident response team trace a runtime anomaly back to the specific model artifact and the supply-chain path that produced it. The trail closes the gap on attribution, not on prevention.

Producing the Coverage Report Your CISO and Auditor Will Accept

The deliverable lands as a four-step workflow on top of the matrix.

Step one: enumerate the agents. A runtime-derived inventory of every agent running in production — names, deployments, identities, dependencies, tools — built from observed execution rather than from a JIRA spreadsheet, is the artifact this step needs. Without per-agent identity, the matrix has no anchor to run against. We’ve laid out before why static AI-BOMs fall short for this step.

Step two: run the matrix per agent. For each agent, walk the twelve representative techniques (and the rest of Bucket 1 from atlas.mitre.org as needed) against the agent’s instrumentation. For each technique: is the signal source named? Is the layer running? If yes, mark covered; if no, mark gap. The output per agent is a four-row score — one row per surface — naming which techniques have evidence and which don’t.

Step three: assemble program-level coverage. The per-agent scores aggregate into a program-level coverage line. A program with all twelve techniques named across all agents covers Bucket 1. A program with Surface 2 gaps covers most agent attacks except tool-misuse and the v5.4.0 agent-focused additions; that’s a coverage statement, not a failure.

Step four: feed the output into the maturity grid. The matrix output produces the surface-coverage axis of the 2×2 maturity grid from the detection framework directly — count surfaces where at least one technique has a named signal source. Stack depth comes from auditing how many of the five layers run end-to-end for those signals. Two numbers, one point on the grid, defensible against either the CISO or the auditor. The conversation with the CISO is not “are we mature” — it’s “we’re at surface coverage 2, stack depth 3, and the next quarter’s instrumentation closes Surface 4.”

The Runtime Context Test in the buyer’s guide does the same diagnostic for vendor evaluation — three “show me” demo questions a CISO can run against a vendor. The matrix here is the program version of that test: the architect runs it against the program they already operate, not against a vendor.

Where the Matrix Leaves You

The framework lands as three pieces. MITRE ATLAS is the vocabulary the CISO conversation will use. The four detection surfaces and the five-layer stack are the program that actually runs. The matrix between them is the translation that makes the coverage report defensible.

The honest version of the report — naming what runtime can see, what posture sees instead, and what training-time and supply-chain prevention own outside the runtime layer — holds up under audit scrutiny in a way that aggregate-coverage claims don’t. Auditors check specific techniques against specific evidence; a report that maps every technique to a named program survives that check, and a report that doesn’t, doesn’t.

The architect’s next quarter follows directly from the matrix output. Which surface has the largest cluster of uncovered techniques? Instrument that surface next. Which layer is missing for the surfaces already instrumented? Run that layer next. The matrix is the artifact that turns the ATLAS conversation into program decisions. Walking it against a real environment is the fastest way to see where the program actually stands. The cloud-native security for AI workloads platform was built to occupy the framework end-to-end.

FAQ

How do I produce an ATLAS coverage report for a CISO or auditor?

Four steps: enumerate agents from a runtime-derived inventory, run the matrix per agent producing a per-surface coverage line, aggregate into program-level coverage, feed the result into the surface coverage and stack depth axes of the maturity grid. The report names the technique, the signal source, and the responsible program (runtime, posture, or lifecycle) per technique — explicit boundaries hold up better than universal claims.

Do I need to instrument all 84 ATLAS techniques?

No. Only the runtime-evidenced bucket — roughly sixty to seventy percent of agent-relevant techniques — falls to the runtime detection program. The posture-evidenced bucket belongs to CSPM and CIEM tools that read configuration state. The lifecycle-evidenced bucket belongs to supply-chain scanning and training-data integrity controls. A credible report names which program owns which technique.

How often should I rerun the ATLAS coverage matrix?

Re-run when ATLAS publishes a new version (the v5.1.0 to v5.4.0 cadence was four months), when the agent fleet changes meaningfully (new deployments, new tool integrations, new orchestration framework), or on the audit cadence the CISO or auditor requests. The matrix is cheap to rerun once the signal-source mapping is documented; what changes between versions is usually the technique inventory, not the surface mapping.

Should I produce ATLAS coverage and OWASP LLM Top 10 coverage as separate reports or one combined report?

Separate. ATLAS answers the auditor’s question (“do we detect adversary techniques”) with a technique-level coverage list. OWASP LLM Top 10 answers the governance question (“have we addressed prioritized risks”) with a risk-category coverage list. Combining them blurs both — the auditor gets a fuzzy answer on technique coverage, governance gets a fuzzy answer on risk treatment. Two reports, two audiences, two reusable artifacts.

Can I use the matrix for red team exercises against my own agents?

Yes. The runtime-evidenced bucket becomes the test plan directly — each technique is a scenario the red team executes against a non-production agent, and the matrix predicts which surface should fire and which layer should assemble the alert. Lifecycle-bucket techniques require separate red-team methodology (training-data injection, supply-chain compromise simulation) that runs at build and training time rather than at runtime.