Prompt and Tool Call Visibility: What Your AI Agents Are Actually Doing
It is 11:47 p.m. and the on-call security engineer is staring at two dashboards. On...
Apr 30, 2026
Observe-to-enforce builds behavioral baselines from observed agent traffic — what tools the agent calls, which networks it reaches, which syscalls it executes — and converts them into per-agent enforcement policies. Baselines persist at the Deployment level because pods churn and the envelope has to outlive any single restart. The methodology runs as a four-stage progression: discovery, observation, selective enforcement, continuous least privilege. We have previously walked the observe-to-enforce methodology end-to-end. The healthcare extension lives on top of that foundation — same engine, additional dimensions the standard primitives can’t natively express.
A platform engineer for a hospital system sits down to translate an observed envelope into NetworkPolicy, seccomp, and IRSA YAML for an ambient scribe. The privacy office’s boundary requirements don’t fit those primitives’ vocabulary. NetworkPolicy can deny traffic to a CIDR but can’t deny traffic to “any endpoint not in the active BAA registry.” Seccomp can constrain syscalls but can’t constrain FHIR resource type access. IRSA can scope IAM permissions but can’t enforce minimum-necessary attestation per agent. The standard primitives are necessary. They are not in the right language.
What follows walks the five primitives, the three live agent classes — ambient scribe, clinical decision support, prior authorization — and the Track 2 question staging raises in healthcare: real PHI in staging is itself a regulatory disclosure, so the financial-services Track 2 model needs three healthcare-specific parity criteria before staging-built baselines can ship to production. The starting assumption is that the AI workload security buyer’s guide framework has been adopted at the program level — what follows is the engineering layer beneath it.
Five Kubernetes enforcement primitives hold up observe-to-enforce in non-clinical environments. Each one expresses something cleanly in its own native vocabulary. Each one runs out of vocabulary at a different HIPAA boundary. The next three sections walk the rows that need the most engineering work — FHIR semantics, BAA scope, break-glass — and the section after addresses the Track 2 staging question separately.
| Standard primitive | Native vocabulary | Where it stops | Healthcare extension |
|---|---|---|---|
| NetworkPolicy | Pod-level egress allowlist by CIDR | BAA-scoped egress (recipient-keyed, not CIDR-keyed) | Runtime AI-BOM as the BAA-scoped layer above CIDR |
| Seccomp | Syscall constraints at the kernel layer | FHIR resource type semantics (HTTP-layer, not kernel-layer) | Behavioral envelope at the application layer |
| IRSA / Workload Identity | IAM role scope per workload | Minimum-necessary attestation (observed-vs-declared) | Per-agent identity bound to the observed envelope |
| Kubernetes RBAC | K8s API permissions | Clinical context attribution (which patient/encounter) | Identity chain through application-layer correlation |
| Observe-to-enforce maturity | Four-stage progression | HIPAA-defensible attestation as byproduct | Each stage produces a HIPAA-keyed output |
How do you sandbox AI agents in HIPAA environments? You run observe-to-enforce. Then you extend each enforcement layer with the healthcare semantics the primitive’s native vocabulary can’t express. The per-agent enforcement policy is the deliverable. The table above is its shape.
Every AI agent already has per-agent guardrails along four enforcement dimensions: tool and API calls, network destinations, syscalls, and file access. For clinical agents, FHIR resource type and operation become a fifth dimension layered on top — what the agent reads, at what operation, at what frequency, in what clinical-context grouping. The Application Profile DNA at the Deployment level captures this fifth dimension as part of the per-agent envelope.
What the envelope captures per agent: FHIR resource type, operation (read, write, search, history), frequency, clinical-context grouping pattern, and the relationship between retrieval pattern and write-back pattern. Three live agent classes show what this looks like concretely.
Reads Patient, Encounter, Observation, and MedicationStatement at encounter-trigger time, in tight clinical-context grouping — one encounter, one set of resources retrieved together, one chart context assembled. Writes DocumentReference and Encounter back through the FHIR gateway after the clinician signs the draft note. The envelope captures the burst pattern: reads cluster around the encounter window, writes follow the signature event, no resource access happens outside that window without a break-glass identity.
Heavier on Condition and diagnostic Observation reads. Batch-shaped retrieval against guideline corpora — pulls dozens of references per query against an external embedding service. Write paths run into the ordering workflow as MedicationRequest or ServiceRequest drafts that a clinician approves. The envelope captures both the corpus-retrieval pattern and the ordering-workflow write surface as distinct shapes. The CDS agent is the loudest of the three on egress; its BAA scope is also the widest.
Batch-shaped, processes claims case by case. Reads Coverage, Claim, Condition, Procedure, and supporting Encounter and Observation resources for each claim under review. Writes are rare and structural — ClaimResponse resources back to payer integration. The envelope is the steadiest of the three: queue-driven, predictable, with little variance per claim type. Anomalies in this envelope are coarse — a claim type the agent has not processed before, a payer endpoint outside the registered BAA list.
The AI Bill of Materials names the external endpoints each agent calls; the FHIR envelope names what the agent does internally with the data those endpoints return. Both layers are runtime-derived. Both update continuously as model versions and prompt templates change.
How this becomes minimum-necessary attestation under §164.502(b): declared scope minus observed scope is the attestation. The gap is typically large. Clinical agents are deployed with broad permissions to accommodate variable clinical context; the observed scope is the envelope they exercise. The §164.502(b) standard requires that disclosures be limited to the minimum amount reasonably necessary for the intended purpose — and for non-deterministic agents, the only defensible “minimum necessary” is what the agent retrieves under load, not what it was provisioned to retrieve.
A privacy auditor asking “show me minimum-necessary evidence for this ambient scribe” gets a behavioral envelope showing the FHIR resource types the agent reads at encounter time, the operations it performs, the frequency distribution, and the gap between that and its declared permissions. The evidence is the byproduct of running enforcement, not a separately produced report.
The runtime AI-BOM joined against the active BAA registry is the operational extension of §164.504(e). Every new endpoint gets matched against that registry at first contact. Endpoints outside BAA scope produce immediate enforcement events. Response mode — kill, quarantine, or tagged-and-allowed-with-alert — depends on agent criticality tier.
The drift problem is the reason continuous enforcement matters. BAA registries update on legal-team timelines, which run in weeks. Agent egress lists drift on deploy-cycle timelines, which run in days. Continuous enforcement is the only way to keep the gap from accumulating into a §164.504(e) violation. A one-time deployment scan against the BAA registry catches what was visible at deploy time. It does not catch the fallback model endpoint added in a hotfix three days later.
A fallback model endpoint gets added in a hotfix to handle a primary model rate-limit issue. The deployment scan that ran a week ago didn’t see it. Under continuous enforcement, the new endpoint appears as a runtime AI-BOM new entry at first contact. The endpoint fails the BAA-registry lookup. An enforcement event fires before any PHI transmits. The hotfix gets rolled back; the BAA review that should have preceded the endpoint addition gets initiated; the auditor sees an enforcement event that worked.
Under one-time-check enforcement, the same scenario produces no event. The endpoint sits in production processing PHI to a recipient with no active BAA. The discovery arrives by accident, weeks later, during an unrelated review. The §164.504(e) violation is already on the record by the time anyone notices.
Pattern: identify break-glass-initiated workflows during the observation phase, document them as sanctioned-exception classes, build per-class fail-open paths into the per-agent enforcement policy. Each fail-open use produces its own attestation evidence — trigger, identity, access, expiry — rather than relaxing the policy globally.
Break-glass deviations are not statistical anomalies of legitimate behavior. They are categorically separate workflows triggered by separate identity claims: a break-glass role activation, an emergency context flag set at the application layer, a documented Joint Commission-required override. The per-agent envelope can recognize the workflow class and apply a different policy without losing audit-trail integrity. The envelope doesn’t widen permanently; it switches policy lanes for the duration of the break-glass event.
A clinician triggers a Code Blue. The break-glass identity claim activates at the EHR layer and propagates into the agent’s request context. The ambient scribe attempts retrieval against a Patient and Encounter combination it has never seen for that clinician — outside the clinician’s normal panel, outside the scribe’s observed envelope for that user. The fail-open path recognizes the break-glass workflow class from the identity claim. Access proceeds. The attestation event captures what happened: workflow class set to Code Blue break-glass, identity recorded as the on-call physician under break-glass authorization, accessed resources logged as Patient, Encounter, and Observation for the specific MRN, expiry set to 60 minutes from activation. The audit log writes the full chain.
When the 60 minutes expire, the policy lane switches back. The next routine access by the same scribe outside a break-glass window goes through the standard envelope. The fail-open path doesn’t persist. No global relaxation. Only the time-bounded, identity-bounded, attestation-producing exception.
The §164.312(b) audit trail is generated as a byproduct. Every break-glass event has its own record. A privacy officer asking “show me every break-glass-initiated PHI access by AI agents in the past 90 days” gets a list with timestamps, identities, accessed records, and trigger events.
The two-track enforcement model — standard observe-to-enforce for most agents, pre-enforced deployment for agents whose production observation window itself carries regulatory cost — was developed for financial services. The model assumes you can build behavioral baselines in staging using production-equivalent traffic. For clinical AI agents, real PHI in staging is itself a HITECH §13402 disclosure. Staging baselines have to be built using synthetic or de-identified traffic, which introduces a parity problem the financial-services analog doesn’t face.
Three healthcare-specific parity criteria extend the financial-services parity model. They are the validation gate before a staging-built envelope ships to production as enforcement policy.
Does the synthetic data reproduce the FHIR resource shape, frequency, and clinical-context distribution of production traffic? Most synthetic PHI generators produce schema-valid records that pass JSON validation but miss the clustering pattern. Real encounters happen in bursts at clinical-workflow trigger points — admission, medication reconciliation, discharge — not uniformly across a 24-hour window. An ambient scribe’s behavioral envelope captures the burst pattern, not the average rate. A staging baseline built on uniformly distributed synthetic encounters never sees that pattern. When the agent ships to production, the burst itself reads as anomalous to the enforcement engine, and clinicians get blocked at admission rounds.
Does staging traffic exercise every FHIR resource type and operation the agent will encounter in production? The behavioral envelope captures patterns that don’t appear if synthetic data only exercises the common 80% of resource types. A CDS agent that occasionally retrieves DiagnosticReport for a complex case never sees DiagnosticReport in staging if the synthetic dataset omits it; the first production DiagnosticReport read fires as anomaly, the agent gets quarantined, and a clinician’s order workflow stalls at the moment they needed the recommendation.
Does staging include simulated break-glass activations? Production agents will hit break-glass paths. If staging never exercises them, the envelope doesn’t include the sanctioned-exception class. The first production break-glass event reads as anomaly, the policy fires, and a Code Blue runs into a security alert that should have been a documented sanctioned exception. Synthetic break-glass simulation isn’t an edge case to handle later — it’s part of the baseline.
When parity holds across all three criteria, the staging-built envelope ports cleanly to production. The agent ships to production already enforced; the production validation window confirms baseline accuracy without putting PHI in observation mode. When parity fails — when the synthetic data missed the encounter-burst pattern, or DiagnosticReport coverage was incomplete, or break-glass simulation was skipped — the agent goes back to the observation queue or stays under stricter enforcement until production-validated baselines accumulate.
The validation discipline is what makes Track 2 operationally usable in healthcare. Without parity validation, the staging-built baseline either over-blocks legitimate clinical work — false positives that interfere with patient care — or under-blocks because it never saw the production envelope, letting real attack patterns through. The three parity criteria are how the team demonstrates to the privacy office, the clinical informatics review, and the eventual auditor that the staging baseline meets the standard production-grade enforcement requires. The criteria are the evidence artifact alongside the baseline.
Six months later, the same platform engineer who started with a vocabulary gap takes an OCR audit request from the privacy officer. The request is for documentation of every AI-agent-initiated PHI disclosure in the past 90 days, including minimum-necessary attestation per agent, BAA scope confirmation per endpoint, and a record of every break-glass activation tied to AI agent activity.
The engineer pulls four artifacts. The minimum-necessary attestation per agent — declared FHIR scope minus observed FHIR scope — generated continuously as a byproduct of running. The runtime AI-BOM joined against the active BAA registry, with a flagged event log for every endpoint that fell outside BAA scope and the enforcement action that followed. The break-glass attestation log, with workflow class, identity, accessed records, and expiry per event. The per-agent enforcement policy version history showing each baseline update against each model and prompt template change.
The auditor doesn’t ask the team to reconstruct evidence. The evidence is what the agents produce by running.
Standard observe-to-enforce stays the methodology. The healthcare extension adds FHIR semantics as a fifth enforcement dimension, BAA-scoped egress as continuous enforcement, break-glass as engineered sanctioned-exception paths, and de-identified staging parity as the Track 2 validation layer. The standard primitives are necessary. The healthcare extension is what makes them defensible.
To see Application Profile DNA at the Deployment level, eBPF-based enforcement at 1–2.5% CPU overhead, and per-agent attack-story correlation tied to specific patient-record access in a clinical Kubernetes environment, book a demo or see the cloud-native security platform for AI workloads.
The methodology runs continuously, but rolling out new enforcement policies during a survey window is rarely the right operational call. Joint Commission surveys typically consume two to three weeks; during that window, change-control tightens, on-call rotations narrow, and clinical leadership has limited bandwidth for incident response. Plan rollouts around survey calendars and EHR release freezes the same way you plan around quarter-end. If a Track 2 staging baseline is ready to promote during a survey window, hold the promotion until the window closes.
Start with the resources the scribe touches per encounter: Patient, Encounter, Observation, and MedicationStatement on the read side; DocumentReference and Encounter on the write side. Add Condition and AllergyIntolerance once the encounter-window envelope is stable, because both surface in clinical context assembly. The instrumentation order matters less than the observation duration — two to three weeks of clinical traffic captures more pattern variation than four days against a synthetic load.
Plan for two to four weeks of staging observation against synthetic traffic that satisfies all three parity criteria, then a one-week production validation window in audit mode before full enforcement engages. The window should cover at least one full clinical operational cycle — admission patterns, weekend-versus-weekday differences, end-of-shift handoff bursts — so the envelope captures the operational shape, not just the average. Re-baselining after model updates or prompt template changes restarts the clock.
Workflows triggered by a documented role activation — Code Blue, rapid response, emergency department triage — belong in the per-agent policy as sanctioned-exception classes with their own fail-open paths. Workflows that run on a regular emergency-context flag set by the EHR but don’t change the agent’s normal envelope shape belong in the standard envelope; the flag is metadata, not a policy switch. The decision criterion is whether the access pattern itself differs during the workflow, not whether the situation is clinically urgent.
The enforcement policy reads the BAA registry as a continuously consulted layer rather than a baked-in list, so registry changes take effect at the next endpoint contact rather than requiring a policy redeploy. New BAA registrations open up the matching endpoint for legitimate calls; BAA terminations close the endpoint for future calls. The runtime AI-BOM logs every endpoint contact with the BAA-registry status at the moment of contact, so the §164.504(e) compliance evidence carries the registry version as part of the record.
It is 11:47 p.m. and the on-call security engineer is staring at two dashboards. On...
The external auditor’s evidence request lands Tuesday morning. A security architect at a Tier 1...
A platform team at a mid-size SaaS company runs three LangChain agents and one AutoGPT-derived...