Get the latest, first
arrowBlog
The Library That Holds All Your AI Keys Was Just Backdoored: The LiteLLM Supply Chain Compromise

The Library That Holds All Your AI Keys Was Just Backdoored: The LiteLLM Supply Chain Compromise

Mar 24, 2026

Ben Hirschberg
CTO & Co-founder

We just published a deep breakdown of the Trivy supply chain attacks yesterday. Twenty-four hours later, we’re writing about the next one. Same threat actor. Different target. Worse implications.

This time it’s LiteLLM, the Python library that acts as a universal API gateway for over 100 LLM providers. If you’re building anything with AI agents, MCP servers, or LLM orchestration, there’s a good chance LiteLLM is somewhere in your dependency tree. It has over 95 million monthly downloads on PyPI. And as of this morning, versions 1.82.7 and 1.82.8 contain a credential-stealing backdoor that harvests every secret it can find, moves laterally into your Kubernetes clusters, and installs a persistent backdoor on every node.

AI tooling is incredible. It’s also becoming the fattest, most credential-rich target in your entire infrastructure. Let’s talk about why.

Key Takeaways

  • LiteLLM is everywhere. With 95 million monthly downloads on PyPI and integrations across virtually every major AI agent framework, MCP server, and LLM orchestration tool, LiteLLM is one of the most critical building blocks in the AI ecosystem. If you’re working with AI, there’s a good chance it’s somewhere in your dependency tree, even if you never installed it directly.
  • It was compromised by TeamPCP this morning. On March 24, 2026, versions 1.82.7 and 1.82.8 were backdoored via maintainer account takeover. This is the same threat actor group behind the compromise of the Trivy vulnerability scanner (March 19), the hijack of the Checkmarx/KICS GitHub Action (March 21), and the defacement of 44 Aqua Security internal repositories (March 22). They are systematically working their way through the most trusted tools in the cloud-native and AI ecosystems.
  • The payload steals everything, persists on any Linux host, and spreads into Kubernetes clusters. The malware harvests SSH keys, cloud credentials, LLM API keys, .env files, database passwords, and crypto wallets from any machine it runs on. It installs a persistent systemd backdoor (sysmon.py) that polls for additional payloads. If it also finds a Kubernetes service account token, it escalates further: reading all cluster secrets across all namespaces and deploying privileged pods to every node. Version 1.82.8 uses a .pth file that executes on every Python process startup, no import needed. Developer laptops, CI runners, and production servers are all equally at risk.
  • Check your systems and rotate credentials now. Run pip show litellm across all environments. If you’re on 1.82.7 or 1.82.8, the damage may already be done. Remove the package, check for persistence artifacts (~/.config/sysmon/sysmon.py), audit kube-system for unauthorized pods, and rotate every credential that was accessible from the compromised machine. The last known-clean version is 1.82.6.

What Happened

On March 24, 2026, at 10:52 UTC, LiteLLM version 1.82.8 was published to PyPI. No corresponding tag or release exists on the LiteLLM GitHub repository. The package was uploaded directly to PyPI, bypassing the normal release workflow, which is a strong indicator of a maintainer account takeover.

Version 1.82.7 was also compromised, with malicious code injected into litellm/proxy/proxy_server.py.

The attack was discovered by researchers at FutureSearch when the package was pulled in as a transitive dependency by an MCP plugin running inside Cursor. That last detail matters: the victim never ran pip install litellm. It was pulled in automatically by their AI development tooling.

The Technical Breakdown

The compromise uses two injection vectors across the two malicious versions, and operates as a three-stage attack.

Stage 0: Getting Code Execution

Version 1.82.7 injects an obfuscated base64 payload into litellm/proxy/proxy_server.py at line 128, between two unrelated legitimate code blocks. The payload executes at import time, meaning any process that imports the LiteLLM proxy triggers the malware. According to Endor Labs’ analysis, the injected code is triple-nested: base64 in proxy_server.py decodes to an orchestrator, which decodes a second base64 blob containing the actual harvester.

Version 1.82.8 adds a second, more aggressive vector: a .pth file called litellm_init.pth (34,628 bytes). This is where things get especially nasty. Python’s .pth files are processed by the site module during interpreter startup. They don’t need an import statement. They don’t need your code to reference LiteLLM. If the package is installed in your environment, the .pth file executes on every single Python process invocation. Opening a Python REPL? It runs. Running a Django management command? It runs. Your CI pipeline executing pytest? It runs.

The .pth launcher spawns a child Python process via subprocess.Popen, but because .pth files trigger on every interpreter startup, the child process re-triggers the same .pth, creating an exponential fork bomb. This is actually a bug in the malware, and it’s what led to the initial discovery: a developer’s machine ran out of RAM and crashed.

The attacker also rebuilt the wheel with a regenerated RECORD file. The RECORD entry for proxy_server.py contains the SHA-256 of the backdoored file, so standard integrity checks against the wheel’s own metadata pass cleanly. Source-to-artifact comparison is the only reliable detection method.

Stage 1: Credential Harvesting

Once executing, the payload sweeps the host for everything of value:

  • SSH keys and configs (~/.ssh/)
  • Cloud provider credentials: AWS (~/.aws/credentials, ~/.aws/config), GCP (Application Default Credentials), Azure tokens
  • Kubernetes configs (~/.kube/config)
  • Environment variables, dumped in full, which for AI workloads typically includes OPENAI_API_KEY, ANTHROPIC_API_KEY, HUGGINGFACE_TOKEN, and every other LLM provider key that LiteLLM is designed to proxy
  • .env files, recursively discovered
  • .gitconfig and shell history
  • Cryptocurrency wallet files
  • Database connection strings
  • Cloud metadata endpoints (IMDS, container credential endpoints)

The collected data is encrypted with a hardcoded 4096-bit RSA public key using AES-256-CBC (random session key, encrypted with the RSA key), bundled into a tar archive named tpcp.tar.gz, and exfiltrated via HTTP POST to models.litellm[.]cloud, a domain designed to look like legitimate LiteLLM infrastructure.

Stage 2: Kubernetes Lateral Movement

If the malware detects a Kubernetes service account token on the host (which is present by default on every pod unless explicitly disabled), it escalates:

  1. Reads all cluster secrets across all namespaces. Service account tokens with broad RBAC permissions, common in development clusters and overly permissive production setups, give the malware access to every Secret object in the cluster.
  1. Creates a privileged alpine:latest pod on every node in kube-system. Each pod mounts the host filesystem at /host and runs with full privileges.
  1. Installs a persistent backdoor at /root/.config/sysmon/sysmon.py on each node’s host filesystem, with a corresponding systemd user service (sysmon.service).
  1. The backdoor polls for updated payloads from checkmarx.zone/raw, establishing ongoing C2 communication.

On non-Kubernetes hosts, the same persistence mechanism is installed locally at ~/.config/sysmon/sysmon.py.

The Kill Chain Summarized

.pth file auto-executes on Python startup

  → Spawns child process (base64-encoded orchestrator)

    → Decodes + executes credential harvester

      → Encrypts stolen data (AES-256-CBC + RSA-4096)

        → Exfiltrates to models.litellm[.]cloud

    → Checks for K8s service account token

      → Reads all cluster secrets

      → Deploys privileged pods to every node

        → Installs persistent backdoor (sysmon.py + systemd)

          → Polls checkmarx.zone/raw for additional payloads

Attribution: TeamPCP’s Expanding Campaign

Endor Labs attributes this attack to TeamPCP with high confidence. The C2 infrastructure, persistence tradecraft, encryption scheme, and kill switch logic are identical to the tooling used in the Trivy and KICS compromises earlier this month.

The timeline tells the story of a systematic campaign:

DateTargetVectorEcosystem
Feb 27Trivypull_request_target workflow exploitationGitHub Actions
Mar 19Trivy (again)Tag force-push, malicious v0.69.4 binaryGitHub Actions, Docker Hub, GHCR, ECR
Mar 21Checkmarx/KICSGitHub Action compromiseGitHub Actions
Mar 22Aqua Security org44 internal repositories defacedGitHub
Mar 24LiteLLMPyPI maintainer account takeoverPyPI

The progression is deliberate: security tools first (Trivy, Checkmarx), then AI infrastructure (LiteLLM). TeamPCP is targeting the tools that organizations trust implicitly, specifically vulnerability scanners and API gateways, because those tools have the broadest access to credentials and infrastructure.

Why This Matters More Than You Think: The AI Credential Problem

Here’s the part that should keep you up at night.

LiteLLM isn’t just any Python library. It’s an LLM API gateway. Its entire purpose is to hold API keys for dozens of AI providers (OpenAI, Anthropic, Google, Azure OpenAI, Hugging Face, Replicate, Bedrock, and many more) and proxy requests to them. A typical LiteLLM deployment has more API keys in its environment than almost any other service in your infrastructure.

And LiteLLM sits at the center of the fastest-growing software ecosystem in the world: AI agents. It’s a transitive dependency of agent frameworks, MCP servers, and LLM orchestration tools. The developer who discovered this compromise wasn’t even using LiteLLM directly. It was pulled in by a Cursor MCP plugin.

This is the fundamental tension we’re facing: the AI ecosystem is amazing, and it’s also a credential dumpster fire.

Think about what a typical AI agent deployment looks like:

  • LLM API keys for one or more providers (often with high rate limits and associated billing)
  • Tool-use credentials: database access, API tokens for SaaS services, cloud provider keys for infrastructure operations
  • MCP server credentials: whatever access the MCP servers expose, which can be anything from Slack to GitHub to your production Kubernetes cluster
  • Vector database credentials: Pinecone, Weaviate, Qdrant, or your self-hosted Postgres with pgvector
  • Memory and context stores: often containing sensitive conversation history, user data, and proprietary information

All of this lives in environment variables, .env files, Kubernetes Secrets, and cloud credential files. Those are exactly the targets that the LiteLLM malware harvests.

What You Should Do Right Now

Immediate Response

1. Check if you’re affected. Run pip show litellm in every environment. Check for versions 1.82.7 or 1.82.8. Don’t forget CI/CD environments, Docker images built today, and developer machines. Check uv caches too: find ~/.cache/uv -name "litellm_init.pth".

2. Check for persistence. Look for ~/.config/sysmon/sysmon.py and ~/.config/systemd/user/sysmon.service on every affected host. In Kubernetes, audit the kube-system namespace for pods matching node-setup-* with suspicious alpine:latest images and host filesystem mounts.

3. Rotate everything. If you ran the compromised version, assume every credential accessible from that environment is stolen. That means:

  • All LLM API keys (OpenAI, Anthropic, Google, etc.)
  • Cloud provider credentials (AWS access keys, GCP service account keys, Azure tokens)
  • SSH keys
  • Kubernetes configs and service account tokens
  • Database passwords
  • Any secrets in .env files

4. Audit Kubernetes clusters. If the compromised code ran in a pod with a mounted service account token, check whether unauthorized pods were created in kube-system. Review RBAC policies: if the service account had list secrets permissions across namespaces, assume all secrets are compromised.

Structural Defenses

This is where we shift from incident response to the question of how to prevent this class of attack from being this devastating in the first place.

Pin dependencies to exact versions with hash verification. pip install litellm==1.82.6 wouldn’t have saved you if your lockfile updated automatically. What does help: lockfiles with hash pinning (pip-tools with –generate-hashes, uv lock, or poetry lock). These ensure you’re installing exactly the artifact you reviewed, not whatever happens to be on PyPI right now.

Don’t mount service account tokens by default. Kubernetes mounts a service account token into every pod unless you explicitly set automountServiceAccountToken: false. Most workloads don’t need it. The LiteLLM malware’s Kubernetes lateral movement is completely dependent on finding this token. Kubescape’s C-0034 control flags pods with automatically mounted service account tokens.

Apply least-privilege RBAC. Even when a service account token is needed, it shouldn’t have list secrets across all namespaces. The malware reads every secret in the cluster, but only if the RBAC policy allows it. ARMO Platform’s runtime-based RBAC recommendations can show you which permissions are actually used versus what’s granted.

Monitor for anomalous runtime behavior. A Python web server suddenly spawning curl processes, reading /var/run/secrets/kubernetes.io/serviceaccount/token, or creating pods in kube-system is not normal behavior. Runtime detection, based on eBPF observation of actual system calls, catches supply chain attacks regardless of the entry vector. ARMO’s Cloud Application Detection & Response (CADR) profiles expected application behavior and alerts on deviations: unexpected process execution, anomalous network connections, suspicious file access patterns, and unauthorized Kubernetes API calls.

Treat AI workloads as high-value targets. The density of credentials in AI agent deployments makes them disproportionately valuable to attackers. Apply the same security rigor you’d apply to a payment processing service: network segmentation, secret rotation policies, runtime monitoring, and above all, the principle of least privilege for every API key and credential.

The Bigger Picture

We said it in our Trivy post yesterday, and we’ll say it again: supply chain attacks are not theoretical risks. They are active, ongoing, and escalating.

In the span of five days, TeamPCP has compromised a security scanner (Trivy), a SAST tool (Checkmarx/KICS), and now an AI API gateway (LiteLLM). The pattern is clear: compromise trusted infrastructure, steal credentials, use those credentials to compromise more infrastructure, repeat.

AI and AI agents are transforming how we build software. The productivity gains are real and massive. But the ecosystem around them (the package registries, the transitive dependency chains, the MCP servers, the agent frameworks) is a target-rich environment where a single compromised package can cascade into thousands of affected systems, each loaded with API keys and cloud credentials.

The question isn’t whether to adopt AI tooling. It’s whether your security posture is ready for the reality that every library in your AI stack is a potential attack vector, and the credentials it has access to are exactly what attackers are after.

Indicators of Compromise (IoCs)

IndicatorType
litellm==1.82.7Compromised PyPI package
litellm==1.82.8Compromised PyPI package
litellm_init.pth (SHA-256: ceNa7wMJnNHy1kRnNCcwJaFjWX3pORLfMh7xGL8TUjg)Malicious .pth file
models.litellm[.]cloudExfiltration C2 domain
checkmarx[.]zone/rawPayload delivery C2
~/.config/sysmon/sysmon.pyPersistence backdoor
~/.config/systemd/user/sysmon.servicePersistence systemd service
tpcp.tar.gzExfiltration archive filename
Privileged pods in kube-system matching node-setup-* with alpine:latestK8s lateral movement

Last known-clean version: litellm==1.82.6 (published March 22, 2026).


This post references analysis from FutureSearch, Endor Labs, Wiz, and Sysdig. The community is tracking the issue at BerriAI/litellm#24512.

Close

Your Cloud Security Advantage Starts Here

Webinars
Data Sheets
Surveys and more
Group 1410190284
Ben Hirschberg CTO & Co-Founder
Rotem_sec_exp_200
Rotem Refael VP R&D
Group 1410191140
Amit Schendel Security researcher
slack_logos Continue to Slack

Get the information you need directly from our experts!

new-messageContinue as a guest