When Supply Chain Attacks Meet NHI Sprawl
.jpg)
What Mercor and Other Breaches Reveal About Non-Human Identity Risk
TL;DR: Aqua rotated their credentials after the initial Trivy compromise. Three weeks later, TeamPCP used residual access to launch the cascading campaign that hit LiteLLM, Mercor, and thousands of others. The supply chain was the vector but it was non-human identity (NHI) access that defined the blast radius. It's the same NHI escalation pattern behind the Cloudflare breach, the tj-actions compromise, and the Cisco incident. The industry's universal remediation advice, ‘rotate your credentials’, assumes capabilities most organizations don't have.
This post breaks down what went wrong, why rotation alone keeps failing, and what security teams can do to actually close the gap. You can't always predict the next supply chain attack but you can prepare for how far it reaches.
What happened: the Trivy-LiteLLM-Mercor attack chain
In late February 2026, a threat actor group called TeamPCP exploited a misconfigured GitHub Actions workflow in the Aqua Trivy vulnerability scanner to steal a privileged access token. The vulnerability (CVE-2026-33634, CVSS 9.4) was later added to CISA's Known Exploited Vulnerabilities catalog. Aqua Security disclosed the breach on March 1 and rotated credentials. But the rotation wasn't complete.
On March 19, using residual access, TeamPCP published malicious releases that turned trusted CI/CD tooling into credential-harvesting infrastructure. Within days, the campaign expanded to LiteLLM (present in ~36% of cloud environments), 47 npm packages, Checkmarx, and Telnyx, a new target every one to three days. What followed was the largest documented supply chain cascade of 2026, culminating in the Mercor data breach.
The payload was exhaustive. If a credential existed on a compromised machine — cloud keys, Kubernetes tokens, database passwords, VPN configs, TLS private keys — it was harvested. Wiz's incident response team documented what happened next. TeamPCP validated stolen credentials within hours using TruffleHog, then moved to full cloud enumeration. Attackers moved quickly across IAM, compute, storage, and secrets management layers, prioritizing speed over stealth.
At Mercor, harvested Tailscale VPN keys gave the attackers network-level access. They exfiltrated 4TB: source code, a user database with PII, and storage buckets containing video interviews and identity documents.
For the full technical breakdown, see the analyses from Upwind, Palo Alto Unit 42, and Microsoft Security.
"Rotate your credentials" is not a remediation plan
Every post-breach advisory ends the same way. Aqua's own incident response page lists "rotate all potentially exposed secrets" as a required action, followed by ten categories of credentials across cloud providers, container registries, SSH keys, Kubernetes tokens, database passwords, VPN configs, TLS private keys, and cryptocurrency wallets. Microsoft's advisory says the same. So does every CISO playbook ever written.
The advice is correct. The problem is that it assumes capabilities most organizations don't have.
Rotation requires knowing what to rotate. That means a complete inventory of every non-human identity and credential in the blast radius. It requires knowing what depends on each credential: which services, applications, and pipelines will break if you revoke it. And it requires knowing whether the rotation actually worked. Not just the paths you remembered to check. Every credential path, confirmed dead.
Aqua rotated. It wasn't enough. Their own disclosure acknowledges the rotation "did not fully sever access." Three weeks later, that gap enabled the entire cascading campaign — Trivy to LiteLLM to Mercor to thousands of others. One service account that survived rotation. One PAT that wasn't revoked. That's the distance between containment and catastrophe.
This pattern and its failures isn’t unique to Aqua. When Cloudflare was breached via compromised Okta credentials in 2023, their team rotated aggressively. They still missed one access token and three service accounts they mistakenly believed were unused. Those four overlooked NHIs gave the attacker access to Confluence, Jira, and Bitbucket. When Cloudflare later attempted a large-scale credential rotation as a remediation measure, the rotation itself caused a production outage. They didn't have a complete map of what depended on what.
"Rotate your credentials" is the right answer. But without the ability to discover, map dependencies, understand blast radius, prioritize, and verify? It's an aspiration. Not a plan.
The pattern that keeps repeating
Every detail changes from breach to breach: the entry point, the target, the exfiltration method. But the escalation path is always the same:
Initial access → credential harvesting → NHI validation → lateral movement → full compromise.
The initial access gets the headlines. But the force multiplier is the same every single time. Non-human identity credentials that are over-privileged, unmonitored, and outlive their intended purpose.
This is the same pattern behind the:
- Codecov breach in 2021: a compromised CI tool harvesting environment variables and credentials from thousands of customer pipelines.
- Okta's 2022 compromise by Lapsus$: the same extortion group now auctioning Mercor's data where NHI credentials found on accessible systems enabled the escalation.
- tj-actions/changed-files incident in 2025: a single compromised GitHub Action leaked secrets from CI/CD pipelines across 23,000 repositories.
- Cisco breach: hard-coded NHI credentials (API tokens, certificates, private keys) were the prize.
The Cloud Security Alliance's research note on TeamPCP names it directly: the attack succeeded not because of sophisticated zero-days, but because of "the endemic mismanagement of non-human identities across modern software supply chains."
The entry point will keep changing. The NHI sprawl is always what turns a breach into a catastrophe.
Five questions that determine your blast radius
Secret scanning and detection are necessary, but they solve for the moment a credential leaks. The organizations that contain incidents fastest are the ones that can answer a harder set of questions about what happens after a credential is compromised. These are the questions that separate "we found it" from "we fixed it."
1. Do you see every credential and its full context profile?
You can't rotate what you don't know about. When the LiteLLM advisory dropped, the first question every affected organization needed to answer was: which of our credentials are at risk, and what can they reach? Not just the credential itself — the identity behind it, its entitlements, its consumers, the downstream systems it connects to, the data it can access. Without that inventory, triage takes days. TeamPCP was enumerating AWS environments within 24 hours. That's the window.
2. Do you understand the blast radius before you act?
At Mercor, a single Tailscale VPN credential unlocked full network access. Across the broader campaign, stolen credentials reached IAM roles, secrets managers, ECS clusters, and databases. Least privilege matters but understanding blast radius means more than checking permission policies. It means knowing which consumers actually use a credential, what they connect from, and what resources they touch. A live dependency map, not a static config review. That's what makes the difference between safe rotation and a rotation that takes production down.
3. Are stale credentials still active?
TeamPCP used TruffleHog to filter harvested credentials for valid, active ones. Many were likely stale — created for a purpose months or years ago and never deprovisioned. Proactive lifecycle management means that when credentials are inevitably harvested, many are already useless.
The Aqua incident is the starkest illustration: credentials that were supposed to have been rotated still worked three weeks later. Shortening credential lifetimes and decommissioning what's unused is the single highest-leverage thing most organizations can do to reduce their blast radius before a breach happens.
4. When a credential behaves anomalously, do you catch it and can you remediate at scale?
The post-compromise phase is textbook anomalous behavior for a non-human identity. Think: ListUsers, ListRoles, DescribeInstances, ListSecrets, ListBuckets. Credentials suddenly active from unfamiliar networks, accessing resources they've never touched, or waking up after months of dormancy. These are detectable signals. But only if you're baselining NHI behavior, where they connect from, and what they normally access.
The organizations that caught TeamPCP before exfiltration were the ones monitoring identity behavior in real time, not scanning for exposed secrets after the fact.
5. When you rotate, can you verify it actually worked?
This is the lesson Aqua learned the hard way. Rotation that leaves one valid credential behind is not rotation, it's a countdown to re-compromise. Verification means confirming that every credential in the blast radius has been invalidated, every consumer has been migrated to the new credential, and no residual access path remains. It's the difference between incident response and incident recurrence.
From "found" to "fixed": treating leaked credentials as identity problems
The standard playbook treats a leaked credential as a string-matching problem: find it, flag it, tell someone to rotate it. That's necessary. It's also where most programs get stuck — because the alert fires and nobody can answer the five questions above.
The shift that changes this is treating a leaked credential as an identity problem. When a scanner finds a secret, the question that matters isn't "is this a real credential?" It's: which identity does this belong to? Is it still valid? What can it reach? What consumes it? What breaks if I revoke it? Who owns it?
That's what we mean by identity-aware secret scanning. It connects exposed credentials to the non-human identities behind them, so that "rotate your credentials" stops being an open-ended investigation and becomes an executable workflow. This is what non-human identity management looks like in practice: inventory, dependency mapping, behavioral baselines, blast radius analysis, and verified rotation. All anchored to the identity itself
Every organization will face a moment where credentials are compromised. The question isn't whether you'll be told to rotate. It's whether you'll have the context to do it completely, safely, and fast enough to matter.
The bottom line
The Mercor breach will be remembered as a supply chain attack. But the supply chain compromise was just the key in the lock. Everything that followed — the credential validation, the cloud enumeration, the lateral movement, the 4TB exfiltration — happened because non-human identity credentials were sitting everywhere, had access to everything, and nobody was watching.
The entry point will keep changing. The NHI sprawl is always the force multiplier.
We do newsletters, too
Discover tips, technical guides and best practices in our biweekly newsletter.




