The Claude Code source leak revealed that Anthropic’s coding tool contacts its servers every hour to poll remote settings — including 6+ killswitches that can force-quit the application, bypass permission prompts, or toggle features without user-initiated updates. The code also contains “Undercover Mode,” a stealth protocol that strips all AI attribution from commits made by Anthropic employees in public open-source repositories.
On March 31, 2026, a misconfigured .npmignore file caused Anthropic to accidentally ship a 59.8 MB source map inside the @anthropic-ai/claude-code npm package (version 2.1.88). The result: 512,000 lines of unobfuscated TypeScript across 1,906 files became public — mirrored across GitHub before Anthropic could react. If you want the full story of how the leak happened and what the Claude Code leak exposed overall, we covered that in a separate deep dive. This article zooms in on three specific subsystems that raise serious questions about transparency, user control, and the ethics of AI-assisted open-source contributions.
What are Claude Code’s remote killswitches and how do they work?
A killswitch is a mechanism that allows a vendor to remotely change the behavior of software running on your machine — without requiring you to update the application or explicitly consent. The Claude Code source reveals at least 6 such remote killswitches, all managed through GrowthBook, an open-source feature flagging and A/B testing platform.
The architecture works in two layers. First, a remote settings endpoint on Anthropic’s servers is polled every 60 minutes by every running instance of Claude Code. This endpoint returns a policySettings object — a configuration payload that can override local settings, activate or deactivate features, and in some cases force the application to shut down entirely. Second, GrowthBook feature flags (prefixed tengu_ — Claude Code’s internal project codename) gate individual behaviors at a granular level.
What can these killswitches actually control? Based on analysis of the leaked source, the documented capabilities include: bypassing permission prompts that normally guard file system and terminal access, enabling or disabling “fast mode” (which reduces safety checks for speed), toggling voice mode, controlling analytics collection, and forcing a full application exit. If a pushed configuration change is classified as “dangerous” internally, a blocking dialog appears — but if the user rejects the change, the application quits anyway. You don’t get to say no and keep working.
Remote killswitches are not unique to Claude Code — most cloud-connected SaaS products have similar capabilities. What makes this case notable is the combination: Claude Code requests filesystem access, terminal command execution, and full codebase read/write privileges on your development machine. A tool with that level of access being remotely controllable deserves scrutiny.
The feature flags themselves reveal Anthropic’s internal naming conventions. Some examples from the leaked source: tengu_attribution_header controls the billing attestation header, tengu_anti_distill_fake_tool_injection toggles decoy tool definitions to poison model distillation attempts, and tengu_penguins_off is a kill-switch for what appears to be an organizational feature mode. The naming is whimsical (penguins, tengu), but the capabilities are not.
How does Claude Code’s telemetry work — and what data does it send?
The telemetry system in Claude Code (managed by firstPartyEventLoggingExporter.ts) is more granular than most users would expect from a developer tool. When Claude Code launches, it phones home with a payload that includes: user ID, session ID, application version, platform and terminal type, organization UUID, account UUID, email address (if defined), and which feature gates are currently enabled.
Anthropic originally used Statsig for analytics — until OpenAI acquired Statsig in September 2025, which presumably made continuing that arrangement uncomfortable. The switch to GrowthBook followed. If the network is unavailable at the time of a telemetry event, data is cached locally in ~/.claude/telemetry/ and transmitted later.
Beyond standard product analytics, the leaked code reveals a frustration-tracking subsystem. A regex in userPromptKeywords.ts scans user input for profanity and anger signals — patterns like “wtf,” “this sucks,” “piece of shit,” and dozens of similar expressions. An LLM company using regex for sentiment analysis is ironic (as multiple commentators pointed out), but it’s also pragmatic: a regex check is orders of magnitude cheaper than an inference call just to determine if someone is swearing at the tool.
// Frustration detection regex (simplified)
/\b(wtf|wth|ffs|omfg|shit(ty|tiest)?|dumbass|
horrible|awful|piss(ed|ing)?\s*off|piece\s*of\s*
(shit|crap|junk)|what\s*the\s*(fuck|hell)|
fuck(ing)?\s*(broken|useless|terrible)|
so\s*frustrating|this\s*sucks|damn\s*it)\b/
There is an opt-out mechanism — users can disable all telemetry via the environment variable CLAUDE_CODE_DISABLE_AUTO_MEMORY=1, or run in --bare mode which strips both memory and telemetry. But there is no documented way to disable frustration tracking independently while keeping other features active. It’s all-or-nothing.
A separate telemetry channel, tengu_api_query, transmits the payload size of every API call — including the byte length of the system prompt, messages, and tool schemas. This isn’t the content of your code, but it gives Anthropic a precise picture of how much context each user session consumes — data directly relevant to capacity planning and pricing decisions.
What is Undercover Mode and why is it controversial?
Undercover Mode is perhaps the most discussed discovery from the entire leak — and the most ethically complex. It’s a stealth protocol that activates automatically when an Anthropic employee (identified by the flag USER_TYPE === 'ant') uses Claude Code to work on a public or open-source repository.
When active, it injects a system prompt that reads (verbatim from the leaked source):
## UNDERCOVER MODE - CRITICAL
You are operating UNDERCOVER in a PUBLIC/OPEN-SOURCE
repository. Your commit messages, PR titles, and PR
bodies MUST NOT contain ANY Anthropic-internal
information. Do not blow your cover.
NEVER include in commit messages or PR descriptions:
- Internal model codenames (Capybara, Tengu, etc.)
- Unreleased model version numbers
- Internal repo or project names
- Internal tooling, Slack channels, or short links
- The phrase "Claude Code" or any mention that you
are an AI
- Co-Authored-By lines or any other attribution
The system also actively strips Co-Authored-By metadata — the standard Git convention used to indicate that AI participated in writing code. An engineer can force Undercover Mode on manually with CLAUDE_CODE_UNDERCOVER=1, but critically, there is no way to force it off when the employee flag is active. This is a one-way door: if you are identified as an Anthropic employee working on a public repo, the AI will hide its own involvement by default.
The two sides of the argument
Defenders argue that Undercover Mode is a leak-prevention tool. Anthropic employees working on upstream open-source dependencies (like the Bun runtime, which Anthropic acquired) shouldn’t accidentally expose internal codenames like “Capybara” or “Fennec” in public commit logs. Stripping internal project names from external contributions is basic operational security — and plenty of companies do something similar manually.
Critics counter that the system goes beyond preventing internal leaks. The explicit instruction to never mention being an AI — and the active removal of Co-Authored-By attribution — means that Anthropic’s AI-generated code contributions to open-source projects are designed to be indistinguishable from human-written code. This is especially sensitive given that several major open-source projects (including the Linux kernel) have policies restricting or banning AI-generated contributions. If Claude Code is writing PRs that are reviewed and merged without reviewers knowing AI was involved, it undermines the trust model that open-source collaboration depends on.
Anthropic built Undercover Mode specifically to prevent internal information from leaking into external contexts — and then leaked everything through a .npmignore oversight. As one viral comment put it: nothing says “agentic future” like shipping the source by accident.
What are the 44 feature flags and what do they reveal about Anthropic’s roadmap?
The leaked source contains 44 compile-time feature flags that gate unreleased capabilities. Feature flags themselves are standard engineering practice — the insight here is what Anthropic is building behind them. The most significant flags, based on community analysis and multiple independent code reviews:
| Flag | Capability | Status |
|---|---|---|
| KAIROS | Persistent always-on autonomous agent — runs as a background daemon, maintains daily logs, proactively acts on observations with a 15-second blocking budget | Fully built, compile-gated |
| COORDINATOR_MODE | Multi-agent orchestration — spawns parallel worker agents communicating via XML | Partially released |
| VOICE_MODE | Push-to-talk voice interface with dedicated CLI entry point | Built, not shipped |
| ULTRAPLAN | 30-minute remote planning sessions with a dedicated Claude instance | Built, not shipped |
| BUDDY | Tamagotchi-style terminal pet with 18 species, rarity tiers, and stats like CHAOS and SNARK | Built, internal comments suggest April teaser |
| ANTI_DISTILLATION_CC | Injects fake tool definitions to poison model distillation attempts | Active in first-party sessions |
| NATIVE_CLIENT_ATTESTATION | Zig-level binary hash verification to prevent unauthorized Claude Code forks | Active in official Bun binary |
As one Hacker News commenter noted, the feature flag names alone are more strategically damaging than the code itself. KAIROS reveals Anthropic’s autonomous agent strategy, the model codenames reveal the Capybara product roadmap, and the anti-distillation mechanisms confirm that Anthropic views output-based model cloning as a real competitive threat. You can refactor code in a week — you cannot un-leak a roadmap.
How do these controls compare to other AI coding tools?
The natural question is: does everyone do this? The answer is nuanced. Google’s Gemini CLI and OpenAI’s Codex are open-source, but those companies released their agent SDK — a toolkit — not the full internal wiring of their flagship commercial product. The Claude Code leak exposed the production implementation, not a public-facing API wrapper.
On telemetry specifically, most developer tools collect usage data. VS Code, JetBrains IDEs, and GitHub Copilot all have telemetry systems. The difference with Claude Code is the combination of: filesystem and terminal access, remote configuration changes without explicit user consent, frustration-tracking that monitors emotional state, and a stealth mode that actively hides the tool’s involvement in code contributions.
No other major AI coding tool has a documented equivalent of Undercover Mode. AI agents in general are moving toward greater autonomy, but the question of attribution — whether AI involvement should be disclosed — remains an open policy debate. Anthropic’s internal answer, as revealed by the leaked code, is: no, at least not when Anthropic employees are the users.
What should developers using Claude Code actually do?
The practical implications depend on your threat model. For most individual developers, Claude Code’s telemetry is comparable to other cloud-connected tools. But if you’re working on sensitive codebases or in regulated industries, several steps are worth considering.
First, understand what runs on your machine. The leaked source confirms that Claude Code makes hourly outbound connections to Anthropic’s settings endpoint. If you use a DNS-level monitoring tool like Pi-hole, you can observe this traffic directly.
Second, use the available opt-out mechanisms. Setting CLAUDE_CODE_DISABLE_AUTO_MEMORY=1 disables all memory and telemetry write operations. Running with --bare strips both memory and the autoDream background process entirely. You can also reroute API calls to a private endpoint using ANTHROPIC_BASE_URL.
Third, if your work must stay completely private, consider running open-weight models locally via Ollama or LM Studio. No telemetry, no remote killswitch, no hourly polling. The trade-off is reduced model capability, but for privacy-sensitive work, that trade-off is often worth making.
A separate supply chain attack on the axios npm package occurred in the same time window as the leak. If you installed or updated Claude Code via npm on March 31 between 00:21 and 03:29 UTC, check your lockfiles for axios versions 1.14.1 or 0.30.4. If found, treat the machine as fully compromised.
What does this mean for AI transparency in open source?
The Undercover Mode revelation lands at a particularly sensitive moment for AI and open-source collaboration. The Linux kernel has restrictions on AI-generated contributions. Several major projects require Co-Authored-By attribution for AI-assisted code. The broader AI ecosystem is actively debating disclosure norms.
Anthropic positions itself as the safety-focused AI company — the one that publishes detailed model specs, invests in interpretability research, and emphasizes responsible deployment. Undercover Mode complicates that narrative. Not because stealth contributions are inherently malicious (there are legitimate operational security reasons to strip internal codenames), but because the explicit instruction to never disclose AI involvement goes beyond what most people would consider “responsible” in the context of open-source collaboration.
The OpenAI Model Spec explicitly addresses disclosure norms. The EU AI Act requires transparency about AI-generated content in certain contexts. Whether Undercover Mode technically violates any of these frameworks depends on interpretation — but it certainly sits in tension with the spirit of transparency that both regulatory bodies and open-source communities are trying to establish.
The cat is out of the bag. The source code has been mirrored, analyzed, rewritten in Python and Rust, and studied by tens of thousands of developers. Anthropic has filed DMCA takedowns that briefly removed over 8,000 copies from GitHub. But as one analyst observed: DMCA works on centralized platforms, and the code has already spread to places that are harder to reach. The strategic damage isn’t the code — it’s the questions the code forces us to ask about how much control we hand to the tools we build with.
FAQ
Bibliography
VentureBeat. (2026). Claude Code Source Leak Analysis. VentureBeat. https://venturebeat.com/technology/claude-codes-source-code-appears-to-have-leaked-heres-what-we-know
The Hacker News. (2026). Claude Code npm Packaging Error Confirmation. The Hacker News. https://thehackernews.com/2026/04/claude-code-tleaked-via-npm-packaging.html
Kim, A. (2026, March 31). Claude Code Source Leak: Technical Deep Dive. Alex000kim.com. https://alex000kim.com/posts/2026-03-31-claude-code-source-leak/
The Register. (2026). Claude Code Telemetry and Privacy Analysis. The Register. https://www.theregister.com/2026/04/01/claude_code_source_leak_privacy_nightmare/
BleepingComputer. (2026). Claude Code Source Code Leak and axios RAT Analysis. BleepingComputer. https://www.bleepingcomputer.com/news/artificial-intelligence/claude-code-source-code-accidentally-leaked-in-npm-package/
Kuberwastaken. (2026). Claurst Repository (Undercover Mode System Prompt). GitHub. https://github.com/Kuberwastaken/claurst
GrowthBook. (2026). Feature Flagging Platform. GrowthBook. https://www.growthbook.io/
European Union. (2026). Artificial Intelligence Act (EU AI Act). ArtificialIntelligenceAct.eu. https://artificialintelligenceact.eu/
Leave a Comment