Archives Cybersecurity -

Project Glasswing: Anthropic’s $100M Cyber Defense Push

2026-04-08 by Ignacy

Last updated: April 2026 · By DecodeTheFuture.org Project Glasswing is an industry-wide cybersecurity initiative announced by Anthropic in April 2026, uniting AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Partners get access to Claude Mythos Preview — an unreleased frontier model that has already found thousands of … Read more

OpenAI, Google, Anthropic vs Chinese AI Distillation 2026

2026-04-072026-04-07 by Ignacy

Last updated: April 2026 · Reading time: ~9 min · By Ignacy Kwiecień OpenAI, Google and Anthropic are sharing threat intelligence through the Frontier Model Forum to detect adversarial distillation — automated query attacks where Chinese labs extract outputs from frontier US models and use them to train cheaper copycat systems. Anthropic has documented over … Read more

AI Agent Traps: 6 Attack Types Hijacking AI Agents in 2026

2026-04-06 by Ignacy

Last updated: April 2026 AI Agent Traps are adversarial content elements embedded in websites, documents, and APIs — engineered to manipulate, deceive, or hijack autonomous AI agents navigating the open web. A March 2026 Google DeepMind paper introduced the first systematic framework, identifying 6 attack categories that target perception, reasoning, memory, action, multi-agent dynamics, and … Read more

EU AI Act Explained: 7 Risk Tiers, Penalties & 2026 Timeline

2026-04-02 by Ignacy

Last updated: April 2026 The EU AI Act (Regulation 2024/1689) is the world’s first comprehensive AI law. It classifies AI systems into four risk tiers — unacceptable, high, limited, and minimal — with fines up to €35 million or 7% of global annual turnover. As of April 2026, prohibited practices and GPAI model rules are … Read more

Claude Code Undercover Mode, Killswitches & Telemetry: 6 Hidden Controls Exposed

2026-04-02 by Ignacy

Last updated: April 2026 The Claude Code source leak revealed that Anthropic’s coding tool contacts its servers every hour to poll remote settings — including 6+ killswitches that can force-quit the application, bypass permission prompts, or toggle features without user-initiated updates. The code also contains “Undercover Mode,” a stealth protocol that strips all AI attribution … Read more

Claude Code Source Leak 2026: The Complete Guide to What Was Exposed

2026-04-01 by Ignacy

Last updated: March 2026 On March 31, 2026, a 57 MB source map file shipped inside the @anthropic-ai/claude-code npm package exposed the entire TypeScript source code of Claude Code — 1,900 files and 512,000+ lines of code. The leak revealed unreleased features (KAIROS always-on agent, autoDream memory consolidation, ULTRAPLAN, Buddy System), future model codenames (Opus … Read more

Axios npm Attack: RAT Hits 100M-Download Package on Claude Code Leak Day

2026-03-31 by Ignacy

Last updated: March 2026 On March 31, 2026, attackers compromised the npm account of Axios’s lead maintainer and published two malicious versions (1.14.1 and 0.30.4) of the HTTP client library used by over 100 million projects weekly. The poisoned packages silently installed a cross-platform Remote Access Trojan (RAT) via a hidden dependency called plain-crypto-js. The … Read more

OpenAI Model Spec Explained: 5 Rules That Shape ChatGPT

2026-03-282026-03-28 by Ignacy

Last updated: March 2026 The OpenAI Model Spec is a ~100-page public document that defines how ChatGPT and OpenAI API models should behave. It establishes a hierarchical chain of command — root rules that can never be overridden, system-level instructions from developers, and user-level preferences — to resolve conflicts between safety, helpfulness, and user freedom. … Read more

What Is Claude Mythos? 7 Facts About Anthropic’s Leaked AI

2026-03-282026-03-28 by Ignacy

Last updated: March 28, 2026 Claude Mythos is Anthropic’s most powerful AI model to date, accidentally revealed on March 27, 2026 through a CMS misconfiguration that exposed ~3,000 unpublished assets. The model introduces a new tier called Capybara — above Opus — with substantially higher benchmark scores in coding, reasoning, and cybersecurity. Anthropic has confirmed … Read more