🗞️ AI Daily Briefing — 2026-04-13

🔥 Top Story

Anthropic’s Project Glasswing goes live — Claude Mythos Preview finds thousands of zero-days across every major OS and browser. Anthropic launched Project Glasswing this week, giving partners like AWS, Apple, Google, Microsoft, and Cisco access to Claude Mythos Preview — a model so capable at cybersecurity that Anthropic won’t release it publicly. Mythos autonomously discovered and exploited a 17-year-old FreeBSD RCE vulnerability (CVE-2026-4747), chained four browser exploits including a JIT heap spray that escaped both renderer and OS sandboxes, and found bugs as old as 27 years in OpenBSD. Anthropic is committing $100M in compute credits to the initiative plus $4M to Linux Foundation / Apache Software Foundation. The debate is already raging: is this responsible disclosure at scale, or the most dangerous AI capability demo yet? (Anthropic) (TechCrunch) (The Hacker News)

🚀 Model & Research News

  • Claude Opus 4.6 Thinking takes #1 on Arena leaderboard: Anthropic now holds the top two global spots on LMArena with Opus 4.6 Thinking (1504 Elo) and Opus 4.6. It’s also #1 on coding with a specialized score of 1549. The top six models are separated by only 20 Elo points — the frontier is a knife fight. (Arena.ai) (AI Dev Day India)
  • NVIDIA drops a full robotics stack for National Robotics Week: Isaac GR00T N1.7 (early access, commercial license) enables natural language instruction for multistep robot tasks. Newton 1.0 physics engine hits GA (co-developed with Google DeepMind and Disney Research, open-source under Linux Foundation). Isaac Sim 6.0 and Isaac Lab 3.0 also GA. Jim Fan’s GEAR lab continues pushing the embodied AI frontier. (NVIDIA Blog) (NVIDIA Newsroom)
  • Google’s TurboQuant heading to ICLR 2026 in Rio: The training-free algorithm compresses KV cache to ~3.5 bits per value for 6x memory reduction with near-zero quality loss. On H100s, that’s up to 8x speedup on attention computation. No calibration data needed. Official Google implementation expected Q2 2026. (Google Research) (TechCrunch)
  • ARC-AGI-3 humbles every frontier model: François Chollet’s new interactive benchmark — hundreds of handcrafted turn-based environments with no instructions, no rules, no stated goals — launched March 25 at YC HQ with a fireside chat between Chollet and Sam Altman. Humans score 100%. The best frontier AI? 0.37% (Gemini 3.1 Pro). $2M+ in prizes now live. (ARC Prize) (Fast Company)
  • OpenAI enterprise AI push: Codex hit 3M weekly active users, APIs process 15B+ tokens/minute, enterprise is 40%+ of revenue and on track for parity with consumer by year-end. OpenAI is now at $25B annualized revenue; Anthropic approaching $19B. (OpenAI)

🛠️ Tools & Developer Updates

  • LangSmith Fleet + Sandboxes + NVIDIA partnership: LangChain rebranded Agent Builder to LangSmith Fleet with agent identity, sharing, and permissions. Sandboxes (private preview) give agents locked-down temp environments. They also announced an enterprise agentic AI platform built with NVIDIA. Harrison Chase hosting AI Agents Workshop in NYC April 16. (LangChain Blog)
  • Google Gemma 4 continues gaining traction: The Apache 2.0 open model family (2B–31B) with 256K context, native vision+audio, 140+ languages is seeing rapid adoption. The 31B dense model’s benchmark jumps are dramatic — AIME math from 20.8% to 89.2%, LiveCodeBench from 29.1% to 80.0%. Over 400M total Gemma downloads. (Google Blog) (Hugging Face)
  • Anthropic MCP at 97M installs: The Model Context Protocol is now foundational infrastructure with 5,800+ servers in production. Every major AI provider ships MCP-compatible tooling. The protocol war is definitively over. (Dev.to)

💰 Funding & Business

  • Q1 2026: $300B in global VC — AI ate 80% of it: Investors poured $300 billion into 6,000 startups in Q1, smashing all records. AI startups captured $242B (80%). OpenAI, Anthropic, xAI, and Waymo alone absorbed 65% of all global VC. Foundational AI startup funding in Q1 was double all of 2025. (Crunchbase)
  • SpaceX-xAI merger sets up $1.75T IPO: The all-stock deal (SpaceX at $1T, xAI at $250B) closed in February. Musk is targeting a July 2026 IPO that would be the largest in history. The combined entity plans orbital data centers powered by space-based solar to feed xAI’s compute demands. Meanwhile, xAI is “rebuilding” after a wave of co-founder departures. (Bloomberg) (CNBC)
  • Meta’s Muse Spark fallout continues: A week after launch, Meta’s first closed-source model (built by Alexandr Wang’s Superintelligence Labs) claims benchmark parity with GPT-5.4 and Claude Sonnet 4.6. The shift away from open-source Llama strategy is generating heated debate. (CNBC)

🐦 Notable from the Timeline

  • New Yorker bombshell still reverberating: The Ronan Farrow / Andrew Marantz investigation into Sam Altman (100+ interviews, Ilya Sutskever’s 70-page secret memos, Dario Amodei’s notes) alleging a “consistent pattern of lying” continues to dominate AI discourse. Hours after publication, OpenAI announced a new Safety Fellowship programme. (Semafor)
  • @alexandr_wang: Muse Spark is Wang’s first major deliverable at Meta. The closed-source pivot is a statement — Scale AI’s influence now shapes both Meta’s frontier models and its data infrastructure.
  • @DrJimFan / NVIDIA GEAR lab: National Robotics Week showcase with Isaac GR00T N1.7, Newton 1.0, and the full sim-to-real pipeline. The embodied AI stack is maturing fast.
  • @fchollet: ARC-AGI-3 launch is the most dramatic benchmark reset in years — frontier models scoring under 0.4% while humans hit 100% is a powerful statement about the gap between memorization and genuine reasoning.
  • @hwchase17: LangSmith Fleet rebrand + NVIDIA enterprise partnership signals LangChain is going all-in on managed agent infrastructure. NYC workshop April 16, Google Cloud Next April 22-24.

📊 Benchmark Watch

  • LMArena Text: Claude Opus 4.6 Thinking (1504 Elo, #1), with Grok 4.20 Beta1 at #4 (1491 Elo) and Gemini 3.1 Pro Preview at #3 (1493 Elo). Top six separated by 20 Elo points. 5.78M+ votes across 339 models. (Arena.ai)
  • ARC-AGI-3: Every frontier model below 0.4%. Humans at 100%. This is the new “are we AGI yet?” yardstick. (ARC Prize)
  • ARC-AGI-2: Gemini 3.1 Pro at 77.1%, Gemini 3 Deep Think at 84.6% (current best). Rapid progress here, but ARC-AGI-3 just moved the goalposts dramatically. (HumAI)

🎙️ Podcast Highlights

  • All-In E267 (April 10): “Anthropic’s $30B Ramp, Mythos Doomsday, OpenClaw Ankled, Iran War Ceasefire, Israel’s Influence” — the besties dig into the Mythos cybersecurity implications, Anthropic’s growth trajectory, compute costs, and AI investment concentration. (All-In)
  • Lex Fridman #490: “State of AI in 2026” deep-dive with Nathan Lambert (Ai2) and Sebastian Raschka — covering LLMs, coding agents, scaling laws, China’s AI ecosystem, and the path to AGI. (Spotify)

🔗 Worth Reading

  • Anthropic’s Project Glasswing technical writeup — What Mythos actually found, how it works, and the responsible disclosure framework. The most important AI security document of the year. (Anthropic)
  • “GPT-5, Claude, Gemini All Score Below 1%” — ARC-AGI-3 analysis — Why the new interactive benchmark breaks every frontier model and what it reveals about the memorization vs. reasoning gap. (Dev.to)
  • Google TurboQuant deep-dive — How 3-bit KV cache compression achieves 6x memory reduction without retraining. The “Pied Piper of AI” as TechCrunch calls it. (Google Research)