Sarmadi AI Digest July 4, 2026 Updated 7:00 AM CT Today Archive Topics Saved Subscribe RSS

Anthropic wants to develop its own drugs; DeepMind + A24 sign a research pact; CVE spike around Mythos Preview

Anthropic said it wants to develop its own drugs — the biggest extension of Claude Science's flagship-workflow bet since Tuesday's launch. Google DeepMind and A24 announced a first-of-its-kind research partnership, converting last month's $75M investment into a joint research surface. Epoch AI documented a spike in serious CVEs around the Claude Mythos Preview release — quantitative evidence that new frontier-model releases now measurably move the vulnerability curve. Google DeepMind unionization talks got off to a rocky start (Wired). The Verge kept the Midjourney medical-scanner story alive with a behind-the-scenes look that still leaves the clinical claims unanswered. A holiday-quiet news day with heavy weight on research papers around agentic capability evaluation, on-device memory agents, and autonomous policy evolution.

8 papers 7 news 6 sources ← Latest

News

7 items

Anthropic wants to develop its own drugs; DeepMind + A24 pair up

Anthropic told The Verge it wants to develop its own drugs — the most ambitious extension of Claude Science since Tuesday's launch. Google DeepMind and A24 announced a first-of-its-kind research partnership, operationalizing last month's $75M Google-A24 investment into a joint research pipeline.

News The Verge AI

Anthropic wants to develop its own drugs

The Verge: Anthropic is now saying it wants to develop its own drug candidates on top of the Claude Science workbench.

Why it matters
  • Largest extension of Claude Science since Tuesday's launch — moves from workflow-provider to drug-developer.
  • Direct challenge to Isomorphic Labs and the Google-DeepMind therapeutics angle.
  • Materially expands Anthropic's addressable market beyond dev tools and inference.
  • Sets a new frontier-lab template: sell the workbench, then use it yourself.
News Google DeepMind

Google DeepMind and A24 announce first-of-its-kind research partnership

DeepMind and A24 announce a first-of-its-kind research partnership — operationalizing the $75M investment into a joint pipeline.

Why it matters
  • Converts last month's $75M A24 investment into an actual research surface.
  • First frontier-lab / independent-studio structured R&D partnership on media.
  • Sets the counter-precedent to the Hollywood-bows-to-OpenAI story from June 24.

CVE spike around Mythos Preview; DeepMind union talks rocky

Epoch AI published data showing new serious vulnerabilities spike around the Claude Mythos Preview release (114 HN) — quantitative evidence that frontier releases measurably move the vulnerability curve. Wired: Google DeepMind's unionization talks are off to a rocky start.

News Hacker News

New serious vulnerabilities spiked around release of Claude Mythos Preview

Epoch data (114 HN points): CVE severity spikes correlate with the Claude Mythos Preview release window.

Why it matters
  • First quantitative evidence that frontier-model releases move the vulnerability curve.
  • Directly relevant to the Anthropic-Alibaba fight and Wired's music-festival Claude story.
  • Names a new procurement question: what does release timing do to your attack surface?

Midjourney's medical scanner still unclear; browser-war alternatives

The Verge published a behind-the-scenes look at Midjourney's medical ultrasound scanner that leaves the substantive clinical claims unanswered. TC ran an alt-browser guide framing the post-Chrome/Safari agentic-browser wave. TC also published a general AI-glossary explainer for readers.

Papers

5 items

PACE, AgenticSTS, EvoPolicyGym; on-device memory agents

Holiday HF list is heavy on agent-evaluation infrastructure: PACE proxies for agentic capability, AgenticSTS provides a bounded-memory long-horizon testbed, and EvoPolicyGym evaluates autonomous policy evolution. AutoMem and DuoMem push on-device memory as a first-class agent skill.

Paper Hugging Face

PACE: A Proxy for Agentic Capability Evaluation

PACE — a proxy metric for agentic capability that correlates with expensive downstream benchmarks.

Why it matters
  • Cheap-proxy metric for agent capability — practical for the Arena / Patronus / Semgrep-style evaluators.
  • Lands the same week as the Zuckerberg-agents-slow story — measurable proxies now matter for internal roadmap conversations.

Also today