Sarmadi AI Digest May 24, 2026 Updated 7:00 AM CT Today Archive Topics Saved Subscribe RSS

DeepSeek ships a coding agent; memory eats AI chip costs

DeepSeek followed its permanent price cut with Reasonix, a native coding agent built around high caching and low cost — the open-weight frontier is now competing on agent tooling, not just model price. The compute-economics story sharpened: new data shows memory has grown to nearly two-thirds of AI chip component costs, reframing the supply bottleneck from logic to DRAM and HBM. Agent reliability stayed under scrutiny, with a study on 'constraint decay' showing how LLM agents degrade on backend code generation as requirements accumulate. Google acknowledged it is navigating AI security in real time, and consumer AI kept getting stranger — an Amazon Bee wearable and a fleet of meal-making robots in San Francisco's Tenderloin. On the research side, agent-skill work is maturing from hand-crafted prompts toward a systematic discipline.

8 papers 6 news 4 sources ← Latest

News

6 items

DeepSeek's coding push and the memory bottleneck

DeepSeek shipped Reasonix, a native coding agent optimized for caching and cost, extending its price-led pressure into agent tooling. In parallel, new data shows memory has grown to nearly two-thirds of AI chip component costs — the hardware bottleneck is shifting from logic to memory, which reshapes where the next capacity constraint bites.

News Hacker News

DeepSeek Reasonix: DeepSeek native coding agent with high caching and low cost

DeepSeek released Reasonix, a native coding agent built around aggressive caching and low cost (445 HN points).

Why it matters
  • Extends DeepSeek's price advantage into the coding-agent layer where Claude Code and Codex compete.
  • Caching-first design directly targets the token cost that dominates agentic coding bills.
  • Open-weight coding agents pressure the subscription pricing of incumbent tools.
News Hacker News

Memory has grown to nearly two-thirds of AI chip component costs

Epoch data shows memory now accounts for nearly two-thirds of AI chip component costs — the bottleneck is shifting from logic to DRAM/HBM (299 HN points).

memory share of chip cost ~2/3
Why it matters
  • Reframes the compute-supply story: memory, not logic, is the binding cost.
  • Explains the wave of KV-cache and quantization research aimed at the memory wall.
  • Has direct pricing implications for anyone forecasting inference cost curves.

Agent reliability under load

A widely-shared study on 'constraint decay' shows LLM agents degrade on backend code generation as requirements pile up, and two papers push agent-skill construction from one-shot prompts toward a systematic, deep-learning-like discipline. The thread of the whole stretch: agents work in demos, strain under real accumulated constraints.

AI out in the physical and security world

Google acknowledged it is figuring out AI security in real time, an Amazon Bee wearable drew the now-standard intrigue-and-unease reaction, and a fleet of robots began making meals for a San Francisco nonprofit. The consumer and physical edge of AI keeps advancing faster than the norms around it.

Papers

5 items

Agent reliability under load

A widely-shared study on 'constraint decay' shows LLM agents degrade on backend code generation as requirements pile up, and two papers push agent-skill construction from one-shot prompts toward a systematic, deep-learning-like discipline. The thread of the whole stretch: agents work in demos, strain under real accumulated constraints.

Text-to-image efficiency and a scaling rethink

Lens shows a 3.8B text-to-image model matching or beating much larger systems on a tighter training budget, PiD speeds high-resolution latent decoding, and a Shannon-channel view of LLMs tries to explain the non-monotonic scaling behavior that simple power laws miss.

Also today