Sarmadi AI Digest May 22, 2026 Updated 7:00 AM CT Today Archive Topics Saved Subscribe RSS

DeepSeek makes its V4 Pro price cut permanent; agents move onto real tools

DeepSeek made its V4 Pro price discount permanent, the day's clearest market signal — sustained price pressure from the open-weight frontier keeps compressing what proprietary labs can charge. The research feeds pushed agents toward the surfaces businesses actually use: Spreadsheet-RL trains agents on real Excel and Sheets tasks, TerminalWorld reverse-engineers evaluation from in-the-wild terminal work, and π-Bench measures proactive personal-assistant agents over long horizons. Clinical agents got more rigorous with multimodal evidence-seeking and clinical-event prediction. Efficient attention and KV serving stayed busy — full-to-sparse attention transfer in a hundred training steps, a second-generation gated DeltaNet, and service-aware cache compression for disaggregated serving.

15 papers 1 news 2 sources ← Latest

News

1 item

DeepSeek resets the price floor

DeepSeek made its V4 Pro price discount permanent. Open-weight frontier pricing keeps dropping, and each permanent cut tightens the band proprietary labs can charge for comparable capability — a direct tailwind for cost-sensitive SMB builders.

News Hacker News

DeepSeek makes the V4 Pro price discount permanent

DeepSeek converted its temporary V4 Pro price discount into permanent pricing, locking in another cut to frontier-grade API costs (534 HN points).

Why it matters
  • Sustained open-weight price pressure compresses what OpenAI, Anthropic, and Google can charge for equivalent tiers.
  • Lowers the per-token cost floor for agentic workloads where volume dominates the bill.
  • Strengthens the case for multi-model routing with DeepSeek on the cheap tier.

Papers

11 items

Agents move onto real business tools

Three benchmarks pull agent evaluation onto the surfaces businesses actually run on: spreadsheets, terminals, and proactive long-horizon assistance. The shared move is reverse-engineering tasks from real-world usage rather than synthetic sandboxes — the same controlled-to-realistic shift seen all week.

Paper Hugging Face

Spreadsheet-RL: Advancing Large Language Model Agents on Realistic Spreadsheet Tasks via Reinforcement Learning

Trains agents with RL on realistic Excel and Google Sheets tasks — the data-work surface most businesses actually live in.

Why it matters
  • Spreadsheets are the highest-leverage, least-glamorous automation target for SMBs.
  • RL on realistic tasks beats prompt-only approaches on the messy operations that matter.
  • Direct relevance to finance, ops, and analytics teams evaluating agent tooling.

Clinical agents get more rigorous

Healthcare AI research moved past assuming evidence is handed to the model: ClinSeekAgent automates multimodal evidence-seeking for clinical reasoning, and a separate effort trains LLMs to predict clinical events from longitudinal notes. Both respond to the deployment-risk story the field has been living this month.

Attention and serving keep getting cheaper

Efficiency work continued across the stack: converting full attention to sparse in a hundred training steps, a second-generation gated DeltaNet that decouples erase and write in linear attention, and service-aware KV-cache compression for disaggregated serving.

Sharper credit assignment for RLVR

Two papers continue the week's RLVR refinement thread — discriminative token-level credit assignment and unsupervised process reward models — both aimed at giving training a signal more precise than one reward per rollout.

Also today