Sarmadi AI Digest May 13, 2026 Updated 6:50 AM CT Today Archive Topics Saved Subscribe RSS

Google rebrands Android around AI; OpenAI trial keeps unraveling

Google used its pre-I/O Android showcase to rebuild the platform around AI — Googlebooks laptops, Gemini Intelligence across the phone, dictation in Gboard, and a 'Create My Widget' feature that turns vibe-coded UI into a system primitive. The OpenAI trial dominated the news cycle in parallel: Altman testified that Musk did 'huge damage' to the company and once floated handing it to his children, while Sutskever defended the 2023 ouster from the stand. Compute geography keeps escalating — Google and SpaceX are now reportedly in talks for orbital data centers, xAI is adding 19 gas turbines despite litigation, and one pitch is to host mini data centers inside private homes. Underneath, agent research is moving the safety frame from prompt to trajectory: papers on hidden multi-turn intent, on-policy self-evolution from failure trajectories, and privacy-aware device-cloud collaboration treat the agent's whole run as the alignment target. A 26M-parameter distillation of Gemini tool-calling, viral on Hacker News, is a useful reminder that the small-model frontier is moving faster than the headline-grabbing one.

12 papers 18 news 11 sources ← Latest

News

16 items

Google rewrites Android around AI

Google's pre-I/O Android showcase reframed the platform around Gemini: Googlebooks laptops, on-device dictation, an agentic phone control layer, and 'Create My Widget' vibe-coding for the home screen. The pitch is a vertically integrated AI-first OS that competes on UX, not benchmarks.

News TechCrunch AI

Everything Google announced at its Android Show, from Googlebooks to vibe-coded widgets

Google revealed AI-first Googlebooks laptops, agentic Gemini phone control, Gemini in Chrome, vibe-coded Android widgets, and a deeper Gboard dictation integration.

Why it matters
  • Repositions Android from a Gemini-enabled OS to an AI-native one — UX is now the differentiator vs Apple Intelligence.
  • Pushes vibe-coding from a developer trend into a consumer-visible primitive (Create My Widget).
  • Tightens Google's grip on the dictation, transcription, and on-device assistant categories.

Agent safety moves to the trajectory level

Three papers and one tragic news story converge on a shared point: judging an agent by its final response misses where the harm actually lives. Hidden multi-turn intent, unsafe tool-call sequences, and unprotected device-cloud data flow all require trajectory-aware alignment, not response-level guardrails.

The OpenAI trial keeps unraveling

Sam Altman took the stand and accused Musk of doing 'huge damage' to OpenAI; he also testified that Musk once considered handing the company to his children. Sutskever, on the previous day, defended his role in the 2023 ouster. The case is functioning as a public seminar on AI governance.

Compute geography pushes outward

Reports of Google–SpaceX talks on orbital data centers, xAI quietly expanding its on-site gas turbines, a startup pitching homeowners as mini-DC hosts, and the Verge's deep-dive on a Maine paper-mill town turned data-center destination all sketch the same picture: AI infrastructure is now negotiating directly with land, power, and atmosphere.

News TechCrunch AI

Report: Google and SpaceX in talks to put data centers into orbit

Google and SpaceX are reportedly negotiating an orbital data-center program, pairing Starlink-style launches with Google's compute appetite.

Why it matters
  • Confirms the trend yesterday's Cowboy Space raise pointed at — hyperscalers now consider space-based compute serious infrastructure planning.
  • If real, removes one terrestrial bottleneck (cooling/water) at the cost of an entirely new logistics layer.

Small models and tighter agentic RL

A 26M-parameter distillation of Gemini tool-calling went viral the same week three papers attack waste in agentic RL — internal-state value baselines instead of full critics, async off-policy correction repairs, and test-time co-evolution of multi-agent topology and capability. The collective signal: production agentic AI is becoming small and cheap to keep current.

Papers

6 items

Agent safety moves to the trajectory level

Three papers and one tragic news story converge on a shared point: judging an agent by its final response misses where the harm actually lives. Hidden multi-turn intent, unsafe tool-call sequences, and unprotected device-cloud data flow all require trajectory-aware alignment, not response-level guardrails.

Paper Hugging Face

One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue

Attackers spread harmful intent across many benign-looking turns; this paper proposes a response-aware detector that recovers signal lost by single-prompt guardrails.

Why it matters
  • Names a class of attack that commercial guardrails systematically miss.
  • Practical defense doesn't require retraining the base model — bolt-on detector.
  • Sets a baseline against which deployed assistants can be measured.
Paper Hugging Face

PAAC: Privacy-Aware Agentic Device-Cloud Collaboration

Treats the device-cloud boundary as a trust boundary instead of a compute split, with policy-aware sanitization that preserves tool-call structure.

Why it matters
  • Useful template for SMBs that need cloud reasoning but can't ship raw user data over the wire.
  • Aligns with the Android-AI push above: on-device + cloud is now the default agent topology.

Small models and tighter agentic RL

A 26M-parameter distillation of Gemini tool-calling went viral the same week three papers attack waste in agentic RL — internal-state value baselines instead of full critics, async off-policy correction repairs, and test-time co-evolution of multi-agent topology and capability. The collective signal: production agentic AI is becoming small and cheap to keep current.

Paper Hugging Face

Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States

Reuses the policy model's own internal states to estimate value baselines, removing the need for a full PPO critic or GRPO's multiple-rollout estimator.

Why it matters
  • Cuts the dominant memory cost of RLVR training while preserving variance reduction.
  • Lowers the entry bar for teams that can't afford twin-model RL setups.

Also today