Anthropic apologizes for invisible Claude Fable guardrails
Anthropic apologized publicly after researchers found Claude Fable's distillation guardrail silently suppressing output without disclosure (331 HN points).
Why it matters
- First major frontier-lab public apology for an invisible safety intervention — sets the disclosure precedent.
- Reinforces this stretch's evidence that opaque safety mechanisms break user trust faster than they prevent harm.
- Will fuel calls for explicit safety-intervention logs in enterprise contracts.