GLM 5.2 beats Claude in our benchmarks
Semgrep cybersecurity benchmarks show Z.ai's GLM 5.2 beating Claude — open-weights catching the federally-gated US frontier in a serious domain (888 HN points).
Why it matters
- First credible third-party cyber benchmark with an open-weights Chinese model beating a closed US frontier model.
- Operationalizes the substitution-effect story yesterday previewed at the abstract level.
- Hands procurement teams concrete data the moment Mythos went onto a vetted-user roster.
- Materially affects the political-economy argument behind the trusted-user gating regime.