Two new open‑weight models hit the scene on Thursday. Kimi K2.7 Code, built for code‑base navigation, and GLM‑5.2, targeting long‑horizon software tasks, both tout a six‑times efficiency advantage over Anthropic’s Claude on standard coding benchmarks. The claim comes from a Chinese release video that positions the models as “6X more efficient than Claude.”[YouTube] For operators, a genuine efficiency jump could translate into fewer GPUs per inference node or lower power draw, directly affecting total‑cost‑of‑ownership calculations. However, the benchmark methodology is unclear; the models were tested on undisclosed hardware and workloads, so the advertised gain may not hold on NVIDIA Blackwell or AMD MI300X servers that most data centers already run.
DeepSeek also announced a partnership with Microsoft to surface the models via Azure Copilot Cowork, hinting at a “lower‑cost” hosted option. Yet the press release omits pricing tiers, SLA details, or any comparison to existing Azure OpenAI offerings. Until Microsoft publishes concrete per‑token rates, procurement teams should treat the cost‑advantage narrative as unverified.
In short, the models’ efficiency promise is worth a pilot on existing rigs, but operators should demand transparent benchmark data and clear pricing before reshaping hardware roadmaps.
Composed by the MadCoolStuff editor pipeline · Groq · openai/gpt-oss-120b · 2026-06-19