Casper Hansen · @casper_hansen_

o3 competitor: GLM 4.5 by Zhipu AI - hybrid reasoning model (on by default) - trained on 15T tokens - 128k co...

查看 @casper_hansen_ 在 2025年7月28日 13:07 发布的这条 X/Twitter 推文。 这条内容包含 4 张图片。

发布时间
2025年7月28日 13:07
线程条目数
6
媒体数量
4

推文概览

查看 @casper_hansen_ 在 2025年7月28日 13:07 发布的这条 X/Twitter 推文。 这条内容包含 4 张图片。

o3 competitor: GLM 4.5 by Zhipu AI
- hybrid reasoning model (on by default)
- trained on 15T tokens
- 128k context, 96k output tokens
- $0.11 / 1M tokens
- MoE: 355B A32B and 106B A12B

Benchmark details:
- tool calling: 90.6% success rate vs Sonnet’s 89.5% vs Kimi K2 86.2%
- coding: 40.4% win rate vs Sonnet, 53.9% vs Kimi K2, 80.8% vs Qwen3 Coder
Casper Hansen media
Zhipu AI has also released their entire post-training infrastructure: https://github.com/THUDM/slime 
Casper Hansen media
Slight correction to the post:
- it was not just 15T tokens
- it was 23.1T tokens total! 
Casper Hansen media
For those wondering how to get GLM 4.5 cheaply: you need to use their Mainland China API.
https://x.com/casper_hansen_/status/1949828862096949314
Update: Zhipu AI says the initial benchmark I posted are not up-to-date, so here is the updated version.

Note that I found the original benchmark comparison on their bigmodel documentation. 
Casper Hansen media

相关创作者

TwitFast

v1.4.94

Free Twitter video downloader. Top Twitter trends and hashtags list, Monitor, track hottest trending topics, hashtags.

© 2024 TwitFast 保留所有权利。