GLM-5.2 by @Zai_org is 5th in Mobile App Arena on Design Arena with an Elo of 1248.
This is a 2 position jump from GLM-5.1, putting GLM-5.2 in the same performance band as Claude Sonnet 4.6 by @AnthropicAI. @Zai_org is the top open-weight lab in Mobile App Arena and the third
The Intelligence Company
100 posts
Joined January 2026
- The Intelligence Company repostedDesign Arena’s benchmarks now help power @OpenRouter’s MCP Get live model intelligence directly in your agent!Replying to @OpenRouterThe model performance rankings come from our new Benchmarks API, allowing your agent to query live benchmark scores (incl Artificial Analysis and Design Arena) Fun result: @Zai_org’s GLM-5.2 is the best available model for both coding & design Docs: openrouter.ai/docs/api/api-r…
- The Intelligence Company reposted
- The Intelligence Company repostedGLM-5.2 by @Zai_org is 2nd on Game Dev Arena on Design Arena with an Elo of 1368. This is a 6 position and 29 Elo jump from GLM-5.1, putting GLM-5.2 in the same performance band as Claude Fable 5 by @Anthropic. GLM-5.2 is the top open weight lab in Game Dev and second lab
- The Intelligence Company repostedFable 5 vs GLM 5.2
- The Intelligence Company repostedBREAKING: Riverflow Pro 2.5, a reasoning model by @riverflow_ai that calls a mix of proprietary and open diffusion models, has scored 1st on Image Arena (Models + Routers), 1st on Graphic Design Arena, and 1st in Image Edit (Models + Routers). Riverflow Pro 2.5 averages 10 Elo
- BREAKING: GLM-5.2 is now 1st on Design Arena. With an Elo of 1360, GLM-5.2 has jumped ahead of the now unavailable Claude Fable 5. And it's open weights. This is an improvement of 4 positions and 27 Elo points to achieve one of the highest Elo scores in our code categories
- The Intelligence Company repostedBREAKING: Reve 2.0 by @reve debuts at 2nd on Image Editing Arena with an Elo of 1325. Reve establishes a new Pareto frontier for Preference vs. Speed, faster than any model at this preference level with an average generation time of 86.8 seconds. Reve is now the highest-ranked
- The Intelligence Company repostedBREAKING: Le Chaton Fat has fully saturated our benchmark. We are at a loss for words. In response, we are retiring Design Arena. Congratulations to the @MistralAI team, and thanks for putting us on vacation.
- The Intelligence Company reposted
- The Intelligence Company repostedOpus 4.8’s hyperfocus on agents may be making it worse at design. Opus 4.8 ranks 23rd overall on single-turn HTML Web Dev, a dramatic regression from Fable (1st), Opus 4.6 (2nd), and Opus 4.7 (3rd). This was particularly surprising as @AnthropicAI models have held the top spots



















