VentureBeat Apr 7, 09:27 PM
AI joins the 8-hour work day as GLM ships 5.1 open source LLM, beating Opus 4.6 and GPT-5.4 on SWE-Bench Pro Is China picking back up the open source AI baton?
Z.ai, also known as Zhupai AI, a Chinese AI startup best known for its powerful, open source GLM family of models, has unveiled GLM-5.1 today under a permissive MIT License, allowing for enterprises to download, customize and use it for commercial purposes. They can do so on Hugging Face.
This follows its release of GLM-5 Turbo, a faster version, under only proprietary license last month.
The new GLM-5.1 is designed to work autonomously for up to eight hours on a single task, marking a definitive shift from vibe coding to agentic engineering.
The release represents a pivotal moment in the evolution of artificial intelligence. While competitors have focused on increasing reasoning tokens for better logic, Z.ai is optimizing for productive horizons.
GLM-5.1 is a 754-billion parameter Mixture-of-Experts model engineered to maintain goal alignment over extended execution traces that span thousands of tool calls.
"agents could do about 20 steps by the end of last year," wrote z.ai leader Lou on X. "glm-5.1 can do 1,700 rn. autonomous work time may be the most important curve after scaling laws. glm-5.1 will be the first point on that curve that the open-source community can verify with their own hands. hope y'all like it^^"
In a market increasingly crowded with fast models, Z.ai is betting on the marathon runner. The company, which listed on the Hong Kong Stock Exchange in early 2026 with a market capitalization of $52.83 billion, is using this release to cement its position as the leading independent developer of large language models in the region.
Technology: the staircase pattern of optimization
GLM-5.1s core technological breakthrough isn't just its scale, though its 754 billion parameters and 202,752 token context window are formidable, but its ability to avoid the plateau effect seen in previous models.
In traditional agentic workflows, a model typically applies a few familiar techniques for quick initial gains and then stalls. Giving it more time or more tool calls usually results in diminishing returns or strategy drift.
Z.ai research demonstrates that GLM-5.1 operates via what they call a staircase pattern, characterized by periods of incremental tuning within a fixed strategy punctuated by structural changes that shift the performance frontier.
In Scenario 1 of their technical report, the model was tasked with optimizing a high-performance vector database, a challenge known as VectorDBBench.
The model is provided with a Rust skeleton and empty implementation stubs, then uses tool-call-based agents to edit code, compile, test, and profile. While previous state-of-the-art results from models like Claude Opus 4.6 reached a performance ceiling of 3,547 queries per second, GLM-5.1 ran through 655 iterations and over 6,000 tool calls. The optimization trajectory was not linear but punctuated by structural breakthroughs.
At iteration 90, the model shifted from full-corpus scanning to IVF cl