--- title: "KNOWLEDGE ATLAS releases GLM-5 technical details: engineering-grade intelligence, compatible with domestic computing power" description: "For China, GLM-5 is more like a declaration: we can not only create large models but also adapt our own computing power and integrate the two" type: "news" locale: "en" url: "https://longbridge.com/en/news/276533880.md" published_at: "2026-02-22T11:20:51.000Z" --- # KNOWLEDGE ATLAS releases GLM-5 technical details: engineering-grade intelligence, compatible with domestic computing power > For China, GLM-5 is more like a declaration: we can not only create large models but also adapt our own computing power and integrate the two On February 12, KNOWLEDGE ATLAS released GLM-5, astonishing everyone. The technical report will be released 10 days later, giving people a glimpse into the intrinsic genes of the GLM-5 model. What's interesting is not which rankings it has topped again, but that the entire approach has changed: it's no longer about comparing parameter sizes, but rather about comparing system engineering capabilities. GLM-5 has accomplished three practical things: 1. The model can genuinely complete complex tasks, not just write a few lines of code; 2. Training efficiency has improved significantly, making ultra-large models no longer just a money-burning game; 3. Comprehensive adaptation to domestic chips from the bottom layer to the inference framework—this is the most crucial point. If it was previously "China is catching up," it has now begun to build its own technological system. ## From "Giving Code" to "Building Systems" The report mentioned a conceptual shift: from Vibe Coding to Agentic Engineering. The former is you say something and I give you a piece of code, while the latter is you provide a goal, and I plan, break it down, write code, adjust tools, debug, and iterate until the entire system is completed. The focus of GLM-5 is no longer on single question scores, but rather on: > 200K context (the amount of several hundred pages of documents) > > Cross-file software engineering tasks > > Continuous planning and correction in long-term tasks > > Maintaining consistency of thought in multi-round interactions For example, Vending-Bench 2 requires "simulating the operation of a vending machine for a year," ultimately looking at the account balance. GLM-5 ranks first among open-source models, close to Claude Opus 4.5. This measures long-term decision-making ability, not just Q&A. The model is beginning to exhibit "engineering-level intelligence." ## Sparse Attention: No More Mindless Computing Power Consumption GLM-5 has 744 billion parameters (with 40 billion activated) and was trained on 28.5 trillion tokens. Under traditional architectures, the computational power consumption would explode. The core innovation is DSA (DeepSeek Sparse Attention). The traditional attention mechanism "looks at all content," leading to a quadratic increase in computational complexity; DSA dynamically determines "which tokens are truly important," only calculating the key parts In a 200K long context, DSA reduces the attention computation by 1.5–2 times. Moreover—without loss. Other efficient attention methods usually sacrifice accuracy, while DSA maintains performance through continued pre-training and smooth transitions. The results are: - Same computing power → Longer context - Same cost → Higher inference capability - Same hardware → Larger models For China, efficiency innovation is much more important than stacking computing power. ## Reinforcement Learning Architecture Reconstruction The RL system of GLM-5 has undergone a complete overhaul. Generation and training are decoupled. The model generates trajectories while training occurs asynchronously in another system. In the past, training had to wait for the slowest task to complete; now, whoever finishes first trains first, significantly improving throughput. This is crucial for long-range agent tasks. The asynchronous agent RL algorithm addresses the issue of tasks lasting several hours in real software engineering. It introduces: - Token-in-Token-out (to avoid re-tokenization errors) - Bidirectional importance sampling - DP-aware routing optimization for KV cache The model can learn stably in complex environments without crashing due to policy drift. In simple terms, it solves the question of "how to enable large models to continuously self-improve in real tasks." ## The Truly Key Step: Adapting Domestic Computing Power The most important part of the report for China's AI lies here. GLM-5 natively adapts to the domestic GPU ecosystem, already compatible with Huawei Ascend, Moore Threads, Haiguang, Cambricon, Kunlun Core, Tianxu Zhixin, and Suiruan. This is not just "can run" compatibility, but: - KV cache scheduling optimization - Communication mechanism adaptation - Mixed precision training matching - INT4 quantization-aware training alignment - Distributed parallel strategy reconstruction Many challenges in the domestic chip ecosystem are not about computing power, but about the software stack. The significance of GLM-5 lies in that it is not designed around a single overseas hardware architecture, but rather makes system-level adaptations for various domestic computing platforms. This is a qualitative change—China's large models are beginning to optimize engineering around the local hardware ecosystem, no longer passively migrating. The report states that thanks to the extreme optimization of the aforementioned software-hardware synergy, GLM-5's performance on a single domestic computing node is already comparable to that of a computing cluster composed of two international mainstream GPUs; moreover, in long sequence processing scenarios, its deployment cost has been significantly reduced by 50%. ## A Software-Hardware Closed Loop is Forming Breaking down the technical path of GLM-5 reveals a complete closed loop: Model architecture innovation (DSA) → Training efficiency optimization (asynchronous RL) → Memory and communication compression (ZeRO, activation offloading) → Low precision alignment (INT4 QAT) → Deep adaptation to domestic chips This is a complete domestic AI engineering link. In the past, China's AI advantage lay in the application layer; now it is beginning to enter full-stack optimization in architecture innovation, algorithm engineering, training systems, chip adaptation, and inference frameworks. The true significance of this technical report is not in a specific benchmark score, but in that Chinese AI has demonstrated competitiveness for the first time with "system capabilities." ## From Showmanship to Maturity The GLM-5 report does not overly emphasize "how much better we are than others," but rather provides detailed disclosures on training processes, algorithm choices, engineering trade-offs, and ablation studies. This itself is a sign of maturity. When a model starts discussing GPU utilization, long-tail latency, KV cache reuse, quantized kernel alignment, and catastrophic forgetting control—it's no longer showcasing capabilities, but rather building an industrial-grade system. **For China, GLM-5 is more like a declaration: we can not only create large models, but also adapt our own computing power, and integrate the two.** This is the true leap ### Related Stocks - [02513.HK - KNOWLEDGE ATLAS](https://longbridge.com/en/quote/02513.HK.md) ## Related News & Research | Title | Description | URL | |-------|-------------|-----| | Is an AI price war about to begin? | The price of AI services is decreasing, with China's Zhipu offering access for about $3 a month, compared to $20 for US | [Link](https://longbridge.com/en/news/276415182.md) | | CICC Reaffirms Their Buy Rating on Knowledge Atlas Technology Joint Stock Company Limited Class H (2513) | CICC has reaffirmed a Buy rating on Knowledge Atlas Technology Joint Stock Company Limited Class H (2513) with a price t | [Link](https://longbridge.com/en/news/276095967.md) | | Factbox-Chinese AI models festoon Spring Festival a year after DeepSeek shock | As China celebrates the Lunar New Year, AI firms are racing to release new models following DeepSeek's success with its | [Link](https://longbridge.com/en/news/276210866.md) | | Olympics-Figure skating-Liu shines while Glenn suffers heartbreak in Milan | At the Milano Cortina Olympics, Alysa Liu secured third place in figure skating with 76.59 points, aided by family suppo | [Link](https://longbridge.com/en/news/276181856.md) | | Chinese AI startup Zhipu hikes prices for coding plan as demand rises | Chinese AI startup Zhipu announced a price increase of at least 30% for subscriptions to its GLM coding plan due to a su | [Link](https://longbridge.com/en/news/275680614.md) | --- > **Disclaimer**: This article is for reference only and does not constitute any investment advice.