GLM Coding Plan — designed for Claude Code users, starting at $3/month to enjoy a premium coding experience!
GLM-4.5 and GLM-4.5-Air are our latest flagship models, purpose-built as foundational models for agent-oriented applications. Both leverage a Mixture-of-Experts (MoE) architecture. GLM-4.5 has a total parameter count of 355B with 32B active parameters per forward pass, while GLM-4.5-Air adopts a more streamlined design with 106B total parameters and 12B active parameters.
Both models share a similar training pipeline: an initial pretraining phase on 15 trillion tokens of general-domain data, followed by targeted fine-tuning on datasets covering code, reasoning, and agent-specific tasks. The context length has been extended to 128k tokens, and reinforcement learning was applied to further enhance reasoning, coding, and agent performance.
GLM-4.5 and GLM-4.5-Air are optimized for tool invocation, web browsing, software engineering, and front-end development. They can be integrated into code-centric agents such as Claude Code and Roo Code, and also support arbitrary agent applications through tool invocation APIs.
Both models support hybrid reasoning modes, offering two execution modes: Thinking Mode for complex reasoning and tool usage, and Non-Thinking Mode for instant responses. These modes can be toggled via the thinking.type parameter (with enabled and disabled settings), and dynamic thinking is enabled by default.
Input Modalities Text Output Modalitie Text Context Length 128K Maximum Output Tokens 96K
GLM GLM-4.5 Our most powerful reasoning model, with 355 billion parameters AIR GLM-4.5-Air Cost-Effective Lightweight Strong Performance X GLM-4.5-X High Performance Strong Reasoning Ultra-Fast Response AirX GLM-4.5-AirX Lightweight Strong Performance Ultra-Fast Response FLASH GLM-4.5-Flash Free Strong Performance Excellent for Reasoning Coding & Agents
Deep Thinking Enable deep thinking mode for more advanced reasoning and analysis Streaming Output Support real-time streaming responses to enhance user interaction experience Function Call Powerful tool invocation capabilities, enabling integration with various external toolsets Context Caching Intelligent caching mechanism to optimize performance in long conversations Structured Output Support for structured output formats like JSON, facilitating system integration
The first-principle measure of AGI lies in integrating more general intelligence capabilities without compromising existing functions. GLM-4.5 represents our first complete realization of this concept. It combines advanced reasoning, coding, and agent capabilities within a single model, achieving a significant technological breakthrough by natively fusing reasoning, coding, and agent abilities to meet the complex demands of agent-based applications.
To comprehensively evaluate the model’s general intelligence, we selected 12 of the most representative benchmark suites, including MMLU Pro, AIME24, MATH 500, SciCode, GPQA, HLE, LiveCodeBench, SWE-Bench, Terminal-bench, TAU-Bench, BFCL v3, and BrowseComp. Based on the aggregated average scores, GLM-4.5 ranks second globally among all models, first among domestic models, and first among open-source models.
... continue reading