Friday, June 19, 2026

Z.ai Launches GLM-5.2 With a Usable 1M-Token Context, Two Pondering-Effort Ranges, and No Benchmarks at Launch


GLM-5.2 is the newest giant language mannequin from Z.ai, turning into the third main launch within the GLM-5 line. It follows GLM-5 (February 11), GLM-5-Turbo (March 15), and GLM-5.1 (April 7). That makes 4 flagship-tier coding releases in roughly 4 months.

Usable 1M-Token Context Window

GLM-5.2’s standout spec is a 1,000,000-token context window. Z.ai labels the variant glm-5.2[1m] in its personal configuration. Every response can return as much as 131,072 output tokens. That’s roughly a 5x bounce from GLM-5.1’s 200,000-token window.

A 1M-token window adjustments how a coding agent works in follow. The agent can maintain a whole mid-sized repository in working reminiscence. That features supply recordsdata, checks, configuration, and dialog historical past. It avoids the fixed summarization that smaller home windows power.

The discharge additionally provides two thinking-effort ranges: Excessive and Max. Z.ai recommends Max effort for complicated, multi-step coding work. In Claude Code, the /effort command controls this setting. The xhigh, max, and ultracode choices all map to GLM-5.2’s Max effort.

Structure and What Modified

Z.ai didn’t specify GLM-5.2’s structure in its launch supplies. However based mostly on neighborhood notes, the GLM-5 base is a 744-billion-parameter Combination-of-Specialists mannequin. It prompts 40 billion parameters per token. GLM-5.1 stored that very same spine with retargeted post-training.

MTP Explainer Playground

Interactive Demo

GLM-5.2 Setup Generator & Context Visualizer

Decide your agent and energy mode. Copy the precise config. See what 1M tokens buys you.

1. Coding agent




2. Context window


3. Pondering effort


Context window: GLM-5.1 vs GLM-5.2

GLM-5.2 at a look

1,000,000enter tokens in a single context window

131,072max output tokens per response

5xbigger than GLM-5.1’s window

8agentic instruments supported day one

The Benchmark Query

Right here is the essential caveat. Z.ai revealed no benchmark scores for GLM-5.2 at launch. There isn’t any SWE-bench, Terminal-Bench, or Code Enviornment quantity but. The announcement centered on availability, context, and the open-source roadmap.

Specification Comparability: GLM-5.2 vs GLM-5.1

Attribute GLM-5.2 GLM-5.1
Launched June 13, 2026 April 7, 2026
Context window 1,000,000 tokens (glm-5.2[1m]) ~200,000 tokens
Max output tokens 131,072 Not disclosed
Reasoning modes Excessive, Max Single mode
Structure Not specified at launch (GLM-5 lineage) 744B MoE, 40B lively
License MIT (weights pending subsequent week) MIT (open weights launched)
Launch benchmarks None revealed 58.4 SWE-bench Professional
Entry at launch GLM Coding Plan (all tiers) Coding Plan, API, and weights

Use Circumstances With Examples

  • Complete-repository refactors: Load a mid-sized repo into one context window. The agent tracks cross-file dependencies with out re-fetching. Instance: refactor a 40-file Python knowledge pipeline in a single session.
  • Lengthy-horizon agent runs: GLM-5.2 targets sustained plan, execute, check, repair loops. GLM-5.1 sustained roughly 1,700 agent steps in a single session. It ran autonomous loops for as much as eight hours. GLM-5.2 inherits that trajectory, although its personal numbers are pending.
  • Drop-in Claude Code substitute: Swap the bottom URL and mannequin identifier solely. Hold your present agent harness and workflow. This issues when frontier API entry is disrupted.
  • Massive-document evaluation: Feed lengthy specs, logs, or transcripts previous 200K tokens. The 1M window holds materials that smaller fashions truncate.

Tips on how to Set Up GLM-5.2

For Claude Code, edit ~/.claude/settings.json. Level the Sonnet and Opus slots on the 1M variant. Increase the auto-compact window so the agent makes use of the complete context.

{
  "env": {
    "CLAUDE_CODE_AUTO_COMPACT_WINDOW": "1000000",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "glm-4.5-air",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "glm-5.2[1m]",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "glm-5.2[1m]"
  }
}

Alternatively, set the endpoint by surroundings variables. The Anthropic-compatible endpoint accepts a base-URL swap.

export ANTHROPIC_AUTH_TOKEN="your-zai-api-key"
export ANTHROPIC_BASE_URL="https://api.z.ai/api/anthropic"
export ANTHROPIC_DEFAULT_OPUS_MODEL="glm-5.2[1m]"
export ANTHROPIC_DEFAULT_SONNET_MODEL="glm-5.2[1m]"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="glm-4.5-air"
claude

Then run /effort in a session and choose max. Run /standing to verify GLM-5.2 is lively. For Cline, select the OpenAI Appropriate supplier. Set the bottom URL to https://api.z.ai/api/coding/paas/v4. Enter the customized mannequin glm-5.2 and set context to 1,000,000.

GLM-5.2 is suitable with eight agentic coding instruments from day one. The record contains Claude Code, Cline, OpenCode, and OpenClaw.

Key Takeaways

  • Z.ai shipped GLM-5.2 on June 13, 2026, reside instantly throughout all GLM Coding Plan tiers (Lite, Professional, Max, Staff).
  • 1M-token context window (glm-5.2[1m]) with as much as 131,072 output tokens.
  • No benchmarks had been revealed at launch
  • It drops into Claude Code, Cline, and OpenClaw through an Anthropic-compatible endpoint with only a base-URL and mannequin swap.

Take a look at the Technical particularsAdditionally, be happy to observe us on Twitter and don’t neglect to hitch our 150k+ML SubReddit and Subscribe to our Publication. Wait! are you on telegram? now you may be part of us on telegram as nicely.

Have to associate with us for selling your GitHub Repo OR Hugging Face Web page OR Product Launch OR Webinar and so forth.? Join with us


Michal Sutter is an information science skilled with a Grasp of Science in Knowledge Science from the College of Padova. With a strong basis in statistical evaluation, machine studying, and knowledge engineering, Michal excels at remodeling complicated datasets into actionable insights.

Related Articles

Latest Articles