Qwen3.7 Max Is Now Available in Puter.js

Reynaldi Chernando

May 22, 2026

On this page

What is Qwen3.7 Max?Examples Long-horizon agentic coding Whole-codebase reasoning Deep document analysis Streaming with chain-of-thought reasoning Get Started Now

Alibaba's Qwen team just unveiled Qwen3.7 Max at the 2026 Alibaba Cloud Summit — and it's available to use through Puter.js.

What is Qwen3.7 Max?

Qwen3.7 Max is Alibaba's flagship proprietary reasoning model, purpose-built for long-horizon agentic workloads. It pairs a chain-of-thought reasoning architecture with a massive context window, and is designed to sustain complex, multi-step autonomous tasks for extended periods. Highlights include:

1M Token Context Window: Process entire codebases, lengthy documents, or thousand-step agent traces in a single request
65K Output Tokens: Generate long-form responses, complete implementations, and detailed plans without truncation
Long-Horizon Agentic Execution: Demonstrated sustaining autonomous runs of up to 35 hours, chaining 1,000+ tool calls in a single session without measurable degradation
Frontier Benchmarks: 56.6 on the Artificial Analysis Intelligence Index (highest-ranked Chinese model), 90.2 on Arena-Hard v2, 72.5 on SWE-Bench Verified, and a top-15 spot on LM Arena's text leaderboard
Native Tool Use: Function calling and tool use out of the box, making it well-suited for coding agents and research pipelines

Examples

Long-horizon agentic coding

puter.ai.chat("Refactor this Express API into a NestJS service, port the tests, and update the OpenAPI spec to match",
  { model: 'qwen/qwen3.7-max' }
);

Whole-codebase reasoning

puter.ai.chat("Here is our entire monorepo. Find every place we read a JWT without verifying its signature, and propose fixes",
  { model: 'qwen/qwen3.7-max' }
);

Deep document analysis

puter.ai.chat("Read this 400-page regulatory filing and produce a structured summary of every disclosed material risk",
  { model: 'qwen/qwen3.7-max' }
);

Streaming with chain-of-thought reasoning

const response = await puter.ai.chat(
  "Design a globally distributed rate limiter that handles 1M req/sec with strong consistency, and walk through the trade-offs",
  { model: 'qwen/qwen3.7-max', stream: true }
);

for await (const part of response) {
  if (part?.reasoning) puter.print(part?.reasoning);
  else puter.print(part?.text);
}

Get Started Now

Just add one library to your project:

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

Or add one script tag to your HTML:

<script src="https://js.puter.com/v2/"></script>

No API keys needed. Start building with Qwen3.7 Max immediately.

Learn more:

Free, Serverless AI and Cloud

Start creating powerful web applications with Puter.js in seconds!

Get Started Now

Read the Docs • Try the Playground