Qwen

Qwen API

Access Qwen instantly with Puter.js, and add AI to any app in a few lines of code without backend or API keys.

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

puter.ai.chat("Explain AI like I'm five!", {
    model: "qwen/qwen-2.5-72b-instruct"
}).then(response => {
    console.log(response);
});
<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat("Explain AI like I'm five!", {
            model: "qwen/qwen-2.5-72b-instruct"
        }).then(response => {
            console.log(response);
        });
    </script>
</body>
</html>

List of Qwen Models

Chat

Qwen3 Max Thinking

qwen/qwen3-max-thinking

Qwen3 Max Thinking is Alibaba Cloud's flagship proprietary reasoning model with a 256K context window, featuring test-time scaling and adaptive tool-use capabilities (web search, code interpreter, memory) that allow it to reason iteratively and autonomously. It scores competitively against GPT-5.2 and Gemini 3 Pro on benchmarks like Humanity's Last Exam and HMMT, excelling in math, complex reasoning, and instruction following.

Chat

Qwen3 Coder Next

qwen/qwen3-coder-next

Qwen3-Coder-Next is an open-weight coding model from Alibaba's Qwen team with 80B total parameters but only 3B active per token, designed specifically for coding agents and local development with a 256K context window. It uses a sparse Mixture-of-Experts (MoE) architecture with hybrid attention, trained on 800K executable coding tasks using reinforcement learning to excel at long-horizon reasoning, tool calling, and recovering from execution failures. It achieves performance comparable to models with 10-20x more active parameters on benchmarks like SWE-Bench while maintaining low inference costs.

Chat

Qwen3 Next 80B A3B Instruct

qwen/qwen3-next-80b-a3b-instruct

Qwen3 Next 80B A3B Instruct is an innovative MoE model with hybrid attention (Gated DeltaNet + Gated Attention), achieving 10x inference throughput for 32K+ contexts while matching Qwen3-235B performance.

Chat

Qwen3 Next 80B A3B Thinking

qwen/qwen3-next-80b-a3b-thinking

Qwen3 Next 80B A3B Thinking is the reasoning-enhanced variant outperforming Gemini-2.5-Flash-Thinking on complex reasoning tasks with hybrid attention and multi-token prediction.

Chat

Qwen Plus 0728

qwen/qwen-plus-2025-07-28

Qwen Plus (2025-07-28) is a snapshot version of Qwen Plus from July 2025, offering consistent behavior and performance for production deployments requiring version stability.

Chat

Qwen Plus 0728 (thinking)

qwen/qwen-plus-2025-07-28:thinking

Qwen Plus (2025-07-28) Thinking is the reasoning-enhanced version that uses chain-of-thought processing for complex problems, providing step-by-step reasoning before delivering answers.

Chat

Qwen3 235B A22B Instruct 2507

qwen/qwen3-235b-a22b-2507

Qwen3 235B A22B (2507) is the July 2025 updated version with significant improvements in instruction following, reasoning, coding, tool usage, and 256K long-context understanding.

Chat

Qwen3 235B A22B Thinking 2507

qwen/qwen3-235b-a22b-thinking-2507

Qwen3 235B A22B Thinking (2507) is the reasoning-enhanced variant using extended chain-of-thought processing for complex math, coding, and logical problems with enhanced performance.

Chat

Qwen3 30B A3B Instruct 2507

qwen/qwen3-30b-a3b-instruct-2507

Qwen3 30B A3B Instruct (2507) is the July 2025 updated instruction-tuned version with improved capabilities in reasoning, coding, and tool usage at high efficiency.

Chat

Qwen3 30B A3B Thinking 2507

qwen/qwen3-30b-a3b-thinking-2507

Qwen3 30B A3B Thinking (2507) is the reasoning-enhanced variant optimized for complex problem-solving with extended chain-of-thought processing at high parameter efficiency.

Chat

Qwen3 Coder 480B A35B

qwen/qwen3-coder

Qwen3 Coder is the most agentic code model in the Qwen series, available in 30B and 480B MoE variants. It achieves SOTA on SWE-Bench with 256K native context, extendable to 1M tokens.

Chat

Qwen3 Coder 30B A3B Instruct

qwen/qwen3-coder-30b-a3b-instruct

Qwen3 Coder 30B A3B Instruct is an efficient MoE coding model with 30B total and 3.3B active parameters, offering strong agentic coding capabilities with 256K context support.

Chat

Qwen3 Coder Flash

qwen/qwen3-coder-flash

Qwen3 Coder Flash is a cost-effective coding model balancing performance and speed, suitable for scenarios requiring fast responses at lower cost while maintaining coding quality.

Chat

Qwen3 Coder Plus

qwen/qwen3-coder-plus

Qwen3 Coder Plus is the strongest Qwen coding API model, ideal for complex project generation and in-depth code reviews with up to 1M token context support.

Chat

Qwen3 VL 235B A22B Instruct

qwen/qwen3-vl-235b-a22b-instruct

Qwen3 VL 235B A22B Instruct is the flagship vision-language MoE model with 256K context, offering superior visual coding, spatial understanding, and long video comprehension up to 20 minutes.

Chat

Qwen3 VL 235B A22B Thinking

qwen/qwen3-vl-235b-a22b-thinking

Qwen3 VL 235B A22B Thinking is the reasoning-enhanced vision-language model excelling at visual math, detail analysis, and causal reasoning with extended chain-of-thought processing.

Chat

Qwen3 VL 30B A3B Instruct

qwen/qwen3-vl-30b-a3b-instruct

Qwen3 VL 30B A3B Instruct is an efficient vision-language MoE model offering strong image/video understanding with 3B active parameters and 256K context support.

Chat

Qwen3 VL 30B A3B Thinking

qwen/qwen3-vl-30b-a3b-thinking

Qwen3 VL 30B A3B Thinking is the reasoning-enhanced vision-language variant optimized for complex visual reasoning tasks with extended thinking capabilities.

Chat

Qwen3 VL 32B Instruct

qwen/qwen3-vl-32b-instruct

Qwen3 VL 32B Instruct is a dense vision-language model with strong text and visual capabilities, featuring visual coding, spatial understanding, and 256K context support.

Chat

Qwen3 VL 8B Instruct

qwen/qwen3-vl-8b-instruct

Qwen3 VL 8B Instruct is a compact vision-language model matching flagship text performance while supporting image/video understanding, visual coding, and 256K context length.

Chat

Qwen3 VL 8B Thinking

qwen/qwen3-vl-8b-thinking

Qwen3 VL 8B Thinking is the reasoning-enhanced compact vision model for complex visual analysis requiring step-by-step reasoning with efficient resource usage.

Chat

Qwen3 Max

qwen/qwen3-max

Qwen3 Max is the most powerful Qwen3 API model with SOTA agent programming and tool usage capabilities. It features non-thinking mode optimized for complex agent scenarios.

Chat

Qwen3 14B

qwen/qwen3-14b

Qwen3 14B is a dense language model with hybrid thinking/non-thinking modes, matching Qwen2.5-32B performance. It supports 119 languages and excels in math, coding, and reasoning tasks.

Chat

Qwen3 235B A22B

qwen/qwen3-235b-a22b

Qwen3 235B A22B is the flagship MoE model with 235B total and 22B active parameters, rivaling DeepSeek-R1 and o1. It features hybrid thinking modes and supports 119 languages with strong agentic capabilities.

Chat

Qwen3 30B A3B

qwen/qwen3-30b-a3b

Qwen3 30B A3B is an efficient MoE model with 30B total and 3B active parameters, outperforming QwQ-32B while using 10x fewer active parameters. It offers hybrid thinking modes and 119 language support.

Chat

Qwen3 32B

qwen/qwen3-32b

Qwen3 32B is a dense language model matching Qwen2.5-72B performance with hybrid thinking/non-thinking modes. It excels in STEM, coding, and reasoning while supporting 119 languages.

Chat

Qwen3 4B

qwen/qwen3-4b:free

Qwen3 4B (Free) is a compact model rivaling Qwen2.5-72B-Instruct performance, featuring hybrid thinking modes and 119 language support. Available at no cost for lightweight deployments.

Chat

Qwen3 8B

qwen/qwen3-8b

Qwen3 8B is a dense model matching Qwen2.5-14B performance with hybrid thinking modes and 128K context. It offers strong reasoning, coding, and multilingual capabilities in a mid-sized package.

Chat

Qwen2.5 VL 32B Instruct

qwen/qwen2.5-vl-32b-instruct

Qwen 2.5 VL 32B Instruct is a mid-sized vision-language model offering enhanced image/video understanding with better alignment to human preferences. It bridges the gap between 7B and 72B variants.

Chat

QwQ 32B

qwen/qwq-32b

QwQ 32B is a 32B parameter reasoning model rivaling DeepSeek-R1 (671B) through scaled reinforcement learning. It excels in math, coding, and complex reasoning with 131K context and agent capabilities.

Chat

Qwen-Max

qwen/qwen-max

Qwen Max is Alibaba's most powerful proprietary API model, a large-scale MoE with hundreds of billions of parameters. It delivers top-tier performance in reasoning, coding, math, and multilingual tasks via Alibaba Cloud Model Studio.

Chat

Qwen-Plus

qwen/qwen-plus

Qwen Plus is a high-performance proprietary API model balancing capability and cost, suitable for complex tasks requiring strong reasoning and multilingual support. Available through Alibaba Cloud Model Studio.

Chat

Qwen-Turbo

qwen/qwen-turbo

Qwen Turbo is a fast, cost-effective API model with up to 1M context length, ideal for simple tasks requiring quick responses. It supports multiple languages and offers flexible tiered pricing.

Chat

Qwen2.5-VL 7B Instruct

qwen/qwen-2.5-vl-7b-instruct

Qwen 2.5 VL 7B Instruct is a vision-language model capable of understanding images, documents, charts, and videos up to 1 hour. It supports OCR, visual reasoning, and can act as a visual agent for computer/phone use.

Chat

Qwen VL Max

qwen/qwen-vl-max

Qwen VL Max is Alibaba's most capable vision-language API model based on Qwen2.5-VL, offering superior image/video understanding, OCR, document analysis, and visual reasoning capabilities.

Chat

Qwen VL Plus

qwen/qwen-vl-plus

Qwen VL Plus is a balanced vision-language API model offering good performance at lower cost, suitable for image understanding, OCR, and multimodal tasks without requiring maximum capability.

Chat

Qwen2.5 VL 72B Instruct

qwen/qwen2.5-vl-72b-instruct

Qwen 2.5 VL 72B Instruct is the flagship open-source vision-language model excelling in document understanding, visual reasoning, and long video comprehension up to 1 hour with event pinpointing.

Image

Qwen/Qwen-Image

qwen/qwen-image
Chat

Qwen2.5 Coder 32B Instruct

qwen/qwen-2.5-coder-32b-instruct

Qwen 2.5 Coder 32B Instruct is a code-specialized model matching GPT-4o's coding capabilities, supporting 40+ programming languages. It excels in code generation, repair, and reasoning with 128K context support.

Chat

Qwen2.5 72B Instruct

qwen/qwen-2.5-72b-instruct

Qwen 2.5 72B Instruct is Alibaba's flagship open-source language model with 72 billion parameters, trained on 18 trillion tokens with 128K context support. It excels in coding, math, instruction following, and multilingual tasks across 29+ languages.

Chat

Qwen2.5 7B Instruct

qwen/qwen-2.5-7b-instruct

Qwen 2.5 7B Instruct is a compact yet capable language model offering strong performance in coding, math, and general tasks. It supports 128K context length and 29+ languages while being efficient enough for smaller deployments.

Chat

Qwen2.5 Coder 7B Instruct

qwen/qwen2.5-coder-7b-instruct

Qwen 2.5 Coder 7B Instruct is a compact code-specialized model with strong code generation, reasoning, and repair capabilities. It supports multiple programming languages while being deployable on consumer hardware.

Frequently Asked Questions

What is this Qwen API about?

The Qwen API gives you access to models for AI chat and image generation. Through Puter.js, you can start using Qwen models instantly with zero setup or configuration.

Which Qwen models can I use?

Puter.js supports a variety of Qwen models, including Qwen3 Max Thinking, Qwen3 Coder Next, Qwen3 Next 80B A3B Instruct, and more. Find all AI models supported by Puter.js in the AI model list.

How much does it cost?

With the User-Pays model, users cover their own AI costs through their Puter account. This means you can build apps without worrying about infrastructure expenses.

What is Puter.js?

Puter.js is a JavaScript library that provides access to AI, storage, and other cloud services directly from a single API. It handles authentication, infrastructure, and scaling so you can focus on building your app.

Does this work with React / Vue / Vanilla JS / Node / etc.?

Yes — the Qwen API through Puter.js works with any JavaScript framework, Node.js, or plain HTML. Just include the library and start building. See the documentation for more details.