Qwen API
Access Qwen instantly with Puter.js, and add AI to any app in a few lines of code without backend or API keys.
// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';
puter.ai.chat("Explain AI like I'm five!", {
model: "qwen/qwen-2.5-72b-instruct"
}).then(response => {
console.log(response);
});
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
puter.ai.chat("Explain AI like I'm five!", {
model: "qwen/qwen-2.5-72b-instruct"
}).then(response => {
console.log(response);
});
</script>
</body>
</html>
List of Qwen Models
Qwen3 Max Thinking
qwen/qwen3-max-thinking
Qwen3 Max Thinking is Alibaba Cloud's flagship proprietary reasoning model with a 256K context window, featuring test-time scaling and adaptive tool-use capabilities (web search, code interpreter, memory) that allow it to reason iteratively and autonomously. It scores competitively against GPT-5.2 and Gemini 3 Pro on benchmarks like Humanity's Last Exam and HMMT, excelling in math, complex reasoning, and instruction following.
ChatQwen3 Coder Next
qwen/qwen3-coder-next
Qwen3-Coder-Next is an open-weight coding model from Alibaba's Qwen team with 80B total parameters but only 3B active per token, designed specifically for coding agents and local development with a 256K context window. It uses a sparse Mixture-of-Experts (MoE) architecture with hybrid attention, trained on 800K executable coding tasks using reinforcement learning to excel at long-horizon reasoning, tool calling, and recovering from execution failures. It achieves performance comparable to models with 10-20x more active parameters on benchmarks like SWE-Bench while maintaining low inference costs.
ChatQwen3 Next 80B A3B Instruct
qwen/qwen3-next-80b-a3b-instruct
Qwen3 Next 80B A3B Instruct is an innovative MoE model with hybrid attention (Gated DeltaNet + Gated Attention), achieving 10x inference throughput for 32K+ contexts while matching Qwen3-235B performance.
ChatQwen3 Next 80B A3B Thinking
qwen/qwen3-next-80b-a3b-thinking
Qwen3 Next 80B A3B Thinking is the reasoning-enhanced variant outperforming Gemini-2.5-Flash-Thinking on complex reasoning tasks with hybrid attention and multi-token prediction.
ChatQwen Plus 0728
qwen/qwen-plus-2025-07-28
Qwen Plus (2025-07-28) is a snapshot version of Qwen Plus from July 2025, offering consistent behavior and performance for production deployments requiring version stability.
ChatQwen Plus 0728 (thinking)
qwen/qwen-plus-2025-07-28:thinking
Qwen Plus (2025-07-28) Thinking is the reasoning-enhanced version that uses chain-of-thought processing for complex problems, providing step-by-step reasoning before delivering answers.
ChatQwen3 235B A22B Instruct 2507
qwen/qwen3-235b-a22b-2507
Qwen3 235B A22B (2507) is the July 2025 updated version with significant improvements in instruction following, reasoning, coding, tool usage, and 256K long-context understanding.
ChatQwen3 235B A22B Thinking 2507
qwen/qwen3-235b-a22b-thinking-2507
Qwen3 235B A22B Thinking (2507) is the reasoning-enhanced variant using extended chain-of-thought processing for complex math, coding, and logical problems with enhanced performance.
ChatQwen3 30B A3B Instruct 2507
qwen/qwen3-30b-a3b-instruct-2507
Qwen3 30B A3B Instruct (2507) is the July 2025 updated instruction-tuned version with improved capabilities in reasoning, coding, and tool usage at high efficiency.
ChatQwen3 30B A3B Thinking 2507
qwen/qwen3-30b-a3b-thinking-2507
Qwen3 30B A3B Thinking (2507) is the reasoning-enhanced variant optimized for complex problem-solving with extended chain-of-thought processing at high parameter efficiency.
ChatQwen3 Coder 480B A35B
qwen/qwen3-coder
Qwen3 Coder is the most agentic code model in the Qwen series, available in 30B and 480B MoE variants. It achieves SOTA on SWE-Bench with 256K native context, extendable to 1M tokens.
ChatQwen3 Coder 30B A3B Instruct
qwen/qwen3-coder-30b-a3b-instruct
Qwen3 Coder 30B A3B Instruct is an efficient MoE coding model with 30B total and 3.3B active parameters, offering strong agentic coding capabilities with 256K context support.
ChatQwen3 Coder Flash
qwen/qwen3-coder-flash
Qwen3 Coder Flash is a cost-effective coding model balancing performance and speed, suitable for scenarios requiring fast responses at lower cost while maintaining coding quality.
ChatQwen3 Coder Plus
qwen/qwen3-coder-plus
Qwen3 Coder Plus is the strongest Qwen coding API model, ideal for complex project generation and in-depth code reviews with up to 1M token context support.
ChatQwen3 VL 235B A22B Instruct
qwen/qwen3-vl-235b-a22b-instruct
Qwen3 VL 235B A22B Instruct is the flagship vision-language MoE model with 256K context, offering superior visual coding, spatial understanding, and long video comprehension up to 20 minutes.
ChatQwen3 VL 235B A22B Thinking
qwen/qwen3-vl-235b-a22b-thinking
Qwen3 VL 235B A22B Thinking is the reasoning-enhanced vision-language model excelling at visual math, detail analysis, and causal reasoning with extended chain-of-thought processing.
ChatQwen3 VL 30B A3B Instruct
qwen/qwen3-vl-30b-a3b-instruct
Qwen3 VL 30B A3B Instruct is an efficient vision-language MoE model offering strong image/video understanding with 3B active parameters and 256K context support.
ChatQwen3 VL 30B A3B Thinking
qwen/qwen3-vl-30b-a3b-thinking
Qwen3 VL 30B A3B Thinking is the reasoning-enhanced vision-language variant optimized for complex visual reasoning tasks with extended thinking capabilities.
ChatQwen3 VL 32B Instruct
qwen/qwen3-vl-32b-instruct
Qwen3 VL 32B Instruct is a dense vision-language model with strong text and visual capabilities, featuring visual coding, spatial understanding, and 256K context support.
ChatQwen3 VL 8B Instruct
qwen/qwen3-vl-8b-instruct
Qwen3 VL 8B Instruct is a compact vision-language model matching flagship text performance while supporting image/video understanding, visual coding, and 256K context length.
ChatQwen3 VL 8B Thinking
qwen/qwen3-vl-8b-thinking
Qwen3 VL 8B Thinking is the reasoning-enhanced compact vision model for complex visual analysis requiring step-by-step reasoning with efficient resource usage.
ChatQwen3 Max
qwen/qwen3-max
Qwen3 Max is the most powerful Qwen3 API model with SOTA agent programming and tool usage capabilities. It features non-thinking mode optimized for complex agent scenarios.
ChatQwen3 14B
qwen/qwen3-14b
Qwen3 14B is a dense language model with hybrid thinking/non-thinking modes, matching Qwen2.5-32B performance. It supports 119 languages and excels in math, coding, and reasoning tasks.
ChatQwen3 235B A22B
qwen/qwen3-235b-a22b
Qwen3 235B A22B is the flagship MoE model with 235B total and 22B active parameters, rivaling DeepSeek-R1 and o1. It features hybrid thinking modes and supports 119 languages with strong agentic capabilities.
ChatQwen3 30B A3B
qwen/qwen3-30b-a3b
Qwen3 30B A3B is an efficient MoE model with 30B total and 3B active parameters, outperforming QwQ-32B while using 10x fewer active parameters. It offers hybrid thinking modes and 119 language support.
ChatQwen3 32B
qwen/qwen3-32b
Qwen3 32B is a dense language model matching Qwen2.5-72B performance with hybrid thinking/non-thinking modes. It excels in STEM, coding, and reasoning while supporting 119 languages.
ChatQwen3 4B
qwen/qwen3-4b:free
Qwen3 4B (Free) is a compact model rivaling Qwen2.5-72B-Instruct performance, featuring hybrid thinking modes and 119 language support. Available at no cost for lightweight deployments.
ChatQwen3 8B
qwen/qwen3-8b
Qwen3 8B is a dense model matching Qwen2.5-14B performance with hybrid thinking modes and 128K context. It offers strong reasoning, coding, and multilingual capabilities in a mid-sized package.
ChatQwen2.5 VL 32B Instruct
qwen/qwen2.5-vl-32b-instruct
Qwen 2.5 VL 32B Instruct is a mid-sized vision-language model offering enhanced image/video understanding with better alignment to human preferences. It bridges the gap between 7B and 72B variants.
ChatQwQ 32B
qwen/qwq-32b
QwQ 32B is a 32B parameter reasoning model rivaling DeepSeek-R1 (671B) through scaled reinforcement learning. It excels in math, coding, and complex reasoning with 131K context and agent capabilities.
ChatQwen-Max
qwen/qwen-max
Qwen Max is Alibaba's most powerful proprietary API model, a large-scale MoE with hundreds of billions of parameters. It delivers top-tier performance in reasoning, coding, math, and multilingual tasks via Alibaba Cloud Model Studio.
ChatQwen-Plus
qwen/qwen-plus
Qwen Plus is a high-performance proprietary API model balancing capability and cost, suitable for complex tasks requiring strong reasoning and multilingual support. Available through Alibaba Cloud Model Studio.
ChatQwen-Turbo
qwen/qwen-turbo
Qwen Turbo is a fast, cost-effective API model with up to 1M context length, ideal for simple tasks requiring quick responses. It supports multiple languages and offers flexible tiered pricing.
ChatQwen2.5-VL 7B Instruct
qwen/qwen-2.5-vl-7b-instruct
Qwen 2.5 VL 7B Instruct is a vision-language model capable of understanding images, documents, charts, and videos up to 1 hour. It supports OCR, visual reasoning, and can act as a visual agent for computer/phone use.
ChatQwen VL Max
qwen/qwen-vl-max
Qwen VL Max is Alibaba's most capable vision-language API model based on Qwen2.5-VL, offering superior image/video understanding, OCR, document analysis, and visual reasoning capabilities.
ChatQwen VL Plus
qwen/qwen-vl-plus
Qwen VL Plus is a balanced vision-language API model offering good performance at lower cost, suitable for image understanding, OCR, and multimodal tasks without requiring maximum capability.
ChatQwen2.5 VL 72B Instruct
qwen/qwen2.5-vl-72b-instruct
Qwen 2.5 VL 72B Instruct is the flagship open-source vision-language model excelling in document understanding, visual reasoning, and long video comprehension up to 1 hour with event pinpointing.
ImageQwen/Qwen-Image
qwen/qwen-image
Qwen2.5 Coder 32B Instruct
qwen/qwen-2.5-coder-32b-instruct
Qwen 2.5 Coder 32B Instruct is a code-specialized model matching GPT-4o's coding capabilities, supporting 40+ programming languages. It excels in code generation, repair, and reasoning with 128K context support.
ChatQwen2.5 72B Instruct
qwen/qwen-2.5-72b-instruct
Qwen 2.5 72B Instruct is Alibaba's flagship open-source language model with 72 billion parameters, trained on 18 trillion tokens with 128K context support. It excels in coding, math, instruction following, and multilingual tasks across 29+ languages.
ChatQwen2.5 7B Instruct
qwen/qwen-2.5-7b-instruct
Qwen 2.5 7B Instruct is a compact yet capable language model offering strong performance in coding, math, and general tasks. It supports 128K context length and 29+ languages while being efficient enough for smaller deployments.
ChatQwen2.5 Coder 7B Instruct
qwen/qwen2.5-coder-7b-instruct
Qwen 2.5 Coder 7B Instruct is a compact code-specialized model with strong code generation, reasoning, and repair capabilities. It supports multiple programming languages while being deployable on consumer hardware.
Frequently Asked Questions
The Qwen API gives you access to models for AI chat and image generation. Through Puter.js, you can start using Qwen models instantly with zero setup or configuration.
Puter.js supports a variety of Qwen models, including Qwen3 Max Thinking, Qwen3 Coder Next, Qwen3 Next 80B A3B Instruct, and more. Find all AI models supported by Puter.js in the AI model list.
With the User-Pays model, users cover their own AI costs through their Puter account. This means you can build apps without worrying about infrastructure expenses.
Puter.js is a JavaScript library that provides access to AI, storage, and other cloud services directly from a single API. It handles authentication, infrastructure, and scaling so you can focus on building your app.
Yes — the Qwen API through Puter.js works with any JavaScript framework, Node.js, or plain HTML. Just include the library and start building. See the documentation for more details.