Moonshot AI: Moonshot v1 32K
moonshotai/moonshot-v1-32k
Access Moonshot v1 32K from Moonshot AI using Puter.js AI API.
Get Started// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';
puter.ai.chat("Explain quantum computing in simple terms", {
model: "moonshotai/moonshot-v1-32k"
}).then(response => {
document.body.innerHTML = response.message.content;
});
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
puter.ai.chat("Explain quantum computing in simple terms", {
model: "moonshotai/moonshot-v1-32k"
}).then(response => {
document.body.innerHTML = response.message.content;
});
</script>
</body>
</html>
# pip install openai
from openai import OpenAI
client = OpenAI(
base_url="https://api.puter.com/puterai/openai/v1/",
api_key="YOUR_PUTER_AUTH_TOKEN",
)
response = client.chat.completions.create(
model="moonshotai/moonshot-v1-32k",
messages=[
{"role": "user", "content": "Explain quantum computing in simple terms"}
],
)
print(response.choices[0].message.content)
curl https://api.puter.com/puterai/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_PUTER_AUTH_TOKEN" \
-d '{
"model": "moonshotai/moonshot-v1-32k",
"messages": [
{"role": "user", "content": "Explain quantum computing in simple terms"}
]
}'
Model Card
Moonshot V1 32K is a general-purpose text generation model from Moonshot AI with a 32,000-token context window. It sits in the middle of the Moonshot V1 family, balancing context capacity with cost.
All Moonshot V1 variants share the same model quality — only the context length differs. The 32K window is well-suited for multi-turn conversations, medium-length document summarization, and tasks where inputs and outputs together exceed 8K tokens but don't require the full 128K capacity.
The API is fully OpenAI-compatible, supporting streaming, tool calling, and standard chat completion parameters. The model performs well in both English and Chinese.
Context Window 33K
tokens
Max Output 33K
tokens
Input Cost $1
per million tokens
Output Cost $3
per million tokens
Input text
modalities
Tool Use Yes
Release Date Jan 31, 2024
Model Playground
Try Moonshot v1 32K instantly in your browser.
This playground uses the Puter.js AI API — no API keys or setup required.
More AI Models From Moonshot AI
Find other Moonshot AI models →
Kimi K2.6
Kimi K2.6 is Moonshot AI's latest open-weight multimodal model, built on a 1-trillion-parameter mixture-of-experts architecture with a 256K context window. It excels at agentic coding and long-horizon execution, supporting sustained autonomous workflows with 4,000+ tool calls across languages like Rust, Go, and Python. On key benchmarks, it scores 58.6 on SWE-Bench Pro, 54.0 on HLE with Tools, and 50.0 on Toolathlon — competitive with GPT-5.4 and Claude Opus 4.6 on coding and agent tasks, though trailing them on pure reasoning. The model accepts text, image, and video input, supports both thinking and non-thinking modes, and offers an OpenAI-compatible API. It's a strong pick for developers building multi-step agentic workflows and complex software engineering pipelines.
ChatKimi K2.5
Kimi K2.5 is Moonshot AI's most capable open-source model, a natively multimodal (vision + text) trillion-parameter MoE with 32B active parameters released in January 2026. Built through continual pretraining on ~15 trillion mixed visual and text tokens atop the K2 base, it supports both thinking and instant modes with a 256K context window. It scored 76.8% on SWE-bench Verified, 96.1% on AIME 2025, and 50.2% on Humanity's Last Exam with tools — outperforming Claude Opus 4.5 and GPT-5.2 on the latter. Its standout feature is Agent Swarm, which coordinates up to 100 parallel sub-agents for complex tasks. K2.5 excels at vision-to-code generation, frontend development from screenshots, and large-scale agentic workflows, making it a strong choice for developers building multimodal AI agents.
ChatKimi K2 0905
Kimi K2 0905 is Moonshot AI's September 2025 update to the original Kimi K2, delivering enhanced coding performance and improved tool-calling reliability. It shares the same 1-trillion-parameter MoE architecture with 32B active parameters but doubles the context window from 128K to 256K tokens. Key improvements include stronger frontend development capabilities — producing cleaner, more polished UI code for frameworks like React, Vue, and Angular — along with better integration across popular agent scaffolds. It scored 53.7% Pass@1 on LiveCodeBench. This version is ideal for developers who want K2's agentic strengths with improved real-world coding quality and longer context support for large codebases.
Frequently Asked Questions
You can access Moonshot v1 32K by Moonshot AI through Puter.js AI API. Include the library in your web app or Node.js project and start making calls with just a few lines of JavaScript — no backend and no configuration required. You can also use it with Python or cURL via Puter's OpenAI-compatible API.
Yes, it is free if you're using it through Puter.js. With the User-Pays Model, you can add Moonshot v1 32K to your app at no cost — your users pay for their own AI usage directly, making it completely free for you as a developer.
| Price per 1M tokens | |
|---|---|
| Input | $1 |
| Output | $3 |
Moonshot v1 32K was created by Moonshot AI and released on Jan 31, 2024.
Moonshot v1 32K supports a context window of 33K tokens. For reference, that is roughly equivalent to 66 pages of text.
Moonshot v1 32K can generate up to 33K tokens in a single response.
Moonshot v1 32K accepts the following input types: text. It produces: text.
Yes, Moonshot v1 32K supports tool use (function calling), allowing it to interact with external tools, APIs, and data sources as part of its response flow.
Yes — the Moonshot v1 32K API works with any JavaScript framework, Node.js, or plain HTML through Puter.js. Just include the library and start building. See the documentation for more details.
Get started with Puter.js
Add Moonshot v1 32K to your app without worrying about API keys or setup.
Read the Docs View Tutorials