Moonshot AI: Kimi K2 0711

Q: Is Kimi K2 0711 free?

Yes, it is free if you're using it through Puter.js . With the User-Pays Model , you can add Kimi K2 0711 to your app at no cost — your users pay for their own AI usage directly, making it completely free for you as a developer.

moonshotai/kimi-k2

Access Kimi K2 0711 from Moonshot AI using Puter.js AI API.

Get Started

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

puter.ai.chat("Explain quantum computing in simple terms", {
    model: "moonshotai/kimi-k2"
}).then(response => {
    document.body.innerHTML = response.message.content;
});

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat("Explain quantum computing in simple terms", {
            model: "moonshotai/kimi-k2"
        }).then(response => {
            document.body.innerHTML = response.message.content;
        });
    </script>
</body>
</html>

# pip install openai
from openai import OpenAI

client = OpenAI(
    base_url="https://api.puter.com/puterai/openai/v1/",
    api_key="YOUR_PUTER_AUTH_TOKEN",
)

response = client.chat.completions.create(
    model="moonshotai/kimi-k2",
    messages=[
        {"role": "user", "content": "Explain quantum computing in simple terms"}
    ],
)

print(response.choices[0].message.content)

curl https://api.puter.com/puterai/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_PUTER_AUTH_TOKEN" \
  -d '{
    "model": "moonshotai/kimi-k2",
    "messages": [
      {"role": "user", "content": "Explain quantum computing in simple terms"}
    ]
  }'

Model Card

Kimi K2 is a trillion-parameter Mixture-of-Experts model by Moonshot AI, activating 32 billion parameters per token. Designed as a non-thinking model optimized for agentic capabilities, it excels at tool use, code generation, and autonomous problem-solving with a 128K token context window.

On benchmarks, K2 scored 65.8% on SWE-bench Verified, 75.1% on GPQA-Diamond, 49.5% on AIME 2025, and 66.1 on Tau2-bench — surpassing most open- and closed-source models in non-thinking settings. It ranked as the #1 open-source model on the LMSYS Arena leaderboard upon release in July 2025.

K2 is well suited for developers building AI agents and tool-calling pipelines who need strong coding and reasoning without extended thinking overhead.

Context Window 131K

tokens

Max Output 100K

tokens

Input Cost $0.57

per million tokens

Output Cost $2.3

per million tokens

Release Date Jul 11, 2025

Output Speed 36

tokens / sec

Latency 1.18s

time to first token

Model Playground

Try Kimi K2 0711 instantly in your browser.
This playground uses the Puter.js AI API — no API keys or setup required.

Chat moonshotai/kimi-k2

Chat with Kimi K2 0711

Benchmarks

How Kimi K2 0711 performs on standard evaluations.

Artificial Analysis

Intelligence Index

19.4

Better than 59% of tracked models

Artificial Analysis

Math Index

57.0

Better than 54% of tracked models

Benchmark	Score
GPQA Diamond Graduate-level science Q&A	76.6%
Humanity's Last Exam Cross-domain reasoning	7.0%
LiveCodeBench Recent coding problems	55.6%
SciCode Scientific programming	34.5%
MATH-500 Competition math	97.1%
AIME 2024 Advanced math exam	69.3%
AIME 2025 Advanced math exam	57.0%
IFBench Instruction following	41.5%
LCR Long-context reasoning	51.0%
Terminal-Bench Hard Agentic terminal tasks	15.9%
τ²-Bench Tool use / agents	61.1%

Scores sourced from Artificial Analysis.

Find other Moonshot AI models →

Chat

Kimi K3

Kimi K3 is Moonshot AI's flagship open-weight model, released July 16, 2026, with full weights following on July 27. At roughly 2.8 trillion parameters in a Mixture-of-Experts architecture, Moonshot positions it as the largest open-source model released to date, built on two new components: Kimi Delta Attention, a hybrid linear attention mechanism, and Attention Residuals, a replacement for standard residual connections. It runs in an always-on thinking mode with a 1-million-token context window and accepts text, image, and video input. Reported results include 93.5% on GPQA Diamond, 91.2% on BrowseComp, 88.3% on Terminal-Bench 2.1, and a first-place finish on Arena.ai's Frontend Code Arena, putting it close to Claude Opus 4.8 and GPT-5.5 on several agentic and coding tasks. These figures come from Moonshot and early testers, not independently confirmed leaderboards. It suits developers building long-horizon coding agents and tool-calling pipelines who want frontier-level performance at open-weight pricing.

Chat

Kimi K2.7 Code

Kimi K2.7 Code is Moonshot AI's open-weight coding-agent model, released June 2026 and purpose-built for long-horizon, autonomous coding tasks. It shares the same 1-trillion-parameter Mixture-of-Experts architecture (32B active parameters) as K2.6 but is entirely focused on software engineering workloads. Compared to K2.6, it improves 21.8% on Kimi Code Bench v2, 11% on Program Bench, and 31.5% on MLS Bench Lite, while cutting reasoning-token usage by roughly 30%. It always runs in thinking mode — non-thinking mode is not supported. With a 262K-token context window, K2.7 Code is well-suited for multi-file, repository-scale coding pipelines and agentic workflows where sustained reasoning and deep code understanding matter.

Chat

Kimi K2.6

Kimi K2.6 is Moonshot AI's latest open-weight multimodal model, built on a 1-trillion-parameter mixture-of-experts architecture with a 256K context window. It excels at agentic coding and long-horizon execution, supporting sustained autonomous workflows with 4,000+ tool calls across languages like Rust, Go, and Python. On key benchmarks, it scores 58.6 on SWE-Bench Pro, 54.0 on HLE with Tools, and 50.0 on Toolathlon — competitive with GPT-5.4 and Claude Opus 4.6 on coding and agent tasks, though trailing them on pure reasoning. The model accepts text, image, and video input, supports both thinking and non-thinking modes, and offers an OpenAI-compatible API. It's a strong pick for developers building multi-step agentic workflows and complex software engineering pipelines.

Frequently Asked Questions

How do I use Kimi K2 0711?

You can access Kimi K2 0711 by Moonshot AI through Puter.js AI API. Include the library in your web app or Node.js project and start making calls with just a few lines of JavaScript — no backend and no configuration required. You can also use it with Python or cURL via Puter's OpenAI-compatible API.

Is Kimi K2 0711 free?

Yes, it is free if you're using it through Puter.js. With the User-Pays Model, you can add Kimi K2 0711 to your app at no cost — your users pay for their own AI usage directly, making it completely free for you as a developer.

What is the pricing for Kimi K2 0711?

Kimi K2 0711 costs $0.57 per 1M input tokens and $2.3 per 1M output tokens.

	Price per 1M tokens
Input	$0.57
Output	$2.3

Who created Kimi K2 0711?

Kimi K2 0711 was created by Moonshot AI and released on Jul 11, 2025.

What is the context window of Kimi K2 0711?

Kimi K2 0711 supports a context window of 131K tokens. For reference, that is roughly equivalent to 262 pages of text.

What is the max output length of Kimi K2 0711?

Kimi K2 0711 can generate up to 100K tokens in a single response.

How does Kimi K2 0711 perform on benchmarks?

Kimi K2 0711 scores 19.4 on the Artificial Analysis Intelligence Index, outperforming 59% of tracked models. On math, it scores 57.0 (outperforms 54% of models).

Does it work with React / Vue / Vanilla JS / Node / etc.?

Yes — the Kimi K2 0711 API works with any JavaScript framework, Node.js, or plain HTML through Puter.js. Just include the library and start building. See the documentation for more details.

Get started with Puter.js

Add Kimi K2 0711 to your app without worrying about API keys or setup.

Read the Docs View Tutorials