Xiaomi MiMo V2.5 Models Are Now Available in Puter.js

Reynaldi Chernando

April 23, 2026

On this page

What is MiMo-V2.5?What is MiMo-V2.5-Pro?Examples Visual analysis with MiMo-V2.5 Long-horizon agentic coding with MiMo-V2.5-Pro Streaming with reasoning Get Started Now

Puter.js now supports the MiMo V2.5 models from Xiaomi — MiMo-V2.5 and MiMo-V2.5-Pro. Both land with a 1M-token context window, native multimodal perception, and frontier-class agentic performance at a fraction of the cost of comparable closed models. Add them to your application for free without any API keys.

What is MiMo-V2.5?

MiMo-V2.5 is Xiaomi's native omnimodal model — text, images, video, and audio all flow through a single unified architecture. It's trained from the start to see, hear, and act on what it perceives, and it surpasses the earlier MiMo-V2-Pro on agentic performance at roughly half the token cost. Key highlights:

Native omnimodal: text, image, video, and audio in one model, no bolt-on encoders
Strong video understanding: 87.7 on Video-MME, competitive with Gemini 3 Pro
Image reasoning: 81.0 on CharXiv RQ and 77.9 on MMMU-Pro
Agentic frontier: 62.3 on ClawEval (general) and 23.8 on ClawEval Multimodal, matching Claude Sonnet 4.6
1M-token context window with up to 131K output tokens

What is MiMo-V2.5-Pro?

MiMo-V2.5-Pro is Xiaomi's most capable model, purpose-built for long-horizon agentic workflows and complex software engineering that can span over a thousand tool calls in a single trajectory. It sits alongside Claude Opus 4.6 and GPT-5.4 on most agentic evaluations while using roughly 40–60% fewer tokens per trajectory. Key highlights:

SWE-bench Pro: 57.2 — matching flagship closed models on real-world software engineering
ClawEval: 63.8 and τ3-Bench: 72.9 — frontier-tier agentic reasoning
1M-token context window with 131K max output for entire codebases and extended sessions
Sustained coherence across long trajectories — demonstrated on a 4.3-hour Rust compiler build using 672 tool calls
Cost-efficient: $1 / $3 per million input/output tokens, roughly one-fifth the price of comparable frontier models

Examples

Visual analysis with MiMo-V2.5

puter.ai.chat(
    "Describe what's happening in this image.",
    "https://assets.puter.site/doge.jpeg",
    { model: 'xiaomi/mimo-v2.5' }
);

Long-horizon agentic coding with MiMo-V2.5-Pro

puter.ai.chat(
    "Plan the implementation of a distributed rate limiter using Redis. Cover token bucket vs sliding window tradeoffs, failure modes, and a phased rollout strategy.",
    { model: 'xiaomi/mimo-v2.5-pro', stream: true }
);

Streaming with reasoning

const response = await puter.ai.chat(
    "Refactor this codebase to extract the auth layer into its own service. Walk through the migration step by step.",
    { model: 'xiaomi/mimo-v2.5-pro', stream: true }
);

for await (const part of response) {
    if (part?.reasoning) puter.print(part?.reasoning);
    else puter.print(part?.text);
}

Get Started Now

Just add one library to your project:

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

Or add one script tag to your HTML:

<script src="https://js.puter.com/v2/"></script>

No API keys needed. Start building with the MiMo V2.5 models immediately.

Learn more:

Free, Serverless AI and Cloud

Start creating powerful web applications with Puter.js in seconds!

Get Started Now

Read the Docs • Try the Playground