Xiaomi: MiMo-V2-Flash API
Access Xiaomi: MiMo-V2-Flash from Xiaomi using Puter.js AI API.
Get Startedxiaomi/mimo-v2-flash
Model Card
MiMo-V2-Flash is Xiaomi's open-source Mixture-of-Experts language model with 309B total parameters (15B active), designed for high-speed reasoning, coding, and agentic workflows. It uses a hybrid attention architecture with Multi-Token Prediction to achieve up to 150 tokens/second inference while keeping costs extremely low. The model excels at software engineering benchmarks and supports a 256K context window.
Context Window
N/A
tokens
Max Output
N/A
tokens
Input Cost
$0.09
per million tokens
Output Cost
$0.29
per million tokens
API Usage Example
Add Xiaomi: MiMo-V2-Flash to your app with just a few lines of code.
No API keys, no backend, no configuration required.
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
puter.ai.chat("Explain quantum computing in simple terms", {
model: "xiaomi/mimo-v2-flash"
}).then(response => {
document.body.innerHTML = response.message.content;
});
</script>
</body>
</html>
Get started with Puter.js
Add Xiaomi: MiMo-V2-Flash to your app without worrying about API keys or setup.
Read the Docs View Tutorials