Inception: Mercury
This model is no longer available.Add AI to your application with Puter.js.
Explore Other ModelsModel Card
Mercury is the world's first commercial-scale diffusion large language model from Inception Labs. It generates text through iterative parallel refinement rather than sequential token prediction, enabling dramatically higher throughput without sacrificing output quality.
It matches the performance of frontier speed-optimized models such as GPT-4o Mini and Gemini 1.5 Flash across knowledge, coding, instruction-following, and math benchmarks, while running up to 10x faster. It is OpenAI API-compatible for straightforward integration.
Mercury is well-suited for API use cases that demand high concurrency, fast response times, or cost efficiency — including chat, summarization, and general-purpose text generation at scale.
Context Window 128K
tokens
Max Output 32K
tokens
Input Cost $0.25
per million tokens
Output Cost $0.75
per million tokens
Release Date Feb 24, 2025
Code Example
Add AI to your app with the Puter.js AI API — no API keys or setup required.
// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';
puter.ai.chat("Explain quantum computing in simple terms").then(response => {
document.body.innerHTML = response.message.content;
});
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
puter.ai.chat("Explain quantum computing in simple terms").then(response => {
document.body.innerHTML = response.message.content;
});
</script>
</body>
</html>
More AI Models From Inception
Frequently Asked Questions
You can access Mercury by Inception through Puter.js AI API. Include the library in your web app or Node.js project and start making calls with just a few lines of JavaScript — no backend and no configuration required. You can also use it with Python or cURL via Puter's OpenAI-compatible API.
Yes, it is free if you're using it through Puter.js. With the User-Pays Model, you can add Mercury to your app at no cost — your users pay for their own AI usage directly, making it completely free for you as a developer.
| Price per 1M tokens | |
|---|---|
| Input | $0.25 |
| Output | $0.75 |
Mercury was created by Inception and released on Feb 24, 2025.
Mercury supports a context window of 128K tokens. For reference, that is roughly equivalent to 256 pages of text.
Mercury can generate up to 32K tokens in a single response.
Yes — the Mercury API works with any JavaScript framework, Node.js, or plain HTML through Puter.js. Just include the library and start building. See the documentation for more details.
Get started with Puter.js
Add AI to your application without worrying about API keys or setup.
Explore Models View Tutorials