Inception: Mercury

This model is no longer available.

Add AI to your application with Puter.js.

Explore Other Models

Model Card

Mercury is the world's first commercial-scale diffusion large language model from Inception Labs. It generates text through iterative parallel refinement rather than sequential token prediction, enabling dramatically higher throughput without sacrificing output quality.

It matches the performance of frontier speed-optimized models such as GPT-4o Mini and Gemini 1.5 Flash across knowledge, coding, instruction-following, and math benchmarks, while running up to 10x faster. It is OpenAI API-compatible for straightforward integration.

Mercury is well-suited for API use cases that demand high concurrency, fast response times, or cost efficiency — including chat, summarization, and general-purpose text generation at scale.

Context Window 128K

tokens

Max Output 32K

tokens

Input Cost $0.25

per million tokens

Output Cost $0.75

per million tokens

Release Date Feb 24, 2025

 

Code Example

Add AI to your app with the Puter.js AI API — no API keys or setup required.

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

puter.ai.chat("Explain quantum computing in simple terms").then(response => {
    document.body.innerHTML = response.message.content;
});
<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat("Explain quantum computing in simple terms").then(response => {
            document.body.innerHTML = response.message.content;
        });
    </script>
</body>
</html>

Frequently Asked Questions

How do I use Mercury?

You can access Mercury by Inception through Puter.js AI API. Include the library in your web app or Node.js project and start making calls with just a few lines of JavaScript — no backend and no configuration required. You can also use it with Python or cURL via Puter's OpenAI-compatible API.

Is Mercury free?

Yes, it is free if you're using it through Puter.js. With the User-Pays Model, you can add Mercury to your app at no cost — your users pay for their own AI usage directly, making it completely free for you as a developer.

What is the pricing for Mercury?
Pricing for Mercury is based on the number of input and output tokens used per request.
Price per 1M tokens
Input$0.25
Output$0.75
Who created Mercury?

Mercury was created by Inception and released on Feb 24, 2025.

What is the context window of Mercury?

Mercury supports a context window of 128K tokens. For reference, that is roughly equivalent to 256 pages of text.

What is the max output length of Mercury?

Mercury can generate up to 32K tokens in a single response.

Does it work with React / Vue / Vanilla JS / Node / etc.?

Yes — the Mercury API works with any JavaScript framework, Node.js, or plain HTML through Puter.js. Just include the library and start building. See the documentation for more details.

Get started with Puter.js

Add AI to your application without worrying about API keys or setup.

Explore Models View Tutorials