NVIDIA Nemotron 3 Ultra Is Now Available in Puter.js

June 5, 2026

On this page

What is Nemotron 3 Ultra?Examples Complex reasoning Streaming with reasoning Get Started Now

Puter.js now supports Nemotron 3 Ultra, NVIDIA's open-weight frontier reasoning model. Add it to your application for free, no API keys or NVIDIA account required.

What is Nemotron 3 Ultra?

Nemotron 3 Ultra is NVIDIA's most powerful open reasoning model, released on June 4, 2026. It's a 550B-parameter Mixture-of-Experts model with 55B active parameters, built on a hybrid Mamba-Transformer architecture that interleaves Mamba-2 layers for sub-quadratic efficiency on long sequences with selective attention layers for precise factual recall. Key highlights include:

1M token context window — built for long-running agentic workflows, deep document analysis, and reasoning-heavy tasks across code, math, and science
Hybrid Mamba-Transformer MoE — 512 experts per layer with top-22 routing, delivering up to 5.9x higher inference throughput than comparable open MoE models at on-par accuracy
Frontier-level intelligence — scores 48 on the Artificial Analysis Intelligence Index, leading US open-weight models
Highest non-hallucination score in its class — 78.7 on AA-Omniscience, the best in its comparison set
Configurable reasoning — reasoning-off, regular, and medium-effort modes with inference-time budget control
Fully open — weights, training data, and recipes ship openly, trained with Multi-teacher On-Policy Distillation that distills 10+ specialized teachers into a single student

Choose it for production agentic pipelines, deep document analysis, or reasoning-heavy API workloads where both accuracy and throughput matter.

Examples

Complex reasoning

puter.ai.chat(
    "Analyze the potential impacts of quantum computing on current encryption methods and suggest strategies for post-quantum cryptography.",
    { model: "nvidia/nemotron-3-ultra-550b-a55b:free" }
)

Streaming with reasoning

const response = await puter.ai.chat(
    "Design a distributed task scheduler with fault tolerance and exactly-once execution guarantees, and explain the trade-offs",
    { model: "nvidia/nemotron-3-ultra-550b-a55b:free", stream: true }
);

for await (const part of response) {
    if (part?.reasoning) puter.print(part?.reasoning);
    else puter.print(part?.text);
}

Get Started Now

Just add one library to your project:

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

Or add one script tag to your HTML:

<script src="https://js.puter.com/v2/"></script>

No API keys and no infrastructure setup. Start building with Nemotron 3 Ultra immediately.

Learn more:

Ship a Full-Stack App with One Prompt

Give this to your AI Create a to-do list app using Puter.js

Try in

Coding manually? see the guide