NVIDIA Nemotron 3 Nano Omni Is Now Available in Puter.js

April 29, 2026

On this page

What is NVIDIA Nemotron 3 Nano Omni?Examples Image analysis Document reasoning Streaming with reasoning Get Started Now

Puter.js now supports Nemotron 3 Nano Omni, NVIDIA's open multimodal model that unifies text, image, video, and audio understanding in a single inference pass.

What is NVIDIA Nemotron 3 Nano Omni?

Nemotron 3 Nano Omni is a 30-billion-parameter hybrid Mamba-Transformer Mixture-of-Experts model with only ~3B active parameters per token. By combining vision and audio encoders directly into the MoE backbone, it eliminates the need for separate perception models and replaces fragmented multi-model pipelines with a single call. Key highlights include:

Unified Multimodal Input: Natively processes text, image, video, and audio in one model with a shared multimodal context across agent loops
Best-in-Class Document Intelligence: Tops leaderboards on MMLongBench-Doc and OCRBenchV2 for long-document reasoning and OCR
Strong Video and Audio Understanding: Leads on WorldSense, DailyOmni, and VoiceBench across video and audio tasks
Up to 9x Higher Throughput: Achieves the highest throughput of any benchmarked model on MediaPerf's video tasks compared to other open omni models with the same interactivity
256K Context Window: With an optional reasoning mode for deeper analysis when needed

The model is designed as a multimodal perception sub-agent in agentic systems, excelling at document reasoning, GUI-based computer use, speech transcription, and audio-video analysis.

Examples

Image analysis

puter.ai.chat(
    "Describe this image in detail and identify any objects you see.",
    "https://assets.puter.site/doge.jpeg",
    { model: "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free" }
);

Document reasoning

puter.ai.chat(
    "Extract the key terms and obligations from this document and summarize them.",
    "https://assets.puter.site/sample-contract.png",
    { model: "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free" }
);

Streaming with reasoning

async function streamResponse() {
    const response = await puter.ai.chat(
        "Walk through how you would analyze a video frame-by-frame for object tracking.",
        { model: "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free", stream: true }
    );

    for await (const part of response) {
        if (part?.reasoning)
            puter.print(part?.reasoning);
        else
            puter.print(part?.text);
    }
}

streamResponse();

Get Started Now

Just add one library to your project:

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

Or add one script tag to your HTML:

<script src="https://js.puter.com/v2/"></script>

No API keys or NVIDIA account needed. Start building with Nemotron 3 Nano Omni immediately.

Learn more:

Ship a Full-Stack App with One Prompt

Give this to your AI Create a to-do list app using Puter.js

Try in

Coding manually? see the guide