Blog

NVIDIA Nemotron 3 Nano Omni Is Now Available in Puter.js

On this page

Puter.js now supports Nemotron 3 Nano Omni, NVIDIA's open multimodal model that unifies text, image, video, and audio understanding in a single inference pass.

What is NVIDIA Nemotron 3 Nano Omni?

Nemotron 3 Nano Omni is a 30-billion-parameter hybrid Mamba-Transformer Mixture-of-Experts model with only ~3B active parameters per token. By combining vision and audio encoders directly into the MoE backbone, it eliminates the need for separate perception models and replaces fragmented multi-model pipelines with a single call. Key highlights include:

  • Unified Multimodal Input: Natively processes text, image, video, and audio in one model with a shared multimodal context across agent loops
  • Best-in-Class Document Intelligence: Tops leaderboards on MMLongBench-Doc and OCRBenchV2 for long-document reasoning and OCR
  • Strong Video and Audio Understanding: Leads on WorldSense, DailyOmni, and VoiceBench across video and audio tasks
  • Up to 9x Higher Throughput: Achieves the highest throughput of any benchmarked model on MediaPerf's video tasks compared to other open omni models with the same interactivity
  • 256K Context Window: With an optional reasoning mode for deeper analysis when needed

The model is designed as a multimodal perception sub-agent in agentic systems, excelling at document reasoning, GUI-based computer use, speech transcription, and audio-video analysis.

Examples

Image analysis

puter.ai.chat(
    "Describe this image in detail and identify any objects you see.",
    "https://assets.puter.site/doge.jpeg",
    { model: "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free" }
);

Document reasoning

puter.ai.chat(
    "Extract the key terms and obligations from this document and summarize them.",
    "https://assets.puter.site/sample-contract.png",
    { model: "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free" }
);

Streaming with reasoning

async function streamResponse() {
    const response = await puter.ai.chat(
        "Walk through how you would analyze a video frame-by-frame for object tracking.",
        { model: "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free", stream: true }
    );

    for await (const part of response) {
        if (part?.reasoning)
            puter.print(part?.reasoning);
        else
            puter.print(part?.text);
    }
}

streamResponse();

Get Started Now

Just add one library to your project:

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

Or add one script tag to your HTML:

<script src="https://js.puter.com/v2/"></script>

No API keys or NVIDIA account needed. Start building with Nemotron 3 Nano Omni immediately.

Learn more:

Free, Serverless AI and Cloud

Start creating powerful web applications with Puter.js in seconds!

Get Started Now

Read the Docs Try the Playground