NVIDIA Nemotron 3 Nano Omni Is Now Available in Puter.js
On this page
Puter.js now supports Nemotron 3 Nano Omni, NVIDIA's open multimodal model that unifies text, image, video, and audio understanding in a single inference pass.
What is NVIDIA Nemotron 3 Nano Omni?
Nemotron 3 Nano Omni is a 30-billion-parameter hybrid Mamba-Transformer Mixture-of-Experts model with only ~3B active parameters per token. By combining vision and audio encoders directly into the MoE backbone, it eliminates the need for separate perception models and replaces fragmented multi-model pipelines with a single call. Key highlights include:
- Unified Multimodal Input: Natively processes text, image, video, and audio in one model with a shared multimodal context across agent loops
- Best-in-Class Document Intelligence: Tops leaderboards on MMLongBench-Doc and OCRBenchV2 for long-document reasoning and OCR
- Strong Video and Audio Understanding: Leads on WorldSense, DailyOmni, and VoiceBench across video and audio tasks
- Up to 9x Higher Throughput: Achieves the highest throughput of any benchmarked model on MediaPerf's video tasks compared to other open omni models with the same interactivity
- 256K Context Window: With an optional reasoning mode for deeper analysis when needed
The model is designed as a multimodal perception sub-agent in agentic systems, excelling at document reasoning, GUI-based computer use, speech transcription, and audio-video analysis.
Examples
Image analysis
puter.ai.chat(
"Describe this image in detail and identify any objects you see.",
"https://assets.puter.site/doge.jpeg",
{ model: "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free" }
);
Document reasoning
puter.ai.chat(
"Extract the key terms and obligations from this document and summarize them.",
"https://assets.puter.site/sample-contract.png",
{ model: "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free" }
);
Streaming with reasoning
async function streamResponse() {
const response = await puter.ai.chat(
"Walk through how you would analyze a video frame-by-frame for object tracking.",
{ model: "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free", stream: true }
);
for await (const part of response) {
if (part?.reasoning)
puter.print(part?.reasoning);
else
puter.print(part?.text);
}
}
streamResponse();
Get Started Now
Just add one library to your project:
// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';
Or add one script tag to your HTML:
<script src="https://js.puter.com/v2/"></script>
No API keys or NVIDIA account needed. Start building with Nemotron 3 Nano Omni immediately.
Learn more:
Free, Serverless AI and Cloud
Start creating powerful web applications with Puter.js in seconds!
Get Started Now