NVIDIA: Nemotron Nano 9B V2 API

Access NVIDIA: Nemotron Nano 9B V2 from NVIDIA using Puter.js AI API.

Get Started

Model Card

Nemotron Nano 9B V2 is a 9B parameter hybrid Mamba-Transformer model trained from scratch by NVIDIA with a 128K context window, achieving up to 6x higher inference throughput than similar models like Qwen3-8B. It features controllable reasoning budget allowing developers to balance accuracy and response time for edge deployment.

Context Window

N/A

tokens

Max Output

N/A

tokens

Input Cost

$0.04

per million tokens

Output Cost

$0.16

per million tokens

API Usage Example

Add NVIDIA: Nemotron Nano 9B V2 to your app with just a few lines of code.
No API keys, no backend, no configuration required.

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat("Explain quantum computing in simple terms", {
            model: "nvidia/nemotron-nano-9b-v2"
        }).then(response => {
            document.body.innerHTML = response.message.content;
        });
    </script>
</body>
</html>

View full documentation →

Get started with Puter.js

Add NVIDIA: Nemotron Nano 9B V2 to your app without worrying about API keys or setup.

Read the Docs View Tutorials