AI21 Labs API

Q: How much does it cost?

With the User-Pays model , users cover their own AI costs through their Puter account. This means you can build apps without worrying about infrastructure expenses.

Access AI21 Labs instantly with Puter.js, and add AI to any app in a few lines of code without backend or API keys.

Get Started Read Tutorial

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

puter.ai.chat("Explain AI like I'm five!", {
    model: "ai21/jamba-large-1.7"
}).then(response => {
    console.log(response);
});

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat("Explain AI like I'm five!", {
            model: "ai21/jamba-large-1.7"
        }).then(response => {
            console.log(response);
        });
    </script>
</body>
</html>

List of AI21 Labs Models

Chat

Jamba Large 1.7

ai21/jamba-large-1.7

Jamba Large 1.7 is AI21 Labs' flagship open-weight language model, built on a hybrid SSM-Transformer (Mamba-Transformer) architecture with a Mixture of Experts design — 398B total parameters with 94B active during inference. Its standout feature is a 256K-token context window, making it well suited for processing lengthy documents, contracts, and knowledge bases. The model supports function calling, JSON mode, and nine languages including English, Spanish, French, German, and Arabic. Jamba Large 1.7 emphasizes grounding and instruction-following, delivering contextually faithful responses with strong steerability. It generates output at roughly 69 tokens per second via the AI21 API. It targets enterprise workflows in domains like finance, healthcare, and legal — where long-context accuracy and data control matter most.

Chat

Jamba Mini 1.7

ai21/jamba-mini-1.7

Jamba Mini 1.7 is a compact, efficiency-focused model from AI21 Labs, sharing the same hybrid SSM-Transformer architecture as its larger sibling but with just 12B active parameters (52B total) in a Mixture of Experts configuration. It retains the full 256K-token context window and supports function calling, making it capable of handling long-document tasks at a fraction of the cost — priced at $0.20 per million input tokens and $0.40 per million output tokens. Like Jamba Large 1.7, this version improves on grounding and instruction-following over earlier releases. It's a practical choice for cost-sensitive production workloads, high-volume pipelines, and use cases where speed and low latency matter more than peak reasoning power.

Frequently Asked Questions

What is this AI21 Labs API about?

The AI21 Labs API gives you access to models for AI chat. Through Puter.js, you can start using AI21 Labs models instantly with zero setup or configuration.

Which AI21 Labs models can I use?

Puter.js supports a variety of AI21 Labs models, including Jamba Large 1.7 and Jamba Mini 1.7. Find all AI models supported by Puter.js in the AI model list.

How much does it cost?

With the User-Pays model, users cover their own AI costs through their Puter account. This means you can build apps without worrying about infrastructure expenses.

What is Puter.js?

Puter.js is a JavaScript library that provides access to AI, storage, and other cloud services directly from a single API. It handles authentication, infrastructure, and scaling so you can focus on building your app.

Does this work with React / Vue / Vanilla JS / Node / etc.?

Yes — the AI21 Labs API through Puter.js works with any JavaScript framework, Node.js, or plain HTML. Just include the library and start building. See the documentation for more details.

AI21 Labs API

List of AI21 Labs Models

Jamba Large 1.7

Jamba Mini 1.7

Frequently Asked Questions

Related Resources

Free, Unlimited AI21 Labs API

Getting Started with Puter.js

Free, Unlimited Claude API