Allen AI API

Q: How much does it cost?

With the User-Pays model , users cover their own AI costs through their Puter account. This means you can build apps without worrying about infrastructure expenses.

Access Allen AI instantly with Puter.js, and add AI to any app in a few lines of code without backend or API keys.

Get Started Read Tutorial

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

puter.ai.chat("Explain AI like I'm five!", {
    model: "allenai/olmo-3-32b-think"
}).then(response => {
    console.log(response);
});

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat("Explain AI like I'm five!", {
            model: "allenai/olmo-3-32b-think"
        }).then(response => {
            console.log(response);
        });
    </script>
</body>
</html>

List of Allen AI Models

Chat

Molmo2 8B

allenai/molmo-2-8b

Molmo 2 8B is an open vision-language model from the Allen Institute for AI (AI2), built on a Qwen3-8B language backbone with a SigLIP 2 vision encoder. It supports single images, multi-image inputs, and video clips. On its 11-benchmark image average, Molmo 2 8B leads all open-weight models in its class. It achieves 32.9% on video pointing versus 17% for Gemini 2.5 Pro, and tops open-weight scores across seven video benchmarks including Video-MME and MVBench. It also outperforms the original Molmo 72B on grounding tasks despite being far smaller. A strong choice for multimodal applications requiring precise spatial reasoning, visual grounding, or video understanding via API.

Chat

Olmo 3.1 32B Instruct

allenai/olmo-3.1-32b-instruct

OLMo 3.1 32B Instruct is a fully open instruction-tuned language model from the Allen Institute for AI (AI2), designed for chat, tool use, and multi-turn dialogue at the 32B parameter scale. AI2 positions it as the most capable fully open 32B-scale instruct model, with strong performance on math (GSM8K, MATH), coding (HumanEval, MBPP+), and instruction-following (IFEval). It uses a hybrid attention architecture and maintains strong long-context retrieval performance (96.1 on RULER at 4K). Released under Apache 2.0 with full data and training transparency, it's a well-rounded choice for instruction-following, tool-augmented, or multi-turn chat applications.

Chat

Olmo 3.1 32B Think

allenai/olmo-3.1-32b-think

OLMo 3.1 32B Think is the updated flagship reasoning model from the Allen Institute for AI (AI2), an improved successor to OLMo 3 32B Think with an additional 21 days of extended reinforcement learning training. The extended training yielded gains of 5+ points on AIME, 4+ points on ZebraLogic and IFEval, and 20+ points on IFBench over its predecessor. It supports a 64K context window and is licensed under Apache 2.0 with full training transparency. For API developers needing a high-performance open reasoning model for math, code, and complex instruction-following, OLMo 3.1 32B Think is AI2's most capable reasoning offering, competitive with Qwen 3 32B at the same scale.

Chat

Olmo 3 32B Think

allenai/olmo-3-32b-think

OLMo 3 32B Think is a fully open reasoning model from the Allen Institute for AI (AI2), and the first fully open 32B thinking model to expose intermediate chain-of-thought reasoning traces. Trained on multi-step math, code, and general problem-solving tasks using a thinking SFT, DPO, and RLVR training flow, it is the strongest fully open reasoning model at the 32B scale — narrowing the gap to open-weight models like Qwen 3-32B-Think while trained on 6x fewer tokens. All training data, code, weights, and checkpoints are publicly available under Apache 2.0. Best suited for complex multi-step reasoning and mathematical problem solving via API.

Chat

Olmo 3 7B Instruct

allenai/olmo-3-7b-instruct

OLMo 3 7B Instruct is a lightweight, fully open instruction-tuned chat model from the Allen Institute for AI (AI2), designed for instruction-following, question-answering, and multi-turn conversational dialogue. Among 7B-scale models, it is competitive with Qwen 2.5 and Gemma 3 equivalents, and represents a clear step up from Llama 3.1 8B in instruction-following quality. It supports a 66K token context window with a knowledge cutoff of December 2024. Released under a fully open license with complete training weights and data publicly available, it's well-suited for cost-efficient API usage where a capable small model is preferred.

Chat

Olmo 3 7B Think

allenai/olmo-3-7b-think

OLMo 3 7B Think is an efficient reasoning model from the Allen Institute for AI (AI2), purpose-built for multi-step problem solving in math, coding, and general analytical tasks. Trained using a thinking SFT, thinking DPO, and RLVR pipeline, it generates structured chain-of-thought reasoning traces. On math benchmarks it matches Qwen 3 8B on MATH and comes within a few points on AIME 2024 and 2025. On coding, it leads similarly-sized models on HumanEvalPlus. Fully open under a permissive license, it is the most capable fully open reasoning option at the 7B scale — ideal for API use cases requiring strong reasoning at a small model footprint.

Chat

Olmo 2 32B Instruct

allenai/olmo-2-0325-32b-instruct

OLMo 2 32B Instruct is a fully open, 32-billion-parameter instruction-tuned language model from the Allen Institute for AI (AI2), post-trained using supervised fine-tuning, DPO, and RLVR. It is the first fully open model to outperform both GPT-3.5 Turbo and GPT-4o mini across popular multi-skill academic benchmarks including GSM8K, MATH, and IFEval. The model supports a 128K token context window and targets math reasoning, instruction-following, and general chat. Released under Apache 2.0 with full transparency into training data, code, and weights, it's a strong choice for developers who need a capable, commercially permissive instruction model.

Frequently Asked Questions

What is this Allen AI API about?

The Allen AI API gives you access to models for AI chat. Through Puter.js, you can start using Allen AI models instantly with zero setup or configuration.

Which Allen AI models can I use?

Puter.js supports a variety of Allen AI models, including Molmo2 8B, Olmo 3.1 32B Instruct, Olmo 3.1 32B Think, and more. Find all AI models supported by Puter.js in the AI model list.

How much does it cost?

With the User-Pays model, users cover their own AI costs through their Puter account. This means you can build apps without worrying about infrastructure expenses.

What is Puter.js?

Puter.js is a JavaScript library that provides access to AI, storage, and other cloud services directly from a single API. It handles authentication, infrastructure, and scaling so you can focus on building your app.

Does this work with React / Vue / Vanilla JS / Node / etc.?

Yes — the Allen AI API through Puter.js works with any JavaScript framework, Node.js, or plain HTML. Just include the library and start building. See the documentation for more details.

Allen AI API

List of Allen AI Models

Molmo2 8B

Olmo 3.1 32B Instruct

Olmo 3.1 32B Think

Olmo 3 32B Think

Olmo 3 7B Instruct

Olmo 3 7B Think

Olmo 2 32B Instruct

Frequently Asked Questions

Related Resources

Free, Unlimited Allen AI API

Getting Started with Puter.js

Free, Unlimited OpenAI API