Deep Cogito: Cogito v2.1 671B
deepcogito/cogito-v2-1-671b
Access Cogito v2.1 671B from Deep Cogito using Puter.js AI API.
Get Started// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';
puter.ai.chat("Explain quantum computing in simple terms", {
model: "deepcogito/cogito-v2-1-671b"
}).then(response => {
document.body.innerHTML = response.message.content;
});
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
puter.ai.chat("Explain quantum computing in simple terms", {
model: "deepcogito/cogito-v2-1-671b"
}).then(response => {
document.body.innerHTML = response.message.content;
});
</script>
</body>
</html>
# pip install openai
from openai import OpenAI
client = OpenAI(
base_url="https://api.puter.com/puterai/openai/v1/",
api_key="YOUR_PUTER_AUTH_TOKEN",
)
response = client.chat.completions.create(
model="deepcogito/cogito-v2-1-671b",
messages=[
{"role": "user", "content": "Explain quantum computing in simple terms"}
],
)
print(response.choices[0].message.content)
curl https://api.puter.com/puterai/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_PUTER_AUTH_TOKEN" \
-d '{
"model": "deepcogito/cogito-v2-1-671b",
"messages": [
{"role": "user", "content": "Explain quantum computing in simple terms"}
]
}'
Model Card
Cogito v2.1 671B is an open-weight Mixture-of-Experts model from Deep Cogito, with 671B total and 37B active parameters. It is trained with Iterated Distillation and Amplification (IDA) plus reinforcement learning and self-play, giving it a hybrid design that can answer directly or self-reflect before responding.
It is built for token-efficient reasoning, scoring 98.57% on MATH-500, 89.47% on AIME 2025, 77.72% on GPQA Diamond, and 84.69% on MMLU-Pro. The maker reports it matches DeepSeek R1 0528 while using roughly 60% shorter reasoning chains.
With strong math, coding, and instruction following plus tool calling, it suits developers who want frontier-level reasoning at lower cost and latency.
Context Window 164K
tokens
Max Output 164K
tokens
Input Cost $1.25
per million tokens
Output Cost $1.25
per million tokens
Input text
modalities
Tool Use Yes
Release Date Nov 2025
Model Playground
Try Cogito v2.1 671B instantly in your browser.
This playground uses the Puter.js AI API — no API keys or setup required.
Frequently Asked Questions
You can access Cogito v2.1 671B by Deep Cogito through Puter.js AI API. Include the library in your web app or Node.js project and start making calls with just a few lines of JavaScript — no backend and no configuration required. You can also use it with Python or cURL via Puter's OpenAI-compatible API.
Yes, it is free if you're using it through Puter.js. With the User-Pays Model, you can add Cogito v2.1 671B to your app at no cost — your users pay for their own AI usage directly, making it completely free for you as a developer.
| Price per 1M tokens | |
|---|---|
| Input | $1.25 |
| Output | $1.25 |
Cogito v2.1 671B was created by Deep Cogito and released on Nov 2025.
Cogito v2.1 671B supports a context window of 164K tokens. For reference, that is roughly equivalent to 328 pages of text.
Cogito v2.1 671B can generate up to 164K tokens in a single response.
Cogito v2.1 671B accepts the following input types: text. It produces: text.
Yes, Cogito v2.1 671B supports tool use (function calling), allowing it to interact with external tools, APIs, and data sources as part of its response flow.
Yes — the Cogito v2.1 671B API works with any JavaScript framework, Node.js, or plain HTML through Puter.js. Just include the library and start building. See the documentation for more details.
Get started with Puter.js
Add Cogito v2.1 671B to your app without worrying about API keys or setup.
Read the Docs View Tutorials