DeepSeek V4 Flash and DeepSeek V4 Pro Are Now Available in Puter.js
On this page
Puter.js now supports DeepSeek V4 Flash and DeepSeek V4 Pro, DeepSeek's latest open-weight Mixture-of-Experts models. Add them to your application for free, no API keys or DeepSeek account required.
What is DeepSeek V4 Flash?
DeepSeek V4 Flash is the lightweight, efficiency-focused model in the V4 family, released on April 24, 2026. It uses a Mixture-of-Experts architecture with 284B total parameters and 13B activated per token, making it fast and economical for high-throughput workloads. Key highlights include:
- 1M token context window with 384K max output tokens
- Configurable reasoning with standard, high, and max thinking modes
- $0.14/M input and $0.28/M output — one of the cheapest frontier-tier models available
- Strong open-weight performance — leads all current open models on math, STEM, and coding, trailing only Gemini 3.1 Pro on world knowledge
- Well suited for coding assistants, chat systems, and agent pipelines where latency and cost matter most
What is DeepSeek V4 Pro?
DeepSeek V4 Pro is the flagship of the V4 family, positioned as the strongest open-weight model currently available. It's a 1.6T-parameter MoE with 49B parameters activated per token, supporting the same 1M-token context window as Flash. Key highlights include:
- 93.5 on LiveCodeBench — ahead of Gemini 3.1 Pro (91.7) and Claude Opus 4.6 (88.8)
- Codeforces rating of 3206 — tops GPT-5.4 (3168)
- Near-parity with Opus 4.6 on agentic tool-use benchmarks like MCPAtlas
- $1.74/M input and $3.48/M output — a fraction of comparable closed-source models
- Built for complex reasoning, agentic coding, and knowledge-intensive tasks
Examples
Basic chat with Flash
puter.ai.chat("Write a TypeScript function that debounces an async call", {
model: "deepseek/deepseek-v4-flash"
})
Complex reasoning with Pro
puter.ai.chat("Design a sharded rate limiter that survives a single-region outage and explain the trade-offs", {
model: "deepseek/deepseek-v4-pro"
})
Streaming responses
const response = await puter.ai.chat(
"Walk through how Mixture-of-Experts routing works at inference time",
{ model: "deepseek/deepseek-v4-pro", stream: true }
);
for await (const part of response) {
if (part?.reasoning) puter.print(part?.reasoning);
else puter.print(part?.text);
}
Get Started Now
Just add one library to your project:
// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';
Or add one script tag to your HTML:
<script src="https://js.puter.com/v2/"></script>
No API keys and no infrastructure setup. Start building with DeepSeek V4 Flash and DeepSeek V4 Pro immediately.
Learn more:
Free, Serverless AI and Cloud
Start creating powerful web applications with Puter.js in seconds!
Get Started Now