Blog

DeepSeek V4 Flash and DeepSeek V4 Pro Are Now Available in Puter.js

On this page

Puter.js now supports DeepSeek V4 Flash and DeepSeek V4 Pro, DeepSeek's latest open-weight Mixture-of-Experts models. Add them to your application for free, no API keys or DeepSeek account required.

What is DeepSeek V4 Flash?

DeepSeek V4 Flash is the lightweight, efficiency-focused model in the V4 family, released on April 24, 2026. It uses a Mixture-of-Experts architecture with 284B total parameters and 13B activated per token, making it fast and economical for high-throughput workloads. Key highlights include:

  • 1M token context window with 384K max output tokens
  • Configurable reasoning with standard, high, and max thinking modes
  • $0.14/M input and $0.28/M output — one of the cheapest frontier-tier models available
  • Strong open-weight performance — leads all current open models on math, STEM, and coding, trailing only Gemini 3.1 Pro on world knowledge
  • Well suited for coding assistants, chat systems, and agent pipelines where latency and cost matter most

What is DeepSeek V4 Pro?

DeepSeek V4 Pro is the flagship of the V4 family, positioned as the strongest open-weight model currently available. It's a 1.6T-parameter MoE with 49B parameters activated per token, supporting the same 1M-token context window as Flash. Key highlights include:

  • 93.5 on LiveCodeBench — ahead of Gemini 3.1 Pro (91.7) and Claude Opus 4.6 (88.8)
  • Codeforces rating of 3206 — tops GPT-5.4 (3168)
  • Near-parity with Opus 4.6 on agentic tool-use benchmarks like MCPAtlas
  • $1.74/M input and $3.48/M output — a fraction of comparable closed-source models
  • Built for complex reasoning, agentic coding, and knowledge-intensive tasks

Examples

Basic chat with Flash

puter.ai.chat("Write a TypeScript function that debounces an async call", {
    model: "deepseek/deepseek-v4-flash"
})

Complex reasoning with Pro

puter.ai.chat("Design a sharded rate limiter that survives a single-region outage and explain the trade-offs", {
    model: "deepseek/deepseek-v4-pro"
})

Streaming responses

const response = await puter.ai.chat(
    "Walk through how Mixture-of-Experts routing works at inference time",
    { model: "deepseek/deepseek-v4-pro", stream: true }
);

for await (const part of response) {
    if (part?.reasoning) puter.print(part?.reasoning);
    else puter.print(part?.text);
}

Get Started Now

Just add one library to your project:

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

Or add one script tag to your HTML:

<script src="https://js.puter.com/v2/"></script>

No API keys and no infrastructure setup. Start building with DeepSeek V4 Flash and DeepSeek V4 Pro immediately.

Learn more:

Free, Serverless AI and Cloud

Start creating powerful web applications with Puter.js in seconds!

Get Started Now

Read the Docs Try the Playground