StepFun Step 3.7 Flash Is Now Available in Puter.js
On this page
Puter.js now supports Step 3.7 Flash, StepFun's latest high-efficiency model—a multimodal Mixture-of-Experts system built for coding agents, search-augmented reasoning, and multimodal task automation. Add it to your application for free without any API keys.
What is Step 3.7 Flash?
Step 3.7 Flash is a 198B sparse Mixture-of-Experts vision-language model that pairs a 196B-parameter language backbone with a 1.8B Vision Transformer encoder, activating only ~11B parameters per token. Released and open-sourced on May 29, 2026, it builds directly on Step 3.5 Flash and adds native multimodality along with stronger, more consistent agentic performance. Key highlights include:
- Native Multimodality: Unlike the text-only Step 3.5 Flash, version 3.7 natively understands images through a dedicated vision encoder, with a Visual Search pathway for long-tail entity recognition and a Python tool pathway for fine-grained tasks like cropping, zooming, and bounding-box analysis
- Frontier Coding Agent: Scores 56.3% on SWE-Bench Pro and 59.6% on Terminal-Bench 2.1—improvements of roughly +5 and +6 points over Step 3.5 Flash—plus 76.5% on SWE-Bench Verified
- Selectable Reasoning Tiers: Exposes low, medium, and high reasoning depths, letting you trade latency and cost against answer depth on a per-call basis
- Cross-Harness Consistency: Where Step 3.5 ranged 43–73% across coding scaffolds, Step 3.7 narrows to 64.5–71.5%, making behavior far more predictable across different tool harnesses
- 256K Context, Up to 400 tokens/sec: Long-context reasoning with real-time responsiveness
- Advisor Mode: Reaches ~97% of Claude Opus 4.6's coding performance at roughly one-ninth the per-task cost
| Step 3.5 Flash | Step 3.7 Flash | |
|---|---|---|
| SWE-Bench Pro | 51.3% | 56.3% |
| Terminal-Bench 2.1 | 53.4% | 59.6% |
| SWE-Bench Verified | — | 76.5% |
| Multimodal Input | Text only | Text + Image |
| Reasoning Tiers | — | Low / Medium / High |
| Context Window | 256K | 256K |
Examples
Multimodal image understanding
puter.ai.chat(
"What do you see in this image? Describe it in detail.",
"https://assets.puter.site/doge.jpeg",
{ model: "stepfun/step-3.7-flash" }
);
Agentic coding task
puter.ai.chat(`Design a Python script that automatically monitors
a directory for new files, processes them based on file type,
and generates a summary report. Include error handling and tests.`,
{ model: "stepfun/step-3.7-flash", stream: true }
);
Search-augmented reasoning
puter.ai.chat(
"Research the trade-offs between server-side and client-side rendering for a content-heavy web app, then recommend an approach with justification.",
{ model: "stepfun/step-3.7-flash", stream: true }
);
Step-by-step reasoning with streaming
puter.ai
.chat(
"A farmer has 17 sheep. All but 9 run away. How many sheep does the farmer have left? Explain your reasoning step by step.",
{ model: "stepfun/step-3.7-flash", stream: true }
)
.then(async (resp) => {
for await (const part of resp) {
if (part?.reasoning) puter.print(part?.reasoning);
else puter.print(part?.text);
}
});
Get Started Now
Just add one library to your project:
// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';
Or add one script tag to your HTML:
<script src="https://js.puter.com/v2/"></script>
No API keys needed. Start building with Step 3.7 Flash immediately.
Learn more:
Free, Serverless AI and Cloud
Start creating powerful web applications with Puter.js in seconds!
Get Started Now