r/LocalLLaMA · · 1 min read

StepFun 3.7 Flash

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

StepFun dropped Step 3.7 Flash, 196B total / 11B active MoE, runs locally on 128GB RAM

It's a multimodal MoE (196B total params, only 11B active) with a built-in 1.8B ViT for vision.

Benchmark highlights vs. other flash-tier models:

- SWE-Bench Pro: 56.26% (beats DeepSeek V4 Flash at 55.6%, matches Gemini 3.5 Flash at 55.1%)

- DeepSearchQA F1: 92.82%, competitive with GPT 5.5 (93.98%)

- HLE w/ tools: 47.2%, solid for a flash-class model

Essentially punches well above its active parameter weight on agentic and coding tasks. If you've got the RAM for it, looks like a genuinely interesting local option, especially for agent workflows.

Available on OpenRouter and NVIDIA NIM if you don't want to self-host.

submitted by /u/Everlier
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA