StepFun 3.7 Flash
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
StepFun dropped Step 3.7 Flash, 196B total / 11B active MoE, runs locally on 128GB RAM
It's a multimodal MoE (196B total params, only 11B active) with a built-in 1.8B ViT for vision.
Benchmark highlights vs. other flash-tier models:
- SWE-Bench Pro: 56.26% (beats DeepSeek V4 Flash at 55.6%, matches Gemini 3.5 Flash at 55.1%)
- DeepSearchQA F1: 92.82%, competitive with GPT 5.5 (93.98%)
- HLE w/ tools: 47.2%, solid for a flash-class model
Essentially punches well above its active parameter weight on agentic and coding tasks. If you've got the RAM for it, looks like a genuinely interesting local option, especially for agent workflows.
Available on OpenRouter and NVIDIA NIM if you don't want to self-host.
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.