StepFun 3.7 Flash - Speed Benchmark in M5 Max
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| Just ran a benchmark with day-0 shipped llama.cpp's branch. M5 Max: 128 GB - Q4_K_S / memory peak around ~120+ GB making things sluggish but still usable once cmd+tab landed. Short context < 16k feels fast and very responsive. 32k-64k's speed is not bad, usable.
Now Pelican bench - very nice one but with quite a long hand lol [link] [comments] |
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.