r/LocalLLaMA · · 1 min read

Quick note on sudden performance loss when running GGUFs

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Had a couple of GGUFs (Qwen3.5-35B-A3B-APEX-I-Quality and an Unsloth model as well) that suddenly displayed erratic performance characteristics (sudden deep dives from 20+ tg/s down to 5 tg/s), turned out both had been damaged, not unlikely during manual embedding of MTP layers (shouldn't touch the source model from logic pov..). Discovered by using sha256 sum and seeing that things weren't aligned any longer, redownloaded models and all sorted.

TLDR: check sha256sum of model matches correctly if things get iffy.

submitted by /u/yeah-ok
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA