Is there a definitive way or cookie cutter way to benchmark variations of the same model for their KLD?
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
I'm looking to do some comparisons between different Qwopus3.6-27B-v2-NVFP4 models, namely A, B, and C.
In particular, I would like to measure how much their KLD is.
How do I go about doing this?
[link] [comments]
More from r/LocalLLaMA
-
Speed difference between Windows 11 and Linux with llama.cpp: a myth when using medium and large MoE models
May 31
-
PolyRange: Contamination-resistant offensive-AI benchmark for web targets (that ain't a benchmark, THAT's a benchmark)
May 31
-
Don’t bite me for that question please…
May 31
-
Use any model and any provider with the official OpenAI Codex Desktop App, without modifying its code, and continue to use the official models in parallel?
May 31
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.