r/LocalLLaMA · · 1 min read

NVFP4 Kimi2.6 and Kimi 2.5 released by Nvidia

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

The NVIDIA Kimi-K2.6-NVFP4 model is the quantized version of the Moonshot AI's Kimi-K2.6 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check here. The NVIDIA Kimi-K2.6 NVFP4 model is quantized with Model Optimizer.

This model is ready for commercial/non-commercial use.

The accuracy benchmark results are presented in the table below:

Precision GPQA Diamond SciCode τ²-Bench Telecom MMMU Pro AA-LCR IFBench
Baseline (INT4) 90.9 52.6 98.2 75.6 71.0 73.9
NVFP4 90.4 54.4 98.0 76.5 71.8 73.9

Baseline: Kimi-K2.6 in its native INT4 format. Benchmarked with temperature=1.0, top_p=0.95, max num tokens 128000.

Links:

https://huggingface.co/nvidia/Kimi-K2.6-NVFP4

https://huggingface.co/nvidia/Kimi-K2.5-NVFP4

submitted by /u/Opening-Broccoli9190
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA