r/LocalLLaMA · May 14, 2026 · 1 min read

NVFP4 Kimi2.6 and Kimi 2.5 released by Nvidia

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

The NVIDIA Kimi-K2.6-NVFP4 model is the quantized version of the Moonshot AI's Kimi-K2.6 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check here. The NVIDIA Kimi-K2.6 NVFP4 model is quantized with Model Optimizer.

This model is ready for commercial/non-commercial use.

The accuracy benchmark results are presented in the table below:

Precision	GPQA Diamond	SciCode	τ²-Bench Telecom	MMMU Pro	AA-LCR	IFBench
Baseline (INT4)	90.9	52.6	98.2	75.6	71.0	73.9
NVFP4	90.4	54.4	98.0	76.5	71.8	73.9

Baseline: Kimi-K2.6 in its native INT4 format. Benchmarked with temperature=1.0, top_p=0.95, max num tokens 128000.

Links:

https://huggingface.co/nvidia/Kimi-K2.6-NVFP4

https://huggingface.co/nvidia/Kimi-K2.5-NVFP4

submitted by /u/Opening-Broccoli9190
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA