r/LocalLLaMA · June 4, 2026 · 1 min read

cyankiwi AWQ 4-bit — 26.05 update, NVFP4 + FP8 Dynamic quantization and benchmarks across Qwen3.6 4-bit quants

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

cyankiwi AWQ 4-bit — 26.05 update, NVFP4 + FP8 Dynamic quantization and benchmarks across Qwen3.6 4-bit quants

We are happy to share cyankiwi AWQ update: better AWQ implementation, now with NVFP4 and FP8 Dynamic quantization support. We measured KL divergence against the BF16 baseline for 4-bit Qwen3.6 quants, on synthesized Qwen3.6 BF16 GPQA Diamond responses.

cyankiwi AWQ release comes out lowest on both the 27B dense and the 35B-A3B MoE.

Qwen3.6-27B (dense)

Model	Weight size	KLD
Lorbus/Qwen3.6-27B-int4-AutoRound	17.69 GiB	0.031682
Intel/Qwen3.6-27B-int4-AutoRound	17.69 GiB	0.032569
sakamakismile/Qwen3.6-27B-NVFP4	18.36 GiB	0.092948
rdtand/Qwen3.6-27B-PrismaSCOUT-Blackwell-NVFP4-BF16-vllm	18.79 GiB	0.040911
cyankiwi/Qwen3.6-27B-AWQ-INT4	19.04 GiB	0.020443
berkerdooo/Qwen3.6-27B-NVFP4	19.15 GiB	0.043821
ocicek/Qwen3.6-27B-NVFP4	19.15 GiB	0.092993
QuantTrio/Qwen3.6-27B-AWQ	20.35 GiB	0.034925
unsloth/Qwen3.6-27B-NVFP4	24.57 GiB	0.039140
QuantTrio/Qwen3.6-27B-AWQ-6Bit	25.79 GiB	0.028084
cyankiwi/Qwen3.6-27B-AWQ-BF16-INT4	26.37 GiB	0.018299
cyankiwi/Qwen3.6-27B-AWQ-BF16-NVFP4	26.59 GiB	0.032549

Qwen3.6-35B-A3B (MoE)

Model	Weight size	KLD
Intel/Qwen3.6-35B-A3B-int4-mixed-AutoRound	20.02 GiB	0.032453
rdtand/Qwen3.6-35B-A3B-PrismaQuant-4.75bit-vllm	21.31 GiB	0.036303
nvidia/Qwen3.6-35B-A3B-NVFP4	21.82 GiB	0.029490
unsloth/Qwen3.6-35B-A3B-NVFP4	22.99 GiB	0.052754
cyankiwi/Qwen3.6-35B-A3B-AWQ-4bit	23.25 GiB	0.017126
RedHatAI/Qwen3.6-35B-A3B-NVFP4	23.32 GiB	0.046624
QuantTrio/Qwen3.6-35B-A3B-AWQ	23.71 GiB	0.020767
cyankiwi/Qwen3.6-35B-A3B-AWQ-NVFP4	23.86 GiB	0.026335

Qwen3.6 KLD

submitted by /u/_cpatonn
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Qwen3.6-27B (dense)

Qwen3.6-35B-A3B (MoE)

Discussion (0)

More from r/LocalLLaMA