NVIDIA Nemotron 3 Ultra is out.
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Not sure how much this is in the "local" world but interesting what they are putting out.
[link] [comments]
More from r/LocalLLaMA
-
finally
Jun 5
-
Higgs Audio v3 TTS 4B. Built for voice chat. Support 100 languages and inline control.
Jun 4
-
BeeLlama v0.3.1 – latest llama.cpp with extras! DFlash, MTP, q6_0 cache, TurboQuant. Single RTX 3090: Qwen 3.6 27B & Gemma 4 31B up to 177.8 tps (4.93x over baseline)
Jun 4
-
cyankiwi AWQ 4-bit — 26.05 update, NVFP4 + FP8 Dynamic quantization and benchmarks across Qwen3.6 4-bit quants
Jun 4
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.