r/LocalLLaMA · May 26, 2026 · 2 min read

Qwen3.5 27B Uncensored Heretic Native MTP Preserved is Out Now With the Full 15 MTPs Preserved and Retained, Available in Safetensors, GGUFs, NVFP4, NVFP4 GGUFs and GPTQ-Int4 Formats!

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Qwen3.5 27B Uncensored Heretic Native MTP Preserved is Out Now With the Full 15 MTPs Preserved and Retained, Available in Safetensors, GGUFs, NVFP4, NVFP4 GGUFs and GPTQ-Int4 Formats!

Safetensors, llmfan46/Qwen3.5-27B-uncensored-heretic-v2-Native-MTP-Preserved: https://huggingface.co/llmfan46/Qwen3.5-27B-uncensored-heretic-v2-Native-MTP-Preserved

GGUFs, llmfan46/Qwen3.5-27B-uncensored-heretic-v2-Native-MTP-Preserved-GGUF: https://huggingface.co/llmfan46/Qwen3.5-27B-uncensored-heretic-v2-Native-MTP-Preserved-GGUF

NVFP4, llmfan46/Qwen3.5-27B-uncensored-heretic-v2-Native-MTP-Preserved-NVFP4: https://huggingface.co/llmfan46/Qwen3.5-27B-uncensored-heretic-v2-Native-MTP-Preserved-NVFP4

NVFP4 GGUFs, llmfan46/Qwen3.5-27B-uncensored-heretic-v2-Native-MTP-Preserved-NVFP4-GGUF: https://huggingface.co/llmfan46/Qwen3.5-27B-uncensored-heretic-v2-Native-MTP-Preserved-NVFP4-GGUF

GPTQ-Int4, llmfan46/Qwen3.5-27B-uncensored-heretic-v2-Native-MTP-Preserved-GPTQ-Int4: https://huggingface.co/llmfan46/Qwen3.5-27B-uncensored-heretic-v2-Native-MTP-Preserved-GPTQ-Int4

Comes with benchmark too.

Find all my models here: HuggingFace-LLMFan46

Now in case some people might ask, why release Qwen3.5 MTPs version when there is already Qwen3.6 MTPs version? Well the thing is, most people would assume that higher number = newer and better model, but the thing is both Qwen3.5 and Qwen3.6 models uses the qwen35 architecture, they just had different training and their focus are meant for different primary usecases, Qwen3.6 models are mainly meant for agentic and coding AI assistance and Qwen3.5 models are mainly meant for general purpose AI assistance, now Qwen3.6 can definitely be used for general AI assistance just like Qwen3.5 can definitely be used for agentic and coding, but if you want the most optimal usecases it would be Qwen3.6 for agentic and coding and Qwen3.5 for general AI assistance that is where each of them excels at.

Also for extra info, in case anyone is wondering, despite Qwen3.5 and Qwen3.6 both sharing the qwen35 architecture, they behave very diferently to abliteration. Qwen3.5 models can have a KL divergence in the 300's or 400's but on benchmarks this does not really translate to big loss of accuracy at all, for Qwen3.6 usually a KL divergence in the 400's+ could very well indicate a disatrous loss of accuracy and quality of the model, for pointer my Qwen3.6-35B-A3B had a KL divergence of only 0.0015 and yet already had a loss of accuracy of 0.32% while my Qwen3.6-27B had a KL divergence of 0.0021 and had an accuracy loss of 0.98%, while here with Qwen3.5-35B-A3B the model has a KL divergence of 0.0487 with an accuracy loss of 0.40% and my Qwen3.5-27B has a KL divergence of 0.0308 with an accuracy loss of 0.35%.

submitted by /u/LLMFan46
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA