News / #model-release Tag Model releases 500 articles archived under #model-release · RSS Sign in to follow r/LocalLLaMA community 2d ago 10x Kaioken SSJ1 4th grade, worth it in 2026? Can it run Qwen3.6?   submitted by   /u/Ice94k [link]   [comments] 20 r/LocalLLaMA community 2d ago Koboldcpp v1.116 released   submitted by   /u/Fcking_Chuck [link]   [comments] 19 r/LocalLLaMA community 2d ago I had 55 LLMs blind-grade each other (22k judgments, all open). Every model family with enough data is biased toward its own siblings. Qwen judges favor Qwen by ~0.9 points. Mistral penalizes its own by ~1.0. I have been running an open evaluation setup where N models answer the same prompt, then blind-grade each other in an N x N matrix with self-judgments excluded. No single privileged judge. So far: 286 evaluations, 198 hand-written questions, 22,254 valid judgments across 55… 35 r/LocalLLaMA community 2d ago 2x RX 9060xt 16gb, is it worth it? I'm planning to buy 2x RX 9060xt with 16gb each to run Qwen 3.6 27B and alike. Would it be a good investment? How much tk/s should i expect in generation and prefill? I'm planning to use this as a coding agent in a large codebase. Currently I'm running this on my i7 64gb laptop… 35 Hugging Face Daily Papers research 2d ago COrigami: An AI Pipeline for Co-Designing Flat-Foldable Visually Recognisable Origami Abstract A computational origami system generates crease patterns from natural language using AI-driven optimization and aesthetic evaluation, enabling human-AI collaboration in mathematically constrained design. Generated by Qwen/Qwen2.5-Coder-32B-Instruct While generative AI… 11 r/LocalLLaMA community 2d ago I built a tool to turn your Claude Code sessions into fine-tuning data for local models If you use Claude Code, every session is already sitting on disk as a .jsonl file under ~/.claude/projects/ . It has real coding conversations: multi-turn edits, tool calls, reasoning traces. That's training data you already generated for free. The problem is the format is not… 36 r/LocalLLaMA community 2d ago Full document redaction with Qwen 3.6 27B with a Pi agent harness Link to full blog post with all method details, results, and links to all relevant code/skills/prompts at the bottom of this post. Apologies for not having more links throughout, it seems this subreddit restricts too many links in posts. Document redaction tasks are complex… 10 The Algorithmic Bridge news-outlet 2d ago The AI Industry as You Know It Died Today OpenAI announced GPT-5.6 but a terrible thing has happened 30 Hugging Face Daily Papers research 2d ago Fast LeWorldModel Abstract Fast-LeWM accelerates visual planning by replacing autoregressive rollout with parallel action-prefix prediction, reducing computational costs and latency accumulation during long-horizon predictions. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Joint-Embedding… 20 r/LocalLLaMA community 2d ago Ornith 35B is great so far Tried creating a quick 3d game with it, after 3 prompts, it got me this(checkvideo). If I compare this with qwen3.5-35b-a3b, it was not able to successfully generate this and was failing even after multiple prompts. Harness: Claude Code How is your experience so far ?… 4 r/LocalLLaMA community 2d ago Mythos was the first, now GPT-5.6 https://techcrunch.com/2026/06/26/openai-limits-gpt-5-6-rollout-after-government-request-says-restrictions-shouldnt-be-the-norm/ Either a hype before IPO, or they have just shot themselves in a foot. This is pretty much it for more advanced online models. Local LLM is one of the… 17 r/LocalLLaMA community 2d ago [NEW MODEL] - SupraSafety-18M · Tiny Content-Moderation Model Hey r/LocalLLaMA ! SupraLabs is back with a new model: SupraSafety-18M . It's a BERT-style 18M params model trained from scratch on 2 T4 GPUs in Kaggle on the nvidia/Nemotron-3.5-Content-Safety-Dataset dataset for 7 epochs. It's built to run on edge devices , mobile phones , or… 13 TechCrunch — AI news-outlet 2d ago Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on New models are launching in Asia that promise Mythos-like capabilities without fear of an export ban. U.S. AI labs may never recover this enormous market. 29 r/LocalLLaMA community 2d ago We built a calibration-aware Q4_K_M quant of Qwen3.5 0.8B that recovers 96.5% of the BF16 gap vs pure llama.cpp Q4_K_M (SpectralQuant) Hey everyone, We just released our first release candidate from Spectral Labs: a Qwen3.5 0.8B Q4_K_M built using a new calibration-aware quantization approach we're calling SpectralQuant . The goal here was to see if we could make a standard Q4_K_M footprint behave more like a… 15 Ahead of AI (Sebastian Raschka) research 2d ago Using Local Coding Agents Using Open-Weight Models in Local Coding Harnesses as an Alternative to Claude Code and Codex Subscriptions 17 r/LocalLLaMA community 2d ago Orthrus (diffusion head) trained Qwen 3.5/3.6 and Gemma 4 models are dropping soon "Hi all, we are finalized with our testing and are preparing the release pipeline. We will be releasing support for the Qwen3.5, Qwen3.6, and Gemma4 very soon. Alongside the model checkpoints, we will be open-sourcing our complete end-to-end training and evaluation code. Stay… 19 r/LocalLLaMA community 2d ago New deepseek vision model incoming? Hello guys, it seems like DeepSeek added a new vision mode to their application. Does this mean, that they will release a new vision model? Edit: Guys.it is not an OCR model. I have just asked it to describe multiple images, which had no text in them.   submitted by  … 19 Hacker News — AI on Front Page community 2d ago DeepSeek open-sources inference optimizations with 60–85% faster generation [pdf] Article URL: https://github.com/deepseek-ai/DeepSpec/blob/main/DSpark_paper.pdf Comments URL: https://news.ycombinator.com/item?id=48696585 Points: 219 # Comments: 43 19 llama.cpp releases dev-tools 2d ago b9823 ci : add windows-openvino to check-release ( #25022 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64… 11 r/LocalLLaMA community 2d ago Are there any qwen finetunes that were genuinely stronger than the base? It's pretty popular to finetune qwen models but I never hear anyone say anything positive about them.   submitted by   /u/MrMrsPotts [link]   [comments] 30 r/LocalLLaMA community 2d ago deepseek-ai/DeepSeek-V4-Pro-DSpark • Huggingface https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro-DSpark https://github.com/deepseek-ai/DeepSpec/blob/main/DSpark_paper.pdf   submitted by   /u/External_Mood4719 [link]   [comments] 18 Latent.Space news-outlet 3d ago [AINews] OpenAI GPT-5.6 Sol / Terra / Luna — restricted to trusted partners Oddly tiered releases to both OAI and ANT on the same day. 30 r/LocalLLaMA community 3d ago When can we expect merged DeepSeek V4 Flash / MiniMax M3 llama.cpp support? I am relatively new here, I have little experience in how long support development takes. I know there are forks. But not merged status means AFAIK that support is far from perfect. When can we expect stable full support for DeepSeek V4 Flash and/or MiniMax M3 in llama.cpp?… 4 TechCrunch — AI news-outlet 3d ago Trump Admin releases Anthropic Mythos to be used by more than 100 US companies, agencies Over 100 companies and government agencies are reportedly authorized to use Mythos 5, including their non-American employees. 12 Hugging Face Daily Papers research 3d ago ABACUS: Adapting Unified Foundation Model for Bridging Image Count Understanding and Generation Abstract ABACUS is a unified vision-language model that performs object counting and related tasks through innovative spatial grounding, boundary-aware counting policies, and self-critical learning strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct ABACUS is a unified… 16 vLLM releases dev-tools 3d ago v0.24.0 [CI] Raise gsm8k startup timeout for MoE Refactor Qwen3 NVFP4 configs… 23 Hacker News — AI on Front Page community 3d ago US allows Anthropic to release Mythos to 'trusted partners' Article URL: https://www.reuters.com/technology/us-releases-anthropic-model-mythos-some-us-companies-semafor-reports-2026-06-26/ Comments URL: https://news.ycombinator.com/item?id=48692995 Points: 207 # Comments: 166 33 Hacker News — AI on Front Page community 3d ago U.S. allows Anthropic to release Mythos AI to ‘trusted’ US organizations https://archive.md/ArXuF https://www.nbcnews.com/tech/tech-news/us-government-gives-a... Comments URL: https://news.ycombinator.com/item?id=48692995 Points: 336 # Comments: 333 21 Simon Willison community 3d ago Quoting Dean W. Ball This is a bad state of affairs. Consider, in particular, some industry dynamics: Frontier models are trained at an enormous cost, and a significant fraction of that cost is recouped in the few post-release months that they are broadly available. After that period elapses, the… 34 LangChain releases dev-tools 3d ago langchain-anthropic==1.4.8 Changes since langchain-anthropic==1.4.7 release(anthropic): 1.4.8 ( #38490 ) fix(anthropic): keep initial text on content_block_start ( #38442 ) chore: bump langgraph-checkpoint from 4.1.0 to 4.1.1 in /libs/partners/anthropic ( #38479 ) fix(core): add messages to bare raise… 22 r/LocalLLaMA community 3d ago Can Qwen3.6-35B-A3B on an RTX 3060 Replace Google Vision for Receipt-to-JSON Extraction? I tried replacing Google Vision in my receipt pipeline with a local Qwen model. I had an old LINE message bot where I could send a receipt photo, it would go to Google Vision, get parsed into JSON, and saved in SQLite. Recently I tried again, but locally. Setup: RTX 3060 12GB… 8 Hugging Face Daily Papers research 3d ago Neglected Free Lunch from Post-training: Progress Advantage for LLM Agents Abstract Reinforcement learning post-training enables effective step-level scoring for language models without requiring dedicated reward model training by deriving an implicit advantage function called progress advantage. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Process… 6 Hugging Face Daily Papers research 3d ago Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation Abstract A unified agentic framework called Qwen-Image-Agent is proposed to address the context gap in text-to-image generation by progressively constructing complete generation context through planning, reasoning, searching, and memory mechanisms. Generated by… 22 Ollama releases dev-tools 3d ago v0.30.11 What's Changed launch: add thinking capability detection to opencode by @hoyyeva in #15434 launch: auto-install Claude Code by @hoyyeva in #16802 launch: auto-install opencode when missing by @hoyyeva in #16806 discover: fix inverted iGPU/dGPU Vulkan classification on Windows… 28 TechCrunch — AI news-outlet 3d ago OpenAI limits GPT-5.6 rollout after government request, says restrictions shouldn’t be the norm “We don’t believe this kind of government access process should become the long-term default,” says OpenAI. “It keeps the best tools from users, developers, enterprises, cyber defenders, and global partners who need them.” 7 Hacker News — AI on Front Page community 3d ago U.S. government will decide who gets to use GPT-5.6 https://archive.ph/PCQQl Comments URL: https://news.ycombinator.com/item?id=48690101 Points: 444 # Comments: 618 11 r/LocalLLaMA community 3d ago Streaming medical STT running locally on a MacBook Quick teaser of what I’ve been working on over the last few weeks: a streaming medical speech-to-text model that runs fully on-device. This demo is running locally on a MacBook through MLX. Still doing more evals, but planning to release the open weights next week.  … 22 llama.cpp releases dev-tools 3d ago b9817 openvino: Update to OV 2026.2.1, self-contained release packages, operator improvements ( #24974 ) Update to OV 2026.2.1, Make OV release packages self-contained Update to OV 2026.2.1, Make OV release packages self-contained OpenVINO Backend: Remove compute_op_type hardcoded… 23 Hacker News — AI on Front Page community 3d ago Previewing GPT‑5.6 Sol: a next-generation model Article URL: https://openai.com/index/previewing-gpt-5-6-sol/ Comments URL: https://news.ycombinator.com/item?id=48689028 Points: 222 # Comments: 199 29 Hugging Face Daily Papers research 3d ago Information-Aware KV Cache Compression for Long Reasoning Abstract InfoKV is an entropy-aware KV cache compression framework that enhances long-context reasoning in LLMs by incorporating information-theoretic signals alongside attention weights. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Reasoning capability has advanced rapidly in… 10 r/LocalLLaMA community 3d ago Gemma 4 12b needs glasses Having a lot of fun using Gemma 4 as an assistant, but is growing frustrated with the poor default image resolution setting for image vision. Tasks like identifying smaller text in an image that Qwen 3.6 flies through, Gemma 4 are never able to decipher. Even larger overall… 31 Don't Worry About the Vase community 3d ago White House Will Ad Hoc Decide Who Can Individually Access GPT-5.6 We have a new standard policy for releasing frontier AI models. It is not good. 6 Hugging Face Daily Papers research 3d ago EO-WM: A Physically Informed World Model for Probabilistic Earth Observation Forecasting Abstract EO-WM is a video diffusion transformer for multispectral Earth Observation forecasting that incorporates physically informed conditioning frameworks to better capture weather-driven uncertainties in land-surface dynamics. Generated by Qwen/Qwen2.5-Coder-32B-Instruct… 10 Hugging Face Daily Papers research 3d ago LISA: Likelihood Score Alignment for Visual-condition Controllable Generation Abstract Score-based generative modeling reveals that side networks contribute likelihood scores to conditional control, leading to improved training efficiency through likelihood score alignment regularization. Generated by Qwen/Qwen2.5-Coder-32B-Instruct The prevalent… 36 r/LocalLLaMA community 3d ago Combined RTX5080 & 4060 for inference ? Hey, I currently use my RTX 4060 8G for inference with Qwen 3.6-35B-A3B Q8 (q8 for everything weight,value,key) max 60k context per agent (for quality over speed, with CPU &DDR4 offloading) but : I only get ~100pp & 20tg at max when context is still low on Qwen 3.6-35B-A3B Q8,… 38 Hugging Face Daily Papers research 3d ago When Does Combining Language Models Help? A Co-Failure Ceiling on Routing, Voting, and Mixture-of-Agents Across 67 Frontier Models Abstract Multi-model systems face fundamental accuracy limits determined by the rate at which all models fail simultaneously, regardless of their individual correlations or ensemble strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Multi-model LLM systems such as routing,… 11 OpenAI official-blog 3d ago Previewing GPT-5.6 Sol: a next-generation model OpenAI previews GPT-5.6 Sol, a next-generation model with stronger capabilities in coding, science, and cybersecurity, paired with its most advanced safety stack. 10 r/LocalLLaMA community 3d ago Help optimizing llama.cpp + Qwen 27B on RTX PRO 6000 Blackwell for coding agents Our company recently acquired a workstation with an RTX PRO 6000 Blackwell , and we're experimenting with local LLMs to reduce part of our Claude token usage. Right now we’re running Qwen3.6 27B MTP Q8_K_XL with llama.cpp on Windows 11 . I've been using both Claude Opus and… 13 Hugging Face Daily Papers research 3d ago CoffeeBench: Benchmarking Long-Horizon LLM Agents in Heterogeneous Multi-Agent Economies Abstract CoffeeBench evaluates LLM agents in a multi-agent economic simulation where firms interact over 90 days to maximize profits, revealing differences in communication patterns and performance among various models. Generated by Qwen/Qwen2.5-Coder-32B-Instruct As LLM agents… 4 LangChain releases dev-tools 3d ago langchain-fireworks==1.4.3 Changes since langchain-fireworks==1.4.2 release(fireworks): 1.4.3 chore: bump vcrpy from 8.1.1 to 8.2.1 in /libs/partners/fireworks ( #38314 ) chore: bump langsmith from 0.8.16 to 0.8.18 in /libs/partners/fireworks ( #38313 ) chore: bump langsmith from 0.8.14 to 0.8.16 in… 24 Page 2 of 10 · 500 articles ← Newer Older →