Tag

Model releases

500 articles archived under #model-release · RSS

r/LocalLLaMA community 2d ago

10x Kaioken SSJ1 4th grade, worth it in 2026? Can it run Qwen3.6?

  submitted by   /u/Ice94k [link]   [comments]

20
r/LocalLLaMA community 2d ago

Koboldcpp v1.116 released

  submitted by   /u/Fcking_Chuck [link]   [comments]

19
r/LocalLLaMA community 2d ago

I had 55 LLMs blind-grade each other (22k judgments, all open). Every model family with enough data is biased toward its own siblings. Qwen judges favor Qwen by ~0.9 points. Mistral penalizes its own by ~1.0.

I have been running an open evaluation setup where N models answer the same prompt, then blind-grade each other in an N x N matrix with self-judgments excluded. No single privileged judge. So far: 286 evaluations, 198 hand-written questions, 22,254 valid judgments across 55…

35
r/LocalLLaMA community 2d ago

2x RX 9060xt 16gb, is it worth it?

I'm planning to buy 2x RX 9060xt with 16gb each to run Qwen 3.6 27B and alike. Would it be a good investment? How much tk/s should i expect in generation and prefill? I'm planning to use this as a coding agent in a large codebase. Currently I'm running this on my i7 64gb laptop…

35
Hugging Face Daily Papers research 2d ago

COrigami: An AI Pipeline for Co-Designing Flat-Foldable Visually Recognisable Origami

Abstract A computational origami system generates crease patterns from natural language using AI-driven optimization and aesthetic evaluation, enabling human-AI collaboration in mathematically constrained design. Generated by Qwen/Qwen2.5-Coder-32B-Instruct While generative AI…

11
r/LocalLLaMA community 2d ago

I built a tool to turn your Claude Code sessions into fine-tuning data for local models

If you use Claude Code, every session is already sitting on disk as a .jsonl file under ~/.claude/projects/ . It has real coding conversations: multi-turn edits, tool calls, reasoning traces. That's training data you already generated for free. The problem is the format is not…

36
r/LocalLLaMA community 2d ago

Full document redaction with Qwen 3.6 27B with a Pi agent harness

Link to full blog post with all method details, results, and links to all relevant code/skills/prompts at the bottom of this post. Apologies for not having more links throughout, it seems this subreddit restricts too many links in posts. Document redaction tasks are complex…

10
The Algorithmic Bridge news-outlet 2d ago

The AI Industry as You Know It Died Today

OpenAI announced GPT-5.6 but a terrible thing has happened

30
Hugging Face Daily Papers research 2d ago

Fast LeWorldModel

Abstract Fast-LeWM accelerates visual planning by replacing autoregressive rollout with parallel action-prefix prediction, reducing computational costs and latency accumulation during long-horizon predictions. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Joint-Embedding…

20
r/LocalLLaMA community 2d ago

Ornith 35B is great so far

Tried creating a quick 3d game with it, after 3 prompts, it got me this(checkvideo). If I compare this with qwen3.5-35b-a3b, it was not able to successfully generate this and was failing even after multiple prompts. Harness: Claude Code How is your experience so far ?…

4
r/LocalLLaMA community 2d ago

Mythos was the first, now GPT-5.6

https://techcrunch.com/2026/06/26/openai-limits-gpt-5-6-rollout-after-government-request-says-restrictions-shouldnt-be-the-norm/ Either a hype before IPO, or they have just shot themselves in a foot. This is pretty much it for more advanced online models. Local LLM is one of the…

17
r/LocalLLaMA community 2d ago

[NEW MODEL] - SupraSafety-18M · Tiny Content-Moderation Model

Hey r/LocalLLaMA ! SupraLabs is back with a new model: SupraSafety-18M . It's a BERT-style 18M params model trained from scratch on 2 T4 GPUs in Kaggle on the nvidia/Nemotron-3.5-Content-Safety-Dataset dataset for 7 epochs. It's built to run on edge devices , mobile phones , or…

13
TechCrunch — AI news-outlet 2d ago

Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on

New models are launching in Asia that promise Mythos-like capabilities without fear of an export ban. U.S. AI labs may never recover this enormous market.

29
r/LocalLLaMA community 2d ago

We built a calibration-aware Q4_K_M quant of Qwen3.5 0.8B that recovers 96.5% of the BF16 gap vs pure llama.cpp Q4_K_M (SpectralQuant)

Hey everyone, We just released our first release candidate from Spectral Labs: a Qwen3.5 0.8B Q4_K_M built using a new calibration-aware quantization approach we're calling SpectralQuant . The goal here was to see if we could make a standard Q4_K_M footprint behave more like a…

15
Ahead of AI (Sebastian Raschka) research 2d ago

Using Local Coding Agents

Using Open-Weight Models in Local Coding Harnesses as an Alternative to Claude Code and Codex Subscriptions

17
r/LocalLLaMA community 2d ago

Orthrus (diffusion head) trained Qwen 3.5/3.6 and Gemma 4 models are dropping soon

"Hi all, we are finalized with our testing and are preparing the release pipeline. We will be releasing support for the Qwen3.5, Qwen3.6, and Gemma4 very soon. Alongside the model checkpoints, we will be open-sourcing our complete end-to-end training and evaluation code. Stay…

19
r/LocalLLaMA community 2d ago

New deepseek vision model incoming?

Hello guys, it seems like DeepSeek added a new vision mode to their application. Does this mean, that they will release a new vision model? Edit: Guys.it is not an OCR model. I have just asked it to describe multiple images, which had no text in them.   submitted by  …

19
Hacker News — AI on Front Page community 2d ago

DeepSeek open-sources inference optimizations with 60–85% faster generation [pdf]

Article URL: https://github.com/deepseek-ai/DeepSpec/blob/main/DSpark_paper.pdf Comments URL: https://news.ycombinator.com/item?id=48696585 Points: 219 # Comments: 43

19
llama.cpp releases dev-tools 2d ago

b9823

ci : add windows-openvino to check-release ( #25022 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…

11
r/LocalLLaMA community 2d ago

Are there any qwen finetunes that were genuinely stronger than the base?

It's pretty popular to finetune qwen models but I never hear anyone say anything positive about them.   submitted by   /u/MrMrsPotts [link]   [comments]

30
r/LocalLLaMA community 2d ago

deepseek-ai/DeepSeek-V4-Pro-DSpark • Huggingface

https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro-DSpark https://github.com/deepseek-ai/DeepSpec/blob/main/DSpark_paper.pdf   submitted by   /u/External_Mood4719 [link]   [comments]

18
Latent.Space news-outlet 3d ago

[AINews] OpenAI GPT-5.6 Sol / Terra / Luna — restricted to trusted partners

Oddly tiered releases to both OAI and ANT on the same day.

30
r/LocalLLaMA community 3d ago

When can we expect merged DeepSeek V4 Flash / MiniMax M3 llama.cpp support?

I am relatively new here, I have little experience in how long support development takes. I know there are forks. But not merged status means AFAIK that support is far from perfect. When can we expect stable full support for DeepSeek V4 Flash and/or MiniMax M3 in llama.cpp?…

4
TechCrunch — AI news-outlet 3d ago

Trump Admin releases Anthropic Mythos to be used by more than 100 US companies, agencies

Over 100 companies and government agencies are reportedly authorized to use Mythos 5, including their non-American employees.

12
Hugging Face Daily Papers research 3d ago

ABACUS: Adapting Unified Foundation Model for Bridging Image Count Understanding and Generation

Abstract ABACUS is a unified vision-language model that performs object counting and related tasks through innovative spatial grounding, boundary-aware counting policies, and self-critical learning strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct ABACUS is a unified…

16
vLLM releases dev-tools 3d ago

v0.24.0

[CI] Raise gsm8k startup timeout for MoE Refactor Qwen3 NVFP4 configs…

23
Hacker News — AI on Front Page community 3d ago

US allows Anthropic to release Mythos to 'trusted partners'

Article URL: https://www.reuters.com/technology/us-releases-anthropic-model-mythos-some-us-companies-semafor-reports-2026-06-26/ Comments URL: https://news.ycombinator.com/item?id=48692995 Points: 207 # Comments: 166

33
Hacker News — AI on Front Page community 3d ago

U.S. allows Anthropic to release Mythos AI to ‘trusted’ US organizations

https://archive.md/ArXuF https://www.nbcnews.com/tech/tech-news/us-government-gives-a... Comments URL: https://news.ycombinator.com/item?id=48692995 Points: 336 # Comments: 333

21
Simon Willison community 3d ago

Quoting Dean W. Ball

This is a bad state of affairs. Consider, in particular, some industry dynamics: Frontier models are trained at an enormous cost, and a significant fraction of that cost is recouped in the few post-release months that they are broadly available. After that period elapses, the…

34
LangChain releases dev-tools 3d ago

langchain-anthropic==1.4.8

Changes since langchain-anthropic==1.4.7 release(anthropic): 1.4.8 ( #38490 ) fix(anthropic): keep initial text on content_block_start ( #38442 ) chore: bump langgraph-checkpoint from 4.1.0 to 4.1.1 in /libs/partners/anthropic ( #38479 ) fix(core): add messages to bare raise…

22
r/LocalLLaMA community 3d ago

Can Qwen3.6-35B-A3B on an RTX 3060 Replace Google Vision for Receipt-to-JSON Extraction?

I tried replacing Google Vision in my receipt pipeline with a local Qwen model. I had an old LINE message bot where I could send a receipt photo, it would go to Google Vision, get parsed into JSON, and saved in SQLite. Recently I tried again, but locally. Setup: RTX 3060 12GB…

8
Hugging Face Daily Papers research 3d ago

Neglected Free Lunch from Post-training: Progress Advantage for LLM Agents

Abstract Reinforcement learning post-training enables effective step-level scoring for language models without requiring dedicated reward model training by deriving an implicit advantage function called progress advantage. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Process…

6
Hugging Face Daily Papers research 3d ago

Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation

Abstract A unified agentic framework called Qwen-Image-Agent is proposed to address the context gap in text-to-image generation by progressively constructing complete generation context through planning, reasoning, searching, and memory mechanisms. Generated by…

22
Ollama releases dev-tools 3d ago

v0.30.11

What's Changed launch: add thinking capability detection to opencode by @hoyyeva in #15434 launch: auto-install Claude Code by @hoyyeva in #16802 launch: auto-install opencode when missing by @hoyyeva in #16806 discover: fix inverted iGPU/dGPU Vulkan classification on Windows…

28
TechCrunch — AI news-outlet 3d ago

OpenAI limits GPT-5.6 rollout after government request, says restrictions shouldn’t be the norm

“We don’t believe this kind of government access process should become the long-term default,” says OpenAI. “It keeps the best tools from users, developers, enterprises, cyber defenders, and global partners who need them.”

7
Hacker News — AI on Front Page community 3d ago

U.S. government will decide who gets to use GPT-5.6

https://archive.ph/PCQQl Comments URL: https://news.ycombinator.com/item?id=48690101 Points: 444 # Comments: 618

11
r/LocalLLaMA community 3d ago

Streaming medical STT running locally on a MacBook

Quick teaser of what I’ve been working on over the last few weeks: a streaming medical speech-to-text model that runs fully on-device. This demo is running locally on a MacBook through MLX. Still doing more evals, but planning to release the open weights next week.  …

22
llama.cpp releases dev-tools 3d ago

b9817

openvino: Update to OV 2026.2.1, self-contained release packages, operator improvements ( #24974 ) Update to OV 2026.2.1, Make OV release packages self-contained Update to OV 2026.2.1, Make OV release packages self-contained OpenVINO Backend: Remove compute_op_type hardcoded…

23
Hacker News — AI on Front Page community 3d ago

Previewing GPT‑5.6 Sol: a next-generation model

Article URL: https://openai.com/index/previewing-gpt-5-6-sol/ Comments URL: https://news.ycombinator.com/item?id=48689028 Points: 222 # Comments: 199

29
Hugging Face Daily Papers research 3d ago

Information-Aware KV Cache Compression for Long Reasoning

Abstract InfoKV is an entropy-aware KV cache compression framework that enhances long-context reasoning in LLMs by incorporating information-theoretic signals alongside attention weights. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Reasoning capability has advanced rapidly in…

10
r/LocalLLaMA community 3d ago

Gemma 4 12b needs glasses

Having a lot of fun using Gemma 4 as an assistant, but is growing frustrated with the poor default image resolution setting for image vision. Tasks like identifying smaller text in an image that Qwen 3.6 flies through, Gemma 4 are never able to decipher. Even larger overall…

31
Don't Worry About the Vase community 3d ago

White House Will Ad Hoc Decide Who Can Individually Access GPT-5.6

We have a new standard policy for releasing frontier AI models. It is not good.

6
Hugging Face Daily Papers research 3d ago

EO-WM: A Physically Informed World Model for Probabilistic Earth Observation Forecasting

Abstract EO-WM is a video diffusion transformer for multispectral Earth Observation forecasting that incorporates physically informed conditioning frameworks to better capture weather-driven uncertainties in land-surface dynamics. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…

10
Hugging Face Daily Papers research 3d ago

LISA: Likelihood Score Alignment for Visual-condition Controllable Generation

Abstract Score-based generative modeling reveals that side networks contribute likelihood scores to conditional control, leading to improved training efficiency through likelihood score alignment regularization. Generated by Qwen/Qwen2.5-Coder-32B-Instruct The prevalent…

36
r/LocalLLaMA community 3d ago

Combined RTX5080 & 4060 for inference ?

Hey, I currently use my RTX 4060 8G for inference with Qwen 3.6-35B-A3B Q8 (q8 for everything weight,value,key) max 60k context per agent (for quality over speed, with CPU &DDR4 offloading) but : I only get ~100pp & 20tg at max when context is still low on Qwen 3.6-35B-A3B Q8,…

38
Hugging Face Daily Papers research 3d ago

When Does Combining Language Models Help? A Co-Failure Ceiling on Routing, Voting, and Mixture-of-Agents Across 67 Frontier Models

Abstract Multi-model systems face fundamental accuracy limits determined by the rate at which all models fail simultaneously, regardless of their individual correlations or ensemble strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Multi-model LLM systems such as routing,…

11
OpenAI official-blog 3d ago

Previewing GPT-5.6 Sol: a next-generation model

OpenAI previews GPT-5.6 Sol, a next-generation model with stronger capabilities in coding, science, and cybersecurity, paired with its most advanced safety stack.

10
r/LocalLLaMA community 3d ago

Help optimizing llama.cpp + Qwen 27B on RTX PRO 6000 Blackwell for coding agents

Our company recently acquired a workstation with an RTX PRO 6000 Blackwell , and we're experimenting with local LLMs to reduce part of our Claude token usage. Right now we’re running Qwen3.6 27B MTP Q8_K_XL with llama.cpp on Windows 11 . I've been using both Claude Opus and…

13
Hugging Face Daily Papers research 3d ago

CoffeeBench: Benchmarking Long-Horizon LLM Agents in Heterogeneous Multi-Agent Economies

Abstract CoffeeBench evaluates LLM agents in a multi-agent economic simulation where firms interact over 90 days to maximize profits, revealing differences in communication patterns and performance among various models. Generated by Qwen/Qwen2.5-Coder-32B-Instruct As LLM agents…

4
LangChain releases dev-tools 3d ago

langchain-fireworks==1.4.3

Changes since langchain-fireworks==1.4.2 release(fireworks): 1.4.3 chore: bump vcrpy from 8.1.1 to 8.2.1 in /libs/partners/fireworks ( #38314 ) chore: bump langsmith from 0.8.16 to 0.8.18 in /libs/partners/fireworks ( #38313 ) chore: bump langsmith from 0.8.14 to 0.8.16 in…

24

10x Kaioken SSJ1 4th grade, worth it in 2026? Can it run Qwen3.6?

Koboldcpp v1.116 released

I had 55 LLMs blind-grade each other (22k judgments, all open). Every model family with enough data is biased toward its own siblings. Qwen judges favor Qwen by ~0.9 points. Mistral penalizes its own by ~1.0.

2x RX 9060xt 16gb, is it worth it?

COrigami: An AI Pipeline for Co-Designing Flat-Foldable Visually Recognisable Origami

I built a tool to turn your Claude Code sessions into fine-tuning data for local models

Full document redaction with Qwen 3.6 27B with a Pi agent harness

The AI Industry as You Know It Died Today

Fast LeWorldModel

Ornith 35B is great so far

Mythos was the first, now GPT-5.6

[NEW MODEL] - SupraSafety-18M · Tiny Content-Moderation Model

Asian AI startups launch Mythos-like models as Anthropic&#8217;s export ban drags on

We built a calibration-aware Q4_K_M quant of Qwen3.5 0.8B that recovers 96.5% of the BF16 gap vs pure llama.cpp Q4_K_M (SpectralQuant)

Using Local Coding Agents

Orthrus (diffusion head) trained Qwen 3.5/3.6 and Gemma 4 models are dropping soon

New deepseek vision model incoming?

DeepSeek open-sources inference optimizations with 60–85% faster generation [pdf]

b9823

Are there any qwen finetunes that were genuinely stronger than the base?

deepseek-ai/DeepSeek-V4-Pro-DSpark • Huggingface

[AINews] OpenAI GPT-5.6 Sol / Terra / Luna — restricted to trusted partners

When can we expect merged DeepSeek V4 Flash / MiniMax M3 llama.cpp support?

Trump Admin releases Anthropic Mythos to be used by more than 100 US companies, agencies

ABACUS: Adapting Unified Foundation Model for Bridging Image Count Understanding and Generation

v0.24.0

US allows Anthropic to release Mythos to 'trusted partners'

U.S. allows Anthropic to release Mythos AI to ‘trusted’ US organizations

Quoting Dean W. Ball

langchain-anthropic==1.4.8

Can Qwen3.6-35B-A3B on an RTX 3060 Replace Google Vision for Receipt-to-JSON Extraction?

Neglected Free Lunch from Post-training: Progress Advantage for LLM Agents

Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation

v0.30.11

OpenAI limits GPT-5.6 rollout after government request, says restrictions shouldn’t be the norm

U.S. government will decide who gets to use GPT-5.6

Streaming medical STT running locally on a MacBook

b9817

Previewing GPT‑5.6 Sol: a next-generation model

Information-Aware KV Cache Compression for Long Reasoning

Gemma 4 12b needs glasses

White House Will Ad Hoc Decide Who Can Individually Access GPT-5.6

EO-WM: A Physically Informed World Model for Probabilistic Earth Observation Forecasting

LISA: Likelihood Score Alignment for Visual-condition Controllable Generation

Combined RTX5080 & 4060 for inference ?

When Does Combining Language Models Help? A Co-Failure Ceiling on Routing, Voting, and Mixture-of-Agents Across 67 Frontier Models

Previewing GPT-5.6 Sol: a next-generation model

Help optimizing llama.cpp + Qwen 27B on RTX PRO 6000 Blackwell for coding agents

CoffeeBench: Benchmarking Long-Horizon LLM Agents in Heterogeneous Multi-Agent Economies

langchain-fireworks==1.4.3

Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on