Tag

Model releases

500 articles archived under #model-release · RSS

arXiv — NLP / Computation & Language research 30m ago

Open but Incompatible: A License Compatibility Analysis of Corpora for Low-Resource African Languages

arXiv:2606.28867v1 Announce Type: new Abstract: Creative Commons licenses dominate African NLP corpus releases, but their compatibility rules are rarely applied. CC-BY-SA and CC-BY-NC cannot be combined in a single published dataset; a NoDerivs clause silently prohibits…

28
arXiv — NLP / Computation & Language research 30m ago

Fine-Tuning General-Purpose Large Language Models for Agricultural Applications:A Reproducible Framework and Evaluation Protocol Based on Qwen3-8B

arXiv:2606.28992v1 Announce Type: new Abstract: General-purpose large language models (LLMs) have demonstrated strong abilities in opendomain question answering, information extraction, and text generation. Agricultural applications, however, are domain-specific,…

20
arXiv — NLP / Computation & Language research 30m ago

Fast Numbers, Slow Language: Bridging Quantitative and Qualitative Earnings Signals

arXiv:2606.29734v1 Announce Type: new Abstract: Earnings announcements release two types of information sequentially: quantitative surprise (numeric earnings-per-share (EPS)/revenue versus analyst estimate) arrives first in press releases and financial news, processed by…

12
arXiv — NLP / Computation & Language research 30m ago

Are We Measuring Strategy or Phrasing? The Gap Between Surface- and Approach-Level Diversity in LLM Math Reasoning

arXiv:2606.29985v1 Announce Type: new Abstract: Diversity in LLM mathematical reasoning is critical for exploration, but common diversity metrics mostly capture surface-level variation rather than differences in how a problem is solved. We address this gap by introducing…

27
TechCrunch — AI news-outlet 2h ago

Vibe coding platform Base44 launches own model as AI startups seek defensibility

Wix-owned vibe coding platform Base44 has started rolling out its own AI model — with hopes that it will eventually outperform frontier models.

8
r/LocalLLaMA community 4h ago

Been running Qwen3.6-27B through a 3-critic harness. The harness matters more than I thought

Been running Qwen3.6-27B (8-bit) through my coding harness for a few days, alongside GLM5.2. The harness uses 3 critics — code review, test review, Playwright e2e — each with fresh context before accepting output. Qwen3.6 is legit for a 27B dense model. Benchmarks weren't lying.…

19
r/LocalLLaMA community 5h ago

Introducing LongCat-2.0 - , a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'.

  submitted by   /u/AnticitizenPrime [link]   [comments]

18
LangChain releases dev-tools 8h ago

langchain-openrouter==0.2.5

Changes since langchain-openrouter==0.2.4 release(openrouter): 0.2.5 ( #38553 ) fix(openrouter): deduplicate repeated finish metadata ( #38552 ) fix(openrouter): strip Responses reasoning IDs ( #38383 )

32
r/LocalLLaMA community 9h ago

It’s time, Sam, it’s time.

I mean….. I’m no CEO…. but it seems like this would be the absolute perfect time to drop a super powerful GPT-OSS-2 to throw a big ol’ wet blanket on Anthropic’s IPO. It doesn’t need to be like frontier or anything, just a 20b and a 120b that is as fast as the old versions, add…

31
Ollama releases dev-tools 9h ago

v0.31.0

launch: check for min version for hermes desktop ( #16912 )

4
r/LocalLLaMA community 10h ago

DeepSeek V4, PR merged into llama.cpp !

The PR : https://github.com/ggml-org/llama.cpp/pull/24162 All to git pull, cmake , and download GGUFs ! A vos marques, prêt, partez !   submitted by   /u/Squik67 [link]   [comments]

4
r/LocalLLaMA community 10h ago

Qwen3-tts.cpp + Compose Desktop GUI

I improved my qwen3-tts.cpp implementation to be about 5x realtime on my RTX 5080. It is GGML based, so it should compile and run anywhere - however I only tested it with CPU & CUDA under Windows & Linux: https://github.com/Danmoreng/qwen3-tts.cpp Additionally I made a Desktop…

13
TechCrunch — AI news-outlet 10h ago

Anthropic and Gov. Newsom forge deal allowing California government to use Claude at half price

As Anthropic forges a closer relationship with the state of California, the federal government has made an enemy out of the OpenAI rival.

26
TechCrunch — AI news-outlet 10h ago

Arena, the AI leaderboard everyone uses, is now a $100M business

The startup, which runs a popular free AI leaderboard, launched its commercial service just last September.

23
Hacker News — AI on Front Page community 11h ago

Qwen 3.6 27B is the sweet spot for local development

Article URL: https://quesma.com/blog/qwen-36-is-awesome/ Comments URL: https://news.ycombinator.com/item?id=48721903 Points: 204 # Comments: 133

7
TechCrunch — AI news-outlet 11h ago

Cursor now has a mobile app for guiding your coding agent on the go

Cursor has launched a new mobile app for remote oversight over coding agents.

29
r/MachineLearning community 12h ago

I'm trying to implement CALM paper, and I have some questions. [P]

Hello, I'm trying to implement the Pocket TTS by kyutai-labs represented by this paper . Since they have didn't released the training/fine-tuning code. I'm trying to implement it on my own for learning some stuff. I have read the paper, tried to implement it with much more…

34
Simon Willison community 12h ago

Ornith-1.0: Self-Scaffolding LLMs for Agentic Coding

Ornith-1.0: Self-Scaffolding LLMs for Agentic Coding This is an interesting new open weights (MIT licensed) model, the first model release from DeepReinforce. [...] with variants including 9B Dense, 31B Dense, 35B MoE, and 397B MoE. Built on top of pretrained Gemma 4 and Qwen…

5
r/MachineLearning community 12h ago

Adaptive Mixture of Experts Gate (AMG) [R]

[Project] Post-hoc Adaptive MoE Gating on Qwen3.6-35B — empirical benchmarking of an open research gap Adaptive MoE routing — selecting a variable number of experts per token based on routing confidence — has been studied in papers (XMoE 2024, DynMoE ICLR 2025, TopP routing…

5
r/LocalLLaMA community 13h ago

Going from single GPU to dual GPU is nice but not in the way I expected

I was expecting what when doubling my VRAM from 24gb to 2x24gb I'd use higher quants with more context, and thus get smarter LLMs, but that's not what it ended up happening. At least for coding, I found that the difference in quality from, say, qwen 27B UD-Q4-XL to a Q6 or Q8 is…

21
r/LocalLLaMA community 13h ago

Instead of decentralized training effort we should build the “One dataset”

There are many threads here calling for united LLM training run of a new open model. Mainly, after govt. stunt of banning commercial frontier models. And also due to the lack of small-medium open-weight models releases lately. I genuinelly believe at some point we’ll have “SETI…

38
Hacker News — AI on Front Page community 14h ago

Rocketlab acquires Iridium

Article URL: https://investors.rocketlabcorp.com/news-releases/news-release-details/rocket-lab-acquire-iridium-historic-deal-creating-fully Comments URL: https://news.ycombinator.com/item?id=48719485 Points: 222 # Comments: 136

25
r/LocalLLaMA community 16h ago

Deepseek V4 Official Launch to be released mid-July with API price changes

Is this the official release for deepseek? I hope it has huge improvements https://preview.redd.it/dm5l0qn8k7ah1.png?width=694&format=png&auto=webp&s=12eadfd0a52c0f1a65bcd685f2cdbb29aff457be   submitted by   /u/jmorant555 [link]   [comments]

22
llama.cpp releases dev-tools 18h ago

b9840

DeepSeek V4 ( #24162 ) convert: add dsv4 conversion add basic setup add llm_graph_input_dsv4 add save-load state add sinkhorn eps - correction by @fairydreaming add rope fix cleanup dead code fix bugs support pro model: added by @fairydreaming remove redundant V cache Chat…

26
r/LocalLLaMA community 18h ago

DeepSeek V4 official version will be launch on mid-July

https://preview.redd.it/n7rwh262b7ah1.jpg?width=1024&format=pjpg&auto=webp&s=33d775b456843cd2dbd458de89384a6a7d6d87d1 Source: Email sent from deepseek (email only available for chinese user) used gpt image 2 translate image into english   submitted by  …

34
r/LocalLLaMA community 19h ago

DeepSeek V4 by am17an · Pull Request #24162 · ggml-org/llama.cpp

now you can run DeepSeek V4 locally   submitted by   /u/jacek2023 [link]   [comments]

26
r/LocalLLaMA community 20h ago

GLM 5.2 Q1_S vs Qwen 27B Q8

TL;DR; GLM-5.2 Q1_S beats Qwen 3.6 27B Q8, both run at KV Q8 edit: GLM run a K & V Q8, Qwen run with KV cache at full FP16., with preserve thinking on. Disclaimer : This is a hobby/amateur comparison with n=1, so go easy on it. I just thought it would be fun to share. The…

11
r/LocalLLaMA community 20h ago

MiCA is now part of Hugging Face PEFT

Glad to share that MiCA, short for Minor Component Adaptation, has now been merged into the HuggingFace PEFT library. It is not yet included in the latest PyPI release, but you can already install it directly from PEFT main: pip install --upgrade…

18
Vercel — AI dev-tools 21h ago

Build realtime voice agents on AI Gateway

AI Gateway now supports audio/voice. You can add realtime voice, text to speech, and speech to text with the same calls you already use for text, image, and video, routed through AI Gateway alongside every other modality. Audio launches with models from OpenAI and xAI . Each…

26
arXiv — Machine Learning research 1d ago

Unified Zero-Shot Time Series Forecasting: A Darts Foundation

arXiv:2606.27438v1 Announce Type: new Abstract: Since its initial release in 2020, Darts has become a widely used open-source Python library for time series analysis. A series of foundation models have recently claimed accuracy improvements in zero-shot forecasting, promising a…

15
arXiv — Machine Learning research 1d ago

FoggyTrust: Robust Federated Learning with Hierarchical Trust Networks

arXiv:2606.27622v1 Announce Type: new Abstract: Byzantine-robust federated learning seeks to protect distributed model training from malicious or corrupted clients without requiring access to their private data. FLTrust addresses this challenge by introducing a trusted…

33
arXiv — Machine Learning research 1d ago

Recovering Sharp Conductivity Features in the Finite-Data Calder\'on Problem with Physics-Informed Neural Networks

arXiv:2606.28158v1 Announce Type: new Abstract: Physics-informed neural networks (PINNs) have recently emerged as a promising framework for addressing the Calder\'on inverse problem from limited boundary data. In this work, we revisit neural Calder\'on inversion by introducing…

31
arXiv — Machine Learning research 1d ago

Qwen-Image-2.0-RL Technical Report

arXiv:2606.27608v1 Announce Type: cross Abstract: We present Qwen-Image-2.0-RL, a post-training pipeline that applies reinforcement learning from human feedback (RLHF) and on-policy distillation (OPD) to improve both the visual quality and instruction-following capability of the…

34
r/LocalLLaMA community 1d ago

I built an agent Harness for Small Models. I got Qwen 3.5 4b managing servers.

This is something I've been working on, I like playing around with smaller local models but found most agent harness's not well suited for them. The failure modes across different model family's tend to be the same: Failed tool calls Poor varication of environment variables Poor…

12
Vercel — AI dev-tools 1d ago

xAI Grok audio models now available on Vercel AI Gateway

xAI's audio models are now live on AI Gateway. Realtime voice, text to speech, and speech to text are all available through the AI SDK with the same routing, observability, and spend controls as your other models. These capabilities are available on the AI SDK 7 release.…

11
r/LocalLLaMA community 1d ago

Qwen3.6-27B UD Q3 with kv at q8 is quite amazing for simple proof of concepts

Preface, technology is not my industry, but I am a very passionate poor man. So much so that I discovered 'AI' - ChatGPT in the beginning of 2025. So go easy on me, I only try. I kind of understand MOE vs. Dense models, MOEs are much forgiving when it comes to running as there…

22
r/LocalLLaMA community 1d ago

Tensor split performance on low-bandwidth (TB3) eGPUs, and a question

Hey everyone! I've got a pair of Morefine G1 4090M 16gb eGPUs connected at 40Gbps via TB3 (daisy-chained). I normally run them in layer split mode as it doesn't seem to need much bandwidth; I'm seeing around 1300t/s PP and 26t/s TG (35-40 with MTP), qwen3.6-27B @ Q4. Which is…

20
TechCrunch — AI news-outlet 1d ago

Ford rehires ‘gray beard’ engineers after AI falls short

"Mistakenly we thought that by just introducing artificial intelligence ... that would produce a high-quality product.”

19
Hacker News — AI on Front Page community 1d ago

GLM 5.2 beats Claude in our benchmarks

Article URL: https://semgrep.dev/blog/2026/we-have-mythos-at-home-glm-52-beats-claude-in-our-cyber-benchmarks/ Comments URL: https://news.ycombinator.com/item?id=48709670 Points: 273 # Comments: 109

22
OpenAI official-blog 1d ago

HP Inc. launches Frontier strategic partnership with OpenAI

HP Inc. scales its OpenAI Frontier partnership to deploy AI across customer experiences, software development, and enterprise operations.

18
Hacker News — AI on Front Page community 1d ago

I used Claude Code to get a second opinion on my MRI

Article URL: https://antoine.fi/mri-analysis-using-claude-code-opus Comments URL: https://news.ycombinator.com/item?id=48708941 Points: 206 # Comments: 309

38
r/LocalLLaMA community 1d ago

Script to monitor llama cpp and analyze memory usage

My goal has always been to be productive with commodity hardware. So far my workhorses have been the MoE editions of gemma 4 and Qwen 3.6 on an old desktop with a single 9060XT with 16GB ram. The problem has always been that every source is vague about Vram/ram requirements.…

33
Don't Worry About the Vase community 1d ago

GPT-5.6: The System Card

While we wait for a general release, the system card is the best hint as to what is going on with the new candidate for America’s Next Top Model, GPT-5.6.

5
Hacker News — AI on Front Page community 1d ago

EU to legislate about Chat Control behind closed doors

Article URL: https://www.patrick-breyer.de/en/double-threat-to-private-communications-undemocratic-chat-control-backroom-deals-and-imminent-concessions-spark-relaunch-of-fightchatcontrol-eu/ Comments URL: https://news.ycombinator.com/item?id=48707719 Points: 249 # Comments: 132

11
r/LocalLLaMA community 1d ago

DeepSpec - a deepseek-ai Collection

DeepSpec DeepSpec is a full-stack codebase for training and evaluating draft models for speculative decoding. It contains data preparation utilities, draft model implementations, training code, and evaluation scripts. Released Checkpoints The checkpoints below are the ones used…

26
r/LocalLLaMA community 1d ago

Qwen3.6 27B local vs Opus 4.8, voxel engine in raw C with zero frameworks

Sunday experiment. Same prompt to both. Build a voxel world in plain C. No engine, no game library, no framework, just the compiler. The model does its own chunk meshing, render loop and memory management by hand. Left is Claude Code on Opus 4.8. Right is Qwen3.6 27B local on…

37
r/LocalLLaMA community 1d ago

How many of you do use Q1 or Q2 of Big models(100-250B)? How's it?

Sharing popular(also recent) models for reference: 151-250B : DeepSeek-V4-Flash Step-3.X-Flash Command-a-plus-05-2026 Laguna-M.1 MiniMax-M2.X Qwen3-235B-A22B 100-150B : GLM-4.5-Air Qwen3.5-122B-A10B NVIDIA-Nemotron-3-Super-120B-A12B Mistral-Small-4-119B-2603…

34
llama.cpp releases dev-tools 1d ago

b9830

common : allow --offline in llama download ( #25091 ) Expose the existing --offline flag to llama download so a script can run it to check whether a model is already cached and ready to be served without touching the network. Also fix a latent use-after-free in the URL-task…

4
r/LocalLLaMA community 1d ago

A barebones CPU-only inference engine for Qwen 3, written from scratch in pure C

TL;DR: The (very messy) code and writeups can be found at https://github.com/jakint0sh/qwen3-engine Read the README for instructions on how to get started. And for those who just want a bulleted list: - Inference engine for Qwen 3 sizes 4B and below - Written from scratch in…

37
r/LocalLLaMA community 1d ago

Is Qwen3-VL-2B the only viable VLM for JSON extraction on a "potato"?

After spending countless hours testing on 3 "potato" laptops (Intel i3, 8GB RAM, Win11, integrated GPU), that's my conclusion. For reliably extracting data from images to JSON on low-end hardware, nothing else even comes close. Yet, it’s completely missing from major benchmarks…

23

Open but Incompatible: A License Compatibility Analysis of Corpora for Low-Resource African Languages

Fine-Tuning General-Purpose Large Language Models for Agricultural Applications:A Reproducible Framework and Evaluation Protocol Based on Qwen3-8B

Fast Numbers, Slow Language: Bridging Quantitative and Qualitative Earnings Signals

Are We Measuring Strategy or Phrasing? The Gap Between Surface- and Approach-Level Diversity in LLM Math Reasoning

Vibe coding platform Base44 launches own model as AI startups seek defensibility

Been running Qwen3.6-27B through a 3-critic harness. The harness matters more than I thought

Introducing LongCat-2.0 - , a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'.

langchain-openrouter==0.2.5

It’s time, Sam, it’s time.

v0.31.0

DeepSeek V4, PR merged into llama.cpp !

Qwen3-tts.cpp + Compose Desktop GUI

Anthropic and Gov. Newsom forge deal allowing California government to use Claude at half price

Arena, the AI leaderboard everyone uses, is now a $100M business

Qwen 3.6 27B is the sweet spot for local development

Cursor now has a mobile app for guiding your coding agent on the go

I'm trying to implement CALM paper, and I have some questions. [P]

Ornith-1.0: Self-Scaffolding LLMs for Agentic Coding

Adaptive Mixture of Experts Gate (AMG) [R]

Going from single GPU to dual GPU is nice but not in the way I expected

Instead of decentralized training effort we should build the “One dataset”

Rocketlab acquires Iridium

Deepseek V4 Official Launch to be released mid-July with API price changes

b9840

DeepSeek V4 official version will be launch on mid-July

DeepSeek V4 by am17an · Pull Request #24162 · ggml-org/llama.cpp

GLM 5.2 Q1_S vs Qwen 27B Q8

MiCA is now part of Hugging Face PEFT

Build realtime voice agents on AI Gateway

Unified Zero-Shot Time Series Forecasting: A Darts Foundation

FoggyTrust: Robust Federated Learning with Hierarchical Trust Networks

Recovering Sharp Conductivity Features in the Finite-Data Calder\'on Problem with Physics-Informed Neural Networks

Qwen-Image-2.0-RL Technical Report

I built an agent Harness for Small Models. I got Qwen 3.5 4b managing servers.

xAI Grok audio models now available on Vercel AI Gateway

Qwen3.6-27B UD Q3 with kv at q8 is quite amazing for simple proof of concepts

Tensor split performance on low-bandwidth (TB3) eGPUs, and a question

Ford rehires ‘gray beard’ engineers after AI falls short

GLM 5.2 beats Claude in our benchmarks

HP Inc. launches Frontier strategic partnership with OpenAI

I used Claude Code to get a second opinion on my MRI

Script to monitor llama cpp and analyze memory usage

GPT-5.6: The System Card

EU to legislate about Chat Control behind closed doors

DeepSpec - a deepseek-ai Collection

Qwen3.6 27B local vs Opus 4.8, voxel engine in raw C with zero frameworks

How many of you do use Q1 or Q2 of Big models(100-250B)? How's it?

b9830

A barebones CPU-only inference engine for Qwen 3, written from scratch in pure C

Is Qwen3-VL-2B the only viable VLM for JSON extraction on a "potato"?