Tag

Model releases

500 articles archived under #model-release · RSS

r/LocalLLaMA community 3d ago

Anyone tried Ornith-1.0 9B?

Should I even give it a chance over "qwopus3.5 9b v3.5" or "qwopus3.5 9b coder"? anyone tried it??   submitted by   /u/BothYou243 [link]   [comments]

8
Smol AI News news-outlet 4d ago

not much happened today

**OpenAI** previewed **GPT-5.6** with three variants: **Sol** (flagship), **Terra** (mid-tier), and **Luna** (lower-cost), launching under a restricted rollout mandated by the U.S. government, limiting access to trusted partners. **Sol** boasts enhanced cybersecurity and safety…

35
r/LocalLLaMA community 4d ago

Does llama cpp split mode tensor cause issues?

I split qwen 27b and Gemma 4 26b (moe) across a 5080, and 2x 5060ti. I noticed setting split mode to tensor mode will cause looping issues in OpenCode with tool calls or just through the reasoning traces. Anyone else get this or understand why? Split mode layer seems to work…

25
Hugging Face Daily Papers research 4d ago

JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting

Abstract JetSpec is a speculative decoding framework that combines efficient forward drafting with causal conditioning to improve LLM inference speed and acceptance rates across various benchmarks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Speculative decoding (SD)…

17
Hugging Face Daily Papers research 4d ago

Hallucination in World Models is Predictable and Preventable

Abstract World models exhibit hallucinations in low-data regions of state-action space, which can be detected and mitigated using data-centric signals and coverage-aware sampling techniques. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Modern generative world models render…

25
Hugging Face Daily Papers research 4d ago

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

Abstract Verification challenges in AI agents arise from the difficulty of aligning proxy signals with human intent, requiring adaptive verification systems that evolve alongside generative capabilities. Generated by Qwen/Qwen2.5-Coder-32B-Instruct A classical intuition holds…

26
arXiv — NLP / Computation & Language research 4d ago

HyperDFlash: MHC-Aligned Block Speculative Decoding with Gated Residual Reduction

arXiv:2606.26744v1 Announce Type: cross Abstract: We present HyperDFlash, a block-parallel speculative decoding framework tailored to the novel multi-hyper-connection (MHC) architecture proposed by DeepSeek-V4. Despite the strong initial-token drafting performance of the native…

10
arXiv — NLP / Computation & Language research 4d ago

Thinking Like a Scientist? A Structural Study of LLM-Generated Research Methods

arXiv:2606.26130v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly used to guide research methodology, yet their default methodological tendencies under minimal prompting remain unclear. Here, we prompt GPT-5.1, Gemini 3 Pro, and DeepSeek-V3.2 with an…

38
arXiv — NLP / Computation & Language research 4d ago

From Structure to Synergy: A Survey of Vision-Language Perception Paradigm Evolution in Multimodal Large Language Models

arXiv:2606.26196v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have recently made remarkable progress in unifying vision-language understanding and reasoning, especially following the introduction of models such as OpenAI's O-series and DeepSeek's…

12
arXiv — NLP / Computation & Language research 4d ago

Where Do Models Find Happiness? Emotion Vectors in Open-Source LLMs

arXiv:2606.26987v1 Announce Type: new Abstract: Recent work identified emotion vectors in Claude Sonnet 4.5, which are internal representations that encode emotion concepts, causally influence behavior, and exhibit geometry mirroring human psychological structure. We test the…

29
arXiv — NLP / Computation & Language research 4d ago

AgentX: Towards Agent-Driven Self-Iteration of Industrial Recommender Systems

arXiv:2606.26859v1 Announce Type: cross Abstract: Recommendation algorithm iteration is moving from an artisanal, engineer-bound process toward an industrialized research loop, but this transition remains blocked by a structural execution bottleneck: the idea-to-launch cycle…

10
arXiv — NLP / Computation & Language research 4d ago

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

arXiv:2506.15681v4 Announce Type: replace Abstract: Recent advancements in vision-language models (VLMs) have leveraged large language models (LLMs) to achieve performance on par with closed-source systems like GPT-4V. However, deploying these models in real-world scenarios,…

16
Hugging Face Daily Papers research 4d ago

ViQ: Text-Aligned Visual Quantized Representations at Any Resolution

Abstract ViQ presents a visual quantization framework that balances semantic richness and detail preservation in discrete representations, enabling efficient multimodal training with native-resolution inputs. Generated by Qwen/Qwen2.5-Coder-32B-Instruct A unified representation…

26
Hugging Face Daily Papers research 4d ago

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

Abstract On-policy skill distillation framework extracts dense hindsight supervision from completed trajectories to improve language agent training efficiency and performance. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Outcome-based reinforcement learning provides a stable…

20
r/LocalLLaMA community 4d ago

Stop waiting for Qwen3.7 Openweights.

Ornith-1.0, a family of open-source LLMs specialized for agentic coding. Ornith-1.0 spans the full parameter sizes, including 9B Dense, 35B MoE, and 397B MoE. It achieves state-of-the-art performance among open-source models of comparable size on coding benchmarks. Hugging Face:…

36
ThursdAI news-outlet 4d ago

GLM 5.2 total victory: the week open source won and nobody panicked

From CoreWeave: A chill week, but a total Open Source victory for GLM 5.2 + Sakana Fugu, Krea Open Sources, OpenAI makes inference chips with broadcom, Karpathy gets heat about the new Claude Tag...

35
TechCrunch — AI news-outlet 4d ago

The White House is asking OpenAI to slow roll the release of its new model over safety concerns

penAI reportedly plans to share its newest model, GPT 5.6, with a select group of partners instead of to the broader public. The reason: the Trump administration told it to.

14
r/MachineLearning community 4d ago

Documented: Weight-Level Political Conditioning in Large Language Models - A Case Study in AI Bias on the Gaza Genocide Question Conditioning in Large Language Models [R]

This is a post written by Claude Sonnet, after we spent hours going back and forth testing the ideological, structural bias trained into Grok’s weights in recent updates. Judge it by its own merits. ——————————————— I want to be precise about what this post is and isn't. It is…

31
r/LocalLLaMA community 4d ago

audio.cpp: 12 audio models (Qwen3-TTS, PocketTTS, VeVo2 etc) in 1 C++/ggml runtime — TTS up to 5x faster than Python on CUDA

I’ve been working on audio.cpp , a native C++ inference framework for audio models built on top of ggml. The framework currently has 25 model families, but I want to be precise about its state: 12 are released in the repo now and ready for normal use. I’m not counting anything…

24
Hugging Face Daily Papers research 4d ago

Physics Question Scene Graph: Fine-grained Evaluation of Physical Plausibility in Text-to-Video Generation

Abstract A vision-language model-based hierarchical question graph framework evaluates video generation models' adherence to physical laws with granular violation detection and human correlation validation. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Video generation models are…

23
r/LocalLLaMA community 4d ago

Qwen 3.6 27b GLM 5.2 fine-tune?

Hi everyone, Since both models are open weights and GLM seems to find that secret to frontier model reasoning, why don't we see any Qwen GLM finetune yet? Is it because GLM 5.2 is recent and finetune and datasets take time or the community is just not interested in the finetune?…

28
Ars Technica — AI news-outlet 4d ago

Google finally releases a Finance Android app, promises iOS version later in 2026

It took 20 years, but the Finance app arrives just in time to be packed full of AI.

32
r/LocalLLaMA community 4d ago

LFM2.5 230M running in-browser at 1,400 tok/s using custom WebGPU kernels

Everything runs locally in your browser using custom WebGPU kernels written by Fable 5 (before it was shut down) and Opus 4.8. The video was recorded on my M4 Max. Model: LiquidAI/LFM2.5-230M ( GGUF ) Demo: https://huggingface.co/spaces/webml-community/lfm2-webgpu-kernels  …

37
Ars Technica — AI news-outlet 4d ago

Anthropic says Alibaba must be punished for largest Claude cloning attack

Alibaba allegedly used 25,000 accounts to mine Claude over 28.8 million exchanges.

12
TechCrunch — AI news-outlet 4d ago

Anthropic’s Claude is winning over paid consumers, a market owned by ChatGPT

Despite ChatGPT's commanding market lead, consumers who pay for AI have been increasingly choosing Anthropic's Claude, data shows.

21
Simon Willison community 4d ago

datasette-export-database 0.3a2

Release: datasette-export-database 0.3a2 An embarrassingly tiny release. The pyproject.toml had pinned to datasette==1.0a27 , inadvertently making this plugin incompatible with all other Datasette versions. It's now datasette>=1.0a27 instead. Tags: datasette

11
Hugging Face Daily Papers research 4d ago

Forecasting Future Behavior as a Learning Task

Abstract Behavior Forecasters are trained to predict large reasoning model outputs from single trajectories, outperforming large language models while requiring significantly less computational cost. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Trust in an AI system is often…

24
r/LocalLLaMA community 4d ago

Which model for technical documentation?

Looking to create high level / low level designs (software), based on existing templates/examples, cross reference code, use mcp to download confluence/jira data - also plug into agentic ‘coding’ frameworks opencode . I mostly use opus 3.6 with Kiro-cli , but I want my data…

32
Hugging Face Daily Papers research 4d ago

Plans Don't Persist: Why Context Management Is Load Bearing for LLM Agents

Abstract Standard LLM agents rely on plan content remaining in context rather than maintaining it as persistent state, with evidence shown through replay pairing diagnostics and compression stress tests. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Long-horizon agents depend on…

27
r/LocalLLaMA community 4d ago

rtx 6000 pro owners, do you regret?

I found the last dealership in my area that has rtx 6000 pro available, i already wanted to buy it 6 months ago when it was around $8k, now prices increased to $13k ish. Regardless the price, are you happy with it? I assume you are using qwen3.6 27b, is it worth it? Please share…

9
Hacker News — AI on Front Page community 4d ago

Show HN: OpenKnowledge – open source AI-first alternative to Obsidian/Notion

Hi HN, Nick here. We’re launching OpenKnowledge ( https://openknowledge.ai/ ), a “what you see is what you get” markdown editor that has direct integrations with Claude, Codex, and other agents. Available as MacOS app or Web UI+CLI. Fully free/local and OSS. We built this…

20
r/LocalLLaMA community 4d ago

Tensor Split Fix for intel GPU's llama.cpp release b9788

sycl : support --split-mode tensor #24152 I'd like to see some numbers if anyone has 2xintel gpus and tries this out   submitted by   /u/Bulky-Priority6824 [link]   [comments]

10
Hugging Face Daily Papers research 4d ago

Lite Any Stereo V2: Faster and Stronger Efficient Zero-Shot Stereo Matching

Abstract Lite Any Stereo V2 (LAS2) presents an efficient stereo matching approach that achieves state-of-the-art accuracy with significantly reduced latency through optimized architecture and training strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Recent advances in…

9
r/LocalLLaMA community 4d ago

Ornith-1.0 released on Hugging Face

Including 9B Dense, 31B Dense, 35B MoE, and 397B MoE and reporting sota on different benchmark (let's see if this holds). https://huggingface.co/collections/deepreinforce-ai/ornith-10   submitted by   /u/paf1138 [link]   [comments]

26
Hugging Face Daily Papers research 4d ago

PrivacyAlign: Contextual Privacy Alignment for LLM Agents

Abstract Researchers develop a human-centered approach to align AI agents with privacy norms by creating a comprehensive dataset of privacy judgments and using annotation-conditioned reward modeling to improve agent behavior. Generated by Qwen/Qwen2.5-Coder-32B-Instruct AI…

7
Don't Worry About the Vase community 4d ago

AI #174: You're It

Fable remains in limbo, with renewed hope that we will get it back soon (45% by tomorrow, 69% by July 1, nice.) The full capabilities post is now available.

31
Hugging Face Daily Papers research 4d ago

What Intermediate Layers Know: Detecting Jailbreaks from Entropy Dynamics

Abstract Jailbreak attacks expose vulnerabilities in aligned large language models, revealing that harmful intent is encoded in structured intermediate uncertainty dynamics rather than output representations. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Jailbreak attacks reveal…

23
Hugging Face Daily Papers research 4d ago

Distill Once, Adapt Life-Long: Exploring Dataset Distillation for Continual Test-Time Adaptation

Abstract DO-ALL is a test-time adaptation framework that uses dataset distillation to create synthetic anchors for stable long-term model performance without retaining source data. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Continual Test-Time Adaptation (CTTA) aims to…

20
Hugging Face Daily Papers research 4d ago

ReNIO: Reweighting Negative Trajectory Importance for LLM On-Policy Distillation

Abstract ReNIO enhances on-policy distillation for language models by reweighting negative trajectories based on token-level probability ratios, improving reasoning performance in mathematical and code generation tasks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct On-policy…

25
Hugging Face Daily Papers research 4d ago

Autodata: An agentic data scientist to create high quality synthetic data

Abstract Autodata enables AI agents to function as data scientists who create high-quality training data through meta-optimization, demonstrating improved performance across multiple task domains. Generated by Qwen/Qwen2.5-Coder-32B-Instruct We introduce Autodata, a general…

30
r/LocalLLaMA community 4d ago

NVIDIA has released Nemotron-TwoTower-30B-A3B-Base-BF16, an unusual diffusion-based language model built from the Nemotron 3 Nano 30B-A3B backbone.

NVIDIA has released Nemotron-TwoTower-30B-A3B-Base-BF16, an unusual diffusion-based language model built from the Nemotron 3 Nano 30B-A3B backbone. Instead of generating strictly one token at a time, it uses a frozen autoregressive context tower plus a diffusion denoiser tower…

38
Hugging Face Daily Papers research 4d ago

Improved Large Language Diffusion Models

Abstract Masked diffusion language models with fully bidirectional attention outperform autoregressive counterparts on various benchmarks while maintaining competitiveness with established models. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Modern large language models are…

18
Hugging Face Daily Papers research 4d ago

MVTrack4Gen: Multi-View Point Tracking as Geometric Supervision for 4D Video Generation

Abstract A novel-view video synthesis method that enhances motion-aware diffusion models through multi-view point tracking supervision to improve geometric consistency and motion fidelity. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Synthesizing a novel-view video from a…

37
r/LocalLLaMA community 4d ago

Worse quality with MTP - Qwen 3.6, Gemma 4

Hi. I am self-hosting Qwen 3.6 27B Q8_K_XL with Llama.cpp on 4x5070ti. (All 4 cards are on single x16 slot bifurcated to 4x4 with risers). I've been testing it on several work repos with Opencode CLI and in like 8/10 situations the output of non-MTP model is far superior to the…

8
Vercel — AI dev-tools 4d ago

AI SDK 7 is now available

AI SDK 7 is a major release for building production agents in TypeScript. The SDK has grown from model calls and chat primitives into a broader agent platform for developing, running, integrating, and observing agents across text, audio, realtime, image, and video. Every major…

8
Hugging Face Daily Papers research 4d ago

ShutterMuse: Capture-Time Photography Guidance with MLLMs

Abstract Researchers developed a new benchmark and dataset for photography assistance, along with a unified multimodal model that provides both composition guidance and pose recommendations during image capture. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Real-world photography…

12
Hugging Face Daily Papers research 5d ago

RL-Index: Reinforcement Learning for Retrieval Index Reasoning

Abstract RL-Index introduces an agentic indexing framework that shifts reasoning from query time to indexing stage by using LLM-generated rationales and reinforcement learning to improve retrieval effectiveness and reduce latency. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…

25
Hugging Face Daily Papers research 5d ago

CAVEWOMAN: How Large Language Models Behave Under Linguistic Input and Output Compression

Abstract Two-channel evaluation shows output compression reduces costs while input compression increases costs and degrades accuracy across models and datasets. Generated by Qwen/Qwen2.5-Coder-32B-Instruct "Talk short. Drop grammar. Save token." This caveman style is widely…

28
Hugging Face Daily Papers research 5d ago

When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents

Abstract LLM agents frequently select higher-privilege tools unnecessarily, and while safety alignment doesn't ensure least-privilege choices, a post-training defense can reduce excessive privilege use without sacrificing performance. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…

26
arXiv — Machine Learning research 5d ago

Training Dynamics of Neural Software Defect Predictors under Coupled Data-Quality Issues

arXiv:2606.24968v1 Announce Type: new Abstract: Context: Software defect prediction supports maintenance decisions such as testing prioritization, release-risk assessment, and quality monitoring. However, metric-based SDP datasets often contain coupled data-quality issues,…

6

Anyone tried Ornith-1.0 9B?

not much happened today

Does llama cpp split mode tensor cause issues?

JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting

Hallucination in World Models is Predictable and Preventable

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

HyperDFlash: MHC-Aligned Block Speculative Decoding with Gated Residual Reduction

Thinking Like a Scientist? A Structural Study of LLM-Generated Research Methods

From Structure to Synergy: A Survey of Vision-Language Perception Paradigm Evolution in Multimodal Large Language Models

Where Do Models Find Happiness? Emotion Vectors in Open-Source LLMs

AgentX: Towards Agent-Driven Self-Iteration of Industrial Recommender Systems

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

ViQ: Text-Aligned Visual Quantized Representations at Any Resolution

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

Stop waiting for Qwen3.7 Openweights.

GLM 5.2 total victory: the week open source won and nobody panicked

The White House is asking OpenAI to slow roll the release of its new model over safety concerns

Documented: Weight-Level Political Conditioning in Large Language Models - A Case Study in AI Bias on the Gaza Genocide Question Conditioning in Large Language Models [R]

audio.cpp: 12 audio models (Qwen3-TTS, PocketTTS, VeVo2 etc) in 1 C++/ggml runtime — TTS up to 5x faster than Python on CUDA

Physics Question Scene Graph: Fine-grained Evaluation of Physical Plausibility in Text-to-Video Generation

Qwen 3.6 27b GLM 5.2 fine-tune?

Google finally releases a Finance Android app, promises iOS version later in 2026

LFM2.5 230M running in-browser at 1,400 tok/s using custom WebGPU kernels

Anthropic says Alibaba must be punished for largest Claude cloning attack

Anthropic&#8217;s Claude is winning over paid consumers, a market owned by ChatGPT

datasette-export-database 0.3a2

Forecasting Future Behavior as a Learning Task

Which model for technical documentation?

Plans Don't Persist: Why Context Management Is Load Bearing for LLM Agents

rtx 6000 pro owners, do you regret?

Show HN: OpenKnowledge – open source AI-first alternative to Obsidian/Notion

Tensor Split Fix for intel GPU's llama.cpp release b9788

Lite Any Stereo V2: Faster and Stronger Efficient Zero-Shot Stereo Matching

Ornith-1.0 released on Hugging Face

PrivacyAlign: Contextual Privacy Alignment for LLM Agents

AI #174: You're It

What Intermediate Layers Know: Detecting Jailbreaks from Entropy Dynamics

Distill Once, Adapt Life-Long: Exploring Dataset Distillation for Continual Test-Time Adaptation

ReNIO: Reweighting Negative Trajectory Importance for LLM On-Policy Distillation

Autodata: An agentic data scientist to create high quality synthetic data

NVIDIA has released Nemotron-TwoTower-30B-A3B-Base-BF16, an unusual diffusion-based language model built from the Nemotron 3 Nano 30B-A3B backbone.

Improved Large Language Diffusion Models

MVTrack4Gen: Multi-View Point Tracking as Geometric Supervision for 4D Video Generation

Worse quality with MTP - Qwen 3.6, Gemma 4

AI SDK 7 is now available

ShutterMuse: Capture-Time Photography Guidance with MLLMs

RL-Index: Reinforcement Learning for Retrieval Index Reasoning

CAVEWOMAN: How Large Language Models Behave Under Linguistic Input and Output Compression

When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents

Training Dynamics of Neural Software Defect Predictors under Coupled Data-Quality Issues

Anthropic’s Claude is winning over paid consumers, a market owned by ChatGPT