News / #version-bump Tag Version Bump 242 articles archived under #version-bump · RSS Sign in to follow r/LocalLLaMA community 29d ago Llama Studio v0.2.0 I have made an update to my llama-server WebUI based on some awesome feedback and interaction with the community. 1) JSON model config replaced by per-model shell scripts. Run from CLI, paste from unsloth, email to your buddy or post to reddit: Using real shell scripts to store… 17 Hacker News — AI on Front Page community 1mo ago The AV2 Video Standard Has Released (Final v1.0 Specification) Article URL: https://av2.aomedia.org Comments URL: https://news.ycombinator.com/item?id=48340910 Points: 203 # Comments: 80 34 r/LocalLLaMA community 1mo ago this new Moss tts 1.5 is damn good with voice cloning https://huggingface.co/spaces/OpenMOSS-Team/MOSS-TTS-v1.5 I prefer this over fish audio s2 pro because fish audio dont allow commercial use Long Cat DiT 3.5 is also a another good model.   submitted by   /u/9r4n4y [link]   [comments] 38 vLLM releases dev-tools 1mo ago v0.22.1rc0: [CI] Make Model Executor test hangs fail fast with a traceback (#43971) Signed-off-by: khluu [email protected] Co-authored-by: Claude [email protected] 10 llama.cpp releases dev-tools 1mo ago b9411 model : support for DeepseekV32ForCausalLM with generic DeepSeek Sparse Attention (DSA) implementation ( #23346 ) llama : support DeepSeek V3.2 model family (with DSA lightning indexer) convert : handle DeepseekV32ForCausalLM architecture ggml : support for f16 GGML_OP_FILL… 34 Ollama releases dev-tools 1mo ago v0.30.0-rc31 ci fix - non-shallow MLX checkout 29 Ollama releases dev-tools 1mo ago v0.30.0-rc30 version bump 18 Anthropic SDK (Python) releases dev-tools 1mo ago v0.105.2 0.105.2 (2026-05-29) Full Changelog: v0.105.1...v0.105.2 14 Anthropic SDK (Python) releases dev-tools 1mo ago v0.105.1 0.105.1 (2026-05-29) Full Changelog: v0.105.0...v0.105.1 Chores internal: use Trusted Publishing for PyPI releases ( 1d04fc5 ) 34 Ollama releases dev-tools 1mo ago v0.30.0-rc29 review comments 24 Anthropic SDK (Python) releases dev-tools 1mo ago v0.105.0 0.105.0 (2026-05-28) Full Changelog: v0.104.1...v0.105.0 Features api: Add support for claude-opus-4-8, mid-conversation system blocks, and usage.output_tokens_details ( f18b014 ) support custom file size caps ( #1825 ) ( 7e5f944 ) Chores examples: rename managed-agents… 12 r/LocalLLaMA community 1mo ago Krasis update: Qwen3.6-35B-A3B (Q4) at reading speed, 1x 8GB 3070 Mobile laptop (32GB RAM) Context Krasis is an LLM runtime for running models that don't fit into VRAM. Krasis streams the model through VRAM from system RAM efficiently and handles prefill and decode as separate architectures and optimised usecases. Latest results (v1.0 release) 1x Laptop RTX 3070… 22 vLLM releases dev-tools 1mo ago v0.22.0rc3: [BugFix] Fix hard-coded timeout for multi-API-server startup (#43768) Signed-off-by: Vadim Gimpelson [email protected] Co-authored-by: Nick Hill [email protected] 20 vLLM releases dev-tools 1mo ago v0.22.0: [BugFix] Fix hard-coded timeout for multi-API-server startup (#43768) Signed-off-by: Vadim Gimpelson [email protected] Co-authored-by: Nick Hill [email protected] 29 vLLM releases dev-tools 1mo ago v0.22.0rc2: Fix early CUDA init (#43791) Signed-off-by: Harry Mellor [email protected] (cherry picked from commit 41688e2 ) 11 Ollama releases dev-tools 1mo ago v0.30.0-rc28 add OLLAMA_IGPU_ENABLE and largely disable iGPUs by default 14 ComfyUI releases dev-tools 1mo ago v0.22.3 ComfyUI v0.22.3 36 r/MachineLearning community 1mo ago Best Text to Text Translation Model? [D] I'm working on a project that translates any language into English. So far, I've tried NMT models like NLLB, MADLAD, and SeamlessM4T v2. The main issue is that they struggle with proper nouns such as: - names - places - dates - organizations I also tried LLMs like Gemma 4, Qwen… 22 r/LocalLLaMA community 1mo ago Info: Nvidia Cuda 13.3 landed Cuda 13.3 Downloads Release Notes Anybody already tried llama.cpp with 13.3?   submitted by   /u/parrot42 [link]   [comments] 18 vLLM releases dev-tools 1mo ago v0.22.0rc1: [MRV2][BugFix] Fix KV connector handling in spec decode case (#43719) Signed-off-by: Nick Hill [email protected] Co-authored-by: Wentao Ye [email protected] (cherry picked from commit 8c94938 ) 18 Ollama releases dev-tools 1mo ago v0.30.0-rc27 ci: windows path workaround for CPU build 20 Ollama releases dev-tools 1mo ago v0.30.0-rc26: Merge remote-tracking branch 'upstream/main' into llama-runner-phase-0 Conflicts: server/images.go server/images_test.go 33 r/LocalLLaMA community 1mo ago OpenMOSS-Team/MOSS-TTS-v1.5 · Hugging Face MOSS-TTS-v1.5 MOSS-TTS-v1.5 is continued from MOSS-TTS 1.0 . It preserves the main 1.0 capabilities, including zero-shot voice cloning, long-form speech generation, token-level duration control, Pinyin/IPA pronunciation control, multilingual synthesis, and code-switching. For… 10 r/LocalLLaMA community 1mo ago Harbor v0.4.19 - vllm/sglang/llama.cpp launch codex/claude/pi/opencode I'm usually not posting about Harbor releases out of the respect for the community here, but I think v0.4.19 might save a lot of people some time. Harbor can now launch your local agentic coding tools with local inference backends. For example, to run pi + vllm: # model… 26 Ollama releases dev-tools 1mo ago v0.30.0-rc25 ci: fix WoA cross-compile 13 r/LocalLLaMA community 1mo ago MiMo-V2.5-coder Hi, I've just released MiMo-V2.5-coder. If you have 128 Gb, this is an excellent alternative to Qwen3.6 and DS4, especially for coding. Fast, and with reliable tool calling. Give it a try!   submitted by   /u/jedisct1 [link]   [comments] 7 Ollama releases dev-tools 1mo ago v0.30.0-rc24 version bump 20 r/MachineLearning community 1mo ago LQS v3.1 — an open methodology for rating AI training data (multi-oracle consensus + signed certificates) [P] Solo author here. I spent the last six months building (and then sunsetting) a marketplace for AI training data. The marketplace failed for an interesting reason: the actual bottleneck isn't supply. There's tons of data. The bottleneck is that buyers can't independently evaluate… 14 r/LocalLLaMA community 1mo ago BeeLlama v0.2.0 – major DFlash update. Single RTX 3090: Qwen 3.6 27B up to 164 tps (4.40x), Gemma 4 31B up to 177.8 tps (4.93x). Prompt processing speed near baseline. BeeLlama v0.2.0 is here! Not quite a pegasus, but close enough. GitHub | Qwen 3.6 27B Quick Start | Gemma 4 31B Quick Start Full Gemma 4 31B support with efficient DFlash implementation and vision. Major Qwen 3.6 27B performance update from lower DFlash overhead, cleaner prefill… 28 ComfyUI releases dev-tools 1mo ago v0.22.2 ComfyUI v0.22.2 6 r/LocalLLaMA community 1mo ago trained a prompt injection detector using ml-intern and DeepSeek v4 Flash, runs in the browser Trained a prompt injection classifier using ml-intern + DeepSeek v4 Flash. DistilBERT, F1 99%, ONNX int8, ~65 MB, runs in browser with Transformers.js v3. You can try it here: https://huggingface.co/spaces/av-codes/prompt-injection-detector --- I've been interested in prompt… 5 Ollama releases dev-tools 1mo ago v0.30.0-rc23 lint fix 8 Anthropic SDK (Python) releases dev-tools 1mo ago v0.104.1 0.104.1 (2026-05-21) Full Changelog: v0.104.0...v0.104.1 Bug Fixes streaming: carry encrypted_content through beta compaction accumulator ( #1821 ) ( f7a720c ) 29 Hacker News — AI on Front Page community 1mo ago Deno 2.8 Article URL: https://deno.com/blog/v2.8 Comments URL: https://news.ycombinator.com/item?id=48234380 Points: 215 # Comments: 98 27 ComfyUI releases dev-tools 1mo ago v0.22.1 ComfyUI v0.22.1 18 OpenAI Python SDK releases dev-tools 1mo ago v2.38.0 2.38.0 (2026-05-21) Full Changelog: v2.37.0...v2.38.0 Features api: api update ( 33d1d01 ) api: manual updates ( a21700a ) api: update OpenAPI spec or Stainless config ( 00265c5 ) Chores api: docs updates ( ee10152 ) check release PR custom code sync ( 2638779 ) remove release… 26 Anthropic SDK (Python) releases dev-tools 1mo ago v0.104.0 0.104.0 (2026-05-21) Full Changelog: v0.103.1...v0.104.0 Features api: Add support for thinking-token-count beta for estimated tokens in thinking block deltas when streaming ( 80d0fdf ) 7 Ollama releases dev-tools 1mo ago v0.30.0-rc22 version bump 5 r/LocalLLaMA community 1mo ago LlamaStation v0.9 — llama.cpp GUI for Windows with multi-backend support, TurboQuant, MTP and more I've been building this for the past few months as a side project — started because I didn't want to run llama.cpp from the command line every time I wanted to try a model. I just wanted something that worked with a click. Fair warning: I'm not a developer. This is 100% vibe… 33 ComfyUI releases dev-tools 1mo ago v0.22.0 ComfyUI v0.22.0 30 llama.cpp releases dev-tools 1mo ago b9246: snapdragon: update toolchain to v0.6 (#23369) snapdragon: update compiler flags to enable all CPU features snapdragon: update readme to point to toolchain v0.6 snapdragon: bump toolchain docker to v0.6 37 r/LocalLLaMA community 1mo ago Google AI Edge Gallery v1.0.13 & v1.0.14 updates: Gemma 4 Multi-Token Prediction, Pixel TPU support, experimental MCP, new skills, now saves chat history   submitted by   /u/AnticitizenPrime [link]   [comments] 26 Ollama releases dev-tools 1mo ago v0.30.0-rc21 improve windows exit error logs 32 r/LocalLLaMA community 1mo ago Why is LM-Studio download page showing me 0.4.7 to download when the latest version is 0.4.13? I'm currently running LM-Studio 0.4.12. In the app if I check for updates it says there's a new version (0.4.13), I can read the changelog for 0.4.13, but when I go to https://lmstudio.ai/download it shows 0.4.7. What's going on here? Anyone knows?   submitted by  … 37 Hugging Face official-blog 1mo ago OlmoEarth v1.1: A more efficient family of models Back to Articles OlmoEarth v1.1: A more efficient family of models Team Article Published May 19, 2026 Upvote 1 Kyle Wiggers Ai2Comms allenai 🧠 Models: https://huggingface.co/collections/allenai/olmoearth | 📄 Tech Report: https://allenai.org/papers/olmoearth_v1_1 | 💻 Code:… 38 r/LocalLLaMA community 1mo ago Here are my KV cache quantization benchmarks: TurboQuant is overrated but saved by TCQ, q5 deserves more attention, and symmetric q8 might be a waste of VRAM Greetings from former TurboQuant's biggest defender, now middle-sized niche-aware TurboQuant defender. Today I'm presenting to you the results of me thoroughly exploring the world of PPL and KLD benchmarks with my single RTX 3090 using BeeLlama v0.1.2 , with some backstory of… 31 Anthropic SDK (Python) releases dev-tools 1mo ago v0.103.1 0.103.1 (2026-05-19) Full Changelog: v0.103.0...v0.103.1 Bug Fixes runner: skip tool calls SessionToolRunner does not own ( #1817 ) ( 9425c6a ) 8 Anthropic SDK (Python) releases dev-tools 1mo ago v0.103.0 0.103.0 (2026-05-19) Full Changelog: v0.102.0...v0.103.0 Features client: Add support for self-hosted sandboxes in CMA with sandbox helpers ( e5625b0 ) 22 Ollama releases dev-tools 1mo ago v0.30.0-rc20 ci: fix cache miss on rocm build 6 Ollama releases dev-tools 1mo ago v0.30.0-rc19 missing file 27 Page 3 of 5 · 242 articles ← Newer Older →