r/LocalLLaMA
500 articles archived · Visit source ↗ · RSS
-
r/LocalLLaMA community 6h ago
Introducing LongCat-2.0 - , a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'.
  submitted by   /u/AnticitizenPrime [link]   [comments]
18 -
r/LocalLLaMA community 9h ago
on Dario’s statement
  submitted by   /u/turtle-toaster [link]   [comments]
32 -
r/LocalLLaMA community 10h ago
It’s time, Sam, it’s time.
I mean….. I’m no CEO…. but it seems like this would be the absolute perfect time to drop a super powerful GPT-OSS-2 to throw a big ol’ wet blanket on Anthropic’s IPO. It doesn’t need to be like frontier or anything, just a 20b and a 120b that is as fast as the old versions, add…
31 -
-
r/LocalLLaMA community 12h ago
Amodei: "Open Source Models Will Eat Your Children"
  submitted by   /u/johnnyApplePRNG [link]   [comments]
35 -
r/LocalLLaMA community 13h ago
Samsung, SK hynix, Micron Sued in US Over Memory Price Fixing
  submitted by   /u/johnnyApplePRNG [link]   [comments]
15 -
r/LocalLLaMA community 14h ago
Effect of GLM 5.2 !!
All hail Z. Ai   submitted by   /u/Independent-Wind4462 [link]   [comments]
13 -
-
r/LocalLLaMA community 15h ago
Mellum2 local deployments
Hey local community, I work at JetBrains with the team that trained Mellum2 models — 12B-2.5A LLMs. Those models are trained completely from scratch, targeting fast inference: our primary goal were H100/H200s prod deployments, but local deployments are good as well. We…
37 -
-
r/LocalLLaMA community 16h ago
Kimi and GLM on frontier code
  submitted by   /u/Charuru [link]   [comments]
36 -
-
-
-
-
-
r/LocalLLaMA community 21h ago
GLM 5.2 Q1_S vs Qwen 27B Q8
TL;DR; GLM-5.2 Q1_S beats Qwen 3.6 27B Q8, both run at KV Q8 edit: GLM run a K & V Q8, Qwen run with KV cache at full FP16., with preserve thinking on. Disclaimer : This is a hobby/amateur comparison with n=1, so go easy on it. I just thought it would be fun to share. The…
11 -
r/LocalLLaMA community 21h ago
LibreChat or OpenWebUI ?
Hello, I have a friend that while technical, it doesn't know too much about AI, I've helped them with the infrastructure setting and that works like a charm, but he's interested in a thing that where I don't have too much experience with, and that is flashy chat "do everything"…
14 -
r/LocalLLaMA community 21h ago
MiCA is now part of Hugging Face PEFT
Glad to share that MiCA, short for Minor Component Adaptation, has now been merged into the HuggingFace PEFT library. It is not yet included in the latest PyPI release, but you can already install it directly from PEFT main: pip install --upgrade…
18 -
r/LocalLLaMA community 22h ago
AMD MI210 64GB vs DCU K100 64GB
On the Chinese eBay there is a many DCU K100 64 GB GPU available for a very attractive price, between 6000 RMB and 19 000 (air or water cooled versions, new or second hands), and 15 000 to 20 000 for the AMD MI210 (4000-6000 RMB for the PCIE bridge). There is very little…
25 -
-
-
-
-
-
r/LocalLLaMA community 1d ago
China Has Matched Anthropic in Cybersecurity, Resetting AI Race
  submitted by   /u/pscoutou [link]   [comments]
30 -
-
-
-
-
r/LocalLLaMA community 1d ago
DFlash support merged into llama.cpp
  submitted by   /u/sammcj [link]   [comments]
36