New Release of ROCm based MLX LLM Engine - lemon-mlx-engine
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Hey everyone lemon-mlx-engine just got done integrating TheRock / ROCm 7.13 into the lemon-mlx-engine which means you get to try the latest ROCm on your local hardware with the MLX engine! This also includes various bug fixes and kernel fixes we have been seeing in Qwen3, 3.5 and 3.6 MoE and dense. try it out! https://github.com/lemonade-sdk/lemon-mlx-engine/releases/tag/b1034-stable
[link] [comments]
More from r/LocalLLaMA
-
How small can the orchestration model in an agent be? (separating it from code-gen — that obviously wants a big model)
May 22
-
BeeLlama v0.2.0 – major DFlash update. Single RTX 3090: Qwen 3.6 27B up to 164 tps (4.40x), Gemma 4 31B up to 177.8 tps (4.93x). Prompt processing speed near baseline.
May 22
-
trained a prompt injection detector using ml-intern and DeepSeek v4 Flash, runs in the browser
May 22
-
ByteShape Qwen3.6-35B-A3B: 30% faster than Unsloth IQ on 6GB VRAM laptop
May 22
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.