Looking to migrate off of Ollama and LMStudio
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Hello,
I'm currently using Ollama / lm studio for things like code inference and proof reading emails, etc. Definitely not experienced in this space but looking to grow.
It's been working great but it's a bit slow at times. I use Gemma 4 / Qwen, I also recently tried using OpenbioLLM 70B for some health questions (for testing) In addition to hooking up vscode / jet brains stuff to it. I also use it open webUI so my wife and I have our own chats going
I was thinking of trying either vllm or llama.cpp to see if there are some improvements on speed.
Specs 64Gb ram + backwell 5000 Ubuntu 26.04
I asked chatgpt which one I should use and it told me to just stick with ollama :/
Thanks for your time.
[link] [comments]
More from r/LocalLLaMA
-
"Elias Thorne" is what eight different LLMs name a lighthouse keeper. He's also selling cancer treatment advice on Amazon
May 17
-
webui: support video files as input by foldl · Pull Request #22830 · ggml-org/llama.cpp
May 17
-
G4-Meromero-31B-Uncensored-Heretic Is Out Now, a Finetune of Gemma 4 31B It Designed for Creative Tasks, With Kld of 0.0100 and 15/100 Refusals!
May 17
-
Ran the same models across Strix Halo, RTX 3090, and RTX 5070 because I wanted my own numbers
May 16
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.