r/LocalLLaMA · June 10, 2026 · 1 min read

I'm brand new to running LLMs and the sheer number of tools is overwhelming

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Hey everyone. I'm brand new to running LLMs in general, even more new to running them locally, and the sheer number of tools available is absolutely overwhelming.

Regarding applications, I look at github and see so many different options that I don't know what to pick. Can't really fully decipher the differences between the tools either, mostly because their descriptions/taglines are filled with so many AI buzzwords. What's the go-to GUI for Windows? The built-in ollama GUI seems like it's pretty barebones.

Regarding model differences like between qwen vs gemma, is there a resource that shows a comprehensive benchmark?

I currently have ollama installed on Windows, downloaded gemma4 and qwen3.6 with

ollama pull gemma4 ollama pull qwen3.6

I don't understand the small differences between models, for example qwen3.6:27b vs qwen3.6:35b. I see the size is 17GB vs 24GB, but does one run faster than the other? If the entire model fits within VRAM, should I always use the larger one? How will I know if a model is too big or will run super slow? Purely based on the size listed on https://ollama.com/library/?

I also found this post: https://old.reddit.com/r/LocalLLaMA/comments/1snxzqi/its_just_me_or_qwen36_feels_kinda_dumb_or_its/

how do i decipher the differences between the 3 models tested? I see lots of letters and numbers that don't mean much to me

gemma4-26B-A4B-it-UD-Q4_K_M
gemma4-31B-it-Q4_K_M
qwen3.6-35B-A3B-UD-IQ4_XS

My specs:

Component	Item
CPU	9950X3D
RAM	64GB DDR5 @ 6000MT/s
GPU	RTX 5090

I'm open to any and all tips you're willing to provide. TIA!

submitted by /u/cryptospartan
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA