r/LocalLLaMA · May 15, 2026 · 1 min read

how would you set up a local llm server for a business of 7 people?

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Okay so i've been stalking this sub for some time and i run the occasional small 2-8b model on my laptop (not the best) for fun

but say my role at a company is to set up a local LLM since we obviously don't want confidential data going to other companies etc /

main use case would be queries, rag, general use nothing crazy except for maybe 1 or 2 people using it for programming purposes.

i was thinking of gemma 4 26/31 or qwen 3.6 27/35. how do these models scale with concurrent users? i know i could run one of these on a 5090 and some extra or a 48gb macbook pro w unified memory but not sure how these scales with multiple users.

submitted by /u/snowieslilpikachu69
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA