how would you set up a local llm server for a business of 7 people?
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Okay so i've been stalking this sub for some time and i run the occasional small 2-8b model on my laptop (not the best) for fun
but say my role at a company is to set up a local LLM since we obviously don't want confidential data going to other companies etc /
main use case would be queries, rag, general use nothing crazy except for maybe 1 or 2 people using it for programming purposes.
i was thinking of gemma 4 26/31 or qwen 3.6 27/35. how do these models scale with concurrent users? i know i could run one of these on a 5090 and some extra or a 48gb macbook pro w unified memory but not sure how these scales with multiple users.
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.