r/LocalLLaMA · June 5, 2026 · 1 min read

Finally finished my LLM server: EPYC 9575F, 4× RTX 3090 (96GB VRAM), 768GB ECC RAM

#inference

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Like Read original ↗

Finally finished my LLM server: EPYC 9575F, 4× RTX 3090 (96GB VRAM), 768GB ECC RAM

Took a while, but Nalthis is finally up and assembled.

Specs:

Supermicro H13SSL-N
AMD EPYC 9575F (64C/128T Zen 5)
768GB DDR5-5600 ECC RDIMM
4× RTX 3090 (96GB VRAM total)
1× 2TB NVMe OS
2× 3.94TB NVMe data
2050W ATX 3.1 PSU
Corsair 9000D

Planned use:

vLLM - high throughput small models
llamacpp - larger reasoning models

I have been making a space simulation and finally ready to integrate AI into how the NPCs doing planning, hoping to get decent throughput on smaller models with lots of requests

The original plan involved a lot more MCIO risers and custom mounting, but I was able to fit two of the 3090s directly on the motherboard and front-mount the other two.

Planning to run all four cards power-limited to 250W since this box is primarily for LLM inference.

The 9000D has been surprisingly good for a 4×3090 build. I also used these fan mounts for additional airflow:

https://www.thingiverse.com/thing:2804306

Still need to finish thermal testing, but the hardware side is finally done.

Head of Cluster Operations: Stannis leading from the couch as well

submitted by /u/C0smo777
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA