r/LocalLLaMA · · 2 min read

If it doesn't make my PP better, I don't want it

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

If it doesn't make my PP better, I don't want it

Highlights:

  • 4 x 48GB modded 4090s - 192GB VRAM
  • 128GB DDR5
  • Pro WS WRX90E-SAGE SE
  • 3000w PSU
  • 240V/30A dryer line

Q. Is putting a server on a dryer line a good idea?

A. No, or emphatically yes. Splitters on this line are not code compliant, so I have to turn off the server to use the dryer, OR buy a smaller dryer that can go on the 20A. Also I've had two nuisance trips while idle in the past month due to laundry GFCI. A dual conversion pure sine wave UPS is on the way. This room is my only option in the house.

Q. Is it super hot?

A. YES. But the laundry room has an exhaust fan. I set this up with a thermometer to automatically exhaust at ~79°F. It works surprisingly well and the room is usually only a few degrees warmer than outside. The cards themselves are like 1/2 a hand dryer idle and 2-3 hand dryers at full blast. This is going to heat half my house in the winter. I have never seen the cards go beyond ~71°C yet.

Q. Is it noisy?

A. YES. It's barely audible outside of the room, though.

Use-case: I have been working on a private Jarvis-class assistant for a while now. It has premium voice capabilities including, most notably, the ability to change voices mid-turn to speak as different characters for effect. This is absolutely surreal. But it also has voice verification, wake words with continuous conversation, turn-taking, long term memory, a dynamic system prompt, Home Assistant integration, Hermes Agent integration, deep research capabilities. It is deployed across the house on clients with conference speaker-mics. Of course, I'm always experimenting with other stuff as well.

Performance:
I have tried many models including high quants of Qwen 397B, MiniMax M3, Nemotron 3 Ultra, GLM 4.7, and an extremely lobotomized GLM 5.2. It's actually very difficult to find anything as good, let alone better than Gemma 4 31B QAT. MiMo V2.5 is looking pretty good over the past day or so I've been running it, although I have encountered a few loops. This model is shockingly fast for the size.

submitted by /u/dangerous_inference
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA