r/LocalLLaMA · May 26, 2026 · 3 min read

Small comparison on full compute performance (Anima) of 5090 (600,475 and 400W) vs 6000 PRO MaxQ (325W), and 6000 PRO WS/SE (600W).

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Like Read original ↗

Small comparison on full compute performance (Anima) of 5090 (600,475 and 400W) vs 6000 PRO MaxQ (325W), and 6000 PRO WS/SE (600W).

Hello guys, hoping you're doing fine!

After selling some cards, I got a 6000 PRO MaxQ, which it's power limit range from 250W to 325W.

I still have a 5090, which it's power limit range ranges from 400W to 600W.

Since I had these, and I like to do compute for diffusion (txt2img, txt2video, img2img, etc), I wanted to compare them.

I also rented on runpod, a 6000 PRO WS edition, which it's power limit ranges from 150W to 600W (yes, lower than the MaxQ)

Important note: I did undervolt+overclock the 5090 and the 6000 PRO MaxQ. I can't modify the clocks or power on the rented GPUs on runpod.

So for this test, I ran these settings for the software:

Torch 2.12.0.dev20260310+cu130 for the 5090 and 6000 PRO MaxQ.
Torch 2.12.0+cu130 stable for the 6000 PRO WS.
Sageattention 2.1 (on commit e9b072f0fc2682f104abbda306af3d42fc33b969), self built on CUDA 13.1.
Forge neo on commit 91c2e0adbefd06bc3475da34fbdb21a4c5736faa
Installed extensions for RTX Upscaling (https://github.com/Haoming02/sd-forge-nvidia-vfx) and for extra samplers (https://github.com/Panchovix/sd_forge_neo_extra_samplers)

I ran these settings for the samplers and steps:

Sampler settings

On text:

EXP Heun 2 x0 SDE for first 25 steps
ER SDE for 10 hires pass steps
Upscale by 1.5x
896x1088 resolution
Batch size 4
CFG 5
Shift 3
Denoise Strength: 0.2
Upscaler: NVIDIA Ultra
Seed: 999999999

Prompt used was:

Positive:

masterpiece, high quality, score_7, '@' \(orange maru\), sfw, 1girl, solo, fully clothed, cynthia \(sygna suit\) \(aura\) \(pokemon\), pokemon masters ex, blonde hair, long hair, ponytail, hair over one eye, grey eyes, :|, full body, blurry background

Negative:

worst quality, low quality, bad anatomy, (jpeg artifacts:0.8), watermark, sketch, no pupils,

For the hardware, I ran them headless, (with LACT):

RTX 5090:
- 2930Mhz max core clock
- 1000Mhz core clock offset
- +4400Mhz on VRAM (total 16000Mhz)
- 400, 475 and 600W
RTX 6000 PRO MaxQ:
- 550 core clock offset
- No max core clock
- +5270Mhz on VRAM (total 16000Mhz)
- 325W
RTX 6000 PRO WS:
- Stock
- 600W

With all this data, I have these results:

GPU	Power	Notes	Time	VS Baseline
RTX 5090	600W	Baseline (OC + UV)	36s	-
RTX 6000 PRO SE/WS	600W	No tuning	39s	-8.3%
RTX 5090	475W	UV+OC	42s	-16.7%
RTX 6000 PRO MaxQ	325W	OC	48s	-33.3%
RTX 5090	400W	UV+OC	48s	-33.3%

Or also, using the 5090 at 400W as baseline:

GPU	Power	Notes	Time	Faster vs Baseline
RTX 5090	400W	Baseline (OC + UV)	48s	-
RTX 6000 PRO MaxQ	325W	OC	48s	0%
RTX 5090	475W	UV+OC	42s	+12.5%
RTX 6000 PRO WS/SE	600W	No tuning	39s	+18.8%
RTX 5090	600W	UV+OC	36s	+25.0%

While running this task, the cards hovered around these core clocks:

5090 600W: ~2500Mhz core clock
5090 475W: ~2100Mhz core clock
6000 PRO WS/SE 600W: ~2200Mhz core clock
5090 400W: ~1800Mhz core clock
6000 PRO MaxQ: 1400-1500Mhz core clock.

So, as you can see, the 5090 is 25% faster than the 6000 MaxQ here but by using 84% more power.

At the same time, the 6000 PRO WS/SE, untuned is 18.8% faster and also using 84% more power. In theory though, if you undervolt + overclock the WS/SE, it would be faster than the 5090.

And lastly, the 6000 PRO MaxQ performs the same as 5090 while using 75% of the power, which is quite impressive for how much power limited it is.

If anyone with a tuned 6000 PRO/WS can do the test, let me know!

submitted by /u/panchovix
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA