Reduce your GPU power limit
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| I'd like to note, I'm effectively a layman at this and have no idea what I'm talking about. Inspired by another post, I wanted to do some testing on power limit adjustments impact on token processing and generation. I have no idea if this applies to more pro-hardware. But it's absolutely applicable on your gaming GPU! Just open up MSI afterburner from back in highschool when you thought you were going to overclock. I believe the testing was with qwen3.5:9b, but it was a few days ago and I forgot to write it down. The second image is data from testing adjustments to core and memory clocks. Very little impact, though if you're really trying to squeeze every last token out, increasing your memory clock by 700-1000mhz will improve token generation moderately across the board (did not test this at stock power limit, but now I'm curious). The only test I think could still be helpful, would be to log the actual power draw by the system, though that would only really be useful to see if adjusting core clocks can impact power consumption and performance simultaneously, so I haven't bothered yet. TG128 -> generate 128 tokens [link] [comments] |
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.