I compared all specs of the major GPUs/machines that are being used here, because bandwidth is not everything. Some of ya'll need a reality check.
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| Hot takes: I understand that this sub is now filled with gamers who do nothing but ERP with anime waifus on their setups, but for people who do something actually productive, prefill is still very important and this is completely hidden by the "generate 1000 word story" benchmarks that most posts or big AI youtube channels do. Especially with multimodal models that eat up context like mad. I'm still collecting data for prefill and generation charts I'd like to do in the future... I also couldn't find much reliable power data, so if you could provide that from your own setups in the comments I'll be glad. Thanks for coming to my ted talk. [link] [comments] |
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.