It felt good to return my Asus Spark
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
It's an incredible little package but too expensive of a price to pay for the performance and I simply didn't want to be part of the great "Superchip lie" - it could be super, but its super ruined by its limited memory bandwidth even though it *could* be 2x throughput - it isn't. (The c2c is 600gb/sec but the memory isn't)
I wanted better experience for larger models and these fail miserably at 27b and do ok at MoE's which actually shine on much cheaper hardware, and I don't have the interest or wallet to buy 3-8 of these to go "All in private" and still just be at a few tokens a second. Qwen 3.5 122b a10b was about the perfect sweet spot for this hardware but i'm not sure we'll see those moving forward and that's part of what stinks about this - if someone could train models for this architecture, they may run OK, but no one is. Not even Nvidia.
It's a product looking for a new market while not doing that great in any of them but pushing a huge premium of a price tag. They honestly should have put in additional memory controllers and made the current chip design a more affordable 32gb system and gave the DGX spark users a full 600gb/s capability for the price they're demanding.
Will the RTX spark offer some of these options and will they do so at a better price point? I know the conntectX port drove a huge chunk of cost but still think the 128gb is largely wasted until they fix the memory controller / bandwidth perf.
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.