Local vs Frontier on low-level systems engineering
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Hey r/LocalLLaMA,
Before anyone jumps on me, this is absolutely not a post about how great Qwen is 😄
Even though I use Qwen 3.6 35B-A3B daily, I’ve found a massive gap between Opus and every other model, local or frontier (including GPT 5), when it comes to low-level systems engineering. I’m using that phrase deliberately to avoid just calling it "hacking" 😄
I’ve spent the last few months burning through a serious amount of Opus tokens on this project: https://github.com/mihailescu2m/woodbourne
To give you the nutshell version: I wanted to modify the firmware of an AirPlay speaker to disable an annoying idle standby timer that would put the speaker to sleep after 20 minutes of inactivity. Where local models and even GPT completely hit a wall was at the very beginning - they couldn't even accurately map out the firmware layout, let alone correctly reverse-engineer the CRC structure. It was only Opus that eventually figured out the checksum constraints, correctly disassembled massive amounts of the firmware code, and automated binary patching so I could safely generate and test dozens of different patched firmware variants.
Going through this process really solidified my impression that for hard, low-level binary analysis, Opus sits on an entirely different tier than everything else. Check out the repo if you're interested in the firmware tooling side of things. Would love to hear other experiences on similar projects!
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.