r/LocalLLaMA · June 5, 2026 · 1 min read

PSA: Gemma 4 12B is NOT completely broken for coding and tool calling, you need a special chat template

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

This is a PSA for people like me who tried it and hit the wall with tool calls failing left and right, so much so that harnesses like OpenCode just didn't work:

There is a fix for that. You need to pass a better chat template file, which is available (I did not write it). See also this comment.

To actually use it with llama.cpp, first compile llama.cpp from source, then download the chat template file I linked above, then try this (8 bit quant in this case):

./build/bin/llama-server -hf unsloth/gemma-4-12b-it-GGUF:UD-Q8_K_XL --host 127.0.0.1 --port 8899 --jinja --chat-template-file ./custom-pub-chat-template-gemma4.jinja

I'm not saying the results are great, or good, or better or worse than Qwen 3 9B or any other model! But with this setting, the tool calling bugs go away and you can genuinely evaluate its capabilities in opencode.

So, please do that before forming a judgement of the model's coding ability.

But once you've done that, judge away 😀

I'm posting because I see so many "I can't code with Gemma 4 12B, tool calls never work" comments that it's tough to cut through the noise when discussing the model.

Thanks to u/HVACcontrolsGuru for bringing the solution to my attention. I hope I'm not stealing their thunder, just thought it was time to call more eyeballs to this.

submitted by /u/boutell
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA