r/LocalLLaMA · June 26, 2026 · 1 min read

Gemma 4 12b needs glasses

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Having a lot of fun using Gemma 4 as an assistant, but is growing frustrated with the poor default image resolution setting for image vision.

Tasks like identifying smaller text in an image that Qwen 3.6 flies through, Gemma 4 are never able to decipher.

Even larger overall elements of composition it consistently fails at.

I tried adding some param to LlamaCpp that supposedly worked with Gemma 4 31b:

 --image-min-tokens 560 --image-max-tokens 2240

But that just makes the server crash and quit.

Is there a way to get Gemma 12b some new glasses, so it can be a do-it-all assistant for me?

Discussion (0)

No comments yet. Sign in and be the first to say something.