r/LocalLLaMA · · 1 min read

Gemma 4 12b needs glasses

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Having a lot of fun using Gemma 4 as an assistant, but is growing frustrated with the poor default image resolution setting for image vision.

Tasks like identifying smaller text in an image that Qwen 3.6 flies through, Gemma 4 are never able to decipher.

Even larger overall elements of composition it consistently fails at.

I tried adding some param to LlamaCpp that supposedly worked with Gemma 4 31b:

 --image-min-tokens 560 --image-max-tokens 2240 

But that just makes the server crash and quit.

Is there a way to get Gemma 12b some new glasses, so it can be a do-it-all assistant for me?

submitted by /u/nixudos
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA