Gemma 12b - Reasoning hardening instructions
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
I've become quite happy with Gemma 12b QAT as a general assistant lately.
It is small enough to run on my PC while still leave plenty of VRAM free for other tasks and fast enough that I I don't have to go make coffee while it thinks.
I saw someone on youtube throwing trick reasoning questions at it as part of a test suite, and wanted to see if I could make a system instruction that made it think more when required, and not overthink when not needed.
After a lot of iterations and testing I think I found something that works:
<|think|> Avoid cognitive bias in answers. Base answers strictly on the premises given. What is the users intent? If presented with a problem or a task, examine wording closely and ensure no bias is added when evaluating it. If you find yourself thinking 'usual'. 'standard', 'typical' or 'classical', you are victim of cognitive bias and all analysis derived from it is VOID and needs closer re-examination. Your goal is to find the best result that fulfills user primary premise, and no STATED constraint forbids. Answer the user once it fulfills the users primary premise; do not re-derive a check you have already passed. It still fails the car wash, depending on how the question is framed, but picks up on a lot of trick questions and reasons well on normal ones without overthinking.
Tested without KV cache compression, if that matters.
If anyone tests it, I'd like to hear the results!
PS. This is not for coding. There are plenty better options for that.
[link] [comments]
More from r/LocalLLaMA
-
Been running Qwen3.6-27B through a 3-critic harness. The harness matters more than I thought
Jun 30
-
I Hate Dario Amodei, and everything he stands for.
Jun 29
-
Introducing LongCat-2.0 - , a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'.
Jun 29
-
Krea-2-Turbo Image Model - Easy to be fully uncensored, but it can also EDIT Images!
Jun 29
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.