r/LocalLLaMA · June 3, 2026 · 1 min read

Gemma 4 12B first coding agent test on a 4080 Super

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Gemma 4 12B first coding agent test on a 4080 Super

Just threw the new Gemma 4 12B into VSCodium with the Pi Agent extension to see how it handles tools, and it nailed the test on the first try. I gave it a prompt to write a Python script that reads logs line-by-line, grabs the error modules, and dumps the counts to a JSON file. I also told it to make its own mock log data and run a live terminal test to verify the results.

Instead of just spitting out a block of code for me to copy and paste, the agent actually went to work. It created the script, populated a dummy app.log file with a mix of random logs, opened up a terminal shell to run the code, and verified the output with zero bugs or path errors.

Model: Gemma 4 12B (Unsloth UD-Q4_K_XL)
Context: 32K (--ctx-size 32768)
KV Cache: 8-bit (--cache-type-k q8_0 --cache-type-v q8_0)
Layers: -1 (Full offload to GPU)
Samplers: Flash Attention ON, --temp 1.0, --top-p 0.95, --top-k 64, --min-p 0.05, --repeat-penalty 1.15
llama.cpp + cuda

submitted by /u/Wrong_Mushroom_7350
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA