r/LocalLLaMA · June 3, 2026 · 1 min read

google/gemma-4-12B · Hugging Face

#model-release #multimodal #training #open-source #gpu #music

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on E2B, E4B, and 12B) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages.

Featuring both Dense and Mixture-of-Experts (MoE) architectures, Gemma 4 is well-suited for tasks like text generation, coding, and reasoning. The models are available in five distinct sizes: E2B, E4B, 12B, 26B A4B, and 31B. Their diverse sizes make them deployable in environments ranging from high-end phones to laptops and servers, democratizing access to state-of-the-art AI.

Gemma 4 introduces key capability and architectural advancements:

Reasoning – All models in the family are designed as highly capable reasoners, with configurable thinking modes.
Extended Multimodalities – Processes Text, Image with variable aspect ratio and resolution support (all models), Video, and Audio (featured natively on the E2B, E4B, and 12B models).
Diverse & Efficient Architectures – Offers Dense and Mixture-of-Experts (MoE) variants of different sizes for scalable deployment.
Optimized for On-Device – Smaller models are specifically designed for efficient local execution on laptops and mobile devices.
Increased Context Window – The small models feature a 128K context window, while the medium models support 256K.
Enhanced Coding & Agentic Capabilities – Achieves notable improvements in coding benchmarks alongside native function-calling support, powering highly capable autonomous agents.
Native System Prompt Support – Gemma 4 introduces native support for the system role, enabling more structured and controllable conversations.

https://developers.googleblog.com/gemma-4-12b-the-developer-guide/

feed your potato!!!

https://huggingface.co/ggml-org/gemma-4-12b-it-GGUF

https://huggingface.co/unsloth/gemma-4-12b-it-GGUF

submitted by /u/jacek2023
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA