r/LocalLLaMA · · 2 min read

Gryphe/Pantheon-Reasoning-27B · Hugging Face

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Gryphe/Pantheon-Reasoning-27B · Hugging Face

from Gryphe:

An experiment in bringing reasoning capability to the Pantheon roleplay series in the form of an uncensored dense Qwen 3.6 27B. This specific model can be thought of as a successor to both the Pantheon series and the one-time Codex release since I used such a large variety of data this time around.

Yet another theory being tested this time around: take the data that Pantheon is built on, pair it with full thinking traces, and let the model reason its way through character work — weighing tone, planning narrative beats, considering how a character would actually respond before committing to a line. Whether that meaningfully improves roleplay quality over a non-reasoning model is a question you'll hopefully be able to help me answer.

GGUF quants are available here.

Model details

Base model is llmfan46/Qwen3.6-27B-uncensored-heretic-v2-Native-MTP-Preserved, and from what I can tell this worked out very, very nicely in regards to refusal reduction and writing capabilities.

I considered Gemma 4 31B but that model has been an absolute pain to train. Something something special snowflake architectures. (grumble, grumble)

All training sources include full reasoning traces, with thinking active across every assistant turn:

  • Pantheon data (~28%) - the core Pantheon roleplay corpus with reasoning traces back-generated using the method described below
  • Opus-4.6-Reasoning-24k (~21%) - a cleaned and deduplicated aggregation of Claude Opus 4.6 reasoning traces covering general instruction-following, STEM, and coding; provides the broad reasoning backbone
  • WorldSim data (~16%) - long-form Opus 4.6 narrative roleplay with native reasoning traces, focusing on extended storytelling, character immersion, and emergent world logic, cobbled together through various experiments - mainly third person present tense but has a bit of everything + cliché cleaned, of course!
  • Text adventure data (~16%) - high stakes interactive fiction and text adventure content with reasoning back-generated, lending the model a more grounded, prose-forward writing style
  • General roleplay data (~16%) - a broad collection of highly varied roleplay transcripts with reasoning back-generated, helping the model generalise well to arbitrary character setups
  • Tiamat data (~3%) - character and roleplay dataset originally built for Tiamat-24B-Magistral, featuring a multi-step generation/extension/improvement pipeline with critic-improver rewrites to reduce AI clichés, with reasoning back-generated for each exchange

The model was trained with preserve_thinking: true, so thinking tags remain active across all assistant turns in multi-turn conversations, not just the first.

submitted by /u/jacek2023
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA