Sapient Intelligence releases HRM-Text 1B: 40B tokens, ~$1k pretrain, beats Llama3.2 3B on MATH and DROP
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| Sapient Intelligence (the HRM/hierarchical reasoning folks) dropped HRM-Text 1B today. Posting because the benchmark chart is interesting enough to be worth a look even if you're skeptical of the marketing. The training numbers:
Where it actually wins (per their chart):
Where it's roughly tied or behind:
So the pattern is what you'd expect from something called a "Hierarchical Reasoning Model" — punches well above weight on multi-step reasoning (MATH, DROP), only middling on knowledge recall (MMLU). The MMLU gap is the validating part of the story: 40B tokens is just not enough to pack in world knowledge. Links: Caveats worth flagging before anyone gets too hyped:
Anyone tried it yet? [link] [comments] |
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.