r/LocalLLaMA · June 9, 2026 · 1 min read

Our ICML paper on predictable hallucination (information-budget abstention gate), + ntkMirror: a training-free open-weight implementation we're releasing today

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Our paper, Predictable Compression Failures: Order Sensitivity and Information Budgeting for Evidence-Grounded Binary Adjudication, was accepted at ICML 2026. Paper: https://arxiv.org/abs/2509.11208

The idea: in evidence-grounded QA, the order you present exchangeable evidence in changes the model's answer probability (permutation dispersion). We treat order as a nuisance variable, derive the Expectation-level Decompression Law (EDFL) relating expected information budget to achievable reliability, and turn it into a fixed ISR=1 answer/abstain gate with no threshold tuning. When information is insufficient, the model abstains instead of guessing. In the paper's pre-specified held-out audit, the gate reaches 0.0–0.7% hallucination at ~24% abstention (80.5% accuracy on attempts), with the ISR=1 boundary fixed by theory rather than tuned.

What we're releasing today (ntkMirror): a training-free implementation of that gate for local open-weight models. It scores each claim under multiple evidence orderings (order-marginal verifier, exact tied-branch scoring), computes ISR from the per-permutation probabilities, and gates answer/abstain. No fine-tuning, no second model, runs on your own weights offline. We also ship a fused kernel that batches the permutation forwards: bit-identical to the naive loop at fp32, 2.6–10× faster.

New results (not in the paper): run as a hallucination detector across small local models, AUROC on VitaminC / BoolQ / SciFact:

Model	VitaminC	BoolQ	SciFact
Qwen2.5-0.5B	0.78	0.69	0.80
Qwen2.5-1.5B	0.69	0.78	0.91
Gemma E4B	0.88	0.84	0.96
Qwen2.5-7B	0.90	0.87	0.94

Separation scales with model size, strongest on SciFact and the larger models. Used as a gate on balanced data, the grounded fraction of accepted claims rises from 50% to roughly 75–90% depending on model/dataset, at the cost of dropping ~10–20% of valid claims. The kernel doesn't affect accuracy (AUROC gap ≤0.008); it just makes the gate cheap.

Please let me know if you find it useful https://github.com/leochlon/ntkmirror

submitted by /u/Upset-Presentation28
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA