r/LocalLLaMA · · 1 min read

Our ICML paper on predictable hallucination (information-budget abstention gate), + ntkMirror: a training-free open-weight implementation we're releasing today

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Our paper, Predictable Compression Failures: Order Sensitivity and Information Budgeting for Evidence-Grounded Binary Adjudication, was accepted at ICML 2026. Paper: https://arxiv.org/abs/2509.11208

The idea: in evidence-grounded QA, the order you present exchangeable evidence in changes the model's answer probability (permutation dispersion). We treat order as a nuisance variable, derive the Expectation-level Decompression Law (EDFL) relating expected information budget to achievable reliability, and turn it into a fixed ISR=1 answer/abstain gate with no threshold tuning. When information is insufficient, the model abstains instead of guessing. In the paper's pre-specified held-out audit, the gate reaches 0.0–0.7% hallucination at ~24% abstention (80.5% accuracy on attempts), with the ISR=1 boundary fixed by theory rather than tuned.

What we're releasing today (ntkMirror): a training-free implementation of that gate for local open-weight models. It scores each claim under multiple evidence orderings (order-marginal verifier, exact tied-branch scoring), computes ISR from the per-permutation probabilities, and gates answer/abstain. No fine-tuning, no second model, runs on your own weights offline. We also ship a fused kernel that batches the permutation forwards: bit-identical to the naive loop at fp32, 2.6–10× faster.

New results (not in the paper): run as a hallucination detector across small local models, AUROC on VitaminC / BoolQ / SciFact:

Model VitaminC BoolQ SciFact
Qwen2.5-0.5B 0.78 0.69 0.80
Qwen2.5-1.5B 0.69 0.78 0.91
Gemma E4B 0.88 0.84 0.96
Qwen2.5-7B 0.90 0.87 0.94

Separation scales with model size, strongest on SciFact and the larger models. Used as a gate on balanced data, the grounded fraction of accepted claims rises from 50% to roughly 75–90% depending on model/dataset, at the cost of dropping ~10–20% of valid claims. The kernel doesn't affect accuracy (AUROC gap ≤0.008); it just makes the gate cheap.

Please let me know if you find it useful https://github.com/leochlon/ntkmirror

submitted by /u/Upset-Presentation28
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA