MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling
Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.
MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling
Abstract
MaxProof is a test-time scaling framework that enhances mathematical proof generation by combining multiple proof-oriented capabilities and using population-level search with tournament selection to achieve competitive performance on high-level mathematical competitions.
We present MaxProof, a population-level test-time scaling framework for competition-level mathematical proof in the MiniMax-M3 series. M3 first trains three proof-oriented capabilities -- proof generation, proof verification, and critique-conditioned proof repair -- using a defense-in-depth generative verifier engineered for low false-positive rate. These capabilities are merged into a single released M3 model. At test time, MaxProof treats the model as a generator, verifier, refiner, and ranker, searches over a population of candidate proofs, and returns one final proof through tournament selection. With MaxProof test-time scaling, the M3 model reaches 35/42 on IMO 2025 and 36/42 on USAMO 2026, exceeding the human gold-medal threshold on both.
Get this paper in your agent:
hf papers read 2606.13473 curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper
More from Hugging Face Daily Papers
-
On the Limits of LLM Adaptability: Impact of Model-Internalized Priors on Annotation Task Performance
Jun 12
-
Rethinking Psychometric Evaluation of LLMs: When and Why Self-Reports Predict Behavior
Jun 12
-
See What I See, Know What I Think: Dense Latent Communication Across Heterogeneous Agents
Jun 12
-
Getting Better at Working With You: Compiling User Corrections into Runtime Enforcement for Coding Agents
Jun 12
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.