Hugging Face Daily Papers · · 5 min read

Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

We present Lean Refactor, a plug-and-play retrieval-augmented agentic framework for multi-objective, controllable, and version-robust refactoring of Lean proofs. LLM-generated proofs are notoriously correct-but-verbose and brittle across library versions, yet existing refactoring works overlook three practical challenges: 1) Lean refactoring is natively multi-objective (proof length, compilation cost, and version compatibility are often in tension); 2) Lean repositories have fragile compatibility, whereas LLM releases are unaware of Lean/Mathlib versions; 3) Training-based pipelines require repeated fine-tuning with each new LLM release, scaling neither with model churn nor with Lean's release cycle. Lean Refactor steers a frozen agentic LLM with retrievals from a curated database of multi-objective refactoring strategies, each densely annotated with metadata such as supported Lean/Mathlib versions and expected compilation-cost reduction. Experiments show over 70% token-level compression on competition benchmarks, over 20% on research repositories, and up to 60% compilation-time reduction, outperforming prior work and Claude Code. Version-filtered retrieval further improves compression on the target Lean version, and refactored miniF2F proofs exhibit stronger zero-shot version transfer to future Lean releases than their unrefactored counterparts.</p>\n","updatedAt":"2026-05-22T12:30:00.145Z","author":{"_id":"64df97c628d5d234ce0bf83c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64df97c628d5d234ce0bf83c/BIEH7Ep4ffytiAcUIPFe-.jpeg","fullname":"Wuyang Chen","name":"wuyangchen","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8901482820510864},"editors":["wuyangchen"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64df97c628d5d234ce0bf83c/BIEH7Ep4ffytiAcUIPFe-.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.20244","authors":[{"_id":"6a104c0fa53a61ce2e422fb1","name":"Jialin Lu","hidden":false},{"_id":"6a104c0fa53a61ce2e422fb2","name":"Soonho Kong","hidden":false},{"_id":"6a104c0fa53a61ce2e422fb3","name":"Rodrigo Stehling","hidden":false},{"_id":"6a104c0fa53a61ce2e422fb4","name":"Kaiyu Yang","hidden":false},{"_id":"6a104c0fa53a61ce2e422fb5","name":"Zhangyang Wang","hidden":false},{"_id":"6a104c0fa53a61ce2e422fb6","name":"Weiran Sun","hidden":false},{"_id":"6a104c0fa53a61ce2e422fb7","user":{"_id":"64df97c628d5d234ce0bf83c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64df97c628d5d234ce0bf83c/BIEH7Ep4ffytiAcUIPFe-.jpeg","isPro":false,"fullname":"Wuyang Chen","user":"wuyangchen","type":"user","name":"wuyangchen"},"name":"Wuyang Chen","status":"claimed_verified","statusLastChangedAt":"2026-05-22T15:59:12.233Z","hidden":false}],"publishedAt":"2026-05-18T00:00:00.000Z","submittedOnDailyAt":"2026-05-22T00:00:00.000Z","title":"Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search","submittedOnDailyBy":{"_id":"64df97c628d5d234ce0bf83c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64df97c628d5d234ce0bf83c/BIEH7Ep4ffytiAcUIPFe-.jpeg","isPro":false,"fullname":"Wuyang Chen","user":"wuyangchen","type":"user","name":"wuyangchen"},"summary":"We present Lean Refactor, a plug-and-play retrieval-augmented agentic framework for multi-objective, controllable, and version-robust refactoring of Lean proofs. LLM-generated proofs are notoriously correct-but-verbose and brittle across library versions, yet existing refactoring works overlook three practical challenges: 1) Lean refactoring is natively multi-objective (proof length, compilation cost, and version compatibility are often in tension); 2) Lean repositories have fragile compatibility, whereas LLM releases are unaware of Lean/Mathlib versions; 3) Training-based pipelines require repeated fine-tuning with each new LLM release, scaling neither with model churn nor with Lean's release cycle. Lean Refactor steers a frozen agentic LLM with retrievals from a curated database of multi-objective refactoring strategies, each densely annotated with metadata such as supported Lean/Mathlib versions and expected compilation-cost reduction. Experiments show over 70% token-level compression on competition benchmarks, over 20% on research repositories, and up to 60% compilation-time reduction, outperforming prior work and Claude Code. Version-filtered retrieval further improves compression on the target Lean version, and refactored miniF2F proofs exhibit stronger zero-shot version transfer to future Lean releases than their unrefactored counterparts.","upvotes":1,"discussionId":"6a104c0fa53a61ce2e422fb8","projectPage":"https://arxiv.org/abs/2605.20244","ai_summary":"Lean Refactor presents a retrieval-augmented agentic framework that improves Lean proof refactoring by addressing multi-objective optimization, version compatibility, and scalability challenges through curated strategy databases and version-filtered retrieval.","ai_keywords":["retrieval-augmented agentic framework","multi-objective optimization","version-robust refactoring","Lean proofs","Mathlib versions","compilation cost reduction","token-level compression","zero-shot version transfer"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64df97c628d5d234ce0bf83c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64df97c628d5d234ce0bf83c/BIEH7Ep4ffytiAcUIPFe-.jpeg","isPro":false,"fullname":"Wuyang Chen","user":"wuyangchen","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.20244.md"}">
Papers
arxiv:2605.20244

Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search

Published on May 18
· Submitted by
Wuyang Chen
on May 22
Authors:
,
,
,
,
,
,

Abstract

Lean Refactor presents a retrieval-augmented agentic framework that improves Lean proof refactoring by addressing multi-objective optimization, version compatibility, and scalability challenges through curated strategy databases and version-filtered retrieval.

AI-generated summary

We present Lean Refactor, a plug-and-play retrieval-augmented agentic framework for multi-objective, controllable, and version-robust refactoring of Lean proofs. LLM-generated proofs are notoriously correct-but-verbose and brittle across library versions, yet existing refactoring works overlook three practical challenges: 1) Lean refactoring is natively multi-objective (proof length, compilation cost, and version compatibility are often in tension); 2) Lean repositories have fragile compatibility, whereas LLM releases are unaware of Lean/Mathlib versions; 3) Training-based pipelines require repeated fine-tuning with each new LLM release, scaling neither with model churn nor with Lean's release cycle. Lean Refactor steers a frozen agentic LLM with retrievals from a curated database of multi-objective refactoring strategies, each densely annotated with metadata such as supported Lean/Mathlib versions and expected compilation-cost reduction. Experiments show over 70% token-level compression on competition benchmarks, over 20% on research repositories, and up to 60% compilation-time reduction, outperforming prior work and Claude Code. Version-filtered retrieval further improves compression on the target Lean version, and refactored miniF2F proofs exhibit stronger zero-shot version transfer to future Lean releases than their unrefactored counterparts.

Community

Paper author Paper submitter about 14 hours ago

We present Lean Refactor, a plug-and-play retrieval-augmented agentic framework for multi-objective, controllable, and version-robust refactoring of Lean proofs. LLM-generated proofs are notoriously correct-but-verbose and brittle across library versions, yet existing refactoring works overlook three practical challenges: 1) Lean refactoring is natively multi-objective (proof length, compilation cost, and version compatibility are often in tension); 2) Lean repositories have fragile compatibility, whereas LLM releases are unaware of Lean/Mathlib versions; 3) Training-based pipelines require repeated fine-tuning with each new LLM release, scaling neither with model churn nor with Lean's release cycle. Lean Refactor steers a frozen agentic LLM with retrievals from a curated database of multi-objective refactoring strategies, each densely annotated with metadata such as supported Lean/Mathlib versions and expected compilation-cost reduction. Experiments show over 70% token-level compression on competition benchmarks, over 20% on research repositories, and up to 60% compilation-time reduction, outperforming prior work and Claude Code. Version-filtered retrieval further improves compression on the target Lean version, and refactored miniF2F proofs exhibit stronger zero-shot version transfer to future Lean releases than their unrefactored counterparts.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.20244
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.20244 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.20244 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.20244 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers