Hugging Face Daily Papers · May 13, 2026 · 4 min read

AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

<a href=\"https://cdn-uploads.huggingface.co/production/uploads/60ed74f536ceac2554083559/oW_baii7bgJvKtRSkiIJc.png\" rel=\"nofollow\"><img src=\"https://cdn-uploads.huggingface.co/production/uploads/60ed74f536ceac2554083559/oW_baii7bgJvKtRSkiIJc.png\" alt=\"intro_HD\"></a></p>\n","updatedAt":"2026-05-13T01:56:42.295Z","author":{"_id":"60ed74f536ceac2554083559","avatarUrl":"/avatars/28c184ef76f719a720d933d05afb5800.svg","fullname":"taicheng guo","name":"taicheng","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":13,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.41122305393218994},"editors":["taicheng"],"editorAvatarUrls":["/avatars/28c184ef76f719a720d933d05afb5800.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.11518","authors":[{"_id":"6a03d98a86b054ce2fa40d12","name":"Taicheng Guo","hidden":false},{"_id":"6a03d98a86b054ce2fa40d13","name":"Nitesh V. Chawla","hidden":false},{"_id":"6a03d98a86b054ce2fa40d14","name":"Olaf Wiest","hidden":false},{"_id":"6a03d98a86b054ce2fa40d15","name":"Xiangliang Zhang","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/60ed74f536ceac2554083559/ZKqKKFH_6xXrXjlF_JFUr.mp4"],"publishedAt":"2026-05-12T00:00:00.000Z","submittedOnDailyAt":"2026-05-13T00:00:00.000Z","title":"AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive","submittedOnDailyBy":{"_id":"60ed74f536ceac2554083559","avatarUrl":"/avatars/28c184ef76f719a720d933d05afb5800.svg","isPro":false,"fullname":"taicheng guo","user":"taicheng","type":"user","name":"taicheng"},"summary":"Effectively configuring scalable large language model (LLM) experiments, spanning architecture design, hyperparameter tuning, and beyond, is crucial for advancing LLM research, as poor configuration choices can waste substantial computational resources and prevent models from realizing their full potential. Prior automated methods are designed for low-cost settings where repeated trial and error is feasible, but scalable LLM experiments are too expensive for such extensive iteration. To our knowledge, no work has addressed the automation of high-cost LLM experiment configurations, leaving this problem labor-intensive and dependent on expert intuition. Motivated by this gap, we propose AutoLLMResearch, an agentic framework that mimics how human researchers learn generalizable principles from low-fidelity experiments and extrapolate to efficiently identify promising configurations in expensive LLM settings. The core challenge is how to enable an agent to learn, through interaction with a multi-fidelity experimental environment that captures the structure of the LLM configuration landscape. To achieve this, we propose a systematic framework with two key components: 1) LLMConfig-Gym, a multi-fidelity environment encompassing four critical LLM experiment tasks, supported by over one million GPU hours of verifiable experiment outcomes; 2) A structured training pipeline that formulates configuration research as a long-horizon Markov Decision Process and accordingly incentivizes cross-fidelity extrapolation reasoning. Extensive evaluation against diverse strong baselines on held-out experiments demonstrates the effectiveness, generalization, and interpretability of our framework, supporting its potential as a practical and general solution for scalable real-world LLM experiment automation.","upvotes":2,"discussionId":"6a03d98a86b054ce2fa40d16","projectPage":"https://arxiv.org/pdf/2605.11518","githubRepo":"https://github.com/taichengguo/AutoLLMResearch","githubRepoAddedBy":"user","ai_summary":"An agentic framework called AutoLLMResearch automates high-cost large language model experiment configurations by learning from multi-fidelity experimental environments and enabling efficient configuration identification through cross-fidelity extrapolation.","ai_keywords":["large language models","automated experiment configuration","multi-fidelity environment","Markov Decision Process","cross-fidelity extrapolation","LLMConfig-Gym","agentic framework"],"githubStars":1,"organization":{"_id":"6356ef35fe4ffe942db2460b","name":"notredame","fullname":"University of Notre Dame","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/RJJ94XCJw7R0WkOyrvXIU.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"60ed74f536ceac2554083559","avatarUrl":"/avatars/28c184ef76f719a720d933d05afb5800.svg","isPro":false,"fullname":"taicheng guo","user":"taicheng","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"6356ef35fe4ffe942db2460b","name":"notredame","fullname":"University of Notre Dame","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/RJJ94XCJw7R0WkOyrvXIU.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.11518.md"}">

Papers

arxiv:2605.11518

AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive

Published on May 12

· Submitted by

taicheng guo on May 13

University of Notre Dame

Upvote

Authors:

Abstract

An agentic framework called AutoLLMResearch automates high-cost large language model experiment configurations by learning from multi-fidelity experimental environments and enabling efficient configuration identification through cross-fidelity extrapolation.

AI-generated summary

Effectively configuring scalable large language model (LLM) experiments, spanning architecture design, hyperparameter tuning, and beyond, is crucial for advancing LLM research, as poor configuration choices can waste substantial computational resources and prevent models from realizing their full potential. Prior automated methods are designed for low-cost settings where repeated trial and error is feasible, but scalable LLM experiments are too expensive for such extensive iteration. To our knowledge, no work has addressed the automation of high-cost LLM experiment configurations, leaving this problem labor-intensive and dependent on expert intuition. Motivated by this gap, we propose AutoLLMResearch, an agentic framework that mimics how human researchers learn generalizable principles from low-fidelity experiments and extrapolate to efficiently identify promising configurations in expensive LLM settings. The core challenge is how to enable an agent to learn, through interaction with a multi-fidelity experimental environment that captures the structure of the LLM configuration landscape. To achieve this, we propose a systematic framework with two key components: 1) LLMConfig-Gym, a multi-fidelity environment encompassing four critical LLM experiment tasks, supported by over one million GPU hours of verifiable experiment outcomes; 2) A structured training pipeline that formulates configuration research as a long-horizon Markov Decision Process and accordingly incentivizes cross-fidelity extrapolation reasoning. Extensive evaluation against diverse strong baselines on held-out experiments demonstrates the effectiveness, generalization, and interpretability of our framework, supporting its potential as a practical and general solution for scalable real-world LLM experiment automation.

View arXiv page View PDF Project page GitHub 1 Add to collection

Community

taicheng

Paper submitter about 19 hours ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2605.11518

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.11518 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.11518 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.11518 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

No comments yet. Sign in and be the first to say something.

AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 0

Discussion (0)

More from Hugging Face Daily Papers