Hugging Face Daily Papers · May 29, 2026 · 5 min read

LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

Add paper.\n","updatedAt":"2026-05-29T03:20:37.409Z","author":{"_id":"63115e4daf1fce227a404b16","avatarUrl":"/avatars/f2e9ee3f1f70bcfabb2b660b241b7aee.svg","fullname":"Xinyu Lu","name":"luxinyu","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.8793436288833618},"editors":["luxinyu"],"editorAvatarUrls":["/avatars/f2e9ee3f1f70bcfabb2b660b241b7aee.svg"],"reactions":[],"isReport":false}},{"id":"6a1a4115df1f8833acea432e","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":359,"isUserFollowing":false},"createdAt":"2026-05-30T01:44:53.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Terminal-World: Scaling Terminal-Agent Environments via Agent Skills](https://huggingface.co/papers/2605.20876) (2026)\n* [TerminalWorld: Benchmarking Agents on Real-World Terminal Tasks](https://huggingface.co/papers/2605.22535) (2026)\n* [Toward Scalable Terminal Task Synthesis via Skill Graphs](https://huggingface.co/papers/2604.25727) (2026)\n* [The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence](https://huggingface.co/papers/2605.26494) (2026)\n* [ClawGym: A Scalable Framework for Building Effective Claw Agents](https://huggingface.co/papers/2604.26904) (2026)\n* [Terminus-4B: Can a Smaller Model Replace Frontier LLMs at Agentic Execution Tasks?](https://huggingface.co/papers/2605.03195) (2026)\n* [Synthetic Sandbox for Training Machine Learning Engineering Agents](https://huggingface.co/papers/2604.04872) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. \nThe following papers were recommended by the Semantic Scholar API \n<ul>\n<li><a href=\"https://huggingface.co/papers/2605.20876\">Terminal-World: Scaling Terminal-Agent Environments via Agent Skills</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.22535\">TerminalWorld: Benchmarking Agents on Real-World Terminal Tasks</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.25727\">Toward Scalable Terminal Task Synthesis via Skill Graphs</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.26494\">The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.26904\">ClawGym: A Scalable Framework for Building Effective Claw Agents</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.03195\">Terminus-4B: Can a Smaller Model Replace Frontier LLMs at Agentic Execution Tasks?</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.04872\">Synthetic Sandbox for Training Machine Learning Engineering Agents</a> (2026)</li>\n</ul>\n Please give a thumbs up to this comment if you found it helpful!\n If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code><a href=\"/librarian-bot\">@librarian-bot</a> recommend</code>\n","updatedAt":"2026-05-30T01:44:53.238Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":359,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7026347517967224},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.29559","authors":[{"_id":"6a18e91156b4bb14ec65cda1","user":{"_id":"6757db3ca72313e29af824f0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/dKM9alNXpPguxs3pszsRN.png","isPro":false,"fullname":"Xiaoxuan Peng","user":"paraline","type":"user","name":"paraline"},"name":"Xiaoxuan Peng","status":"claimed_verified","statusLastChangedAt":"2026-05-29T09:32:27.631Z","hidden":false},{"_id":"6a18e91156b4bb14ec65cda2","name":"Kaiqi Zhang","hidden":false},{"_id":"6a18e91156b4bb14ec65cda3","name":"Xinyu Lu","hidden":false},{"_id":"6a18e91156b4bb14ec65cda4","name":"Boxi Cao","hidden":false},{"_id":"6a18e91156b4bb14ec65cda5","name":"Yaojie Lu","hidden":false},{"_id":"6a18e91156b4bb14ec65cda6","name":"Hongyu Lin","hidden":false},{"_id":"6a18e91156b4bb14ec65cda7","name":"Xianpei Han","hidden":false},{"_id":"6a18e91156b4bb14ec65cda8","name":"Le Sun","hidden":false}],"publishedAt":"2026-05-28T00:00:00.000Z","submittedOnDailyAt":"2026-05-29T00:00:00.000Z","title":"LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents","submittedOnDailyBy":{"_id":"63115e4daf1fce227a404b16","avatarUrl":"/avatars/f2e9ee3f1f70bcfabb2b660b241b7aee.svg","isPro":false,"fullname":"Xinyu Lu","user":"luxinyu","type":"user","name":"luxinyu"},"summary":"Mastering terminal environments requires language agents capable of multi-step planning, feedback-grounded execution, and dynamic state adaptation. However, training such agents is currently bottlenecked by a reliance on scraped external repositories, which limits domain diversity, environment controllability, and the targeting of specific capability deficits. We introduce LiteCoder-Terminal-Gen, a zero-dependency synthesis pipeline that autonomously generates executable and verifiable terminal training environments directly from domain specifications. Using this framework, we construct two large-scale resources: LiteCoder-Terminal-SFT, comprising 11,255 expert trajectories across 10 domains, and LiteCoder-Terminal-RL, featuring 602 verifiable environments for trajectory-level preference optimization. Supervised fine-tuning of Qwen-family models on our SFT dataset yields agents that significantly outperform their base counterparts. Notably, our 32B variant achieves 29.06%, 18.54%, and 34.00% pass@1 on Terminal Bench 1.0, 2.0, and Pro, respectively. Furthermore, applying Direct Multi-turn Preference Optimization (DMPO) on our RL environments yields additional performance gains. These results systematically demonstrate that fully synthetic, executable environments offer a scalable and verifiable supervision signal for mastering complex, real-world command-line workflows.","upvotes":9,"discussionId":"6a18e91156b4bb14ec65cda9","githubRepo":"https://github.com/icip-cas/LiteCoder","githubRepoAddedBy":"user","ai_summary":"LiteCoder-Terminal-Gen enables scalable training of language agents for terminal environments through synthetic, executable environments that outperform traditional methods.","ai_keywords":["language agents","multi-step planning","feedback-grounded execution","dynamic state adaptation","zero-dependency synthesis pipeline","expert trajectories","trajectory-level preference optimization","supervised fine-tuning","Direct Multi-turn Preference Optimization","DMPO","terminal environments","command-line workflows"],"githubStars":15},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6942a253b5344869abe7abfc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6942a253b5344869abe7abfc/ts6znV8l3RLv1_B6oKA9l.png","isPro":false,"fullname":"LiteCoder","user":"Lite-Coder","type":"user"},{"_id":"6757db3ca72313e29af824f0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/dKM9alNXpPguxs3pszsRN.png","isPro":false,"fullname":"Xiaoxuan Peng","user":"paraline","type":"user"},{"_id":"63115e4daf1fce227a404b16","avatarUrl":"/avatars/f2e9ee3f1f70bcfabb2b660b241b7aee.svg","isPro":false,"fullname":"Xinyu Lu","user":"luxinyu","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"643b62ac065961b2252abb7a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/643b62ac065961b2252abb7a/6F140r2GfdpItuvSk8GNc.jpeg","isPro":false,"fullname":"zuijiang","user":"zuijiang","type":"user"},{"_id":"5fc052241160c47d1d438556","avatarUrl":"/avatars/f508707fc92f1b42f7897b12b727754c.svg","isPro":false,"fullname":"Boxi Cao","user":"Bowieee","type":"user"},{"_id":"63c1699e40a26dd2db32400d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63c1699e40a26dd2db32400d/3N0-Zp8igv8-52mXAdiiq.jpeg","isPro":false,"fullname":"Chroma","user":"Chroma111","type":"user"},{"_id":"6a0b2f731f2a5f058ce5c724","avatarUrl":"/avatars/0e2a406f3c8ae3d17e7e67139aca4553.svg","isPro":false,"fullname":"Tab Bab","user":"Tabbab321","type":"user"},{"_id":"624ac233c04d55ec0f42b11e","avatarUrl":"/avatars/58a9abce945e71a65abc8a54085de6d7.svg","isPro":false,"fullname":"oh sehun","user":"sehun","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.29559.md"}">

Papers

arxiv:2605.29559

LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents

Published on May 28

· Submitted by

Xinyu Lu on May 29

Upvote

Authors:

Xiaoxuan Peng ,

Abstract

LiteCoder-Terminal-Gen enables scalable training of language agents for terminal environments through synthetic, executable environments that outperform traditional methods.

AI-generated summary

Mastering terminal environments requires language agents capable of multi-step planning, feedback-grounded execution, and dynamic state adaptation. However, training such agents is currently bottlenecked by a reliance on scraped external repositories, which limits domain diversity, environment controllability, and the targeting of specific capability deficits. We introduce LiteCoder-Terminal-Gen, a zero-dependency synthesis pipeline that autonomously generates executable and verifiable terminal training environments directly from domain specifications. Using this framework, we construct two large-scale resources: LiteCoder-Terminal-SFT, comprising 11,255 expert trajectories across 10 domains, and LiteCoder-Terminal-RL, featuring 602 verifiable environments for trajectory-level preference optimization. Supervised fine-tuning of Qwen-family models on our SFT dataset yields agents that significantly outperform their base counterparts. Notably, our 32B variant achieves 29.06%, 18.54%, and 34.00% pass@1 on Terminal Bench 1.0, 2.0, and Pro, respectively. Furthermore, applying Direct Multi-turn Preference Optimization (DMPO) on our RL environments yields additional performance gains. These results systematically demonstrate that fully synthetic, executable environments offer a scalable and verifiable supervision signal for mastering complex, real-world command-line workflows.