From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills
Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.
From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills
Abstract
Language agents benefit from reusable skills that encode domain-specific procedures, but their effectiveness varies significantly across different extraction and consumption scenarios, requiring careful evaluation and meta-skill guidance to optimize performance.
Language agents increasingly improve by reusing skills -- structured procedural artifacts distilled from past experience. In particular, domain-level and model-generated skills are especially promising. They offer fast adaptation within a domain by encoding domain-specific recurring procedures, and they scale beyond labor-intensive hand-crafting. However, while extraction methods continue to proliferate, understanding remains limited, with no comprehensive study spanning the full skill lifecycle -- experience generation, skill extraction, and skill consumption -- to ask whether such skills actually work, when they work, and what makes them succeed or fail. To close this gap, we build a utility-grounded evaluation framework that provides systematic experimental results across extractors and target agents, covering five diverse agentic task domains. We find that model-generated skills are beneficial on average but exhibit non-trivial negative transfer, and that neither extractors nor targets behave uniformly. A model can be a strong extractor yet a weak consumer, or vice versa, with skill utility independent of model scale or baseline task strength. To explain these patterns, we then dissect each lifecycle stage in depth, analyzing how experience composition shapes skill quality, what properties characterize useful skills, and how the same skill transfers across different consumers. Finally, we translate these findings into a concrete meta-skill that guides skill extraction toward the features tied to actual utility, which consistently improves skill quality across domains and substantially reduces negative transfer.
Get this paper in your agent:
hf papers read 2605.23899 curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 1
More from Hugging Face Daily Papers
-
LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws
May 25
-
GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction
May 25
-
SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research
May 25
-
From Seeing to Thinking: Decoupling Perception and Reasoning Improves Post-Training of Vision-Language Models
May 25
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.