r/LocalLLaMA · · 1 min read

SkillOpt treats markdown skill files as trainable parameters with proper optimization machinery

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

SkillOpt treats markdown skill files as trainable parameters with proper optimization machinery

Paper came out recently that formalizes something a lot of agent builders have been doing ad hoc. They use a frontier model to propose bounded edits (add/delete/replace) to markdown skill files, then gate every edit against a held out validation set. Only strict improvements accepted, ties rejected, rejected edits become negative signal for the next round.

Few things worth noting:

Best skills converge with 1 to 4 accepted edits out of many more proposals. Edit budget of 4 to 8 per step works best, remove the cap and performance collapses. Median final skill is ~920 tokens.

A skill optimized on Codex transferred to Claude Code with zero modification and gained +59.7 on SpreadsheetBench. And GPT 4.1 nano with an optimized skill roughly matched frontier on procedural benchmarks.

The limitation is the validation gate requires an auto grader with clear correct answers. Works for code and spreadsheets, breaks for anything open ended.

Paper: https://arxiv.org/pdf/2605.23904

submitted by /u/agentic-doc
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA