SkillOpt treats markdown skill files as trainable parameters with proper optimization machinery
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| Paper came out recently that formalizes something a lot of agent builders have been doing ad hoc. They use a frontier model to propose bounded edits (add/delete/replace) to markdown skill files, then gate every edit against a held out validation set. Only strict improvements accepted, ties rejected, rejected edits become negative signal for the next round. Few things worth noting: Best skills converge with 1 to 4 accepted edits out of many more proposals. Edit budget of 4 to 8 per step works best, remove the cap and performance collapses. Median final skill is ~920 tokens. A skill optimized on Codex transferred to Claude Code with zero modification and gained +59.7 on SpreadsheetBench. And GPT 4.1 nano with an optimized skill roughly matched frontier on procedural benchmarks. The limitation is the validation gate requires an auto grader with clear correct answers. Works for code and spreadsheets, breaks for anything open ended. [link] [comments] |
More from r/LocalLLaMA
-
Qwen3.5 27B Uncensored Heretic Native MTP Preserved is Out Now With the Full 15 MTPs Preserved and Retained, Available in Safetensors, GGUFs, NVFP4, NVFP4 GGUFs and GPTQ-Int4 Formats!
May 26
-
Strix Halo users, a rejected PR can give you up to 30% faster PP for MOEs.
May 26
-
I finally put my NPU (Intel Arrow Lake) to use doing ASR for my smart home
May 26
-
Qwen3.5 35B A3B uncensored heretic Native MTP Preserved is Out Now With the Full 785 MTPs Preserved and Retained, Available in Safetensors, GGUFs. NVFP4, NVFP4 GGUFs and GPTQ-Int4 Formats
May 26
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.