LBW-Guard is a bounded training-control governance layer above AdamW. It observes training telemetry and applies corrective control when destabilizing regimes form — distinct from gradient clipping. Benchmarked on Qwen2.5-3B/7B/14B and TinyLlama-1B: 18.7% perplexity reduction, 1.10x speedup, and stability at LR=3e-3 where AdamW diverges to 1885. PyPI package and live HF Space available.</p>\n","updatedAt":"2026-05-21T05:30:50.201Z","author":{"_id":"6a05e37c8616369ddedafb09","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6a05e37c8616369ddedafb09/K_mDkm4vvX0x1ZPADnSGK.jpeg","fullname":"Anis Radianis","name":"aradianis","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8191379904747009},"editors":["aradianis"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/6a05e37c8616369ddedafb09/K_mDkm4vvX0x1ZPADnSGK.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.19008","authors":[{"_id":"6a0dc37ed1ef9ecdf71c0d9d","user":{"_id":"6a05e37c8616369ddedafb09","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6a05e37c8616369ddedafb09/K_mDkm4vvX0x1ZPADnSGK.jpeg","isPro":false,"fullname":"Anis Radianis","user":"aradianis","type":"user","name":"aradianis"},"name":"Anis Radianis","status":"claimed_verified","statusLastChangedAt":"2026-05-20T15:31:19.778Z","hidden":false}],"publishedAt":"2026-05-18T00:00:00.000Z","submittedOnDailyAt":"2026-05-21T00:00:00.000Z","title":"Learn-by-Wire Training Control Governance: Bounded Autonomous Training Under Stress for Stability and Efficiency","submittedOnDailyBy":{"_id":"6a05e37c8616369ddedafb09","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6a05e37c8616369ddedafb09/K_mDkm4vvX0x1ZPADnSGK.jpeg","isPro":false,"fullname":"Anis Radianis","user":"aradianis","type":"user","name":"aradianis"},"summary":"Modern language-model training is increasingly exposed to instability, degraded runs, and wasted compute, especially under aggressive learning-rate, scale, and runtime-stress conditions. This paper introduces Learn-by-Wire Guard (LBW-Guard), a bounded autonomous training-control governance layer that operates above AdamW. Rather than replacing the optimizer update rule, LBW-Guard observes training telemetry, interprets instability-sensitive regimes, and applies bounded control to optimizer execution while preserving fixed training objectives.\n We evaluate LBW-Guard in a Qwen2.5-centered stress-and-robustness suite using WikiText-103, with Qwen2.5-7B as the empirical anchor, model-size comparisons against Qwen2.5-3B and Qwen2.5-14B, learning-rate stress tests, gradient-clipping baselines, and a no-LoRA TinyLlama-1B full-parameter sanity check. In the 7B reference setting, LBW-Guard reduces final perplexity from 13.21 to 10.74, an 18.7% improvement, while reducing end-to-end time from 392.54s to 357.02s, a 1.10x speedup. Under stronger learning-rate stress, AdamW degrades to 1885.24 final perplexity at LR=3e-3 and 659.76 at LR=1e-3, whereas LBW-Guard remains trainable at 11.57 and 10.33, respectively. Gradient-clipping baselines do not reproduce this effect.\n These results support a scoped systems conclusion that stability-sensitive LLM training can benefit from a governance plane above the optimizer. LBW-Guard provides evidence that bounded runtime control can preserve productive compute under stress while remaining distinct from optimizer replacement and local gradient suppression.","upvotes":2,"discussionId":"6a0dc37fd1ef9ecdf71c0d9e","projectPage":"https://huggingface.co/Qluon","githubRepo":"https://github.com/Qluon/LBW-Guard","githubRepoAddedBy":"user","ai_summary":"Learn-by-Wire Guard (LBW-Guard) enhances language model training stability and efficiency by providing bounded autonomous control over optimizer execution without altering the underlying training objective.","ai_keywords":["AdamW","optimizer update rule","training telemetry","instability-sensitive regimes","bounded control","learning-rate stress","gradient-clipping","perplexity","end-to-end time","compute efficiency"],"githubStars":0,"organization":{"_id":"6978be5543789a59fee8b354","name":"QluonAI","fullname":"Qluon","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/6978bcaf3e35436d6d3e027a/8wt7Z81ujCCT-SulFD33Q.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6a05e37c8616369ddedafb09","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6a05e37c8616369ddedafb09/K_mDkm4vvX0x1ZPADnSGK.jpeg","isPro":false,"fullname":"Anis Radianis","user":"aradianis","type":"user"},{"_id":"6978bcaf3e35436d6d3e027a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6978bcaf3e35436d6d3e027a/iZHqpXf2XpsnynSYGnxpg.png","isPro":false,"fullname":"Qluon","user":"Qluon","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"6978be5543789a59fee8b354","name":"QluonAI","fullname":"Qluon","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/6978bcaf3e35436d6d3e027a/8wt7Z81ujCCT-SulFD33Q.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.19008.md"}">
Learn-by-Wire Training Control Governance: Bounded Autonomous Training Under Stress for Stability and Efficiency
Abstract
Learn-by-Wire Guard (LBW-Guard) enhances language model training stability and efficiency by providing bounded autonomous control over optimizer execution without altering the underlying training objective.
AI-generated summary
Modern language-model training is increasingly exposed to instability, degraded runs, and wasted compute, especially under aggressive learning-rate, scale, and runtime-stress conditions. This paper introduces Learn-by-Wire Guard (LBW-Guard), a bounded autonomous training-control governance layer that operates above AdamW. Rather than replacing the optimizer update rule, LBW-Guard observes training telemetry, interprets instability-sensitive regimes, and applies bounded control to optimizer execution while preserving fixed training objectives.
We evaluate LBW-Guard in a Qwen2.5-centered stress-and-robustness suite using WikiText-103, with Qwen2.5-7B as the empirical anchor, model-size comparisons against Qwen2.5-3B and Qwen2.5-14B, learning-rate stress tests, gradient-clipping baselines, and a no-LoRA TinyLlama-1B full-parameter sanity check. In the 7B reference setting, LBW-Guard reduces final perplexity from 13.21 to 10.74, an 18.7% improvement, while reducing end-to-end time from 392.54s to 357.02s, a 1.10x speedup. Under stronger learning-rate stress, AdamW degrades to 1885.24 final perplexity at LR=3e-3 and 659.76 at LR=1e-3, whereas LBW-Guard remains trainable at 11.57 and 10.33, respectively. Gradient-clipping baselines do not reproduce this effect.
These results support a scoped systems conclusion that stability-sensitive LLM training can benefit from a governance plane above the optimizer. LBW-Guard provides evidence that bounded runtime control can preserve productive compute under stress while remaining distinct from optimizer replacement and local gradient suppression.
Community
LBW-Guard is a bounded training-control governance layer above AdamW. It observes training telemetry and applies corrective control when destabilizing regimes form — distinct from gradient clipping. Benchmarked on Qwen2.5-3B/7B/14B and TinyLlama-1B: 18.7% perplexity reduction, 1.10x speedup, and stability at LR=3e-3 where AdamW diverges to 1885. PyPI package and live HF Space available.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.19008 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.19008 in a dataset README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.