<a href=\"https://cdn-uploads.huggingface.co/production/uploads/648905d1a15c43c791d4381f/OEFNSEfiqPmctgE1dJWuu.png\" rel=\"nofollow\"><img src=\"https://cdn-uploads.huggingface.co/production/uploads/648905d1a15c43c791d4381f/OEFNSEfiqPmctgE1dJWuu.png\" alt=\"image\"></a></p>\n","updatedAt":"2026-05-13T09:24:14.521Z","author":{"_id":"648905d1a15c43c791d4381f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648905d1a15c43c791d4381f/GpqGBzsLiMHX0gWZEz3qn.jpeg","fullname":"Weiyang Liu","name":"wy1iu","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":8,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5396015644073486},"editors":["wy1iu"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/648905d1a15c43c791d4381f/GpqGBzsLiMHX0gWZEz3qn.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.12492","authors":[{"_id":"6a0442e086b054ce2fa410d3","name":"Kexuan Shi","hidden":false},{"_id":"6a0442e086b054ce2fa410d4","name":"Hanxuan Li","hidden":false},{"_id":"6a0442e086b054ce2fa410d5","name":"Zeju Qiu","hidden":false},{"_id":"6a0442e086b054ce2fa410d6","name":"Yandong Wen","hidden":false},{"_id":"6a0442e086b054ce2fa410d7","name":"Simon Buchholz","hidden":false},{"_id":"6a0442e086b054ce2fa410d8","name":"Weiyang Liu","hidden":false}],"publishedAt":"2026-05-12T00:00:00.000Z","submittedOnDailyAt":"2026-05-13T00:00:00.000Z","title":"Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation","submittedOnDailyBy":{"_id":"648905d1a15c43c791d4381f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648905d1a15c43c791d4381f/GpqGBzsLiMHX0gWZEz3qn.jpeg","isPro":false,"fullname":"Weiyang Liu","user":"wy1iu","type":"user","name":"wy1iu"},"summary":"We introduce Pion, a spectrum-preserving optimizer for large language model (LLM) training based on orthogonal equivalence transformation. Unlike additive optimizers such as Adam and Muon, Pion updates each weight matrix through left and right orthogonal transformations, preserving its singular values throughout training. This yields an optimization mechanism that modulates the geometry of weight matrices while keeping their spectral norm fixed. We derive the Pion update rule, systematically examine its design choices, and analyze its convergence behavior along with several key properties. Empirical results show that Pion offers a stable and competitive alternative to standard optimizers for both LLM pretraining and finetuning.","upvotes":3,"discussionId":"6a0442e186b054ce2fa410d9","projectPage":"https://spherelab.ai/pion/","githubRepo":"https://github.com/Sphere-AI-Lab/pion","githubRepoAddedBy":"user","ai_summary":"Pion is a spectrum-preserving optimizer for large language model training that uses orthogonal equivalence transformations to maintain singular values during weight updates, offering stable performance comparable to standard optimizers.","ai_keywords":["orthogonal equivalence transformation","singular values","spectral norm","weight matrices","optimization mechanism","convergence behavior","orthogonal transformations","large language model training","additive optimizers","Adam","Muon"],"githubStars":9,"organization":{"_id":"6390c6fdd00f25601f445cd4","name":"CUHK-CSE","fullname":"The Chinese University of Hong Kong","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/621f2eb36e152b56a7cf0248/o8RRAczRjfNEzq70GzUwQ.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"648905d1a15c43c791d4381f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648905d1a15c43c791d4381f/GpqGBzsLiMHX0gWZEz3qn.jpeg","isPro":false,"fullname":"Weiyang Liu","user":"wy1iu","type":"user"},{"_id":"67a6fb00b8b21202c96dcbd5","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/uqyE3VcaRPwgofIriBgI_.png","isPro":false,"fullname":"Hanxuan Li","user":"lihanxuan","type":"user"},{"_id":"67cff3de91473f9c5ccf0fa8","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67cff3de91473f9c5ccf0fa8/lyHDTF4foLuOHBPmKc2NH.jpeg","isPro":false,"fullname":"Kexuan Shi","user":"KexuanShi","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"6390c6fdd00f25601f445cd4","name":"CUHK-CSE","fullname":"The Chinese University of Hong Kong","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/621f2eb36e152b56a7cf0248/o8RRAczRjfNEzq70GzUwQ.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.12492.md"}">
Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation
Abstract
Pion is a spectrum-preserving optimizer for large language model training that uses orthogonal equivalence transformations to maintain singular values during weight updates, offering stable performance comparable to standard optimizers.
AI-generated summary
We introduce Pion, a spectrum-preserving optimizer for large language model (LLM) training based on orthogonal equivalence transformation. Unlike additive optimizers such as Adam and Muon, Pion updates each weight matrix through left and right orthogonal transformations, preserving its singular values throughout training. This yields an optimization mechanism that modulates the geometry of weight matrices while keeping their spectral norm fixed. We derive the Pion update rule, systematically examine its design choices, and analyze its convergence behavior along with several key properties. Empirical results show that Pion offers a stable and competitive alternative to standard optimizers for both LLM pretraining and finetuning.
Community
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.12492 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.12492 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.12492 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.