Large language models solve complex problems by generating lengthy chains of explicit reasoning tokens. While effective, this makes reasoning expensive, length-sensitive, and constrained to (discrete) natural language. While latent reasoning offers a continuous alternative, determining useful structures for intermediate latent states is an open challenge. In this paper, we formulate latent reasoning as a geometric path-approximation problem within the model's pretrained token-embedding space. We introduce Geometric Latent Reasoning (GLR), which uses a lightweight transition head to predict iterative direction updates in embedding space. Using textual chain-of-thought traces as anchors, GLR learns to approximate discrete reasoning trajectories while permitting continuous deviations from exact token embeddings. Evaluations on mathematical reasoning benchmarks using Qwen3 models reveal an emergent phenomenon: geometric latent reasoning induces substantially shorter generations without an explicit length objective. By replacing early explicit reasoning with continuous latent steps, models often reach correct answers using substantially fewer total generation steps. These findings suggest that continuous trajectories act as compact intermediate reasoning states, exposing a new tradeoff between latent computation budget, output length, and accuracy.</p>\n","updatedAt":"2026-06-02T10:01:55.161Z","author":{"_id":"6578985576b6de797812afdf","avatarUrl":"/avatars/db75f141d4ecf43ca11cb61a32d2e3d6.svg","fullname":"Shashi Kumar","name":"shashi-kumar","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8781715631484985},"editors":["shashi-kumar"],"editorAvatarUrls":["/avatars/db75f141d4ecf43ca11cb61a32d2e3d6.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.02248","authors":[{"_id":"6a1ea9aa808ddbc3c7d43ff9","name":"Shashi Kumar","hidden":false},{"_id":"6a1ea9aa808ddbc3c7d43ffa","name":"Yacouba Kaloga","hidden":false},{"_id":"6a1ea9aa808ddbc3c7d43ffb","name":"Petr Motlicek","hidden":false},{"_id":"6a1ea9aa808ddbc3c7d43ffc","name":"Ina Kodrasi","hidden":false},{"_id":"6a1ea9aa808ddbc3c7d43ffd","name":"Andrea Cavallaro","hidden":false}],"publishedAt":"2026-06-01T00:00:00.000Z","submittedOnDailyAt":"2026-06-02T00:00:00.000Z","title":"Geometric Latent Reasoning Induces Shorter Generations in LLMs","submittedOnDailyBy":{"_id":"6578985576b6de797812afdf","avatarUrl":"/avatars/db75f141d4ecf43ca11cb61a32d2e3d6.svg","isPro":false,"fullname":"Shashi Kumar","user":"shashi-kumar","type":"user","name":"shashi-kumar"},"summary":"Large language models solve complex problems by generating lengthy chains of explicit reasoning tokens. While effective, this makes reasoning expensive, length-sensitive, and constrained to (discrete) natural language. While latent reasoning offers a continuous alternative, determining useful structures for intermediate latent states is an open challenge. In this paper, we formulate latent reasoning as a geometric path-approximation problem within the model's pretrained token-embedding space. We introduce Geometric Latent Reasoning (GLR), which uses a lightweight transition head to predict iterative direction updates in embedding space. Using textual chain-of-thought traces as anchors, GLR learns to approximate discrete reasoning trajectories while permitting continuous deviations from exact token embeddings. Evaluations on mathematical reasoning benchmarks using Qwen3 models reveal an emergent phenomenon: geometric latent reasoning induces substantially shorter generations without an explicit length objective. By replacing early explicit reasoning with continuous latent steps, models often reach correct answers using substantially fewer total generation steps. These findings suggest that continuous trajectories act as compact intermediate reasoning states, exposing a new tradeoff between latent computation budget, output length, and accuracy.","upvotes":1,"discussionId":"6a1ea9ac808ddbc3c7d43ffe","ai_summary":"Geometric Latent Reasoning formulates latent reasoning as a geometric path-approximation problem in pretrained token-embedding space, enabling continuous intermediate reasoning states that reduce generation length while maintaining accuracy.","ai_keywords":["latent reasoning","geometric path-approximation problem","pretrained token-embedding space","transition head","chain-of-thought traces","discrete reasoning trajectories","continuous deviations","mathematical reasoning benchmarks","Qwen3 models","emergent phenomenon","latent computation budget","output length","accuracy"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","organization":{"_id":"6304d7b341387c7f1178cb8f","name":"Idiap","fullname":"Idiap Research Institute","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1661261863683-6304d69ece6b12280b1baf27.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"69ccb5d0b2619f648f8e4f94","avatarUrl":"/avatars/3ed1287b2dc12bb3538463103aff862a.svg","isPro":false,"fullname":"Смирнов Екатерина","user":"liamdavis","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"6304d7b341387c7f1178cb8f","name":"Idiap","fullname":"Idiap Research Institute","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1661261863683-6304d69ece6b12280b1baf27.jpeg"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.02248.md"}">
Geometric Latent Reasoning Induces Shorter Generations in LLMs
Abstract
Geometric Latent Reasoning formulates latent reasoning as a geometric path-approximation problem in pretrained token-embedding space, enabling continuous intermediate reasoning states that reduce generation length while maintaining accuracy.
Large language models solve complex problems by generating lengthy chains of explicit reasoning tokens. While effective, this makes reasoning expensive, length-sensitive, and constrained to (discrete) natural language. While latent reasoning offers a continuous alternative, determining useful structures for intermediate latent states is an open challenge. In this paper, we formulate latent reasoning as a geometric path-approximation problem within the model's pretrained token-embedding space. We introduce Geometric Latent Reasoning (GLR), which uses a lightweight transition head to predict iterative direction updates in embedding space. Using textual chain-of-thought traces as anchors, GLR learns to approximate discrete reasoning trajectories while permitting continuous deviations from exact token embeddings. Evaluations on mathematical reasoning benchmarks using Qwen3 models reveal an emergent phenomenon: geometric latent reasoning induces substantially shorter generations without an explicit length objective. By replacing early explicit reasoning with continuous latent steps, models often reach correct answers using substantially fewer total generation steps. These findings suggest that continuous trajectories act as compact intermediate reasoning states, exposing a new tradeoff between latent computation budget, output length, and accuracy.
Community
Large language models solve complex problems by generating lengthy chains of explicit reasoning tokens. While effective, this makes reasoning expensive, length-sensitive, and constrained to (discrete) natural language. While latent reasoning offers a continuous alternative, determining useful structures for intermediate latent states is an open challenge. In this paper, we formulate latent reasoning as a geometric path-approximation problem within the model's pretrained token-embedding space. We introduce Geometric Latent Reasoning (GLR), which uses a lightweight transition head to predict iterative direction updates in embedding space. Using textual chain-of-thought traces as anchors, GLR learns to approximate discrete reasoning trajectories while permitting continuous deviations from exact token embeddings. Evaluations on mathematical reasoning benchmarks using Qwen3 models reveal an emergent phenomenon: geometric latent reasoning induces substantially shorter generations without an explicit length objective. By replacing early explicit reasoning with continuous latent steps, models often reach correct answers using substantially fewer total generation steps. These findings suggest that continuous trajectories act as compact intermediate reasoning states, exposing a new tradeoff between latent computation budget, output length, and accuracy.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2606.02248 in a model README.md to link it from this page.
Cite arxiv.org/abs/2606.02248 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2606.02248 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.