Hugging Face Daily Papers · · 5 min read

Geometric Latent Reasoning Induces Shorter Generations in LLMs

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Large language models solve complex problems by generating lengthy chains of explicit reasoning tokens. While effective, this makes reasoning expensive, length-sensitive, and constrained to (discrete) natural language. While latent reasoning offers a continuous alternative, determining useful structures for intermediate latent states is an open challenge. In this paper, we formulate latent reasoning as a geometric path-approximation problem within the model's pretrained token-embedding space. We introduce Geometric Latent Reasoning (GLR), which uses a lightweight transition head to predict iterative direction updates in embedding space. Using textual chain-of-thought traces as anchors, GLR learns to approximate discrete reasoning trajectories while permitting continuous deviations from exact token embeddings. Evaluations on mathematical reasoning benchmarks using Qwen3 models reveal an emergent phenomenon: geometric latent reasoning induces substantially shorter generations without an explicit length objective. By replacing early explicit reasoning with continuous latent steps, models often reach correct answers using substantially fewer total generation steps. These findings suggest that continuous trajectories act as compact intermediate reasoning states, exposing a new tradeoff between latent computation budget, output length, and accuracy.</p>\n","updatedAt":"2026-06-02T10:01:55.161Z","author":{"_id":"6578985576b6de797812afdf","avatarUrl":"/avatars/db75f141d4ecf43ca11cb61a32d2e3d6.svg","fullname":"Shashi Kumar","name":"shashi-kumar","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8781715631484985},"editors":["shashi-kumar"],"editorAvatarUrls":["/avatars/db75f141d4ecf43ca11cb61a32d2e3d6.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.02248","authors":[{"_id":"6a1ea9aa808ddbc3c7d43ff9","name":"Shashi Kumar","hidden":false},{"_id":"6a1ea9aa808ddbc3c7d43ffa","name":"Yacouba Kaloga","hidden":false},{"_id":"6a1ea9aa808ddbc3c7d43ffb","name":"Petr Motlicek","hidden":false},{"_id":"6a1ea9aa808ddbc3c7d43ffc","name":"Ina Kodrasi","hidden":false},{"_id":"6a1ea9aa808ddbc3c7d43ffd","name":"Andrea Cavallaro","hidden":false}],"publishedAt":"2026-06-01T00:00:00.000Z","submittedOnDailyAt":"2026-06-02T00:00:00.000Z","title":"Geometric Latent Reasoning Induces Shorter Generations in LLMs","submittedOnDailyBy":{"_id":"6578985576b6de797812afdf","avatarUrl":"/avatars/db75f141d4ecf43ca11cb61a32d2e3d6.svg","isPro":false,"fullname":"Shashi Kumar","user":"shashi-kumar","type":"user","name":"shashi-kumar"},"summary":"Large language models solve complex problems by generating lengthy chains of explicit reasoning tokens. While effective, this makes reasoning expensive, length-sensitive, and constrained to (discrete) natural language. While latent reasoning offers a continuous alternative, determining useful structures for intermediate latent states is an open challenge. In this paper, we formulate latent reasoning as a geometric path-approximation problem within the model's pretrained token-embedding space. We introduce Geometric Latent Reasoning (GLR), which uses a lightweight transition head to predict iterative direction updates in embedding space. Using textual chain-of-thought traces as anchors, GLR learns to approximate discrete reasoning trajectories while permitting continuous deviations from exact token embeddings. Evaluations on mathematical reasoning benchmarks using Qwen3 models reveal an emergent phenomenon: geometric latent reasoning induces substantially shorter generations without an explicit length objective. By replacing early explicit reasoning with continuous latent steps, models often reach correct answers using substantially fewer total generation steps. These findings suggest that continuous trajectories act as compact intermediate reasoning states, exposing a new tradeoff between latent computation budget, output length, and accuracy.","upvotes":1,"discussionId":"6a1ea9ac808ddbc3c7d43ffe","ai_summary":"Geometric Latent Reasoning formulates latent reasoning as a geometric path-approximation problem in pretrained token-embedding space, enabling continuous intermediate reasoning states that reduce generation length while maintaining accuracy.","ai_keywords":["latent reasoning","geometric path-approximation problem","pretrained token-embedding space","transition head","chain-of-thought traces","discrete reasoning trajectories","continuous deviations","mathematical reasoning benchmarks","Qwen3 models","emergent phenomenon","latent computation budget","output length","accuracy"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","organization":{"_id":"6304d7b341387c7f1178cb8f","name":"Idiap","fullname":"Idiap Research Institute","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1661261863683-6304d69ece6b12280b1baf27.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"69ccb5d0b2619f648f8e4f94","avatarUrl":"/avatars/3ed1287b2dc12bb3538463103aff862a.svg","isPro":false,"fullname":"Смирнов Екатерина","user":"liamdavis","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"6304d7b341387c7f1178cb8f","name":"Idiap","fullname":"Idiap Research Institute","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1661261863683-6304d69ece6b12280b1baf27.jpeg"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.02248.md"}">
Papers
arxiv:2606.02248

Geometric Latent Reasoning Induces Shorter Generations in LLMs

Published on Jun 1
· Submitted by
Shashi Kumar
on Jun 2
Authors:
,
,
,
,

Abstract

Geometric Latent Reasoning formulates latent reasoning as a geometric path-approximation problem in pretrained token-embedding space, enabling continuous intermediate reasoning states that reduce generation length while maintaining accuracy.

Large language models solve complex problems by generating lengthy chains of explicit reasoning tokens. While effective, this makes reasoning expensive, length-sensitive, and constrained to (discrete) natural language. While latent reasoning offers a continuous alternative, determining useful structures for intermediate latent states is an open challenge. In this paper, we formulate latent reasoning as a geometric path-approximation problem within the model's pretrained token-embedding space. We introduce Geometric Latent Reasoning (GLR), which uses a lightweight transition head to predict iterative direction updates in embedding space. Using textual chain-of-thought traces as anchors, GLR learns to approximate discrete reasoning trajectories while permitting continuous deviations from exact token embeddings. Evaluations on mathematical reasoning benchmarks using Qwen3 models reveal an emergent phenomenon: geometric latent reasoning induces substantially shorter generations without an explicit length objective. By replacing early explicit reasoning with continuous latent steps, models often reach correct answers using substantially fewer total generation steps. These findings suggest that continuous trajectories act as compact intermediate reasoning states, exposing a new tradeoff between latent computation budget, output length, and accuracy.

Community

Large language models solve complex problems by generating lengthy chains of explicit reasoning tokens. While effective, this makes reasoning expensive, length-sensitive, and constrained to (discrete) natural language. While latent reasoning offers a continuous alternative, determining useful structures for intermediate latent states is an open challenge. In this paper, we formulate latent reasoning as a geometric path-approximation problem within the model's pretrained token-embedding space. We introduce Geometric Latent Reasoning (GLR), which uses a lightweight transition head to predict iterative direction updates in embedding space. Using textual chain-of-thought traces as anchors, GLR learns to approximate discrete reasoning trajectories while permitting continuous deviations from exact token embeddings. Evaluations on mathematical reasoning benchmarks using Qwen3 models reveal an emergent phenomenon: geometric latent reasoning induces substantially shorter generations without an explicit length objective. By replacing early explicit reasoning with continuous latent steps, models often reach correct answers using substantially fewer total generation steps. These findings suggest that continuous trajectories act as compact intermediate reasoning states, exposing a new tradeoff between latent computation budget, output length, and accuracy.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.02248
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.02248 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.02248 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.02248 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers