Hugging Face Daily Papers · · 4 min read

Latent Reasoning with Normalizing Flows

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

NF-CoT introduces a latent reasoning framework for LLMs that models intermediate thoughts as continuous states using normalizing flows, preserving autoregressive decoding while improving performance and efficiency.</p>\n<p><a href=\"https://cdn-uploads.huggingface.co/production/uploads/6039478ab3ecf716b1a5fd4d/JGTy7j-3LEB5DLgb_ZOLO.png\" rel=\"nofollow\"><img src=\"https://cdn-uploads.huggingface.co/production/uploads/6039478ab3ecf716b1a5fd4d/JGTy7j-3LEB5DLgb_ZOLO.png\" alt=\"image\"></a></p>\n","updatedAt":"2026-06-05T02:23:01.722Z","author":{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","fullname":"taesiri","name":"taesiri","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":310,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7678427696228027},"editors":["taesiri"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.06447","authors":[{"_id":"6a2232e73490a593e87b1458","name":"Guancheng Tu","hidden":false},{"_id":"6a2232e73490a593e87b1459","name":"Xiangjun Fu","hidden":false},{"_id":"6a2232e73490a593e87b145a","name":"Suhao Yu","hidden":false},{"_id":"6a2232e73490a593e87b145b","name":"Yao Tang","hidden":false},{"_id":"6a2232e73490a593e87b145c","name":"Haoqiang Kang","hidden":false},{"_id":"6a2232e73490a593e87b145d","name":"Lianhui Qin","hidden":false},{"_id":"6a2232e73490a593e87b145e","name":"Yizhe Zhang","hidden":false},{"_id":"6a2232e73490a593e87b145f","name":"Jiatao Gu","hidden":false}],"publishedAt":"2026-06-04T00:00:00.000Z","submittedOnDailyAt":"2026-06-05T00:00:00.000Z","title":"Latent Reasoning with Normalizing Flows","submittedOnDailyBy":{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user","name":"taesiri"},"summary":"Large language models often improve reasoning by generating explicit chain-of-thought (CoT), demonstrating the importance of intermediate computation. However, textual CoT forces this computation through a discrete, serial, and communication-oriented token stream: each reasoning step must be verbalized before the model can proceed, even when the underlying update is semantic, uncertain, or only partially formed. Latent reasoning offers a higher-bandwidth alternative by performing intermediate computation in compact continuous states before committing to text. Yet existing latent-reasoning methods often sacrifice key advantages that make CoT effective in autoregressive language models, including native left-to-right generation, probabilistic sampling, compatibility with KV-cache decoding, and tractable likelihood estimation. We propose NF-CoT, a latent reasoning framework that preserves these advantages by modeling continuous thoughts with normalizing flows. NF-CoT instantiates a TARFlow-style normalizing flow inside the LLM backbone, defining a tractable probability model over compact continuous thoughts distilled from explicit CoT. Continuous-thought positions are generated by an NF head, while text positions are generated by the standard LM head within the same causal stream. This design provides exact likelihoods for latent thoughts, enables probabilistic left-to-right decoding with the original KV cache, and supports direct policy-gradient optimization in the latent reasoning space. On code-generation benchmarks, NF-CoT improves pass rates over explicit-CoT and prior latent-reasoning baselines while substantially reducing intermediate-reasoning cost.","upvotes":1,"discussionId":"6a2232e73490a593e87b1460","projectPage":"https://nf-cot.vercel.app/","ai_summary":"Latent reasoning framework using normalizing flows preserves autoregressive generation advantages while enabling efficient, probabilistic intermediate computation in large language models.","ai_keywords":["chain-of-thought","latent reasoning","normalizing flows","TARFlow","probabilistic sampling","KV-cache decoding","likelihood estimation","policy-gradient optimization","code-generation benchmarks"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct"},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.06447.md"}">
Papers
arxiv:2606.06447

Latent Reasoning with Normalizing Flows

Published on Jun 4
· Submitted by
taesiri
on Jun 5
Authors:
,
,
,
,
,
,
,

Abstract

Latent reasoning framework using normalizing flows preserves autoregressive generation advantages while enabling efficient, probabilistic intermediate computation in large language models.

Large language models often improve reasoning by generating explicit chain-of-thought (CoT), demonstrating the importance of intermediate computation. However, textual CoT forces this computation through a discrete, serial, and communication-oriented token stream: each reasoning step must be verbalized before the model can proceed, even when the underlying update is semantic, uncertain, or only partially formed. Latent reasoning offers a higher-bandwidth alternative by performing intermediate computation in compact continuous states before committing to text. Yet existing latent-reasoning methods often sacrifice key advantages that make CoT effective in autoregressive language models, including native left-to-right generation, probabilistic sampling, compatibility with KV-cache decoding, and tractable likelihood estimation. We propose NF-CoT, a latent reasoning framework that preserves these advantages by modeling continuous thoughts with normalizing flows. NF-CoT instantiates a TARFlow-style normalizing flow inside the LLM backbone, defining a tractable probability model over compact continuous thoughts distilled from explicit CoT. Continuous-thought positions are generated by an NF head, while text positions are generated by the standard LM head within the same causal stream. This design provides exact likelihoods for latent thoughts, enables probabilistic left-to-right decoding with the original KV cache, and supports direct policy-gradient optimization in the latent reasoning space. On code-generation benchmarks, NF-CoT improves pass rates over explicit-CoT and prior latent-reasoning baselines while substantially reducing intermediate-reasoning cost.

Community

Paper submitter about 9 hours ago

NF-CoT introduces a latent reasoning framework for LLMs that models intermediate thoughts as continuous states using normalizing flows, preserving autoregressive decoding while improving performance and efficiency.

image

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.06447
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.06447 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.06447 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.06447 in a Space README.md to link it from this page.

Collections including this paper 1

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers