Hugging Face Daily Papers · June 25, 2026 · 5 min read

Forecasting Future Behavior as a Learning Task

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. \nThe following papers were recommended by the Semantic Scholar API \n<ul>\n<li><a href=\"https://huggingface.co/papers/2605.13165\">STOP: Structured On-Policy Pruning of Long-Form Reasoning in Low-Data Regimes</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.30343\">Unlocking the Working Memory of Large Language Models for Latent Reasoning</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2606.02981\">Predicting Inference-Time Scaling Gains from Labeled Validation-Set Output Statistics</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.18299\">SD-Search: On-Policy Hindsight Self-Distillation for Search-Augmented Reasoning</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.07315\">LaTER: Efficient Test-Time Reasoning via Latent Exploration and Explicit Verification</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.18549\">Monitoring the Internal Monologue: Probe Trajectories Reveal Reasoning Dynamics</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.27190\">Learning When to Think While Listening in Large Audio-Language Models</a> (2026)</li>\n</ul>\n Please give a thumbs up to this comment if you found it helpful!\n If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code>@librarian-bot recommend</code>\n","updatedAt":"2026-06-26T01:49:21.843Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":371,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7044394016265869},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.11445","authors":[{"_id":"6a2cd609a0d4daae4285f0f7","name":"Mosh Levy","hidden":false},{"_id":"6a2cd609a0d4daae4285f0f8","name":"Yoav Goldberg","hidden":false},{"_id":"6a2cd609a0d4daae4285f0f9","name":"Asa Cooper Stickland","hidden":false}],"publishedAt":"2026-06-09T00:00:00.000Z","submittedOnDailyAt":"2026-06-25T00:00:00.000Z","title":"Forecasting Future Behavior as a Learning Task","submittedOnDailyBy":{"_id":"622f35a2bc2a392eaf21b3e7","avatarUrl":"/avatars/383409ebd912ba90d8e7966e61a3910d.svg","isPro":true,"fullname":"Mosh Levy","user":"Mosh","type":"user","name":"Mosh"},"summary":"Trust in an AI system is often anchored by explanations of how it works, which one then uses to forecast its behavior on new inputs. For large reasoning models (LRMs), this conventional route is particularly difficult to follow: explanation methods for single token generations do not naturally generalize to long trajectories, and the trajectories themselves are often not faithful when read as natural language. We propose an alternative that bypasses the explanation step: treat behavior forecasting as a learnable task and train Behavior Forecasters that operates on a single reasoning trajectory to make the same forecasts one would typically seek from an explanation. The forecaster's training data is obtained by querying the LRM with no human annotation, and its inference is done in a single forward pass. We instantiate this approach on two tasks: how likely the LRM is to repeat its answer on re-runs, and how removing parts of the input changes its answer. We evaluate this approach on both tasks across three diverse reasoning datasets and find that trained Behavior Forecasters are more accurate than GPT-5.4 and Claude Opus-4.6 reading the same trajectories as naive readers, at a small fraction of their inference cost. We find that fine-tuning the backbone end-to-end and initializing it from the target LRM are each necessary for strong performance. These results show that the reasoning trajectory carries information about the LRM's future behavior that goes beyond what naive reading conveys.","upvotes":0,"discussionId":"6a2cd60aa0d4daae4285f0fa","githubRepo":"https://github.com/Mosh0110/behavior-forecasters","githubRepoAddedBy":"user","ai_summary":"Behavior Forecasters are trained to predict large reasoning model outputs from single trajectories, outperforming large language models while requiring significantly less computational cost.","ai_keywords":["large reasoning models","behavior forecasting","reasoning trajectories","inference cost","end-to-end fine-tuning","trajectory analysis"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":1},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[],"acceptLanguages":["en"],"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.11445.md","query":{}}">

Papers

arxiv:2606.11445

Forecasting Future Behavior as a Learning Task

Published on Jun 9

· Submitted by

Mosh Levy on Jun 25

Upvote

Authors:

Abstract

Behavior Forecasters are trained to predict large reasoning model outputs from single trajectories, outperforming large language models while requiring significantly less computational cost.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

Trust in an AI system is often anchored by explanations of how it works, which one then uses to forecast its behavior on new inputs. For large reasoning models (LRMs), this conventional route is particularly difficult to follow: explanation methods for single token generations do not naturally generalize to long trajectories, and the trajectories themselves are often not faithful when read as natural language. We propose an alternative that bypasses the explanation step: treat behavior forecasting as a learnable task and train Behavior Forecasters that operates on a single reasoning trajectory to make the same forecasts one would typically seek from an explanation. The forecaster's training data is obtained by querying the LRM with no human annotation, and its inference is done in a single forward pass. We instantiate this approach on two tasks: how likely the LRM is to repeat its answer on re-runs, and how removing parts of the input changes its answer. We evaluate this approach on both tasks across three diverse reasoning datasets and find that trained Behavior Forecasters are more accurate than GPT-5.4 and Claude Opus-4.6 reading the same trajectories as naive readers, at a small fraction of their inference cost. We find that fine-tuning the backbone end-to-end and initializing it from the target LRM are each necessary for strong performance. These results show that the reasoning trajectory carries information about the LRM's future behavior that goes beyond what naive reading conveys.

View arXiv page View PDF GitHub 1 Add to collection

Community

Mosh

Paper submitter about 9 hours ago

This comment has been hidden (marked as Resolved)

librarian-bot

12 minutes ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2606.11445

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.11445 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.11445 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.11445 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

No comments yet. Sign in and be the first to say something.

Forecasting Future Behavior as a Learning Task

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 0

Discussion (0)

More from Hugging Face Daily Papers