arXiv — Machine Learning · May 13, 2026 · 1 min read

Latent Chain-of-Thought Improves Structured-Data Transformers

Mirrored from arXiv — Machine Learning for archival readability. Support the source by reading on the original site.

Like Read original ↗

arXiv:2605.11262v1 Announce Type: new Abstract: Chain-of-thought and more broadly test-time compute are known to augment the expressive capabilities of language models and have led to major innovations in reasoning. Motivated by this success, this paper explores latent chain-of-thought as well as the impact of depth and looping for time-series and tabular data. We propose a recurrent scheme in which a structured-data transformer, after an initial forward pass, compresses its query-position hidden states into feedback tokens that are appended to the input and processed again, allowing multiple rounds of latent computation before prediction. We compare CoT models against a same-depth no-CoT baseline, a deeper baseline matched to the CoT model in effective depth, and a looped transformer with weight-tied recurrence but no additional chain-of-thought tokens. Across 36 datasets in time-series forecasting and tabular prediction, latent chain-of-thought improves over the baseline on 8/9 time-series datasets (+10.99\% average gain) and 22/27 tabular datasets (+5.31\% average gain). Across both settings, the CoT models perform the best on average. These results demonstrate that chain-of-thought is a useful axis for scaling test-time compute for structured data.

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from arXiv — Machine Learning