Dynamics of the Transformer Residual Stream: Coupling Spectral Geometry to Network Topology
Mirrored from arXiv — Machine Learning for archival readability. Support the source by reading on the original site.
Computer Science > Machine Learning
Title:Dynamics of the Transformer Residual Stream: Coupling Spectral Geometry to Network Topology
Abstract:Large language models are remarkably capable, yet how computation propagates through their layers remains poorly understood. A growing line of work treats depth as discrete time and the residual stream as a dynamical system, where each layer's nonlinear update has a local linear description. However, previous analyses have relied on scalar summaries or approximate linearizations, leaving the full spectral geometry of trained LLMs unknown. We perform full Jacobian eigendecomposition across three production--scale LLMs and show that training installs a monotonic spectral gradient through depth -- from non-normal, rotation-dominated early layers to near--symmetric late layers -- together with a cumulative low-rank bottleneck that funnels perturbations into a small fraction of the residual stream's effective dimensions. Our experiments reveal that this gradient and the dimensional collapse are learned rather than architectural, and is largely dissolved when structured non-normality is removed. We further show that the topological positioning of graph communities predicts whether the Jacobian amplifies or suppresses them, with the sign of the coupling determined by the local operator type, a relationship absent at initialization. These results map a learned spectral geometry in LLMs that links perturbation propagation and compression to the network's functional topology.
| Subjects: | Machine Learning (cs.LG); Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2605.14258 [cs.LG] |
| (or arXiv:2605.14258v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.14258
arXiv-issued DOI via DataCite (pending registration)
|
Submission history
From: Grigori Guitchounts [view email][v1] Thu, 14 May 2026 01:57:47 UTC (5,637 KB)
Access Paper:
- View PDF
- HTML (experimental)
- TeX Source
References & Citations
Bibliographic and Citation Tools
Code, Data and Media Associated with this Article
Demos
Recommenders and Search Tools
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
More from arXiv — Machine Learning
-
Vision-Based Runtime Monitoring under Varying Specifications using Semantic Latent Representations
May 15
-
Mechanistic Interpretability of EEG Foundation Models via Sparse Autoencoders
May 15
-
Rethinking Molecular OOD Generalization via Target-Aware Source Selection
May 15
-
Unsupervised learning of acquisition variability in structural connectomes via hybrid latent space modeling
May 15
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.