Expand More, Shrink Less: Shaping Effective-Rank Dynamics for Dense Scaling in Recommendation
Mirrored from arXiv — Machine Learning for archival readability. Support the source by reading on the original site.
Computer Science > Machine Learning
Title:Expand More, Shrink Less: Shaping Effective-Rank Dynamics for Dense Scaling in Recommendation
Abstract:Scaling recommendation models is a central challenge in recommender systems. Recently, RankMixer has emerged as an effective solution, operating on a unified token representation and alternating between token mixing and per-token feedforward networks (P-FFNs) to achieve scalable performance. However, RankMixer suffers from \textit{embedding collapse}, where learned representations have low effective rank, limiting expressivity and underutilizing the expanded representation space. Through empirical analysis and theoretical insights, we identify rigid token mixing and P-FFN modules as the primary causes of this phenomenon, jointly inducing a \textbf{damped oscillatory trajectory} in effective-rank evolution across layers. To address it, we propose RankElastor, a novel architecture that produces spectrum-robust representations with provable collapse mitigation. RankElastor introduces two components: (i) \textbf{parameterized full mixing}, which enables expressive token mixing with improved spectral robustness; and (ii) \textbf{GLU-improved P-FFNs}, which stabilize representation spectra through GLU-style FFN modules. Extensive experiments on large-scale industrial datasets demonstrate that RankElastor consistently improves recommendation performance, mitigates embedding collapse, and exhibits robust scaling behavior. Code is available at this GitHub repository: this https URL
| Comments: | Accepted at the 32st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Research Track), KDD 2026 February Cycle |
| Subjects: | Machine Learning (cs.LG); Information Retrieval (cs.IR); Numerical Analysis (math.NA) |
| Cite as: | arXiv:2605.23191 [cs.LG] |
| (or arXiv:2605.23191v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.23191
arXiv-issued DOI via DataCite (pending registration)
|
|
| Related DOI: | https://doi.org/10.1145/3770855.3818049
DOI(s) linking to related resources
|
Access Paper:
- View PDF
- HTML (experimental)
- TeX Source
Current browse context:
References & Citations
Bibliographic and Citation Tools
Code, Data and Media Associated with this Article
Demos
Recommenders and Search Tools
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
More from arXiv — Machine Learning
-
Latent Cache Flow: Model-to-Model Communication Without Text
May 25
-
Reading Calibrated Uncertainty from Language Model Trajectories
May 25
-
FusionSense: Tri-Stage Near-Sensor Learning for Runtime-Adaptive Multimodal Edge Intelligence
May 25
-
FuRA: Full-Rank Parameter-Efficient Fine-Tuning with Spectral Preconditioning
May 25
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.