Hugging Face Daily Papers · June 16, 2026 · 5 min read

OneRank: Unified Transformer-Native Ranking Architecture for Multi-Task Recommendation

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

🎉 Excited to share that our paper \"OneRank: Unified Transformer-Native Ranking Architecture for Multi-Task Recommendation\" has been accepted at KDD 2026! See you in Jeju Island 🇰🇷\n🔍 The problem: Traditional multi-task recommenders follow an encoder–predictor paradigm that creates information bottlenecks, suffers from the seesaw phenomenon, and forces a dataflow mismatch between attention-based encoding and static feed-forward prediction.\n💡 Our solution: OneRank completely removes the encoder–predictor split and internalizes all multi-task reasoning within a unified Transformer stack — from task-specific encoding to dynamic context-aware ranking.\n✨ Key highlights:\n<ul>\n<li>Task-specific token injection with mutual invisibility for early task specialization</li>\n<li>Candidate-aware contextualization via situational descriptors to bridge the training-serving gap</li>\n<li>Cross-task relational attention with strategic gradient detachment — turning cross-task attention into a read-only memory for knowledge transfer</li>\n<li>Dynamic matching-based scoring that replaces static MLP heads with context-adaptive ranking 📈 Validated on large-scale industrial datasets with both offline experiments and online A/B tests at Shopee, showing significant gains in ranking effectiveness while maintaining computational efficiency.</li>\n</ul>\nJoint work with amazing collaborators from Gaoling School of AI @RUC, Shopee, and NTU 🙌\n","updatedAt":"2026-06-16T03:33:44.253Z","author":{"_id":"65acfb3a14e6582c30b4ce76","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65acfb3a14e6582c30b4ce76/RhEhePggBtyM0RIIqXQen.jpeg","fullname":"TangJiakai","name":"TangJiakai5704","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8550417423248291},"editors":["TangJiakai5704"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/65acfb3a14e6582c30b4ce76/RhEhePggBtyM0RIIqXQen.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.16838","authors":[{"_id":"6a30c2bca0d4daae4285fee5","user":{"_id":"65acfb3a14e6582c30b4ce76","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65acfb3a14e6582c30b4ce76/RhEhePggBtyM0RIIqXQen.jpeg","isPro":false,"fullname":"TangJiakai","user":"TangJiakai5704","type":"user","name":"TangJiakai5704"},"name":"Jiakai Tang","status":"claimed_verified","statusLastChangedAt":"2026-06-16T12:06:53.000Z","hidden":false},{"_id":"6a30c2bca0d4daae4285fee6","user":{"_id":"64db88993725f8d9a908c077","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64db88993725f8d9a908c077/JZEGk0kius6mrANlwOWw9.jpeg","isPro":false,"fullname":"Sunhao Dai","user":"KID-22","type":"user","name":"KID-22"},"name":"Sunhao Dai","status":"claimed_verified","statusLastChangedAt":"2026-06-16T12:06:54.963Z","hidden":false},{"_id":"6a30c2bca0d4daae4285fee7","name":"Kun Wang","hidden":false},{"_id":"6a30c2bca0d4daae4285fee8","name":"Zhiluohan Guo","hidden":false},{"_id":"6a30c2bca0d4daae4285fee9","name":"Yu Zhao","hidden":false},{"_id":"6a30c2bca0d4daae4285feea","name":"Cong Fu","hidden":false},{"_id":"6a30c2bca0d4daae4285feeb","name":"Kangle Wu","hidden":false},{"_id":"6a30c2bca0d4daae4285feec","name":"Yabo Ni","hidden":false},{"_id":"6a30c2bca0d4daae4285feed","name":"Anxiang Zeng","hidden":false},{"_id":"6a30c2bca0d4daae4285feee","name":"Xu Chen","hidden":false},{"_id":"6a30c2bca0d4daae4285feef","name":"Jun Xu","hidden":false}],"publishedAt":"2026-06-15T00:00:00.000Z","submittedOnDailyAt":"2026-06-16T00:00:00.000Z","title":"OneRank: Unified Transformer-Native Ranking Architecture for Multi-Task Recommendation","submittedOnDailyBy":{"_id":"65acfb3a14e6582c30b4ce76","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65acfb3a14e6582c30b4ce76/RhEhePggBtyM0RIIqXQen.jpeg","isPro":false,"fullname":"TangJiakai","user":"TangJiakai5704","type":"user","name":"TangJiakai5704"},"summary":"Multi-task learning (MTL) is essential in recommender systems to enable complementary learning among diverse user feedback. While modern industrial practices have shifted from DNNs to Transformer-centric architectures to strengthen sequence modeling and scaling capacity, they still decouple feature encoding from multi-task prediction, treating the Transformer as a task-agnostic encoder. This design fundamentally limits the performance and scalability by (1) creating an information bottleneck under heterogeneous task objectives, (2) inducing gradient interference that leads to the seesaw phenomenon, and (3) forcing a dataflow transition in which attention-based, context-adaptive representation learning is converted to static feed-forward task prediction with incompatible information read-write dynamics.\n We propose OneRank, a Transformer-native multi-task ranking framework that eliminates encoder-predictor separation and introduces task-private channels for forward representation learning and backward optimization, enabling task-specialized learning while reducing inter-task interference. In the forward pass, OneRank learns task-specific representations bottom-up through task-conditioned information selection, candidate-aware contextualization, and controlled cross-task interaction. In the backward pass, cross-task gradient detachment isolates task-private parameter updates from shared knowledge extraction modules, preventing negative transfer. We further replace static task-specific MLP scorers with dynamic matching-based scoring for context-aware personalized ranking. By internalizing multi-task reasoning within the Transformer stack, OneRank establishes a unified and scalable architectural paradigm. Offline and online experiments on large-scale industrial datasets show that OneRank significantly outperforms state-of-the-art baselines while maintaining computational efficiency.","upvotes":14,"discussionId":"6a30c2bca0d4daae4285fef0","ai_summary":"OneRank presents a Transformer-native multi-task learning framework that integrates feature encoding and prediction to reduce inter-task interference and improve ranking performance in recommender systems.","ai_keywords":["multi-task learning","Transformer","recommender systems","task-private channels","gradient detachment","cross-task interaction","task-conditioned information selection","candidate-aware contextualization","dynamic matching-based scoring","attention-based representation learning","negative transfer"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct"},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64db88993725f8d9a908c077","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64db88993725f8d9a908c077/JZEGk0kius6mrANlwOWw9.jpeg","isPro":false,"fullname":"Sunhao Dai","user":"KID-22","type":"user"},{"_id":"653dbf9bfbe250285ffc7cc6","avatarUrl":"/avatars/7668bf361981638a9dd54f282e433d9b.svg","isPro":false,"fullname":"ZCS","user":"lyingCS","type":"user"},{"_id":"65acfb3a14e6582c30b4ce76","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65acfb3a14e6582c30b4ce76/RhEhePggBtyM0RIIqXQen.jpeg","isPro":false,"fullname":"TangJiakai","user":"TangJiakai5704","type":"user"},{"_id":"68d20ecacea4fc4c97a0062e","avatarUrl":"/avatars/7e50129a4d7e51896d288ce21e8eef84.svg","isPro":false,"fullname":"hby123","user":"hby123","type":"user"},{"_id":"68d214fe2e59f503dbffb8c9","avatarUrl":"/avatars/fef3c37dc9dd9e4833d7ba05c9e293de.svg","isPro":false,"fullname":"guo","user":"guoehan","type":"user"},{"_id":"67dbcfab85eacb364e951c17","avatarUrl":"/avatars/6805f4b31bf9db12c5fe233a85ae0eeb.svg","isPro":false,"fullname":"dd","user":"Curryzk","type":"user"},{"_id":"666a55b1ae0b4ceca4e8ff86","avatarUrl":"/avatars/642e703469c8c2d91f6011879747f7dd.svg","isPro":false,"fullname":"gawa","user":"gawa123","type":"user"},{"_id":"672dda02ee49faac3ac69510","avatarUrl":"/avatars/d264e8f9f8a9b2787ff4768dfc371a2e.svg","isPro":false,"fullname":"Lehua He","user":"yycloudywind","type":"user"},{"_id":"6a30c95d7bad64e1f9f88df5","avatarUrl":"/avatars/61e9d410f205f368726c1cfd8de71ff6.svg","isPro":false,"fullname":"Qihao Liang","user":"cheehowliang","type":"user"},{"_id":"66152fbe1bcd61054402449b","avatarUrl":"/avatars/17cb2f997e7983d706d87cf7c8c5c3dd.svg","isPro":false,"fullname":"Shi","user":"TengShi","type":"user"},{"_id":"68d210738c6103d703e915a8","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/9kAT_FpLNTKN3uld5yMl5.png","isPro":false,"fullname":"zhaoyu","user":"SmallYelloHair","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.16838.md","query":{}}">

Papers

arxiv:2606.16838

OneRank: Unified Transformer-Native Ranking Architecture for Multi-Task Recommendation

Published on Jun 15

· Submitted by

TangJiakai on Jun 16

Upvote

Authors:

Jiakai Tang ,

Sunhao Dai ,

Abstract

OneRank presents a Transformer-native multi-task learning framework that integrates feature encoding and prediction to reduce inter-task interference and improve ranking performance in recommender systems.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

Multi-task learning (MTL) is essential in recommender systems to enable complementary learning among diverse user feedback. While modern industrial practices have shifted from DNNs to Transformer-centric architectures to strengthen sequence modeling and scaling capacity, they still decouple feature encoding from multi-task prediction, treating the Transformer as a task-agnostic encoder. This design fundamentally limits the performance and scalability by (1) creating an information bottleneck under heterogeneous task objectives, (2) inducing gradient interference that leads to the seesaw phenomenon, and (3) forcing a dataflow transition in which attention-based, context-adaptive representation learning is converted to static feed-forward task prediction with incompatible information read-write dynamics. We propose OneRank, a Transformer-native multi-task ranking framework that eliminates encoder-predictor separation and introduces task-private channels for forward representation learning and backward optimization, enabling task-specialized learning while reducing inter-task interference. In the forward pass, OneRank learns task-specific representations bottom-up through task-conditioned information selection, candidate-aware contextualization, and controlled cross-task interaction. In the backward pass, cross-task gradient detachment isolates task-private parameter updates from shared knowledge extraction modules, preventing negative transfer. We further replace static task-specific MLP scorers with dynamic matching-based scoring for context-aware personalized ranking. By internalizing multi-task reasoning within the Transformer stack, OneRank establishes a unified and scalable architectural paradigm. Offline and online experiments on large-scale industrial datasets show that OneRank significantly outperforms state-of-the-art baselines while maintaining computational efficiency.

View arXiv page View PDF Add to collection

Community

TangJiakai5704

Paper author Paper submitter about 9 hours ago

🎉 Excited to share that our paper "OneRank: Unified Transformer-Native Ranking Architecture for Multi-Task Recommendation" has been accepted at KDD 2026! See you in Jeju Island 🇰🇷

🔍 The problem: Traditional multi-task recommenders follow an encoder–predictor paradigm that creates information bottlenecks, suffers from the seesaw phenomenon, and forces a dataflow mismatch between attention-based encoding and static feed-forward prediction.

💡 Our solution: OneRank completely removes the encoder–predictor split and internalizes all multi-task reasoning within a unified Transformer stack — from task-specific encoding to dynamic context-aware ranking.

✨ Key highlights:

Task-specific token injection with mutual invisibility for early task specialization
Candidate-aware contextualization via situational descriptors to bridge the training-serving gap
Cross-task relational attention with strategic gradient detachment — turning cross-task attention into a read-only memory for knowledge transfer
Dynamic matching-based scoring that replaces static MLP heads with context-adaptive ranking
📈 Validated on large-scale industrial datasets with both offline experiments and online A/B tests at Shopee, showing significant gains in ranking effectiveness while maintaining computational efficiency.

Joint work with amazing collaborators from Gaoling School of AI @RUC, Shopee, and NTU 🙌

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2606.16838

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.16838 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.16838 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.16838 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

No comments yet. Sign in and be the first to say something.

OneRank: Unified Transformer-Native Ranking Architecture for Multi-Task Recommendation

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 0

Discussion (0)

More from Hugging Face Daily Papers