Pairwise Ranking Prompting (PRP) elicits pairwise preference judgments from an LLM, which are then aggregated into a ranking, usually via classical sorting algorithms. However, judgments are noisy, order-sensitive, and sometimes intransitive, so sorting assumptions do not match the setting. Because sorting aims to recover a full permutation, truncating it to meet a call budget does not produce a dependable top-K. We thus reframe PRP reranking as active learning from noisy pairwise comparisons and show that active rankers are drop-in replacements that improve NDCG@10 per call in the call-constrained regime. Our noise-robust framework also introduces a randomized-direction oracle that uses a single LLM call per pair. This approach converts systematic position bias into zero-mean noise, enabling unbiased aggregate ranking without the cost of bidirectional calls.</p>\n","updatedAt":"2026-05-20T10:17:54.531Z","author":{"_id":"63b81aace60862785afd8ca2","avatarUrl":"/avatars/f96ce78af6ad42514235bab811544789.svg","fullname":"Luciano Del Corro","name":"lucianodelcorro","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":5,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.875080943107605},"editors":["lucianodelcorro"],"editorAvatarUrls":["/avatars/f96ce78af6ad42514235bab811544789.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.14236","authors":[{"_id":"6a0d89a60cc88a0d483d3778","name":"Jeremías Figueiredo Paschmann","hidden":false},{"_id":"6a0d89a60cc88a0d483d3779","name":"Juan Kaplan","hidden":false},{"_id":"6a0d89a60cc88a0d483d377a","name":"Francisco Nattero","hidden":false},{"_id":"6a0d89a60cc88a0d483d377b","name":"Santiago Barron","hidden":false},{"_id":"6a0d89a60cc88a0d483d377c","name":"Juan Wisznia","hidden":false},{"_id":"6a0d89a60cc88a0d483d377d","name":"Luciano del Corro","hidden":false}],"publishedAt":"2026-05-15T00:00:00.000Z","submittedOnDailyAt":"2026-05-20T00:00:00.000Z","title":"Active Learners as Efficient PRP Rerankers","submittedOnDailyBy":{"_id":"63b81aace60862785afd8ca2","avatarUrl":"/avatars/f96ce78af6ad42514235bab811544789.svg","isPro":false,"fullname":"Luciano Del Corro","user":"lucianodelcorro","type":"user","name":"lucianodelcorro"},"summary":"Pairwise Ranking Prompting (PRP) elicits pairwise preference judgments from an LLM, which are then aggregated into a ranking, usually via classical sorting algorithms. However, judgments are noisy, order-sensitive, and sometimes intransitive, so sorting assumptions do not match the setting. Because sorting aims to recover a full permutation, truncating it to meet a call budget does not produce a dependable top-K. We thus reframe PRP reranking as active learning from noisy pairwise comparisons and show that active rankers are drop-in replacements that improve NDCG@10 per call in the call-constrained regime. Our noise-robust framework also introduces a randomized-direction oracle that uses a single LLM call per pair. This approach converts systematic position bias into zero-mean noise, enabling unbiased aggregate ranking without the cost of bidirectional calls.","upvotes":22,"discussionId":"6a0d89a60cc88a0d483d377e","githubRepo":"https://github.com/jerecoder/IReranker","githubRepoAddedBy":"user","ai_summary":"Pairwise ranking prompting is reformulated as active learning from noisy comparisons, with improved rankers that enhance ranking quality under call constraints and address position bias through a randomized oracle.","ai_keywords":["pairwise ranking prompting","active learning","noisy pairwise comparisons","ranking algorithms","NDCG@10","call budget","position bias","randomized oracle"],"githubStars":2,"organization":{"_id":"66d5cb037300d333daebedd9","name":"UdeSA","fullname":"Universidad de San Andrés","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/6594888e754092f6b1443bbd/C-ZrIbZoTgAj9p-u0v5jZ.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"63b81aace60862785afd8ca2","avatarUrl":"/avatars/f96ce78af6ad42514235bab811544789.svg","isPro":false,"fullname":"Luciano Del Corro","user":"lucianodelcorro","type":"user"},{"_id":"645c43737d655680b57d4202","avatarUrl":"/avatars/96f116dc6be10ccafa8ea0aec8656651.svg","isPro":false,"fullname":"Juan Wisznia","user":"juanwisz","type":"user"},{"_id":"6841b01a0a85a22a69ac0c2f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/DZSitSYJ9uQx6Pkexs8BF.jpeg","isPro":false,"fullname":"Juan Kaplan","user":"juankaplan","type":"user"},{"_id":"62f4864ed6ba2ee266568abd","avatarUrl":"/avatars/65884f3b6fbdb3bced0841d3fd6f959b.svg","isPro":false,"fullname":"Joaquín Bardauil","user":"jbard","type":"user"},{"_id":"6a0d9fe291f8a4c00e069c6e","avatarUrl":"/avatars/71b0f93f9419abcd1d0ad613df916d7d.svg","isPro":false,"fullname":"Tibaudin","user":"Guidotiba","type":"user"},{"_id":"68c0a717ec518e17a95ac927","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/XEv8Z8bveKuaMxxSy4euW.png","isPro":false,"fullname":"Martina Grünewald","user":"mgrunewald","type":"user"},{"_id":"657f3de9cec775bfe003e924","avatarUrl":"/avatars/dc9af77f854f0cd84047ed89659d89b4.svg","isPro":false,"fullname":"Tomas Alejandro Palazzo ","user":"tomaspalazzo","type":"user"},{"_id":"6a0da2f96126696db39864d2","avatarUrl":"/avatars/b1df5f32ca4cb67cb7158f79d541d6b9.svg","isPro":false,"fullname":"Francisco Nattero","user":"fnattero","type":"user"},{"_id":"661c4105d7c07238c2760426","avatarUrl":"/avatars/f3db82b6daf98af0e57a069745ab23a4.svg","isPro":false,"fullname":"Paolillo","user":"paolilloe","type":"user"},{"_id":"684c7e6d5255641661a5dca1","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/wyX2KmrAD-6nu2QC1ncPr.png","isPro":false,"fullname":"Ciro","user":"cpalmieri","type":"user"},{"_id":"68f2aba2bc8276f59182296f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/j_nf8afk-uqa_YKkn5U_G.png","isPro":false,"fullname":"Nadia Guadalupe Molina","user":"gmolina2604","type":"user"},{"_id":"6a0da3b236d895528321c425","avatarUrl":"/avatars/3cc97a21e70464a9b90cd7b306c28a20.svg","isPro":false,"fullname":"Agustín Roel Gasseuy","user":"AroelG","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"66d5cb037300d333daebedd9","name":"UdeSA","fullname":"Universidad de San Andrés","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/6594888e754092f6b1443bbd/C-ZrIbZoTgAj9p-u0v5jZ.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.14236.md"}">
Active Learners as Efficient PRP Rerankers
Abstract
Pairwise ranking prompting is reformulated as active learning from noisy comparisons, with improved rankers that enhance ranking quality under call constraints and address position bias through a randomized oracle.
AI-generated summary
Pairwise Ranking Prompting (PRP) elicits pairwise preference judgments from an LLM, which are then aggregated into a ranking, usually via classical sorting algorithms. However, judgments are noisy, order-sensitive, and sometimes intransitive, so sorting assumptions do not match the setting. Because sorting aims to recover a full permutation, truncating it to meet a call budget does not produce a dependable top-K. We thus reframe PRP reranking as active learning from noisy pairwise comparisons and show that active rankers are drop-in replacements that improve NDCG@10 per call in the call-constrained regime. Our noise-robust framework also introduces a randomized-direction oracle that uses a single LLM call per pair. This approach converts systematic position bias into zero-mean noise, enabling unbiased aggregate ranking without the cost of bidirectional calls.
Community
Pairwise Ranking Prompting (PRP) elicits pairwise preference judgments from an LLM, which are then aggregated into a ranking, usually via classical sorting algorithms. However, judgments are noisy, order-sensitive, and sometimes intransitive, so sorting assumptions do not match the setting. Because sorting aims to recover a full permutation, truncating it to meet a call budget does not produce a dependable top-K. We thus reframe PRP reranking as active learning from noisy pairwise comparisons and show that active rankers are drop-in replacements that improve NDCG@10 per call in the call-constrained regime. Our noise-robust framework also introduces a randomized-direction oracle that uses a single LLM call per pair. This approach converts systematic position bias into zero-mean noise, enabling unbiased aggregate ranking without the cost of bidirectional calls.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.14236 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.14236 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.14236 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.