Hugging Face Daily Papers · · 4 min read

DuMate-DeepResearch: An Auditable Multi-Agent System with Recursive Search and Rubric-Grounded Reasoning

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Deep Research technical report by the DuMate Team.</p>\n","updatedAt":"2026-06-09T05:28:24.864Z","author":{"_id":"66d33c7b35c36f266fbb4faa","avatarUrl":"/avatars/b64f787794e233962e134bf294afc391.svg","fullname":"Lingyong Yan","name":"lingyongyan","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9207212924957275},"editors":["lingyongyan"],"editorAvatarUrls":["/avatars/b64f787794e233962e134bf294afc391.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.07299","authors":[{"_id":"6a27a3de6dde1c5ef75bd134","name":"Lingyong Yan","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd135","name":"Can Xu","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd136","name":"Yukun Zhao","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd137","name":"Wenxuan Li","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd138","name":"Qingyang Chen","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd139","name":"Jiulong Wu","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd13a","name":"Wenli Song","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd13b","name":"Xiangnan Li","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd13c","name":"Weixian Shi","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd13d","name":"Yiqun Chen","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd13e","name":"Xuchen Ma","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd13f","name":"Yuchen Li","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd140","name":"Jiashu Zhao","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd141","name":"Shuaiqiang Wang","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd142","name":"Jianmin Wu","hidden":false},{"_id":"6a27a3de6dde1c5ef75bd143","name":"Dawei Yin","hidden":false}],"publishedAt":"2026-06-05T00:00:00.000Z","submittedOnDailyAt":"2026-06-09T00:00:00.000Z","title":"DuMate-DeepResearch: An Auditable Multi-Agent System with Recursive Search and Rubric-Grounded Reasoning","submittedOnDailyBy":{"_id":"66d33c7b35c36f266fbb4faa","avatarUrl":"/avatars/b64f787794e233962e134bf294afc391.svg","isPro":false,"fullname":"Lingyong Yan","user":"lingyongyan","type":"user","name":"lingyongyan"},"summary":"Deep Research (DR) has emerged as a new agentic paradigm to tackle complex, open-ended research tasks, demanding systems that can iteratively frame problems, acquire evidence, verify sources, and synthesize long-form reports. In practice, however, current DR systems are constrained by four interrelated limitations: long-horizon planning over an underspecified scope, the bottleneck of decomposing and scheduling such tasks within a single agent, hallucination risk in long-form synthesis, and limited process auditability. This technical report presents DuMate-DeepResearch, a multi-agent DR framework built on the Qianfan Agent Foundry. The framework decouples the Agent Core, which handles task understanding, planning, and scheduling, from an extensible Tool Ecosystem for retrieval, evidence acquisition, and report rendering, making every intermediate decision and tool invocation explicitly traceable. Building on this infrastructure, DuMate-DeepResearch further introduces three mechanisms: (i) a graph-based dynamic planning strategy expands the research roadmap coarse-to-fine and continuously revises it through reflection, re-planning, backtracking, and parallel branching; (ii) a recursive two-level execution design delegates each complex search sub-task to an inner Search Agent that runs its own planning loop, isolating noisy retrieval and stabilizing long-horizon execution; (iii) a rubric-based test-time optimization mechanism dynamically generates task-specific quality criteria and uses them as live reasoning scaffolds for evidence-grounded synthesis and adaptive stopping. Across two deep research benchmarks, DuMate-DeepResearch establishes new state-of-the-art results: the best overall score (58.03%) on DeepResearch Bench, and the best overall score (61.95%) on DeepResearch Bench II while ranking first in information recall and analysis.","upvotes":3,"discussionId":"6a27a3df6dde1c5ef75bd144","githubRepo":"https://github.com/baidubce/qianfan-deepresearch","githubRepoAddedBy":"user","ai_summary":"A multi-agent framework for deep research tasks that addresses planning, evidence acquisition, and report synthesis through decoupled components and dynamic optimization mechanisms.","ai_keywords":["multi-agent DR framework","Qianfan Agent Foundry","Agent Core","Tool Ecosystem","graph-based dynamic planning","recursive two-level execution","rubric-based test-time optimization","DeepResearch Bench","DeepResearch Bench II"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":32,"organization":{"_id":"626a6d6b4909b521e1f59ce5","name":"baidu","fullname":"BAIDU","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f187a2cc1c03340ac30498/TYYUxK8xD1AxExFMWqbZD.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6a0fc58cc8676ad292e8a15d","avatarUrl":"/avatars/2649c5c1d64cd72b77265747b89e0cea.svg","isPro":false,"fullname":"Ruizhe Li","user":"imlrz01","type":"user"},{"_id":"66d33c7b35c36f266fbb4faa","avatarUrl":"/avatars/b64f787794e233962e134bf294afc391.svg","isPro":false,"fullname":"Lingyong Yan","user":"lingyongyan","type":"user"},{"_id":"69087415fbacae3388af1427","avatarUrl":"/avatars/ea028ef784ebee33fe3e6031cd8b31b2.svg","isPro":false,"fullname":"XuchenMa","user":"sfadcasdcasdc","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"626a6d6b4909b521e1f59ce5","name":"baidu","fullname":"BAIDU","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f187a2cc1c03340ac30498/TYYUxK8xD1AxExFMWqbZD.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.07299.md"}">
Papers
arxiv:2606.07299

DuMate-DeepResearch: An Auditable Multi-Agent System with Recursive Search and Rubric-Grounded Reasoning

Published on Jun 5
· Submitted by
Lingyong Yan
on Jun 9
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

A multi-agent framework for deep research tasks that addresses planning, evidence acquisition, and report synthesis through decoupled components and dynamic optimization mechanisms.

Deep Research (DR) has emerged as a new agentic paradigm to tackle complex, open-ended research tasks, demanding systems that can iteratively frame problems, acquire evidence, verify sources, and synthesize long-form reports. In practice, however, current DR systems are constrained by four interrelated limitations: long-horizon planning over an underspecified scope, the bottleneck of decomposing and scheduling such tasks within a single agent, hallucination risk in long-form synthesis, and limited process auditability. This technical report presents DuMate-DeepResearch, a multi-agent DR framework built on the Qianfan Agent Foundry. The framework decouples the Agent Core, which handles task understanding, planning, and scheduling, from an extensible Tool Ecosystem for retrieval, evidence acquisition, and report rendering, making every intermediate decision and tool invocation explicitly traceable. Building on this infrastructure, DuMate-DeepResearch further introduces three mechanisms: (i) a graph-based dynamic planning strategy expands the research roadmap coarse-to-fine and continuously revises it through reflection, re-planning, backtracking, and parallel branching; (ii) a recursive two-level execution design delegates each complex search sub-task to an inner Search Agent that runs its own planning loop, isolating noisy retrieval and stabilizing long-horizon execution; (iii) a rubric-based test-time optimization mechanism dynamically generates task-specific quality criteria and uses them as live reasoning scaffolds for evidence-grounded synthesis and adaptive stopping. Across two deep research benchmarks, DuMate-DeepResearch establishes new state-of-the-art results: the best overall score (58.03%) on DeepResearch Bench, and the best overall score (61.95%) on DeepResearch Bench II while ranking first in information recall and analysis.

Community

Paper submitter about 3 hours ago

Deep Research technical report by the DuMate Team.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.07299
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.07299 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.07299 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.07299 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers