Hugging Face Daily Papers · · 6 min read

MapAgent: An Industrial-Grade Agentic Framework for City-scale Lane-level Map Generation

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Lane-level maps are critical infrastructure for autonomous driving and lane-level navigation, yet constructing and maintaining standardized lane networks for hundreds of cities remains highly labor-intensive. Recent end-to-end vectorized mapping methods can predict lane geometry and topology directly from sensor data, but they typically treat mapping specifications and traffic regulations as implicit, dataset-dependent supervision. Moreover, in complex scenes (e.g., worn or missing markings and occlusions), correct lane configurations are often under-determined by visual evidence alone, making specification violations a major source of human post-editing. We propose MapAgent, an industrial-grade agentic architecture that augments a vectorization backbone for specification-compliant lane-map production. Rather than merely adding an agent loop to map prediction, MapAgent couples backbone perception with explicit specification verification, constraint-aware reasoning, and deterministic map editing under a bounded, verification-driven Judge-Planner-Worker loop. A vision-language Judge diagnoses errors by jointly inspecting visual evidence and draft vectors, while a tool-calling Planner generates minimal corrective edits with post-edit re-validation. To remain scalable for city-scale production, MapAgent is selectively triggered only on tiles with low backbone confidence, adding modest overhead while preserving throughput. Experiments on real-world datasets show consistent gains over strong production baselines, especially in complex and long-tail scenarios. Additionally, MapAgent has been integrated into Baidu Maps, supporting lane-level map generation for over 360 cities nationwide and elevating the overall production automation to over 95%, demonstrating MapAgent's practicality and effectiveness for large-scale lane-level map generation.</p>\n","updatedAt":"2026-06-04T02:52:32.321Z","author":{"_id":"6449d8345dec46f9adc30dd8","avatarUrl":"/avatars/fbc4204466750fca0b56074674ee64d7.svg","fullname":"Dong Xie","name":"PPrimo","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":16,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8690159320831299},"editors":["PPrimo"],"editorAvatarUrls":["/avatars/fbc4204466750fca0b56074674ee64d7.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.04513","authors":[{"_id":"6a20e6b515100c5272a846c8","name":"Deguo Xia","hidden":false},{"_id":"6a20e6b515100c5272a846c9","name":"Zihan Li","hidden":false},{"_id":"6a20e6b515100c5272a846ca","name":"Haochen Zhao","hidden":false},{"_id":"6a20e6b515100c5272a846cb","name":"Dong Xie","hidden":false},{"_id":"6a20e6b515100c5272a846cc","name":"Yuyao Kong","hidden":false},{"_id":"6a20e6b515100c5272a846cd","name":"Xiyan Liu","hidden":false},{"_id":"6a20e6b515100c5272a846ce","name":"Jizhou Huang","hidden":false},{"_id":"6a20e6b515100c5272a846cf","name":"Mengmeng Yang","hidden":false},{"_id":"6a20e6b515100c5272a846d0","name":"Diange Yang","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/6449d8345dec46f9adc30dd8/hlsMMtIthUtHMSJA18lce.png"],"publishedAt":"2026-06-03T00:00:00.000Z","submittedOnDailyAt":"2026-06-04T00:00:00.000Z","title":"MapAgent: An Industrial-Grade Agentic Framework for City-scale Lane-level Map Generation","submittedOnDailyBy":{"_id":"6449d8345dec46f9adc30dd8","avatarUrl":"/avatars/fbc4204466750fca0b56074674ee64d7.svg","isPro":false,"fullname":"Dong Xie","user":"PPrimo","type":"user","name":"PPrimo"},"summary":"Lane-level maps are critical infrastructure for autonomous driving and lane-level navigation, yet constructing and maintaining standardized lane networks for hundreds of cities remains highly labor-intensive. Recent end-to-end vectorized mapping methods can predict lane geometry and topology directly from sensor data, but they typically treat mapping specifications and traffic regulations as implicit, dataset-dependent supervision. Moreover, in complex scenes (e.g., worn or missing markings and occlusions), correct lane configurations are often under-determined by visual evidence alone, making specification violations a major source of human post-editing. We propose MapAgent, an industrial-grade agentic architecture that augments a vectorization backbone for specification-compliant lane-map production. Rather than merely adding an agent loop to map prediction, MapAgent couples backbone perception with explicit specification verification, constraint-aware reasoning, and deterministic map editing under a bounded, verification-driven Judge-Planner-Worker loop. A vision-language Judge diagnoses errors by jointly inspecting visual evidence and draft vectors, while a tool-calling Planner generates minimal corrective edits with post-edit re-validation. To remain scalable for city-scale production, MapAgent is selectively triggered only on tiles with low backbone confidence, adding modest overhead while preserving throughput. Experiments on real-world datasets show consistent gains over strong production baselines, especially in complex and long-tail scenarios. Additionally, MapAgent has been integrated into Baidu Maps, supporting lane-level map generation for over 360 cities nationwide and elevating the overall production automation to over 95%, demonstrating MapAgent's practicality and effectiveness for large-scale lane-level map generation.","upvotes":9,"discussionId":"6a20e6b515100c5272a846d1","projectPage":"https://github.com/eadst/KDD-2026-MapAgent","githubRepo":"https://github.com/eadst/KDD-2026-MapAgent","githubRepoAddedBy":"user","ai_summary":"MapAgent is an industrial-grade agentic architecture that combines vision-language processing with constraint-aware reasoning to produce specification-compliant lane maps, achieving high automation rates in large-scale urban mapping.","ai_keywords":["vectorized mapping","end-to-end vectorization","lane-level maps","autonomous driving","vision-language Judge","Planner","Worker","specification compliance","constraint-aware reasoning","deterministic map editing","confidence-based triggering","real-world datasets","production automation"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":0,"organization":{"_id":"626a6d6b4909b521e1f59ce5","name":"baidu","fullname":"BAIDU","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f187a2cc1c03340ac30498/TYYUxK8xD1AxExFMWqbZD.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6449d8345dec46f9adc30dd8","avatarUrl":"/avatars/fbc4204466750fca0b56074674ee64d7.svg","isPro":false,"fullname":"Dong Xie","user":"PPrimo","type":"user"},{"_id":"68c26fae00ea023ead05b0cd","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/98Oy1JqsYJewHYnKeBsyb.png","isPro":false,"fullname":"editcut","user":"editcut","type":"user"},{"_id":"6a20ea2d51490ebf5faf463c","avatarUrl":"/avatars/d4ba1e7d106d80681ba6113e4b3e1cf9.svg","isPro":false,"fullname":"Li Zihan","user":"lizihan0217","type":"user"},{"_id":"699e662c79e2af57327c28c1","avatarUrl":"/avatars/ca3dafdcf5b56a170c5e60b88ca979ff.svg","isPro":false,"fullname":"yyao","user":"yyao888","type":"user"},{"_id":"65783285528e89e35f41af74","avatarUrl":"/avatars/55db809970c06a2e0f616e7067ae3268.svg","isPro":false,"fullname":"yang","user":"zhilan","type":"user"},{"_id":"6a15e120fed3d19c9de1f93d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/RqGk0f_4FflN_9f2DKYDS.png","isPro":false,"fullname":"孙 昊宇","user":"danielbsotb","type":"user"},{"_id":"66935bdc5489e4f73c76bc7b","avatarUrl":"/avatars/129d1e86bbaf764b507501f4feb177db.svg","isPro":false,"fullname":"Abidoye Aanuoluwapo","user":"Aanuoluwapo65","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"63c1699e40a26dd2db32400d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63c1699e40a26dd2db32400d/3N0-Zp8igv8-52mXAdiiq.jpeg","isPro":false,"fullname":"Chroma","user":"Chroma111","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"626a6d6b4909b521e1f59ce5","name":"baidu","fullname":"BAIDU","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f187a2cc1c03340ac30498/TYYUxK8xD1AxExFMWqbZD.png"}}">
Papers
arxiv:2606.04513

MapAgent: An Industrial-Grade Agentic Framework for City-scale Lane-level Map Generation

Published on Jun 3
· Submitted by
Dong Xie
on Jun 4
Authors:
,
,
,
,
,
,
,
,

Abstract

MapAgent is an industrial-grade agentic architecture that combines vision-language processing with constraint-aware reasoning to produce specification-compliant lane maps, achieving high automation rates in large-scale urban mapping.

Lane-level maps are critical infrastructure for autonomous driving and lane-level navigation, yet constructing and maintaining standardized lane networks for hundreds of cities remains highly labor-intensive. Recent end-to-end vectorized mapping methods can predict lane geometry and topology directly from sensor data, but they typically treat mapping specifications and traffic regulations as implicit, dataset-dependent supervision. Moreover, in complex scenes (e.g., worn or missing markings and occlusions), correct lane configurations are often under-determined by visual evidence alone, making specification violations a major source of human post-editing. We propose MapAgent, an industrial-grade agentic architecture that augments a vectorization backbone for specification-compliant lane-map production. Rather than merely adding an agent loop to map prediction, MapAgent couples backbone perception with explicit specification verification, constraint-aware reasoning, and deterministic map editing under a bounded, verification-driven Judge-Planner-Worker loop. A vision-language Judge diagnoses errors by jointly inspecting visual evidence and draft vectors, while a tool-calling Planner generates minimal corrective edits with post-edit re-validation. To remain scalable for city-scale production, MapAgent is selectively triggered only on tiles with low backbone confidence, adding modest overhead while preserving throughput. Experiments on real-world datasets show consistent gains over strong production baselines, especially in complex and long-tail scenarios. Additionally, MapAgent has been integrated into Baidu Maps, supporting lane-level map generation for over 360 cities nationwide and elevating the overall production automation to over 95%, demonstrating MapAgent's practicality and effectiveness for large-scale lane-level map generation.

Community

Paper submitter about 6 hours ago

Lane-level maps are critical infrastructure for autonomous driving and lane-level navigation, yet constructing and maintaining standardized lane networks for hundreds of cities remains highly labor-intensive. Recent end-to-end vectorized mapping methods can predict lane geometry and topology directly from sensor data, but they typically treat mapping specifications and traffic regulations as implicit, dataset-dependent supervision. Moreover, in complex scenes (e.g., worn or missing markings and occlusions), correct lane configurations are often under-determined by visual evidence alone, making specification violations a major source of human post-editing. We propose MapAgent, an industrial-grade agentic architecture that augments a vectorization backbone for specification-compliant lane-map production. Rather than merely adding an agent loop to map prediction, MapAgent couples backbone perception with explicit specification verification, constraint-aware reasoning, and deterministic map editing under a bounded, verification-driven Judge-Planner-Worker loop. A vision-language Judge diagnoses errors by jointly inspecting visual evidence and draft vectors, while a tool-calling Planner generates minimal corrective edits with post-edit re-validation. To remain scalable for city-scale production, MapAgent is selectively triggered only on tiles with low backbone confidence, adding modest overhead while preserving throughput. Experiments on real-world datasets show consistent gains over strong production baselines, especially in complex and long-tail scenarios. Additionally, MapAgent has been integrated into Baidu Maps, supporting lane-level map generation for over 360 cities nationwide and elevating the overall production automation to over 95%, demonstrating MapAgent's practicality and effectiveness for large-scale lane-level map generation.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.04513 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.04513 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.04513 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers