Can an AI agent act like an editor, turning raw data into readable, verifiable, multimodal stories?</p>\n<p>Data2Story brings data analysis, information retrieval, narrative writing, visual design, and fact-checking into one “virtual newsroom.” Every key finding can be traced back to code, data, or external sources, making the story more transparent, reproducible, and verifiable.</p>\n<p>In short, we want AI not only to generate stories, but to generate stories with evidence, sources, and accountability.</p>\n","updatedAt":"2026-06-10T12:17:44.351Z","author":{"_id":"64440be5af034cdfd69ca3a7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64440be5af034cdfd69ca3a7/qmx24QiDFT29vleCxL9TX.jpeg","fullname":"Qinghong (Kevin) Lin","name":"KevinQHLin","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":47,"isUserFollowing":false}},"numEdits":3,"identifiedLanguage":{"language":"en","probability":0.7068847417831421},"editors":["KevinQHLin"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64440be5af034cdfd69ca3a7/qmx24QiDFT29vleCxL9TX.jpeg"],"reactions":[],"isReport":false}},{"id":"6a294dcb91c630b2794164e6","author":{"_id":"6960eca92f7ad9b043b5cbe0","avatarUrl":"/avatars/e68dcc7fd04f143d849d40414866e633.svg","fullname":"Noah","name":"noahml","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2026-06-10T11:43:07.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Neat paper. The idea of using a multi-agent framework to act as a virtual newsroom for data journalism is pretty compelling, especially the focus on forcing the model to link every claim back to specific code or references to ensure auditability.\n\nI'm curious, since the authors mention human articles still have an edge in creative design and editorial angle, what are the primary bottlenecks preventing the agent from closing that gap?\n\nI made a podcast on it with ResearchPod, it makes it easy to get the key concepts on the go:\nhttps://researchpod.app/episode/4aa1ad00-55fe-4bc5-97c3-8141f4f8692e","html":"<p>Neat paper. The idea of using a multi-agent framework to act as a virtual newsroom for data journalism is pretty compelling, especially the focus on forcing the model to link every claim back to specific code or references to ensure auditability.</p>\n<p>I'm curious, since the authors mention human articles still have an edge in creative design and editorial angle, what are the primary bottlenecks preventing the agent from closing that gap?</p>\n<p>I made a podcast on it with ResearchPod, it makes it easy to get the key concepts on the go:<br><a href=\"https://researchpod.app/episode/4aa1ad00-55fe-4bc5-97c3-8141f4f8692e\" rel=\"nofollow\">https://researchpod.app/episode/4aa1ad00-55fe-4bc5-97c3-8141f4f8692e</a></p>\n","updatedAt":"2026-06-10T11:43:07.517Z","author":{"_id":"6960eca92f7ad9b043b5cbe0","avatarUrl":"/avatars/e68dcc7fd04f143d849d40414866e633.svg","fullname":"Noah","name":"noahml","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8996742963790894},"editors":["noahml"],"editorAvatarUrls":["/avatars/e68dcc7fd04f143d849d40414866e633.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.11176","authors":[{"_id":"6a28e548e7d78ea7587e5537","name":"Kevin Qinghong Lin","hidden":false},{"_id":"6a28e548e7d78ea7587e5538","name":"Batu EI","hidden":false},{"_id":"6a28e548e7d78ea7587e5539","name":"Yuhong Shi","hidden":false},{"_id":"6a28e548e7d78ea7587e553a","name":"Pan Lu","hidden":false},{"_id":"6a28e548e7d78ea7587e553b","name":"Philip Torr","hidden":false},{"_id":"6a28e548e7d78ea7587e553c","name":"James Zou","hidden":false}],"publishedAt":"2026-06-09T00:00:00.000Z","submittedOnDailyAt":"2026-06-10T00:00:00.000Z","title":"Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories","submittedOnDailyBy":{"_id":"64440be5af034cdfd69ca3a7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64440be5af034cdfd69ca3a7/qmx24QiDFT29vleCxL9TX.jpeg","isPro":false,"fullname":"Qinghong (Kevin) Lin","user":"KevinQHLin","type":"user","name":"KevinQHLin"},"summary":"Data tells stories that shape society; the data journalist's job is to turn raw information into stories non-experts can trust. A high-quality news feature takes a newsroom team weeks: hunting for context, running statistics, choosing an angle, and designing visuals. Recent agents handle individual steps well: data-science agents close the analysis loop, while design agents synthesize beautiful websites. But can an agent serve as a data journalist end to end? We introduce Data Journalist Agent (Data2Story), a multi-agent framework that orchestrates specialized roles into a single virtual newsroom. Data2Story contributes two innovations. (i) Claims are evidence-grounded: an Inspector links every number, angle, and asset back to data, code, or an external reference. (ii) Articles are multimodally generative: rather than defaulting to plain text and static charts, Data2Story reasons about what readers will want to see, then deploys multimodal tools, such as interactive maps for geography and audio for music. We evaluate Data2Story on 18 articles, each paired with the originally published expert piece, along four axes: (a) human-agent angle coverage; (b) rubric evaluation with 53 participants across five dimensions; (c) computer-use agents as judges, a cost-saving proxy for how readers navigate interactive articles; and (d) verifiability, where a coding verifier re-executes statements against the data and checks claims against references. Data2Story produces competitive, evidence-traceable multimedia stories, with particular strength in transparency and auditability. Human articles retain an edge in editorial angle, creative design, and presentation. We position Data2Story as a collaborator for journalists, enabling more evidence-based, transparent, and verifiable reporting. Code and demos are available at https://data2story.github.io.","upvotes":31,"discussionId":"6a28e548e7d78ea7587e553d","projectPage":"https://data2story.github.io/","githubRepo":"https://github.com/QinghongLin/data2story-skill","githubRepoAddedBy":"user","ai_summary":"A multi-agent framework automates data journalism by generating evidence-grounded, multimodal news stories while maintaining transparency and verifiability.","ai_keywords":["multi-agent framework","data journalism","evidence-grounded","multimodal generation","verifiability","article generation","data storytelling"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":21,"organization":{"_id":"627bbc28fbab61b048eba8b6","name":"Oxford","fullname":"University of Oxford","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/68e396f2b5bb631e9b2fac9a/u0ey2LfYu6uG6iu8m_kH7.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"68dde6c99ba9f8600f1cf45c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/OwKvTNQGc_NC-XsT5wldu.png","isPro":false,"fullname":"HengThong Lam","user":"htlam08","type":"user"},{"_id":"64107c7df52d7eb22e062956","avatarUrl":"/avatars/7b1cee9a2b8454fedfbd4c3d1df9865c.svg","isPro":false,"fullname":"Yuhao Zheng","user":"yhzheng1031","type":"user"},{"_id":"652b83b73b5997ed71a310f2","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/652b83b73b5997ed71a310f2/ipCpdeHUp4-0OmRz5z8IW.png","isPro":false,"fullname":"Rui Zhao","user":"ruizhaocv","type":"user"},{"_id":"6357c9f400f138b8ca551704","avatarUrl":"/avatars/9bf638df27fae9a78d15ccfe67619c7a.svg","isPro":false,"fullname":"Siyuan Hu","user":"h-siyuan","type":"user"},{"_id":"6331b4719d77091830d2e0c4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6331b4719d77091830d2e0c4/8JXaovCvmBJyvoCVuT3ph.jpeg","isPro":false,"fullname":"Dongxing Mao","user":"neversa","type":"user"},{"_id":"67eb81928ebaba352ed56c3b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67eb81928ebaba352ed56c3b/kTV3pDz5UEc9NJ80Gz9_N.jpeg","isPro":false,"fullname":"Danton Zhu","user":"zhudantong","type":"user"},{"_id":"657669f2b4379e65a8c6d5cf","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/657669f2b4379e65a8c6d5cf/9M7lS0Old5NqOrNx7W04N.jpeg","isPro":false,"fullname":"YanzheChen","user":"YanzheChen","type":"user"},{"_id":"6968f1b7ce62791a7814339f","avatarUrl":"/avatars/d7f728182a2d5696fab6ae0d55d79caf.svg","isPro":false,"fullname":"Andrew","user":"andrew-hggf67","type":"user"},{"_id":"634e2217c1ce28f1de921708","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/634e2217c1ce28f1de921708/XTMB6alYUM0KAUptM98kP.jpeg","isPro":false,"fullname":"yyyang404","user":"yyyang","type":"user"},{"_id":"65098dc1d95f30b9dcd9611b","avatarUrl":"/avatars/4adfad499836a7b0249d16436a8a0aff.svg","isPro":false,"fullname":"jerx2y","user":"jerx2y","type":"user"},{"_id":"664717a50860c78e7c7b7c52","avatarUrl":"/avatars/ca17216b6d73234e1a68510f87653b3a.svg","isPro":false,"fullname":"Puyi Wang","user":"Puyiiii","type":"user"},{"_id":"6729d1fed3ec5370cb035901","avatarUrl":"/avatars/50f7ce9c635148df76d1c63ebf3efa38.svg","isPro":false,"fullname":"1","user":"DANNY621","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"627bbc28fbab61b048eba8b6","name":"Oxford","fullname":"University of Oxford","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/68e396f2b5bb631e9b2fac9a/u0ey2LfYu6uG6iu8m_kH7.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.11176.md"}">
Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories
Abstract
A multi-agent framework automates data journalism by generating evidence-grounded, multimodal news stories while maintaining transparency and verifiability.
Data tells stories that shape society; the data journalist's job is to turn raw information into stories non-experts can trust. A high-quality news feature takes a newsroom team weeks: hunting for context, running statistics, choosing an angle, and designing visuals. Recent agents handle individual steps well: data-science agents close the analysis loop, while design agents synthesize beautiful websites. But can an agent serve as a data journalist end to end? We introduce Data Journalist Agent (Data2Story), a multi-agent framework that orchestrates specialized roles into a single virtual newsroom. Data2Story contributes two innovations. (i) Claims are evidence-grounded: an Inspector links every number, angle, and asset back to data, code, or an external reference. (ii) Articles are multimodally generative: rather than defaulting to plain text and static charts, Data2Story reasons about what readers will want to see, then deploys multimodal tools, such as interactive maps for geography and audio for music. We evaluate Data2Story on 18 articles, each paired with the originally published expert piece, along four axes: (a) human-agent angle coverage; (b) rubric evaluation with 53 participants across five dimensions; (c) computer-use agents as judges, a cost-saving proxy for how readers navigate interactive articles; and (d) verifiability, where a coding verifier re-executes statements against the data and checks claims against references. Data2Story produces competitive, evidence-traceable multimedia stories, with particular strength in transparency and auditability. Human articles retain an edge in editorial angle, creative design, and presentation. We position Data2Story as a collaborator for journalists, enabling more evidence-based, transparent, and verifiable reporting. Code and demos are available at https://data2story.github.io.
Community
Can an AI agent act like an editor, turning raw data into readable, verifiable, multimodal stories?
Data2Story brings data analysis, information retrieval, narrative writing, visual design, and fact-checking into one “virtual newsroom.” Every key finding can be traced back to code, data, or external sources, making the story more transparent, reproducible, and verifiable.
In short, we want AI not only to generate stories, but to generate stories with evidence, sources, and accountability.
Neat paper. The idea of using a multi-agent framework to act as a virtual newsroom for data journalism is pretty compelling, especially the focus on forcing the model to link every claim back to specific code or references to ensure auditability.
I'm curious, since the authors mention human articles still have an edge in creative design and editorial angle, what are the primary bottlenecks preventing the agent from closing that gap?
I made a podcast on it with ResearchPod, it makes it easy to get the key concepts on the go:
https://researchpod.app/episode/4aa1ad00-55fe-4bc5-97c3-8141f4f8692e
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2606.11176 in a model README.md to link it from this page.
Cite arxiv.org/abs/2606.11176 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2606.11176 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.