Hugging Face Daily Papers · · 7 min read

LLM Anonymization Against Agentic Re-Identification

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become cross-referenceable evidence for re-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text for formal privacy, or test rewritten text against non-web inference models, leaving underexplored the operating region between resistance to agentic web-search re-identification and utility retention. We introduce AURA (Anonymization with Utility-Retention Adaptation), an LLM-powered mask-reconstruct framework that decouples privacy localization from utility-preserving reconstruction and selects candidates with adversarial privacy and utility-retention checks. We evaluate AURA on real-user interview transcripts using re-identification attacks carried out by web-search agents, along with a utility evaluation based on interviewee-profile facts, codebook facts, and the joint contextual utility grid. Our results show that AURA improves the privacy-utility frontier by using adaptive privacy scope to strengthen resistance to agentic re-identification and using a mask-reconstruct anonymization method to better preserve contextual utility under fixed privacy scope.</p>\n","updatedAt":"2026-06-05T16:47:17.754Z","author":{"_id":"6535f06d9805be89b4393844","avatarUrl":"/avatars/81d8e6e180a776056980f9cc74cf3855.svg","fullname":"Ziwen Li","name":"Aaron43","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8689409494400024},"editors":["Aaron43"],"editorAvatarUrls":["/avatars/81d8e6e180a776056980f9cc74cf3855.svg"],"reactions":[],"isReport":false}},{"id":"6a237bad7ed3eb841bf2e492","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":362,"isUserFollowing":false},"createdAt":"2026-06-06T01:45:17.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Mask-Free Privacy Extraction and Rewriting: A Domain-Aware Approach via Prototype Learning](https://huggingface.co/papers/2604.10145) (2026)\n* [CAMP: Cumulative Agentic Masking and Pruning for Privacy Protection in Multi-Turn LLM Conversations](https://huggingface.co/papers/2604.16521) (2026)\n* [Need to Know: Contextual-Integrity-Grounded Query Rewriting for Privacy-Conscious LLM Delegation](https://huggingface.co/papers/2606.04067) (2026)\n* [A Case Study on the Impact of Anonymization Along the RAG Pipeline](https://huggingface.co/papers/2604.15958) (2026)\n* [MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents](https://huggingface.co/papers/2605.09530) (2026)\n* [Towards Privacy-Preserving Large Language Model: Text-free Inference Through Alignment and Adaptation](https://huggingface.co/papers/2604.06831) (2026)\n* [Spore: Efficient and Training-Free Privacy Extraction Attack on LLMs via Inference-Time Hybrid Probing](https://huggingface.co/papers/2604.23711) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"<p>This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. </p>\n<p>The following papers were recommended by the Semantic Scholar API </p>\n<ul>\n<li><a href=\"https://huggingface.co/papers/2604.10145\">Mask-Free Privacy Extraction and Rewriting: A Domain-Aware Approach via Prototype Learning</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.16521\">CAMP: Cumulative Agentic Masking and Pruning for Privacy Protection in Multi-Turn LLM Conversations</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2606.04067\">Need to Know: Contextual-Integrity-Grounded Query Rewriting for Privacy-Conscious LLM Delegation</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.15958\">A Case Study on the Impact of Anonymization Along the RAG Pipeline</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.09530\">MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.06831\">Towards Privacy-Preserving Large Language Model: Text-free Inference Through Alignment and Adaptation</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.23711\">Spore: Efficient and Training-Free Privacy Extraction Attack on LLMs via Inference-Time Hybrid Probing</a> (2026)</li>\n</ul>\n<p> Please give a thumbs up to this comment if you found it helpful!</p>\n<p> If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space</p>\n<p> You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code><span class=\"SVELTE_PARTIAL_HYDRATER contents\" data-target=\"UserMention\" data-props=\"{&quot;user&quot;:&quot;librarian-bot&quot;}\"><span class=\"inline-block\"><span class=\"contents\"><a href=\"/librarian-bot\">@<span class=\"underline\">librarian-bot</span></a></span> </span></span> recommend</code></p>\n","updatedAt":"2026-06-06T01:45:17.271Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":362,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7174628973007202},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.30848","authors":[{"_id":"6a22d9d4e4c258a0294916d4","user":{"_id":"6535f06d9805be89b4393844","avatarUrl":"/avatars/81d8e6e180a776056980f9cc74cf3855.svg","isPro":false,"fullname":"Ziwen Li","user":"Aaron43","type":"user","name":"Aaron43"},"name":"Ziwen Li","status":"admin_assigned","statusLastChangedAt":"2026-06-05T16:41:44.771Z","hidden":false},{"_id":"6a22d9d4e4c258a0294916d5","name":"Jianing Wen","hidden":false},{"_id":"6a22d9d4e4c258a0294916d6","name":"Tianshi Li","hidden":false}],"publishedAt":"2026-06-01T00:00:00.000Z","submittedOnDailyAt":"2026-06-05T00:00:00.000Z","title":"LLM Anonymization Against Agentic Re-Identification","submittedOnDailyBy":{"_id":"6535f06d9805be89b4393844","avatarUrl":"/avatars/81d8e6e180a776056980f9cc74cf3855.svg","isPro":false,"fullname":"Ziwen Li","user":"Aaron43","type":"user","name":"Aaron43"},"summary":"Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become cross-referenceable evidence for re-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text for formal privacy, or test rewritten text against non-web inference models, leaving underexplored the operating region between resistance to agentic web-search re-identification and utility retention. We introduce AURA (Anonymization with Utility-Retention Adaptation), an LLM-powered mask-reconstruct framework that decouples privacy localization from utility-preserving reconstruction and selects candidates with adversarial privacy and utility-retention checks. We evaluate AURA on real-user interview transcripts using re-identification attacks carried out by web-search agents, along with a utility evaluation based on interviewee-profile facts, codebook facts, and the joint contextual utility grid. Our results show that AURA improves the privacy-utility frontier by using adaptive privacy scope to strengthen resistance to agentic re-identification and using a mask-reconstruct anonymization method to better preserve contextual utility under fixed privacy scope.","upvotes":1,"discussionId":"6a22d9d4e4c258a0294916d7","projectPage":"https://peach-research-lab.github.io/AURA/","githubRepo":"https://github.com/PEACH-Research-Lab/AURA","githubRepoAddedBy":"user","ai_summary":"AURA is an LLM-powered anonymization framework that balances privacy protection against agentic web-search re-identification while preserving contextual utility through adaptive privacy scopes and mask-reconstruct methods.","ai_keywords":["LLM-powered","mask-reconstruct","anonymization","agentic web-search","re-identification","privacy-utility frontier","adaptive privacy scope","contextual utility"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":0,"organization":{"_id":"69a3ee3977151a9fca8f8b85","name":"peach-lab","fullname":"Northeastern PEACH Lab","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/65d99563bcd15bc5cb082a3a/2AG1zBZ77uTdoXHyW6C51.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6535f06d9805be89b4393844","avatarUrl":"/avatars/81d8e6e180a776056980f9cc74cf3855.svg","isPro":false,"fullname":"Ziwen Li","user":"Aaron43","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"69a3ee3977151a9fca8f8b85","name":"peach-lab","fullname":"Northeastern PEACH Lab","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/65d99563bcd15bc5cb082a3a/2AG1zBZ77uTdoXHyW6C51.png"}}">
Papers
arxiv:2605.30848

LLM Anonymization Against Agentic Re-Identification

Published on Jun 1
· Submitted by
Ziwen Li
on Jun 5
Authors:
,

Abstract

AURA is an LLM-powered anonymization framework that balances privacy protection against agentic web-search re-identification while preserving contextual utility through adaptive privacy scopes and mask-reconstruct methods.

Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become cross-referenceable evidence for re-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text for formal privacy, or test rewritten text against non-web inference models, leaving underexplored the operating region between resistance to agentic web-search re-identification and utility retention. We introduce AURA (Anonymization with Utility-Retention Adaptation), an LLM-powered mask-reconstruct framework that decouples privacy localization from utility-preserving reconstruction and selects candidates with adversarial privacy and utility-retention checks. We evaluate AURA on real-user interview transcripts using re-identification attacks carried out by web-search agents, along with a utility evaluation based on interviewee-profile facts, codebook facts, and the joint contextual utility grid. Our results show that AURA improves the privacy-utility frontier by using adaptive privacy scope to strengthen resistance to agentic re-identification and using a mask-reconstruct anonymization method to better preserve contextual utility under fixed privacy scope.

Community

Paper author Paper submitter about 9 hours ago

Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become cross-referenceable evidence for re-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text for formal privacy, or test rewritten text against non-web inference models, leaving underexplored the operating region between resistance to agentic web-search re-identification and utility retention. We introduce AURA (Anonymization with Utility-Retention Adaptation), an LLM-powered mask-reconstruct framework that decouples privacy localization from utility-preserving reconstruction and selects candidates with adversarial privacy and utility-retention checks. We evaluate AURA on real-user interview transcripts using re-identification attacks carried out by web-search agents, along with a utility evaluation based on interviewee-profile facts, codebook facts, and the joint contextual utility grid. Our results show that AURA improves the privacy-utility frontier by using adaptive privacy scope to strengthen resistance to agentic re-identification and using a mask-reconstruct anonymization method to better preserve contextual utility under fixed privacy scope.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.30848 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.30848 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.30848 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers