arXiv — NLP / Computation & Language · June 2, 2026 · 4 min read

Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism

Mirrored from arXiv — NLP / Computation & Language for archival readability. Support the source by reading on the original site.

Like Read original ↗

Computer Science > Computation and Language

arXiv:2606.00408 (cs)

[Submitted on 29 May 2026]

Title:Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism

Authors:Haoxiang Zhang, Qixin Xu, Zhuofeng Li, Lei Zhang, Pengcheng Jiang, Yu Zhang, Julian McAuley

View a PDF of the paper titled Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism, by Haoxiang Zhang and 6 other authors

View PDF

Abstract:Long-horizon search agents accumulate large amounts of retrieved content across many tool calls, making context-budget efficiency increasingly important. A minimal intervention is to mask stale observations from the context as the trajectory progresses, but it remains unclear when this form of context management helps and why. We study observation masking through a systematic sweep over various agent backbones (4B to 284B parameters) and three retrievers on offline and live-web agentic search benchmarks. We find that the accuracy gain from masking follows an asymmetric inverted-U shape when plotted against the model's accuracy without context management: a plateau under weak retrievers, a peak when a strong retriever meets a mid-capacity model, and a sharp collapse when the model is saturated. This pattern reflects the interaction between retriever recall and the model's implicit filtering capacity, rather than either factor in isolation. Mechanistically, masking implements a token-for-turn trade-off: it removes observations the model has largely stopped attending to and pages the agent rarely re-opens. The added turns help when they convert failures into successes, but they fail when masking removes evidence the model would otherwise have used. We therefore reframe context management as a regime-dependent intervention and provide a holistic perspective for analyzing context use in agentic deep search. We release our scaffold and trajectories here (this https URL) to support future research.

Comments:	47 pages, 7 figures
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
Cite as:	arXiv:2606.00408 [cs.CL]
	(or arXiv:2606.00408v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2606.00408 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Haoxiang Zhang [view email]
[v1] Fri, 29 May 2026 22:51:58 UTC (4,704 KB)

Full-text links:

Access Paper:

View a PDF of the paper titled Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism, by Haoxiang Zhang and 6 other authors

View PDF
TeX Source

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2026-06

Change to browse by:

cs
cs.AI
cs.IR

References & Citations

BibTeX formatted citation

Data provided by:

Bookmark

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Link to Influence Flower

Influence Flower (What are Influence Flowers?)

Core recommender toggle

CORE Recommender (What is CORE?)

Author
Venue
Institution
Topic

About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Discussion (0)

No comments yet. Sign in and be the first to say something.