Hugging Face Daily Papers · May 21, 2026 · 3 min read

Mem-π: Adaptive Memory through Learning When and What to Generate

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

Mem-π is an adaptive memory framework for LLM agents that replaces traditional retrieval with a model-generated, RL-optimized guidance mechanism to improve performance on complex, context-dependent agentic tasks.</p>\n","updatedAt":"2026-05-21T02:26:35.767Z","author":{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","fullname":"taesiri","name":"taesiri","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":302,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8433855772018433},"editors":["taesiri"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.21463","authors":[{"_id":"6a0e6d54164dbbc68a26c473","name":"Xiaoqiang Wang","hidden":false},{"_id":"6a0e6d54164dbbc68a26c474","name":"Chao Wang","hidden":false},{"_id":"6a0e6d54164dbbc68a26c475","name":"Hadi Nekoei","hidden":false},{"_id":"6a0e6d54164dbbc68a26c476","name":"Christopher Pal","hidden":false},{"_id":"6a0e6d54164dbbc68a26c477","name":"Alexandre Lacoste","hidden":false},{"_id":"6a0e6d54164dbbc68a26c478","name":"Spandana Gella","hidden":false},{"_id":"6a0e6d54164dbbc68a26c479","name":"Bang Liu","hidden":false},{"_id":"6a0e6d54164dbbc68a26c47a","name":"Perouz Taslakian","hidden":false}],"publishedAt":"2026-05-20T00:00:00.000Z","submittedOnDailyAt":"2026-05-21T00:00:00.000Z","title":"Mem-π: Adaptive Memory through Learning When and What to Generate","submittedOnDailyBy":{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user","name":"taesiri"},"summary":"We present Mem-π, a framework for adaptive memory in large language model (LLM) agents, where useful guidance is generated on demand rather than retrieved from external memory stores. Existing memory-augmented agents typically rely on similarity-based retrieval from episodic memory banks or skill libraries, returning static entries that often misalign with the current context. In contrast, Mem-π uses a dedicated language or vision-language model with its own parameters, separate from the downstream agent, to generate context-specific guidance for complex tasks. Conditioned on the current agent context, the model jointly decides when to produce guidance and what guidance to produce. We train it with a decision-content decoupled reinforcement learning (RL) objective, enabling it to abstain when generation would not help and otherwise produce concise, useful guidance. Across diverse agentic benchmarks spanning web navigation, terminal-based tool use, and text-based embodied interaction, Mem-π consistently outperforms retrieval-based and prior RL-optimized memory baselines, achieving over 30% relative improvement on web navigation tasks.","upvotes":2,"discussionId":"6a0e6d55164dbbc68a26c47b","ai_summary":"Mem-π is a framework for adaptive memory in LLM agents that generates context-specific guidance using a separate language or vision-language model trained with decision-content decoupled reinforcement learning.","ai_keywords":["large language model agents","adaptive memory","episodic memory banks","skill libraries","similarity-based retrieval","language model","vision-language model","reinforcement learning","decision-content decoupled reinforcement learning","web navigation","terminal-based tool use","text-based embodied interaction"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"699edcd4796be456dc1dfaf7","avatarUrl":"/avatars/7d9112dbd6e723b7c945f542ac619782.svg","isPro":false,"fullname":"杨紫瑜","user":"ELIJAHHI3","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.21463.md"}">

Papers

arxiv:2605.21463

Mem-π: Adaptive Memory through Learning When and What to Generate

Published on May 20

· Submitted by

taesiri on May 21

Upvote

Authors:

Abstract

Mem-π is a framework for adaptive memory in LLM agents that generates context-specific guidance using a separate language or vision-language model trained with decision-content decoupled reinforcement learning.

AI-generated summary

We present Mem-π, a framework for adaptive memory in large language model (LLM) agents, where useful guidance is generated on demand rather than retrieved from external memory stores. Existing memory-augmented agents typically rely on similarity-based retrieval from episodic memory banks or skill libraries, returning static entries that often misalign with the current context. In contrast, Mem-π uses a dedicated language or vision-language model with its own parameters, separate from the downstream agent, to generate context-specific guidance for complex tasks. Conditioned on the current agent context, the model jointly decides when to produce guidance and what guidance to produce. We train it with a decision-content decoupled reinforcement learning (RL) objective, enabling it to abstain when generation would not help and otherwise produce concise, useful guidance. Across diverse agentic benchmarks spanning web navigation, terminal-based tool use, and text-based embodied interaction, Mem-π consistently outperforms retrieval-based and prior RL-optimized memory baselines, achieving over 30% relative improvement on web navigation tasks.

View arXiv page View PDF Add to collection

Community

taesiri

Paper submitter about 11 hours ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2605.21463

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.21463 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.21463 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.21463 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

No comments yet. Sign in and be the first to say something.

Mem-π: Adaptive Memory through Learning When and What to Generate

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 0

Discussion (0)

More from Hugging Face Daily Papers