We uncover a quantitative law of parametric memory in LLMs, showing that exact recall emerges through a sharp probability threshold and can be significantly improved with threshold-guided fine-tuning.</p>\n","updatedAt":"2026-05-29T02:15:02.570Z","author":{"_id":"620b3bbb0668e435407c8d0a","avatarUrl":"/avatars/e0fccbb2577d76088e09f054c35cffbc.svg","fullname":"Ningyu Zhang","name":"Ningyu","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":48,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9213289022445679},"editors":["Ningyu"],"editorAvatarUrls":["/avatars/e0fccbb2577d76088e09f054c35cffbc.svg"],"reactions":[],"isReport":false}},{"id":"6a1a414b3a7ae8c9ef97c718","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":359,"isUserFollowing":false},"createdAt":"2026-05-30T01:45:47.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Parameter Efficiency Is Not Memory Efficiency: Rethinking Fine-Tuning for On-Device LLM Adaptation](https://huggingface.co/papers/2604.22783) (2026)\n* [TSUBASA: Improving Long-Horizon Personalization via Evolving Memory and Self-Learning with Context Distillation](https://huggingface.co/papers/2604.07894) (2026)\n* [IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference](https://huggingface.co/papers/2605.25475) (2026)\n* [Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility](https://huggingface.co/papers/2605.14037) (2026)\n* [Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference](https://huggingface.co/papers/2604.07394) (2026)\n* [MixSD: Mixed Contextual Self-Distillation for Knowledge Injection](https://huggingface.co/papers/2605.16865) (2026)\n* [FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning](https://huggingface.co/papers/2605.09932) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"<p>This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. </p>\n<p>The following papers were recommended by the Semantic Scholar API </p>\n<ul>\n<li><a href=\"https://huggingface.co/papers/2604.22783\">Parameter Efficiency Is Not Memory Efficiency: Rethinking Fine-Tuning for On-Device LLM Adaptation</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.07894\">TSUBASA: Improving Long-Horizon Personalization via Evolving Memory and Self-Learning with Context Distillation</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.25475\">IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.14037\">Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.07394\">Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.16865\">MixSD: Mixed Contextual Self-Distillation for Knowledge Injection</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.09932\">FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning</a> (2026)</li>\n</ul>\n<p> Please give a thumbs up to this comment if you found it helpful!</p>\n<p> If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space</p>\n<p> You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code><span class=\"SVELTE_PARTIAL_HYDRATER contents\" data-target=\"UserMention\" data-props=\"{"user":"librarian-bot"}\"><span class=\"inline-block\"><span class=\"contents\"><a href=\"/librarian-bot\">@<span class=\"underline\">librarian-bot</span></a></span> </span></span> recommend</code></p>\n","updatedAt":"2026-05-30T01:45:47.139Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":359,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7295645475387573},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.30260","authors":[{"_id":"6a18f60456b4bb14ec65ce4c","name":"Ziwen Xu","hidden":false},{"_id":"6a18f60456b4bb14ec65ce4d","name":"Haiwen Hong","hidden":false},{"_id":"6a18f60456b4bb14ec65ce4e","name":"Linsong Yu","hidden":false},{"_id":"6a18f60456b4bb14ec65ce4f","name":"Benglei Cui","hidden":false},{"_id":"6a18f60456b4bb14ec65ce50","name":"Longtao Huang","hidden":false},{"_id":"6a18f60456b4bb14ec65ce51","name":"Hui Xue","hidden":false},{"_id":"6a18f60456b4bb14ec65ce52","name":"Ningyu Zhang","hidden":false}],"publishedAt":"2026-05-28T00:00:00.000Z","submittedOnDailyAt":"2026-05-29T00:00:00.000Z","title":"How LoRA Remembers? A Parametric Memory Law for LLM Finetuning","submittedOnDailyBy":{"_id":"620b3bbb0668e435407c8d0a","avatarUrl":"/avatars/e0fccbb2577d76088e09f054c35cffbc.svg","isPro":false,"fullname":"Ningyu Zhang","user":"Ningyu","type":"user","name":"Ningyu"},"summary":"Large Language Models (LLMs) must continuously learn and update knowledge to remain effective in dynamic real-world environments. While Low-Rank Adaptation (LoRA) is widely used for such memory updates, existing studies mainly rely on qualitative downstream evaluations, leaving the quantitative capacity limits and underlying dynamics of exact parametric memory largely unexplored. To bridge this gap, we employ LoRA as a controlled memory capacity probe within the latent space to systematically quantify exact parametric memory. We introduce the Parametric Memory Law, a robust power law linking loss reduction Delta L to effective parameters and sequence length. At the token level, fine-grained analysis reveals a deterministic phase transition, demonstrating that a prediction probability of p > 0.5 constitutes a sufficient condition for verbatim recall under greedy decoding. Driven by these insights, we introduce MemFT, a threshold-guided optimization strategy that dynamically redistributes the training budget toward sub-threshold tokens. Empirical evaluations demonstrate that MemFT can enhance memory fidelity and efficiency. Code will be released at https://github.com/zjunlp/ParametricMemoryLaw.","upvotes":22,"discussionId":"6a18f60456b4bb14ec65ce53","githubRepo":"https://github.com/zjunlp/ParametricMemoryLaw","githubRepoAddedBy":"user","ai_summary":"Research investigates the quantitative limits of parametric memory in large language models using LoRA as a probe, establishing a power law relationship and developing a threshold-guided optimization method for improved memory performance.","ai_keywords":["Large Language Models","Low-Rank Adaptation","parametric memory","latent space","power law","token-level analysis","phase transition","greedy decoding","verbatim recall","MemFT","threshold-guided optimization"],"githubStars":3,"organization":{"_id":"64488b334988ee01f2a8d856","name":"alibaba-inc","fullname":"alibaba-inc","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/61ac8f8a00d01045fca0ad2f/MX4wxQVaFm1A1wqnrL2WU.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"620b3bbb0668e435407c8d0a","avatarUrl":"/avatars/e0fccbb2577d76088e09f054c35cffbc.svg","isPro":false,"fullname":"Ningyu Zhang","user":"Ningyu","type":"user"},{"_id":"6549caee44e75a7de4fee2fa","avatarUrl":"/avatars/5aea69671eb1299aaaa948d888b4b64f.svg","isPro":false,"fullname":"Xu Ziwen","user":"xzwnlp","type":"user"},{"_id":"64bf898d979949d2e2585c9a","avatarUrl":"/avatars/da77c856ec997e2b812c06272a01c8b2.svg","isPro":false,"fullname":"mengruwang","user":"mengru","type":"user"},{"_id":"66abc6da92b9eb71fe476118","avatarUrl":"/avatars/6d1618f45cc76da80335ad926ad24552.svg","isPro":false,"fullname":"xy.r","user":"ShawnRu","type":"user"},{"_id":"684bc1be17ae31ba66171292","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/684bc1be17ae31ba66171292/LFlkU4kArMjSzIbwjXd44.jpeg","isPro":false,"fullname":"Jingsheng Zheng","user":"JohnsonZheng03","type":"user"},{"_id":"64bb2e0ac733e8552f98ee83","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bb2e0ac733e8552f98ee83/rMOvHI9QOVHruOjPLOiZ7.jpeg","isPro":false,"fullname":"Hongbin Ye","user":"Casper888","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"6a0c1e5139d601217d9b3e8e","avatarUrl":"/avatars/bc27ca94a598dd902d591cbdee597f0c.svg","isPro":false,"fullname":"Leonardo Garate","user":"Opaquing","type":"user"},{"_id":"652bdbb77c5365f2d1228dfb","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/652bdbb77c5365f2d1228dfb/ImPwcK1dMr23MtJVI9C9I.jpeg","isPro":false,"fullname":"ZhongYi","user":"Blurblur02","type":"user"},{"_id":"63a942dd2e05ca32e35335df","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63a942dd2e05ca32e35335df/kuKfBLEXfWnvnoUUmoXW6.jpeg","isPro":false,"fullname":"haoming xu","user":"haomingx","type":"user"},{"_id":"65535b54140fc44a74d43635","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/MIrD8OzDKF2aI38i7ZPjR.jpeg","isPro":false,"fullname":"Zhisong Qiu","user":"consultantQ","type":"user"},{"_id":"6a141165d1d6b695028c7a45","avatarUrl":"/avatars/1aca65ab7abf80dc6171f3aa2f9308ee.svg","isPro":false,"fullname":"LinsongYu","user":"Yulinsong","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"64488b334988ee01f2a8d856","name":"alibaba-inc","fullname":"alibaba-inc","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/61ac8f8a00d01045fca0ad2f/MX4wxQVaFm1A1wqnrL2WU.jpeg"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.30260.md"}">
How LoRA Remembers? A Parametric Memory Law for LLM Finetuning
Abstract
Research investigates the quantitative limits of parametric memory in large language models using LoRA as a probe, establishing a power law relationship and developing a threshold-guided optimization method for improved memory performance.
AI-generated summary
Large Language Models (LLMs) must continuously learn and update knowledge to remain effective in dynamic real-world environments. While Low-Rank Adaptation (LoRA) is widely used for such memory updates, existing studies mainly rely on qualitative downstream evaluations, leaving the quantitative capacity limits and underlying dynamics of exact parametric memory largely unexplored. To bridge this gap, we employ LoRA as a controlled memory capacity probe within the latent space to systematically quantify exact parametric memory. We introduce the Parametric Memory Law, a robust power law linking loss reduction Delta L to effective parameters and sequence length. At the token level, fine-grained analysis reveals a deterministic phase transition, demonstrating that a prediction probability of p > 0.5 constitutes a sufficient condition for verbatim recall under greedy decoding. Driven by these insights, we introduce MemFT, a threshold-guided optimization strategy that dynamically redistributes the training budget toward sub-threshold tokens. Empirical evaluations demonstrate that MemFT can enhance memory fidelity and efficiency. Code will be released at https://github.com/zjunlp/ParametricMemoryLaw.
Community
We uncover a quantitative law of parametric memory in LLMs, showing that exact recall emerges through a sharp probability threshold and can be significantly improved with threshold-guided fine-tuning.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.30260 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.30260 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.30260 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.