Hugging Face Daily Papers · May 29, 2026 · 6 min read

How LoRA Remembers? A Parametric Memory Law for LLM Finetuning

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

We uncover a quantitative law of parametric memory in LLMs, showing that exact recall emerges through a sharp probability threshold and can be significantly improved with threshold-guided fine-tuning.\n","updatedAt":"2026-05-29T02:15:02.570Z","author":{"_id":"620b3bbb0668e435407c8d0a","avatarUrl":"/avatars/e0fccbb2577d76088e09f054c35cffbc.svg","fullname":"Ningyu Zhang","name":"Ningyu","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":48,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9213289022445679},"editors":["Ningyu"],"editorAvatarUrls":["/avatars/e0fccbb2577d76088e09f054c35cffbc.svg"],"reactions":[],"isReport":false}},{"id":"6a1a414b3a7ae8c9ef97c718","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":359,"isUserFollowing":false},"createdAt":"2026-05-30T01:45:47.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Parameter Efficiency Is Not Memory Efficiency: Rethinking Fine-Tuning for On-Device LLM Adaptation](https://huggingface.co/papers/2604.22783) (2026)\n* [TSUBASA: Improving Long-Horizon Personalization via Evolving Memory and Self-Learning with Context Distillation](https://huggingface.co/papers/2604.07894) (2026)\n* [IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference](https://huggingface.co/papers/2605.25475) (2026)\n* [Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility](https://huggingface.co/papers/2605.14037) (2026)\n* [Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference](https://huggingface.co/papers/2604.07394) (2026)\n* [MixSD: Mixed Contextual Self-Distillation for Knowledge Injection](https://huggingface.co/papers/2605.16865) (2026)\n* [FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning](https://huggingface.co/papers/2605.09932) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. \nThe following papers were recommended by the Semantic Scholar API \n<ul>\n<li><a href=\"https://huggingface.co/papers/2604.22783\">Parameter Efficiency Is Not Memory Efficiency: Rethinking Fine-Tuning for On-Device LLM Adaptation</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.07894\">TSUBASA: Improving Long-Horizon Personalization via Evolving Memory and Self-Learning with Context Distillation</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.25475\">IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.14037\">Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.07394\">Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.16865\">MixSD: Mixed Contextual Self-Distillation for Knowledge Injection</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.09932\">FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning</a> (2026)</li>\n</ul>\n Please give a thumbs up to this comment if you found it helpful!\n If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code><a href=\"/librarian-bot\">@librarian-bot</a> recommend</code>\n","updatedAt":"2026-05-30T01:45:47.139Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":359,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7295645475387573},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.30260","authors":[{"_id":"6a18f60456b4bb14ec65ce4c","name":"Ziwen Xu","hidden":false},{"_id":"6a18f60456b4bb14ec65ce4d","name":"Haiwen Hong","hidden":false},{"_id":"6a18f60456b4bb14ec65ce4e","name":"Linsong Yu","hidden":false},{"_id":"6a18f60456b4bb14ec65ce4f","name":"Benglei Cui","hidden":false},{"_id":"6a18f60456b4bb14ec65ce50","name":"Longtao Huang","hidden":false},{"_id":"6a18f60456b4bb14ec65ce51","name":"Hui Xue","hidden":false},{"_id":"6a18f60456b4bb14ec65ce52","name":"Ningyu Zhang","hidden":false}],"publishedAt":"2026-05-28T00:00:00.000Z","submittedOnDailyAt":"2026-05-29T00:00:00.000Z","title":"How LoRA Remembers? A Parametric Memory Law for LLM Finetuning","submittedOnDailyBy":{"_id":"620b3bbb0668e435407c8d0a","avatarUrl":"/avatars/e0fccbb2577d76088e09f054c35cffbc.svg","isPro":false,"fullname":"Ningyu Zhang","user":"Ningyu","type":"user","name":"Ningyu"},"summary":"Large Language Models (LLMs) must continuously learn and update knowledge to remain effective in dynamic real-world environments. While Low-Rank Adaptation (LoRA) is widely used for such memory updates, existing studies mainly rely on qualitative downstream evaluations, leaving the quantitative capacity limits and underlying dynamics of exact parametric memory largely unexplored. To bridge this gap, we employ LoRA as a controlled memory capacity probe within the latent space to systematically quantify exact parametric memory. We introduce the Parametric Memory Law, a robust power law linking loss reduction Delta L to effective parameters and sequence length. At the token level, fine-grained analysis reveals a deterministic phase transition, demonstrating that a prediction probability of p > 0.5 constitutes a sufficient condition for verbatim recall under greedy decoding. Driven by these insights, we introduce MemFT, a threshold-guided optimization strategy that dynamically redistributes the training budget toward sub-threshold tokens. Empirical evaluations demonstrate that MemFT can enhance memory fidelity and efficiency. Code will be released at https://github.com/zjunlp/ParametricMemoryLaw.","upvotes":22,"discussionId":"6a18f60456b4bb14ec65ce53","githubRepo":"https://github.com/zjunlp/ParametricMemoryLaw","githubRepoAddedBy":"user","ai_summary":"Research investigates the quantitative limits of parametric memory in large language models using LoRA as a probe, establishing a power law relationship and developing a threshold-guided optimization method for improved memory performance.","ai_keywords":["Large Language Models","Low-Rank Adaptation","parametric memory","latent space","power law","token-level analysis","phase transition","greedy decoding","verbatim recall","MemFT","threshold-guided optimization"],"githubStars":3,"organization":{"_id":"64488b334988ee01f2a8d856","name":"alibaba-inc","fullname":"alibaba-inc","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/61ac8f8a00d01045fca0ad2f/MX4wxQVaFm1A1wqnrL2WU.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"620b3bbb0668e435407c8d0a","avatarUrl":"/avatars/e0fccbb2577d76088e09f054c35cffbc.svg","isPro":false,"fullname":"Ningyu Zhang","user":"Ningyu","type":"user"},{"_id":"6549caee44e75a7de4fee2fa","avatarUrl":"/avatars/5aea69671eb1299aaaa948d888b4b64f.svg","isPro":false,"fullname":"Xu Ziwen","user":"xzwnlp","type":"user"},{"_id":"64bf898d979949d2e2585c9a","avatarUrl":"/avatars/da77c856ec997e2b812c06272a01c8b2.svg","isPro":false,"fullname":"mengruwang","user":"mengru","type":"user"},{"_id":"66abc6da92b9eb71fe476118","avatarUrl":"/avatars/6d1618f45cc76da80335ad926ad24552.svg","isPro":false,"fullname":"xy.r","user":"ShawnRu","type":"user"},{"_id":"684bc1be17ae31ba66171292","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/684bc1be17ae31ba66171292/LFlkU4kArMjSzIbwjXd44.jpeg","isPro":false,"fullname":"Jingsheng Zheng","user":"JohnsonZheng03","type":"user"},{"_id":"64bb2e0ac733e8552f98ee83","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bb2e0ac733e8552f98ee83/rMOvHI9QOVHruOjPLOiZ7.jpeg","isPro":false,"fullname":"Hongbin Ye","user":"Casper888","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"6a0c1e5139d601217d9b3e8e","avatarUrl":"/avatars/bc27ca94a598dd902d591cbdee597f0c.svg","isPro":false,"fullname":"Leonardo Garate","user":"Opaquing","type":"user"},{"_id":"652bdbb77c5365f2d1228dfb","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/652bdbb77c5365f2d1228dfb/ImPwcK1dMr23MtJVI9C9I.jpeg","isPro":false,"fullname":"ZhongYi","user":"Blurblur02","type":"user"},{"_id":"63a942dd2e05ca32e35335df","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63a942dd2e05ca32e35335df/kuKfBLEXfWnvnoUUmoXW6.jpeg","isPro":false,"fullname":"haoming xu","user":"haomingx","type":"user"},{"_id":"65535b54140fc44a74d43635","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/MIrD8OzDKF2aI38i7ZPjR.jpeg","isPro":false,"fullname":"Zhisong Qiu","user":"consultantQ","type":"user"},{"_id":"6a141165d1d6b695028c7a45","avatarUrl":"/avatars/1aca65ab7abf80dc6171f3aa2f9308ee.svg","isPro":false,"fullname":"LinsongYu","user":"Yulinsong","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"64488b334988ee01f2a8d856","name":"alibaba-inc","fullname":"alibaba-inc","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/61ac8f8a00d01045fca0ad2f/MX4wxQVaFm1A1wqnrL2WU.jpeg"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.30260.md"}">

Papers

arxiv:2605.30260

How LoRA Remembers? A Parametric Memory Law for LLM Finetuning

Published on May 28

· Submitted by

Ningyu Zhang on May 29

alibaba-inc

Upvote

Authors:

Abstract

Research investigates the quantitative limits of parametric memory in large language models using LoRA as a probe, establishing a power law relationship and developing a threshold-guided optimization method for improved memory performance.

AI-generated summary

Large Language Models (LLMs) must continuously learn and update knowledge to remain effective in dynamic real-world environments. While Low-Rank Adaptation (LoRA) is widely used for such memory updates, existing studies mainly rely on qualitative downstream evaluations, leaving the quantitative capacity limits and underlying dynamics of exact parametric memory largely unexplored. To bridge this gap, we employ LoRA as a controlled memory capacity probe within the latent space to systematically quantify exact parametric memory. We introduce the Parametric Memory Law, a robust power law linking loss reduction Delta L to effective parameters and sequence length. At the token level, fine-grained analysis reveals a deterministic phase transition, demonstrating that a prediction probability of p > 0.5 constitutes a sufficient condition for verbatim recall under greedy decoding. Driven by these insights, we introduce MemFT, a threshold-guided optimization strategy that dynamically redistributes the training budget toward sub-threshold tokens. Empirical evaluations demonstrate that MemFT can enhance memory fidelity and efficiency. Code will be released at https://github.com/zjunlp/ParametricMemoryLaw.