Canonicalized Stable-List Replay for Private Federated Continual Learning over Language-Model Embeddings
Mirrored from arXiv — Machine Learning for archival readability. Support the source by reading on the original site.
arXiv:2606.00426v1 Announce Type: new
Abstract: Federated continual learning (FCL) lets distributed clients adapt language-model heads to evolving NLP tasks without sharing raw text. Under user-level differential privacy (DP), replay-based continual learning faces a structural obstacle: clients can release only small noisy lists of candidate replay summaries, and those lists are unordered across clients. We introduce Canonicalized Stable-List Replay (CSLR), where clients privately produce candidate replay distributions over a shared sentence-embedding space and the server aligns them using signatures induced by public anchor sentences. The anchors provide identifiability for aggregation rather than additional replay data. We prove that, under an observable anchor-signature margin, $O(\log(N/\eta)/p)$ anchors distinguish $N$ candidate list elements with probability at least $1-\eta$, and we give a scoped anchorless non-identifiability result for unordered-label oracle models. Across five seeds on continual classification, NER, and dialogue benchmarks, CSLR improves the final average task metric by 3.9--5.6 points over the strongest non-CSLR DP baseline at $\eps=4$ under the reported replay-release budget, while also outperforming Hungarian and optimal-transport matchers. The formal privacy guarantee covers replay release; end-to-end private training additionally requires composition with a private optimizer for task-head updates.
More from arXiv — Machine Learning
-
BitsMoE: Efficient Spectral Energy-Guided Bit Allocation for MoE LLM Quantization
Jun 2
-
DAStatFormer: A Hybrid Multibranch Transformer with Statistical Feature Integration for DAS-Based Pattern Recognitions
Jun 2
-
Hoeffding Concept Bottleneck Models with Applications to Overhead Images
Jun 2
-
From Demonstrations to Rewards: Test-Time Prompt Optimization for VLM Reward Models
Jun 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.