News / #copyright Tag Copyright 22 articles archived under #copyright · RSS Sign in to follow arXiv — Machine Learning research 1d ago CBD: API-Only LLM Black-Box Unlearning through Controlled Behavioral Divergence arXiv:2606.27683v1 Announce Type: new Abstract: Edge devices increasingly invoke large language models (LLMs) through API services for context aware edge intelligence, while edge generated data may be collected to improve LLMs and may introduce sensitive, copyrighted, harmful,… 11 arXiv — NLP / Computation & Language research 1d ago Position: The Term "Machine Unlearning" Is Overused in LLMs arXiv:2606.27379v1 Announce Type: new Abstract: Large language models increasingly face demands to "forget" training data, knowledge, or behaviors due to regulatory deletion obligations, copyright/licensing disputes, and safety or product-policy requirements. This position paper… 15 Ars Technica — AI news-outlet 3d ago NYT slams Microsoft for building copyright-infringing supercomputer for OpenAI NYT shifts OpenAI/Microsoft copyright claims after SCOTUS ruling against Sony. 20 arXiv — NLP / Computation & Language research 12d ago Output Vector Editing for Memorization Mitigation in Large Language Models arXiv:2606.18767v1 Announce Type: new Abstract: Large language models memorize and reproduce sequences from their training data, creating privacy, copyright, and security risks. Existing neuron-level mitigation methods equate editing with zeroing out neuron activations, but the… 24 llama.cpp releases dev-tools 12d ago b9684 [SYCL] Add conv_3d ( #24691 ) add conv_3d optimize update ops.md restore test script rm unused code rm copyright notes macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu… 15 Hacker News — AI on Front Page community 19d ago H.R. 6028 would fundamentally change the U.S. Copyright Office Article URL: https://www.eff.org/deeplinks/2026/06/congress-just-rushed-through-disastrous-copyright-office-overhaul Comments URL: https://news.ycombinator.com/item?id=48484496 Points: 209 # Comments: 65 25 r/MachineLearning community 22d ago ICML rejected paper visibility [D] If ICML conference paper is rejected and no one opts-in or opts-out to keep the reviews visible, will the reviews be visible to everyone? There was clear instruction that only papers with at-least 1 opt-in AND zero opt-out options will be visible. None of the authors selected… 7 arXiv — Machine Learning research 22d ago Where Rectified Flows Leak: Characterising Membership Signals Along the Interpolation Path arXiv:2606.07271v1 Announce Type: new Abstract: Understanding what generative models retain from training data remains challenging, with implications for copyright and privacy. Beyond verbatim reproduction, models can encode subtler traces of their training data that never… 12 TechCrunch — AI news-outlet 26d ago Publishers will be able to opt out of AI Search, thanks to new regulation U.K. regulators are requiring Google offer a tool allowing website publishers to opt-out of generative AI search features. The option will be tested in the UK then rolled out globally. 10 arXiv — Machine Learning research 28d ago Geometric Erasure by Contrastive Velocity Matching in Rectified Flows arXiv:2606.00140v1 Announce Type: new Abstract: While the rapid adoption of multimodal generative models offers immense potential, it has also increased the risks of harmful content synthesis, deepfakes, and copyright infringements. To address these challenges, concept erasure… 21 arXiv — NLP / Computation & Language research 29d ago Divergence Decoding: Inference-Time Unlearning via Auxiliary Models arXiv:2605.31293v1 Announce Type: new Abstract: Large Language Models (LLMs) frequently memorize sensitive training data thereby creating significant privacy and copyright risks. Addressing these risks, i.e., removing such knowledge from an existing model checkpoint, has proven… 34 arXiv — Machine Learning research 1mo ago Localizing Memorized Regions in Diffusion Models via Coordinate-Wise Curvature Differences arXiv:2605.26756v1 Announce Type: new Abstract: Diffusion models can unintentionally memorize training samples, raising concerns about privacy and copyright. While recent methods can detect memorization, they often rely on global or model-specific signals and provide limited… 30 arXiv — NLP / Computation & Language research 1mo ago Translators as Invisible Teachers of AI: Copyright, Translation Memory, and the Political Economy of Linguistic Data arXiv:2605.24842v1 Announce Type: new Abstract: This paper examines how the labour of translators has been transformed into foundational data capital for the age of artificial intelligence (AI). Translation memories (TM) and parallel corpora preserve a one-to-one correspondence… 6 Hacker News — AI on Front Page community 1mo ago Show HN: Auto-identity-remove – Automated data broker opt-out runner for macOS Article URL: https://github.com/stephenlthorn/auto-identity-remove Comments URL: https://news.ycombinator.com/item?id=48178184 Points: 282 # Comments: 112 15 arXiv — NLP / Computation & Language research 1mo ago Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training arXiv:2506.01732v3 Announce Type: replace Abstract: Large Language Models (LLMs) are pre-trained on large amounts of data from different sources and domains. Such datasets often contain trillions of tokens, including large portions of copyrighted or proprietary content, which… 11 Ars Technica — AI news-outlet 1mo ago Authors fight for higher payouts from Anthropic’s $1.5B copyright settlement Lawyers accused of rushing historic settlement to seize $320 million in fees. 8 arXiv — NLP / Computation & Language research 1mo ago To See is Not to Learn: Protecting Multimodal Data from Unauthorized Fine-Tuning of Large Vision-Language Model arXiv:2605.14291v1 Announce Type: cross Abstract: The rapid advancement of Large Vision-Language Models (LVLMs) is increasingly accompanied by unauthorized scraping and training on multimodal web data, posing severe copyright and privacy risks to data owners. Existing… 8 arXiv — Machine Learning research 1mo ago Inference-Time Machine Unlearning via Gated Activation Redirection arXiv:2605.12765v1 Announce Type: new Abstract: Large Language Models memorize vast amounts of training data, raising concerns regarding privacy, copyright infringement, and safety. Machine unlearning seeks to remove the influence of a targeted forget set while preserving model… 10 arXiv — NLP / Computation & Language research 1mo ago Robust LLM Unlearning Against Relearning Attacks: The Minor Components in Representations Matter arXiv:2605.11685v1 Announce Type: new Abstract: Large language model (LLM) unlearning aims to remove specific data influences from pre-trained model without costly retraining, addressing privacy, copyright, and safety concerns. However, recent studies reveal a critical… 17 Vercel — AI dev-tools 2mo ago Team-wide Zero Data Retention and prompt training controls now on AI Gateway AI Gateway now supports Zero Data Retention (ZDR) at the team level, removing the need to configure opt-outs or reach agreements with each provider individually. It routes requests only to providers where ZDR agreements are in place, with support for Anthropic, OpenAI, Google,… 35 Smol AI News news-outlet 5mo ago not much happened today **Stanford paper** reveals **Claude 3.7 Sonnet** memorized **95.8% of Harry Potter 1**, highlighting copyright extraction risks compared to **GPT-4.1**. **Google AI Studio** sponsors **TailwindCSS** amid OSS funding debates. **Google** and **Sundar Pichai** launch **Gmail Gemini… 21 Eugene Yan research 27mo ago Task-Specific LLM Evals that Do & Don't Work Evals for classification, summarization, translation, copyright regurgitation, and toxicity. 9