News / #outage Tag Outages 62 articles archived under #outage · RSS Sign in to follow arXiv — Machine Learning research 2h ago Towards Improved Anomaly Detection for Cloud Cybersecurity via Graph Neural Networks arXiv:2606.28923v1 Announce Type: new Abstract: Detecting security threats in an organization's cloud computing environment has become necessary due to the increased reliance on cloud infrastructure. Logging of all cloud computing events enables investigation into any incidents… 24 arXiv — Machine Learning research 1d ago Layerwise Progressive Freezing: A Training Scaffold for Depth-Scalable Binary Networks arXiv:2606.27759v1 Announce Type: new Abstract: Training binary neural networks (BNNs) from scratch is dominated by the straight-through estimator (STE), whose forward/backward mismatch produces severe accuracy degradation as networks deepen. We study an orthogonal axis: when… 12 arXiv — NLP / Computation & Language research 1d ago Cross-Platform Chinese Offensive Comment Detection via Dual-Threshold Hard Example Mining arXiv:2606.27629v1 Announce Type: new Abstract: Cross-platform deployment of offensive comment detection for Chinese social media suffers performance degradation. The paper proposes a dual-threshold hard mining method to address this. First, the clean-Chinese-base RoBERTa is… 16 arXiv — NLP / Computation & Language research 1d ago DMV-Bench: Diagnosing Long-Horizon Multimodal Agents' Visual Memory with Incidental Cue Injection arXiv:2606.27499v1 Announce Type: cross Abstract: Research on agent memory has matured rapidly, but almost entirely on the text side: few existing benchmarks ask, in an interactive environment, when an agent genuinely needs to remember what it saw rather than what it could write… 11 Simon Willison community 3d ago Incident Report: CVE-2026-LGTM Incident Report: CVE-2026-LGTM Spectacular hypothetical incident report by Andrew Nesbitt. Day 2, 16:00 UTC --- Two AI review agents from competing vendors, both attached to a downstream pull request bumping foxhole-lz4 , enter a disagreement loop over whether the package is… 5 Hacker News — AI on Front Page community 3d ago Incident CVE-2026-LGTM Article URL: https://nesbitt.io/2026/06/26/incident-report-cve-2026-lgtm.html Comments URL: https://news.ycombinator.com/item?id=48686093 Points: 225 # Comments: 39 17 arXiv — NLP / Computation & Language research 4d ago Helpfulness Hurts: Domain-Dependent Degradation of Mid-Trained Compassion Values Under Post-Training arXiv:2606.26102v1 Announce Type: new Abstract: Standard post-training pipelines apply supervised fine-tuning (SFT) and reinforcement learning (RL) to make language models helpful, but these processes may inadvertently degrade values instilled during pre-training. We investigate… 22 arXiv — Machine Learning research 5d ago The Gentle Collapse: Distributional Metrics for Continual Learning arXiv:2606.25165v1 Announce Type: new Abstract: Accuracy degradation is the standard metric for Catastrophic Forgetting (CF), however, it records only whether forgetting occurred or not. It saturates at the extremes and collapses discretely at task boundaries, hiding the… 35 arXiv — NLP / Computation & Language research 5d ago How Robust is OCR-Reasoning? Evaluating OCR-Reasoning Robustness of Vision-Language Models under Visual Perturbations arXiv:2606.26041v1 Announce Type: cross Abstract: Vision-language models (VLMs) have achieved strong performance on OCR-based benchmarks and increasingly focused on text-rich understanding, but their robustness under controlled visual degradation remains insufficiently… 29 arXiv — NLP / Computation & Language research 6d ago Pigeonholing: Bad prompts hurt models to collapse and make mistakes arXiv:2606.24267v1 Announce Type: new Abstract: While in-context learning is generally shown to be effective in Large Language Models (LLMs), bad contexts can cause performance degradation and mode collapse, a phenomenon we call "pigeonholing." **Unintentionally bad** contexts… 26 arXiv — Machine Learning research 12d ago Quantum Annealing Enhanced Reinforcement Learning for Accurate Remaining Useful Lifetime Prediction arXiv:2606.18503v1 Announce Type: new Abstract: Remaining useful life (RUL) estimation is central to predictive maintenance, where an unplanned failure can cost far more than the asset itself. Statistical degradation models miss the strong nonlinearity of real systems, and… 38 arXiv — NLP / Computation & Language research 13d ago Scaling Enterprise Agent Routing: Degradation, Diagnosis, and Recovery arXiv:2606.17519v1 Announce Type: new Abstract: Production LLM assistants route user requests to growing libraries of specialized tools, but how does routing accuracy degrade as the catalog scales? We study single-step routing on a 110-agent, 584-tool catalog from a deployed… 14 arXiv — NLP / Computation & Language research 13d ago The Slop Paradox: How Synthetic Standardization Erodes Clinical Uncertainty and Cross-Modal Alignment in AI-Rewritten Radiology Reports arXiv:2606.17791v1 Announce Type: new Abstract: AI-assisted clinical documentation tools increasingly summarize, standardize, and reformat radiology reports using large language models (LLMs). We present a controlled measurement of the resulting information degradation. Using… 24 arXiv — NLP / Computation & Language research 15d ago LLMs Contain Multitudes: How Deployment Context Reshapes Model-Level Preferences and Values arXiv:2606.13944v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly characterised in recent evaluation work as having stable, model-level preference and value systems. However, accompanying robustness checks are limited to incidental prompt… 33 Hacker News — AI on Front Page community 16d ago Arch Linux Now Believes Malware Incident Under Control: More Than 1,500 Packages Article URL: https://www.phoronix.com/news/Arch-Linux-AUR-More-Than-1500 Comments URL: https://news.ycombinator.com/item?id=48516379 Points: 238 # Comments: 146 32 arXiv — NLP / Computation & Language research 18d ago Multi-Turn Reasoning When Context Arrives in Pieces: Scalable Sharding and Memory-Augmented RL arXiv:2606.12941v1 Announce Type: new Abstract: When a user reveals task-critical information across several conversation turns, LLM accuracy drops by up to 65% despite full context availability. We show that this Lost in Conversation degradation can be substantially mitigated… 31 Smol AI News news-outlet 19d ago not much happened today **Anthropic** reversed its covert degradation policy on **Claude Fable 5** after public backlash, sparking debates on governance, transparency, and access to frontier AI models. The model shows strong capabilities with mixed benchmark results, including **87.8% on WeirdML** and… 19 arXiv — NLP / Computation & Language research 19d ago Lius: Translation Model Based Instructional Lingustic Using Continual Instruction Tuning In Kupang Malay arXiv:2606.11786v1 Announce Type: new Abstract: Large Language Models (LLMs) offer new potential for translation tasks but often experience performance degradation when handling low-resource languages. To address this limitation, we propose an approach for fine-tuning LLMs on a… 37 r/MachineLearning community 19d ago [R] AI Agent Security: The Complete Guide to Threats, Defenses, and the Future of Autonomous AI Safety [R] This is a comprehensive living reference guide to AI agent security — synthesizing 18 articles from The Agent Report covering the 75-day period (April–June 2026) when agent security went from theoretical concern to operational crisis. ​ What's inside: ​ • Incident… 4 arXiv — NLP / Computation & Language research 20d ago Small Data, Big Noise: Adversarial Training for Robust Parameter-Efficient Fine-Tuning arXiv:2606.10610v1 Announce Type: new Abstract: Parameter-Efficient Fine-Tuning (PEFT) has become essential for adapting foundation models to downstream NLP tasks. However, current PEFT methods often struggle with robustness to noise and performance degradation on limited… 25 arXiv — NLP / Computation & Language research 20d ago SpenseGPT: Practical One-shot Pruning Enabling Sparse and Dense GEMMs for LLM Inference arXiv:2606.10445v1 Announce Type: cross Abstract: Semi-structured 2:4 sparsity is widely supported by modern accelerators, providing up to a 2x theoretical speedup. However, its strict 50% sparsity constraint often causes non-negligible accuracy degradation under post-training… 21 arXiv — Machine Learning research 21d ago STARIXNet: Multivariate and Multi-attribute Deep Learning Approach to Real-Time Resource Allocation in Cloud Platforms arXiv:2606.07565v1 Announce Type: new Abstract: Intelligent scaling of microservices in cloud platforms is crucial for mitigating escalating compute costs while avoiding service disruptions. Current solutions are limited to the univariate space, typically focusing on CPU usage… 24 arXiv — Machine Learning research 21d ago Outage Detection in Self-Healing Smart Grids Using Reinforcement Learning with Spectral Graph Neural Networks arXiv:2606.07583v1 Announce Type: new Abstract: Self-healing smart grids can quickly adjust their network configuration during outages to minimize power disruptions. During an outage, several actions can be taken, such as network reconfiguration through switching operations and… 25 arXiv — NLP / Computation & Language research 22d ago Signal-Driven Observation for Long-Horizon Web Agents arXiv:2606.06708v1 Announce Type: new Abstract: Web agents operating over long horizons ingest raw DOM and accessibility trees -- routinely tens of thousands of tokens -- at every action step, causing progressive context degradation that erodes reasoning well before tasks… 7 TechCrunch — AI news-outlet 22d ago Notion restores access to Anthropic after service disruption Notion's head of product said he was "astonished" at “the amount of people RT-ing this." 5 arXiv — NLP / Computation & Language research 25d ago Epidemiology of Model Collapse: Modeling Synthetic Data Contamination via Bilayer SIR Dynamics arXiv:2606.05168v1 Announce Type: new Abstract: Training on synthetic data causes model collapse, but existing analyses treat this as single-chain degradation. In reality, the AI ecosystem involves cross-contamination: models ingest synthetic data from other models, produce new… 19 Hugging Face Daily Papers research 25d ago Token Budgets: An Empirical Catalog of 63 LLM-Agent Budget-Overrun Incidents, with an Affine-Typed Rust Mitigation as a Case Study Abstract LLM-agent budget overruns are a documented production failure class: a single retry loop can spend thousands of dollars before an operator notices, and the in-process integrity properties that would prevent it (no aliasing, no double-spend, no use-after-delegation of a… 18 arXiv — Machine Learning research 26d ago Recover-LoRA for Aggressive Quantization: Reclaiming Accuracy in 2-Bit Language Models via Low-Rank Adaptation with Knowledge Distillation on Synthetic Data arXiv:2606.04238v1 Announce Type: new Abstract: Aggressive weight quantization to 2-bit precision offers substantial throughput and memory gains for large language model (LLM) inference, but typically incurs severe accuracy degradation. These gains are particularly relevant for… 22 arXiv — Machine Learning research 26d ago Revisiting Privacy Amplification by Subsampling in Selective Release DPSGD arXiv:2606.04384v1 Announce Type: new Abstract: Machine learning's reliance on sensitive data necessitates privacy-preserving techniques like Differentially Private Stochastic Gradient Descent (DPSGD). However, DPSGD suffers from substantial utility degradation and slow… 28 Hugging Face Daily Papers research 27d ago A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL Abstract Multi-domain reinforcement learning in language models causes performance degradation through shared computational pathways, but targeted refresh and rollback techniques can selectively recover lost capabilities with minimal side effects. Generated by… 34 arXiv — Machine Learning research 27d ago Learning to See via Epiretinal Implant Stimulation in silico with Model-Based Deep Reinforcement Learning arXiv:2606.03118v1 Announce Type: new Abstract: Objective: Diseases such as age-related macular degeneration and retinitis pigmentosa cause the degradation of the photoreceptor layer. One approach to restore vision is to electrically stimulate the surviving retinal ganglion… 38 arXiv — Machine Learning research 28d ago RAFT: Data Refinement and Adaptive Distillation for Domain Fine-Tuning with Alleviated Forgetting arXiv:2606.00147v1 Announce Type: new Abstract: Domain-specific supervised fine-tuning (SFT) often improves in-domain performance at the cost of degrading a model's general capabilities. We view this degradation through two practical gaps in domain SFT: a… 10 TechCrunch — AI news-outlet 28d ago Florida sues OpenAI, Sam Altman, in first-of-its-kind lawsuit over violent incidents The lawsuit partially revolves around a shooting at Florida State University last year, and ChatGPT's alleged role in the incident. 18 arXiv — Machine Learning research 29d ago idSCD: Identifying Training Datasets through Semantic Correlation Descriptors arXiv:2605.30462v1 Announce Type: new Abstract: Can a dataset be recognized from the spurious correlations it induces during training? We argue that datasets leave dataset-specific traces in a model's learned semantic correlation structure: incidental regularities that are… 25 arXiv — Machine Learning research 29d ago Benchmarking Machine Learning Uncertainty Quantification Methodologies for Predicting Turbine Gas Temperature Degradation arXiv:2605.30585v1 Announce Type: new Abstract: Effective prognostics and health management of modern engines relies on accurate turbine gas temperature predictions and robust uncertainty quantification to ensure reliability and safety. This paper investigates five major… 38 r/LocalLLaMA community 29d ago Qwen3.6-35B vs Gemma4-26B on 7900 XTX Ran a fair comparison between Qwen3.6-35B-A3B and Gemma4-26B-A4B on my Radeon 7900 XTX. Both reasoning-enabled at matching 32K budgets, no output caps, six generic real-world prompts (meeting notes, incident postmortem, log triage to JSON, code review, a build-vs-buy decision, a… 9 arXiv — Machine Learning research 1mo ago SCOPE: A Lightweight-training LLM Framework for Air Traffic Control Readback Monitoring arXiv:2605.29543v1 Announce Type: new Abstract: Pilot readback of Air Traffic Control (ATC) voice instructions is a primary safeguard against miscommunication in air transportation. However, readback anomalies remain implicated in approximately 80% of aviation incidents. This… 10 The Information — AI news-outlet 1mo ago Blue Origin New Glenn Rocket Explodes During Test Jeff Bezos’ space company Blue Origin suffered a serious setback Thursday evening when its New Glenn rocket exploded on a launch pad in Florida during a test. Video clips of the incident show a giant fireball engulfing the rocket and surrounding structures. No one was on board… 18 Hacker News — AI on Front Page community 1mo ago Incident with Pull Requests, Issues, Git Operations and API Requests Article URL: https://www.githubstatus.com/incidents/xy1tt3hs572m Comments URL: https://news.ycombinator.com/item?id=48293080 Points: 211 # Comments: 164 20 arXiv — NLP / Computation & Language research 1mo ago MAIGO: Mitigating Lost-in-Conversation with History-Cleaned On-Policy Self-Distillation arXiv:2605.27186v1 Announce Type: new Abstract: Large language models often solve tasks from a fully specified prompt but degrade when the same requirements unfold over multiple turns, known as the lost-in-conversation (LiC) gap. We trace part of this degradation to… 33 Vercel — AI dev-tools 1mo ago Building a real-time power outage map with Next.js on Vercel Endeavour Energy on Vercel Sub-1s page loads during peak storm traffic Five-minute data sync cycle across all upstream systems 38% faster deployments compared to their previous platform When a summer storm rolls through New South Wales in Australia, hundreds of thousands of… 7 arXiv — Machine Learning research 1mo ago Coupling-Robust Accuracy in Multiphysics Physics Informed Neural Networks via Kronecker-Preconditioned Optimization arXiv:2605.23391v1 Announce Type: new Abstract: Physics-informed neural networks (PINNs) for coupled multiphysics systems suffer systematic accuracy degradation as inter-equation coupling strengthens. We provide a theoretical explanation for this phenomenon through neural… 5 arXiv — NLP / Computation & Language research 1mo ago Model Collapse as Cultural Evolution arXiv:2605.23054v1 Announce Type: new Abstract: Model collapse, the progressive degradation of LLMs trained on their own outputs, has been characterized statistically but lacks a linguistic explanation for which structures degrade, in what order, and why. We show that iterated… 28 r/LocalLLaMA community 1mo ago Gemma4 26b a4b Apex quant is quite good I tried mudler's apex quant for gemma4 26b a4b and it was amazing! I got 38tps at 90.000 context with no loop and suprisingly no quality degradation. I used mudler/gemma-4-26B-A4B-it-APEX-GGUF / APEX-I-Compact (15gb) on my RX 9060 XT 16 GB with llama.cpp Vulkan. For comperison,… 19 Hugging Face Daily Papers research 1mo ago SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation Abstract SpaceDG dataset and benchmark evaluate multimodal language models' spatial reasoning robustness under visual degradations, revealing significant performance gaps and demonstrating improved robustness through targeted training. AI-generated summary Multimodal Large… 26 arXiv — Machine Learning research 1mo ago On-Policy Consistency Training Improves LLM Safety with Minimal Capability Degradation arXiv:2605.21834v1 Announce Type: new Abstract: Aligned models can misbehave in several ways: they are often sycophantic, fall victim to jailbreaks, or fail to include appropriate safety warnings. Consistency training is a promising new alignment paradigm to mitigate such… 31 arXiv — NLP / Computation & Language research 1mo ago SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation arXiv:2605.22536v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) have made rapid progress in spatial intelligence, yet existing spatial reasoning benchmarks largely assume pristine visual inputs and overlook the degradations that commonly occur in… 10 The Information — AI news-outlet 1mo ago Cursor Sees Opening as GitHub Flounders Microsoft's GitHub unit has been on the defensive lately. Amid a series of outages and other snags, Jay Parikh , who oversees the software-project management platform, recently warned deputies that coding tools from Cursor and Anthropic could eventually make GitHub obsolete, my… 12 arXiv — Machine Learning research 1mo ago Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor arXiv:2605.20402v1 Announce Type: new Abstract: MXFP4 arithmetic can dramatically accelerate reinforcement learning (RL) post-training of large language models (LLMs), yet the quantization error introduces severe accuracy degradation. Existing work treats the quantization error… 9 arXiv — Machine Learning research 1mo ago Graph-Driven Cross-Industry Real-Time Monitoring Framework for Anti-Money Laundering Detection in Converged Mobility-Energy Supply Chain Networks arXiv:2605.18844v1 Announce Type: new Abstract: With the deep integration of the travel and energy industries, cross-industry supply chain finance has gradually become a high-risk field of hidden money laundering incidents. For this reason, this work proposes a graph-driven… 22 Page 1 of 2 · 62 articles Older →