Tag

Outages

62 articles archived under #outage · RSS

arXiv — Machine Learning research 2h ago

Towards Improved Anomaly Detection for Cloud Cybersecurity via Graph Neural Networks

arXiv:2606.28923v1 Announce Type: new Abstract: Detecting security threats in an organization's cloud computing environment has become necessary due to the increased reliance on cloud infrastructure. Logging of all cloud computing events enables investigation into any incidents…

24
arXiv — Machine Learning research 1d ago

Layerwise Progressive Freezing: A Training Scaffold for Depth-Scalable Binary Networks

arXiv:2606.27759v1 Announce Type: new Abstract: Training binary neural networks (BNNs) from scratch is dominated by the straight-through estimator (STE), whose forward/backward mismatch produces severe accuracy degradation as networks deepen. We study an orthogonal axis: when…

12
arXiv — NLP / Computation & Language research 1d ago

Cross-Platform Chinese Offensive Comment Detection via Dual-Threshold Hard Example Mining

arXiv:2606.27629v1 Announce Type: new Abstract: Cross-platform deployment of offensive comment detection for Chinese social media suffers performance degradation. The paper proposes a dual-threshold hard mining method to address this. First, the clean-Chinese-base RoBERTa is…

16
arXiv — NLP / Computation & Language research 1d ago

DMV-Bench: Diagnosing Long-Horizon Multimodal Agents' Visual Memory with Incidental Cue Injection

arXiv:2606.27499v1 Announce Type: cross Abstract: Research on agent memory has matured rapidly, but almost entirely on the text side: few existing benchmarks ask, in an interactive environment, when an agent genuinely needs to remember what it saw rather than what it could write…

11
Simon Willison community 3d ago

Incident Report: CVE-2026-LGTM

Incident Report: CVE-2026-LGTM Spectacular hypothetical incident report by Andrew Nesbitt. Day 2, 16:00 UTC --- Two AI review agents from competing vendors, both attached to a downstream pull request bumping foxhole-lz4 , enter a disagreement loop over whether the package is…

5
Hacker News — AI on Front Page community 3d ago

Incident CVE-2026-LGTM

Article URL: https://nesbitt.io/2026/06/26/incident-report-cve-2026-lgtm.html Comments URL: https://news.ycombinator.com/item?id=48686093 Points: 225 # Comments: 39

17
arXiv — NLP / Computation & Language research 4d ago

Helpfulness Hurts: Domain-Dependent Degradation of Mid-Trained Compassion Values Under Post-Training

arXiv:2606.26102v1 Announce Type: new Abstract: Standard post-training pipelines apply supervised fine-tuning (SFT) and reinforcement learning (RL) to make language models helpful, but these processes may inadvertently degrade values instilled during pre-training. We investigate…

22
arXiv — Machine Learning research 5d ago

The Gentle Collapse: Distributional Metrics for Continual Learning

arXiv:2606.25165v1 Announce Type: new Abstract: Accuracy degradation is the standard metric for Catastrophic Forgetting (CF), however, it records only whether forgetting occurred or not. It saturates at the extremes and collapses discretely at task boundaries, hiding the…

35
arXiv — NLP / Computation & Language research 5d ago

How Robust is OCR-Reasoning? Evaluating OCR-Reasoning Robustness of Vision-Language Models under Visual Perturbations

arXiv:2606.26041v1 Announce Type: cross Abstract: Vision-language models (VLMs) have achieved strong performance on OCR-based benchmarks and increasingly focused on text-rich understanding, but their robustness under controlled visual degradation remains insufficiently…

29
arXiv — NLP / Computation & Language research 6d ago

Pigeonholing: Bad prompts hurt models to collapse and make mistakes

arXiv:2606.24267v1 Announce Type: new Abstract: While in-context learning is generally shown to be effective in Large Language Models (LLMs), bad contexts can cause performance degradation and mode collapse, a phenomenon we call "pigeonholing." **Unintentionally bad** contexts…

26
arXiv — Machine Learning research 12d ago

Quantum Annealing Enhanced Reinforcement Learning for Accurate Remaining Useful Lifetime Prediction

arXiv:2606.18503v1 Announce Type: new Abstract: Remaining useful life (RUL) estimation is central to predictive maintenance, where an unplanned failure can cost far more than the asset itself. Statistical degradation models miss the strong nonlinearity of real systems, and…

38
arXiv — NLP / Computation & Language research 13d ago

Scaling Enterprise Agent Routing: Degradation, Diagnosis, and Recovery

arXiv:2606.17519v1 Announce Type: new Abstract: Production LLM assistants route user requests to growing libraries of specialized tools, but how does routing accuracy degrade as the catalog scales? We study single-step routing on a 110-agent, 584-tool catalog from a deployed…

14
arXiv — NLP / Computation & Language research 13d ago

The Slop Paradox: How Synthetic Standardization Erodes Clinical Uncertainty and Cross-Modal Alignment in AI-Rewritten Radiology Reports

arXiv:2606.17791v1 Announce Type: new Abstract: AI-assisted clinical documentation tools increasingly summarize, standardize, and reformat radiology reports using large language models (LLMs). We present a controlled measurement of the resulting information degradation. Using…

24
arXiv — NLP / Computation & Language research 15d ago

LLMs Contain Multitudes: How Deployment Context Reshapes Model-Level Preferences and Values

arXiv:2606.13944v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly characterised in recent evaluation work as having stable, model-level preference and value systems. However, accompanying robustness checks are limited to incidental prompt…

33
Hacker News — AI on Front Page community 16d ago

Arch Linux Now Believes Malware Incident Under Control: More Than 1,500 Packages

Article URL: https://www.phoronix.com/news/Arch-Linux-AUR-More-Than-1500 Comments URL: https://news.ycombinator.com/item?id=48516379 Points: 238 # Comments: 146

32
arXiv — NLP / Computation & Language research 18d ago

Multi-Turn Reasoning When Context Arrives in Pieces: Scalable Sharding and Memory-Augmented RL

arXiv:2606.12941v1 Announce Type: new Abstract: When a user reveals task-critical information across several conversation turns, LLM accuracy drops by up to 65% despite full context availability. We show that this Lost in Conversation degradation can be substantially mitigated…

31
Smol AI News news-outlet 19d ago

not much happened today

**Anthropic** reversed its covert degradation policy on **Claude Fable 5** after public backlash, sparking debates on governance, transparency, and access to frontier AI models. The model shows strong capabilities with mixed benchmark results, including **87.8% on WeirdML** and…

19
arXiv — NLP / Computation & Language research 19d ago

Lius: Translation Model Based Instructional Lingustic Using Continual Instruction Tuning In Kupang Malay

arXiv:2606.11786v1 Announce Type: new Abstract: Large Language Models (LLMs) offer new potential for translation tasks but often experience performance degradation when handling low-resource languages. To address this limitation, we propose an approach for fine-tuning LLMs on a…

37
r/MachineLearning community 19d ago

[R] AI Agent Security: The Complete Guide to Threats, Defenses, and the Future of Autonomous AI Safety [R]

This is a comprehensive living reference guide to AI agent security — synthesizing 18 articles from The Agent Report covering the 75-day period (April–June 2026) when agent security went from theoretical concern to operational crisis.  What's inside:  • Incident…

4
arXiv — NLP / Computation & Language research 20d ago

Small Data, Big Noise: Adversarial Training for Robust Parameter-Efficient Fine-Tuning

arXiv:2606.10610v1 Announce Type: new Abstract: Parameter-Efficient Fine-Tuning (PEFT) has become essential for adapting foundation models to downstream NLP tasks. However, current PEFT methods often struggle with robustness to noise and performance degradation on limited…

25
arXiv — NLP / Computation & Language research 20d ago

SpenseGPT: Practical One-shot Pruning Enabling Sparse and Dense GEMMs for LLM Inference

arXiv:2606.10445v1 Announce Type: cross Abstract: Semi-structured 2:4 sparsity is widely supported by modern accelerators, providing up to a 2x theoretical speedup. However, its strict 50% sparsity constraint often causes non-negligible accuracy degradation under post-training…

21
arXiv — Machine Learning research 21d ago

STARIXNet: Multivariate and Multi-attribute Deep Learning Approach to Real-Time Resource Allocation in Cloud Platforms

arXiv:2606.07565v1 Announce Type: new Abstract: Intelligent scaling of microservices in cloud platforms is crucial for mitigating escalating compute costs while avoiding service disruptions. Current solutions are limited to the univariate space, typically focusing on CPU usage…

24
arXiv — Machine Learning research 21d ago

Outage Detection in Self-Healing Smart Grids Using Reinforcement Learning with Spectral Graph Neural Networks

arXiv:2606.07583v1 Announce Type: new Abstract: Self-healing smart grids can quickly adjust their network configuration during outages to minimize power disruptions. During an outage, several actions can be taken, such as network reconfiguration through switching operations and…

25
arXiv — NLP / Computation & Language research 22d ago

Signal-Driven Observation for Long-Horizon Web Agents

arXiv:2606.06708v1 Announce Type: new Abstract: Web agents operating over long horizons ingest raw DOM and accessibility trees -- routinely tens of thousands of tokens -- at every action step, causing progressive context degradation that erodes reasoning well before tasks…

7
TechCrunch — AI news-outlet 22d ago

Notion restores access to Anthropic after service disruption

Notion's head of product said he was "astonished" at “the amount of people RT-ing this."

5
arXiv — NLP / Computation & Language research 25d ago

Epidemiology of Model Collapse: Modeling Synthetic Data Contamination via Bilayer SIR Dynamics

arXiv:2606.05168v1 Announce Type: new Abstract: Training on synthetic data causes model collapse, but existing analyses treat this as single-chain degradation. In reality, the AI ecosystem involves cross-contamination: models ingest synthetic data from other models, produce new…

19
Hugging Face Daily Papers research 25d ago

Token Budgets: An Empirical Catalog of 63 LLM-Agent Budget-Overrun Incidents, with an Affine-Typed Rust Mitigation as a Case Study

Abstract LLM-agent budget overruns are a documented production failure class: a single retry loop can spend thousands of dollars before an operator notices, and the in-process integrity properties that would prevent it (no aliasing, no double-spend, no use-after-delegation of a…

18
arXiv — Machine Learning research 26d ago

Recover-LoRA for Aggressive Quantization: Reclaiming Accuracy in 2-Bit Language Models via Low-Rank Adaptation with Knowledge Distillation on Synthetic Data

arXiv:2606.04238v1 Announce Type: new Abstract: Aggressive weight quantization to 2-bit precision offers substantial throughput and memory gains for large language model (LLM) inference, but typically incurs severe accuracy degradation. These gains are particularly relevant for…

22
arXiv — Machine Learning research 26d ago

Revisiting Privacy Amplification by Subsampling in Selective Release DPSGD

arXiv:2606.04384v1 Announce Type: new Abstract: Machine learning's reliance on sensitive data necessitates privacy-preserving techniques like Differentially Private Stochastic Gradient Descent (DPSGD). However, DPSGD suffers from substantial utility degradation and slow…

28
Hugging Face Daily Papers research 27d ago

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Abstract Multi-domain reinforcement learning in language models causes performance degradation through shared computational pathways, but targeted refresh and rollback techniques can selectively recover lost capabilities with minimal side effects. Generated by…

34
arXiv — Machine Learning research 27d ago

Learning to See via Epiretinal Implant Stimulation in silico with Model-Based Deep Reinforcement Learning

arXiv:2606.03118v1 Announce Type: new Abstract: Objective: Diseases such as age-related macular degeneration and retinitis pigmentosa cause the degradation of the photoreceptor layer. One approach to restore vision is to electrically stimulate the surviving retinal ganglion…

38
arXiv — Machine Learning research 28d ago

RAFT: Data Refinement and Adaptive Distillation for Domain Fine-Tuning with Alleviated Forgetting

arXiv:2606.00147v1 Announce Type: new Abstract: Domain-specific supervised fine-tuning (SFT) often improves in-domain performance at the cost of degrading a model's general capabilities. We view this degradation through two practical gaps in domain SFT: a…

10
TechCrunch — AI news-outlet 28d ago

Florida sues OpenAI, Sam Altman, in first-of-its-kind lawsuit over violent incidents

The lawsuit partially revolves around a shooting at Florida State University last year, and ChatGPT's alleged role in the incident.

18
arXiv — Machine Learning research 29d ago

idSCD: Identifying Training Datasets through Semantic Correlation Descriptors

arXiv:2605.30462v1 Announce Type: new Abstract: Can a dataset be recognized from the spurious correlations it induces during training? We argue that datasets leave dataset-specific traces in a model's learned semantic correlation structure: incidental regularities that are…

25
arXiv — Machine Learning research 29d ago

Benchmarking Machine Learning Uncertainty Quantification Methodologies for Predicting Turbine Gas Temperature Degradation

arXiv:2605.30585v1 Announce Type: new Abstract: Effective prognostics and health management of modern engines relies on accurate turbine gas temperature predictions and robust uncertainty quantification to ensure reliability and safety. This paper investigates five major…

38
r/LocalLLaMA community 29d ago

Qwen3.6-35B vs Gemma4-26B on 7900 XTX

Ran a fair comparison between Qwen3.6-35B-A3B and Gemma4-26B-A4B on my Radeon 7900 XTX. Both reasoning-enabled at matching 32K budgets, no output caps, six generic real-world prompts (meeting notes, incident postmortem, log triage to JSON, code review, a build-vs-buy decision, a…

9
arXiv — Machine Learning research 1mo ago

SCOPE: A Lightweight-training LLM Framework for Air Traffic Control Readback Monitoring

arXiv:2605.29543v1 Announce Type: new Abstract: Pilot readback of Air Traffic Control (ATC) voice instructions is a primary safeguard against miscommunication in air transportation. However, readback anomalies remain implicated in approximately 80% of aviation incidents. This…

10
The Information — AI news-outlet 1mo ago

Blue Origin New Glenn Rocket Explodes During Test

Jeff Bezos’ space company Blue Origin suffered a serious setback Thursday evening when its New Glenn rocket exploded on a launch pad in Florida during a test. Video clips of the incident show a giant fireball engulfing the rocket and surrounding structures. No one was on board…

18
Hacker News — AI on Front Page community 1mo ago

Incident with Pull Requests, Issues, Git Operations and API Requests

Article URL: https://www.githubstatus.com/incidents/xy1tt3hs572m Comments URL: https://news.ycombinator.com/item?id=48293080 Points: 211 # Comments: 164

20
arXiv — NLP / Computation & Language research 1mo ago

MAIGO: Mitigating Lost-in-Conversation with History-Cleaned On-Policy Self-Distillation

arXiv:2605.27186v1 Announce Type: new Abstract: Large language models often solve tasks from a fully specified prompt but degrade when the same requirements unfold over multiple turns, known as the lost-in-conversation (LiC) gap. We trace part of this degradation to…

33
Vercel — AI dev-tools 1mo ago

Building a real-time power outage map with Next.js on Vercel

Endeavour Energy on Vercel Sub-1s page loads during peak storm traffic Five-minute data sync cycle across all upstream systems 38% faster deployments compared to their previous platform When a summer storm rolls through New South Wales in Australia, hundreds of thousands of…

7
arXiv — Machine Learning research 1mo ago

Coupling-Robust Accuracy in Multiphysics Physics Informed Neural Networks via Kronecker-Preconditioned Optimization

arXiv:2605.23391v1 Announce Type: new Abstract: Physics-informed neural networks (PINNs) for coupled multiphysics systems suffer systematic accuracy degradation as inter-equation coupling strengthens. We provide a theoretical explanation for this phenomenon through neural…

5
arXiv — NLP / Computation & Language research 1mo ago

Model Collapse as Cultural Evolution

arXiv:2605.23054v1 Announce Type: new Abstract: Model collapse, the progressive degradation of LLMs trained on their own outputs, has been characterized statistically but lacks a linguistic explanation for which structures degrade, in what order, and why. We show that iterated…

28
r/LocalLLaMA community 1mo ago

Gemma4 26b a4b Apex quant is quite good

I tried mudler's apex quant for gemma4 26b a4b and it was amazing! I got 38tps at 90.000 context with no loop and suprisingly no quality degradation. I used mudler/gemma-4-26B-A4B-it-APEX-GGUF / APEX-I-Compact (15gb) on my RX 9060 XT 16 GB with llama.cpp Vulkan. For comperison,…

19
Hugging Face Daily Papers research 1mo ago

SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation

Abstract SpaceDG dataset and benchmark evaluate multimodal language models' spatial reasoning robustness under visual degradations, revealing significant performance gaps and demonstrating improved robustness through targeted training. AI-generated summary Multimodal Large…

26
arXiv — Machine Learning research 1mo ago

On-Policy Consistency Training Improves LLM Safety with Minimal Capability Degradation

arXiv:2605.21834v1 Announce Type: new Abstract: Aligned models can misbehave in several ways: they are often sycophantic, fall victim to jailbreaks, or fail to include appropriate safety warnings. Consistency training is a promising new alignment paradigm to mitigate such…

31
arXiv — NLP / Computation & Language research 1mo ago

SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation

arXiv:2605.22536v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) have made rapid progress in spatial intelligence, yet existing spatial reasoning benchmarks largely assume pristine visual inputs and overlook the degradations that commonly occur in…

10
The Information — AI news-outlet 1mo ago

Cursor Sees Opening as GitHub Flounders

Microsoft's GitHub unit has been on the defensive lately. Amid a series of outages and other snags, Jay Parikh , who oversees the software-project management platform, recently warned deputies that coding tools from Cursor and Anthropic could eventually make GitHub obsolete, my…

12
arXiv — Machine Learning research 1mo ago

Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor

arXiv:2605.20402v1 Announce Type: new Abstract: MXFP4 arithmetic can dramatically accelerate reinforcement learning (RL) post-training of large language models (LLMs), yet the quantization error introduces severe accuracy degradation. Existing work treats the quantization error…

9
arXiv — Machine Learning research 1mo ago

Graph-Driven Cross-Industry Real-Time Monitoring Framework for Anti-Money Laundering Detection in Converged Mobility-Energy Supply Chain Networks

arXiv:2605.18844v1 Announce Type: new Abstract: With the deep integration of the travel and energy industries, cross-industry supply chain finance has gradually become a high-risk field of hidden money laundering incidents. For this reason, this work proposes a graph-driven…

22

Towards Improved Anomaly Detection for Cloud Cybersecurity via Graph Neural Networks

Layerwise Progressive Freezing: A Training Scaffold for Depth-Scalable Binary Networks

Cross-Platform Chinese Offensive Comment Detection via Dual-Threshold Hard Example Mining

DMV-Bench: Diagnosing Long-Horizon Multimodal Agents' Visual Memory with Incidental Cue Injection

Incident Report: CVE-2026-LGTM

Incident CVE-2026-LGTM

Helpfulness Hurts: Domain-Dependent Degradation of Mid-Trained Compassion Values Under Post-Training

The Gentle Collapse: Distributional Metrics for Continual Learning

How Robust is OCR-Reasoning? Evaluating OCR-Reasoning Robustness of Vision-Language Models under Visual Perturbations

Pigeonholing: Bad prompts hurt models to collapse and make mistakes

Quantum Annealing Enhanced Reinforcement Learning for Accurate Remaining Useful Lifetime Prediction

Scaling Enterprise Agent Routing: Degradation, Diagnosis, and Recovery

The Slop Paradox: How Synthetic Standardization Erodes Clinical Uncertainty and Cross-Modal Alignment in AI-Rewritten Radiology Reports

LLMs Contain Multitudes: How Deployment Context Reshapes Model-Level Preferences and Values

Arch Linux Now Believes Malware Incident Under Control: More Than 1,500 Packages

Multi-Turn Reasoning When Context Arrives in Pieces: Scalable Sharding and Memory-Augmented RL

not much happened today

Lius: Translation Model Based Instructional Lingustic Using Continual Instruction Tuning In Kupang Malay

[R] AI Agent Security: The Complete Guide to Threats, Defenses, and the Future of Autonomous AI Safety [R]

Small Data, Big Noise: Adversarial Training for Robust Parameter-Efficient Fine-Tuning

SpenseGPT: Practical One-shot Pruning Enabling Sparse and Dense GEMMs for LLM Inference

STARIXNet: Multivariate and Multi-attribute Deep Learning Approach to Real-Time Resource Allocation in Cloud Platforms

Outage Detection in Self-Healing Smart Grids Using Reinforcement Learning with Spectral Graph Neural Networks

Signal-Driven Observation for Long-Horizon Web Agents

Notion restores access to Anthropic after service disruption

Epidemiology of Model Collapse: Modeling Synthetic Data Contamination via Bilayer SIR Dynamics

Token Budgets: An Empirical Catalog of 63 LLM-Agent Budget-Overrun Incidents, with an Affine-Typed Rust Mitigation as a Case Study

Recover-LoRA for Aggressive Quantization: Reclaiming Accuracy in 2-Bit Language Models via Low-Rank Adaptation with Knowledge Distillation on Synthetic Data

Revisiting Privacy Amplification by Subsampling in Selective Release DPSGD

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Learning to See via Epiretinal Implant Stimulation in silico with Model-Based Deep Reinforcement Learning

RAFT: Data Refinement and Adaptive Distillation for Domain Fine-Tuning with Alleviated Forgetting

Florida sues OpenAI, Sam Altman, in first-of-its-kind lawsuit over violent incidents

idSCD: Identifying Training Datasets through Semantic Correlation Descriptors

Benchmarking Machine Learning Uncertainty Quantification Methodologies for Predicting Turbine Gas Temperature Degradation

Qwen3.6-35B vs Gemma4-26B on 7900 XTX

SCOPE: A Lightweight-training LLM Framework for Air Traffic Control Readback Monitoring

Blue Origin New Glenn Rocket Explodes During Test

Incident with Pull Requests, Issues, Git Operations and API Requests

MAIGO: Mitigating Lost-in-Conversation with History-Cleaned On-Policy Self-Distillation

Building a real-time power outage map with Next.js on Vercel

Coupling-Robust Accuracy in Multiphysics Physics Informed Neural Networks via Kronecker-Preconditioned Optimization

Model Collapse as Cultural Evolution

Gemma4 26b a4b Apex quant is quite good

SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation

On-Policy Consistency Training Improves LLM Safety with Minimal Capability Degradation

SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation

Cursor Sees Opening as GitHub Flounders

Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor

Graph-Driven Cross-Industry Real-Time Monitoring Framework for Anti-Money Laundering Detection in Converged Mobility-Energy Supply Chain Networks