Free 3 min read

Is Anyscale Down?

Check live Anyscale status — managed Ray clusters, LLM serving (Llama, Mistral), and distributed ML training. See recent incidents and set up free email alerts.

Anyscale live status

Anyscale — live status

Updated every 5 minutes. Full incident history at prismix.dev/service/anyscale.

Full status →

Quick check: is Anyscale down right now?

  1. Prismix: prismix.dev/service/anyscale — live status + 30-day uptime + incidents.
  2. Anyscale official status: status.anyscale.com — Anyscale's own status page with per-region breakdown.
  3. API call: curl https://prismix.dev/api/v1/statuses | jq '.services[] | select(.id=="anyscale")'

Set up free email alerts for Anyscale

  1. 1

    Sign in

    Go to prismix.dev/sign-in — email OTP or GitHub sign-in.

  2. 2

    Star Anyscale

    On prismix.dev/service/anyscale, click the ☆ star icon.

  3. 3

    Alerts are live

    You'll get an email within minutes of any status change.

Common causes of "Anyscale not working"

If Prismix shows Anyscale as "Operational" but your clusters or endpoints are failing:

  • Ray cluster not scaling (autoscaler stuck) — the Anyscale autoscaler can stall when the cloud provider (AWS/GCP) has insufficient capacity for the requested instance type. Check the cluster event log in the Anyscale console for InsufficientInstanceCapacity errors and switch to a different instance family or region with available GPU stock.
  • LLM endpoint cold start over 60 seconds — Anyscale LLM endpoints on serverless tiers may take 60–90 seconds to start from a cold state when no replicas are warm. Enable a minimum replica count of 1 in your endpoint configuration to keep at least one replica always warm, avoiding cold starts at the cost of idle compute.
  • Out-of-memory on GPU instance (CUDA OOM) — large models like Llama-70B require multi-GPU tensor parallelism. If you deploy to a single A10G (24GB VRAM) instead of multiple GPUs, the model will OOM during loading. Verify the minimum recommended GPU configuration for your model size and set num_gpus accordingly in your serve config.
  • Ray object store eviction during long training runs — Ray's in-memory object store evicts objects under memory pressure using LRU. During distributed training, large checkpoints or datasets pinned to the object store can be evicted mid-run. Increase the object_store_memory setting or pin critical objects explicitly with ray.put(obj) and hold a reference.
  • API endpoint returning 503 during scheduled maintenance — Anyscale performs rolling maintenance windows for the control plane. During these windows, new cluster launches and endpoint deployments may fail with 503. Check status.anyscale.com for scheduled maintenance notices; existing running clusters are unaffected during control-plane maintenance.
  • Region-specific GPU availability exhausted — certain GPU types (A100, H100) are in high demand in us-west-2 and eu-west-1. If your cluster config hard-codes a single region, you may be unable to scale when GPUs are sold out. Add fallback regions to your cluster config or use Anyscale's multi-cloud compute config to allow cross-region autoscaling.
🔔

Stop manually checking — get alerts instead

Star Anyscale on Prismix and get emailed the moment status changes. Free, no credit card.

Monitor related ML infrastructure tools?

Full status dashboard: prismix.dev/status