Is Cloudflare AI Down?
Check live Cloudflare AI (Workers AI) status — GPU inference on the Cloudflare edge, Llama/Mistral/SDXL/Whisper models, and AI Gateway. See recent incidents and set up free email alerts.
Cloudflare AI — live status
Updated every 5 minutes. Full incident history at prismix.dev/service/cloudflare-ai.
Quick check: is Cloudflare AI down right now?
- Prismix: prismix.dev/service/cloudflare-ai — live status + 30-day uptime + incidents.
- Cloudflare status:
www.cloudflarestatus.com— Cloudflare's official status page covering Workers, AI, R2, and the broader network. - API call:
curl https://prismix.dev/api/v1/statuses | jq '.services[] | select(.id=="cloudflare-ai")'
Set up free email alerts for Cloudflare AI
- 1
Sign in
Go to prismix.dev/sign-in — email OTP or GitHub sign-in.
- 2
Star Cloudflare AI
On prismix.dev/service/cloudflare-ai, click the ☆ star icon.
- 3
Alerts are live
You'll get an email within minutes of any status change.
Common causes of "Cloudflare AI not working"
If Prismix shows Cloudflare AI as "Operational" but you're having issues:
- Model not found 400 error — Workers AI model names are version-specific. Using a deprecated or misspelled model ID (e.g.
@cf/meta/llama-2-7bvs the current@cf/meta/llama-3.1-8b-instruct) returns a 400. Check the Cloudflare AI model catalog for current identifiers before deploying. - Rate limit 429 on the free tier — The Workers AI free tier has daily neuron limits per account. After exhausting the quota, every inference call returns HTTP 429 for the remainder of the day. Upgrade to Workers Paid or wait for the daily quota reset at midnight UTC.
- AI Gateway latency spike — Cloudflare AI Gateway is a reverse proxy that adds observability and caching. During high-traffic periods or when the Gateway region is under load, it can add hundreds of milliseconds of latency or time out before the inference even starts. Temporarily bypass AI Gateway by calling the Workers AI endpoint directly to confirm whether the Gateway or the model inference is the bottleneck.
- Wrangler binding not found locally — When developing with Wrangler locally (
wrangler dev), theenv.AIbinding must be declared inwrangler.tomlas[ai] binding = "AI". Missing this entry causes a runtime TypeError ("Cannot read properties of undefined") that looks like a platform outage but is actually a configuration error. - GPU quota exceeded for the account — Enterprise and high-volume accounts can exhaust allocated GPU capacity during traffic spikes. Requests queue and eventually time out with a 503. Contact Cloudflare support to increase GPU quota, or add retry logic with exponential backoff in your Worker code.
- Response streaming broken in Workers preview — Workers AI streaming (SSE/ReadableStream) works differently in the local Wrangler preview versus the live Cloudflare edge. Some streaming-related bugs only appear in production. Use
wrangler dev --remoteto test against the actual Cloudflare network when debugging streaming issues.
Stop manually checking — get alerts instead
Star Cloudflare AI on Prismix and get emailed the moment status changes. Free, no credit card.
Monitor related edge inference & GPU tools?
Full status dashboard: prismix.dev/status