Why is the Anthropic Claude API not working?

Common causes: (1) 401 — invalid or missing API key (get at console.anthropic.com); (2) 429 — rate limit exceeded; Anthropic enforces requests-per-minute, tokens-per-minute, AND concurrent request limits; (3) 529 — API overloaded, Anthropic infrastructure issue, retry with backoff; (4) model name wrong (use claude-3-5-sonnet-20241022 not claude-3.5-sonnet); (5) streaming delta.text accessed before streaming completes.

Anthropic API 401 — how to fix?

(1) API keys start with sk-ant-api03- — verify format; (2) get or rotate keys at console.anthropic.com/settings/keys; (3) Claude.ai subscription does NOT include API access — you need a separate API account with billing; (4) ensure ANTHROPIC_API_KEY env var is set and not expired; (5) if using workspace keys, ensure the workspace has API credits.

Anthropic API 429 rate limit?

Anthropic has three rate limit types: (a) requests per minute (RPM); (b) tokens per minute (TPM — input + output combined); (c) concurrent requests (usually 5-50 depending on tier). Check response headers: x-ratelimit-limit-requests, x-ratelimit-remaining-requests, x-ratelimit-reset-requests. To get higher limits, add billing and spend more — Anthropic tiers by historical spend (Tier 1: $5 payment, Tier 4: $40k spent).

Anthropic API 529 overloaded?

Status 529 means Anthropic's servers are overloaded — it's a temporary server-side issue, not your code. Always retry 529 with exponential backoff. Also check prismix.dev/service/anthropic for active incidents. If 529s persist for hours, switch to claude-3-haiku (lighter load) or use the API at off-peak hours.

Claude API context window exceeded?

Each Claude model has a max context: claude-3-5-sonnet: 200K tokens, claude-3-opus: 200K tokens, claude-3-haiku: 200K tokens. Error: "prompt is too long: X tokens > 200000 max". Solutions: summarize conversation history before sending, use the API to count tokens before sending (countTokens endpoint), or chunk long documents.

Anthropic Claude API Fix 5 min read

Anthropic Claude API Not Working? Fix Auth, Rate Limits & SDK Errors

Troubleshoot Anthropic Claude API errors — 401 invalid API key, 429 rate limits (RPM, TPM, concurrent), 529 overloaded, context window exceeded, and streaming issues when calling the API with the Python or TypeScript SDK.

Anthropic API — live status

Updated every 5 minutes · Full incident history →

Full status →

Common errors and fixes

401 Unauthorized / invalid API key

The most common cause is a missing, expired, or incorrectly formatted API key. Use the official SDK with an environment variable:

import anthropic
import os

client = anthropic.Anthropic(
    api_key=os.environ.get("ANTHROPIC_API_KEY")  # sk-ant-api03-...
)

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello"}
    ]
)
print(message.content[0].text)

// TypeScript
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

const message = await client.messages.create({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello' }],
});

Key format: all Anthropic API keys start with sk-ant-api03- — verify this prefix exactly.
Get or rotate keys: go to console.anthropic.com/settings/keys — deleted or expired keys show as inactive.
Claude.ai ≠ API: a Claude.ai subscription does NOT include API access — you need a separate API account at console.anthropic.com with billing enabled.
Workspace keys: if using workspace-scoped keys, ensure the workspace has API credits and the key has not been restricted.

429 Rate limit — RPM, TPM, concurrent

Anthropic enforces three separate rate limit types simultaneously: requests per minute (RPM), tokens per minute (TPM, input + output combined), and concurrent requests. The Anthropic SDK has built-in retry logic you can configure:

# The Anthropic SDK has built-in retry for 429 and 529
client = anthropic.Anthropic(
    api_key=os.environ["ANTHROPIC_API_KEY"],
    max_retries=5,  # default is 2
)

# Or manual retry with backoff
import time
import random

def create_with_retry(client, **kwargs):
    for attempt in range(5):
        try:
            return client.messages.create(**kwargs)
        except anthropic.RateLimitError as e:
            if attempt == 4:
                raise
            wait = (2 ** attempt) + random.random()
            print(f"Rate limited. Retry {attempt + 1}/5 in {wait:.1f}s")
            time.sleep(wait)

Check rate limit headers: inspect x-ratelimit-limit-requests, x-ratelimit-remaining-requests, and x-ratelimit-reset-requests to see which limit you hit.
Tier progression: Anthropic tiers by historical spend — Tier 1 unlocks after a $5 payment; Tier 4 after $40k spent. Check your tier at console.anthropic.com/settings/limits.
Concurrent limit: easy to hit in parallel code — use asyncio.Semaphore or a queue to cap simultaneous in-flight requests.

529 Overloaded — Anthropic server issue

Status 529 means Anthropic's servers are overloaded — it is a temporary server-side issue, not your code. Always retry 529 with exponential backoff:

# 529 should always be retried — it's a temporary server issue
from anthropic import APIStatusError

def robust_create(client, **kwargs):
    for attempt in range(6):
        try:
            return client.messages.create(**kwargs)
        except APIStatusError as e:
            if e.status_code in (429, 529) and attempt < 5:
                wait = min(60, (2 ** attempt) + random.random())
                time.sleep(wait)
            else:
                raise

Note: Check prismix.dev/service/anthropic for active incidents. During outages, 529s may last minutes to hours — implement a circuit breaker pattern for production workloads. Switching to claude-3-haiku (lighter server load) can reduce 529 frequency during partial outages.

Wrong model name

Claude model IDs are exact strings — any typo or wrong format returns a model-not-found error. Current valid model IDs (as of June 2026):

claude-opus-4-8 — most capable
claude-sonnet-4-6 — balanced performance
claude-3-5-sonnet-20241022 — previous gen, stable
claude-3-5-haiku-20241022 — fastest, cheapest
claude-3-opus-20240229 — previous Opus

Common wrong names that fail: claude-3.5-sonnet (dot instead of dash), claude-opus-4 (missing version suffix), claude-3-5-sonnet (missing date suffix for older models).
Always check: console.anthropic.com/docs/models for the current canonical model IDs.

Streaming with the SDK

Use the stream() context manager and iterate — do not call message.content on a stream object before it completes:

# Python streaming
with client.messages.stream(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a poem"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

# Get the final message after streaming
final_message = stream.get_final_message()
print(f"\nUsage: {final_message.usage}")

// TypeScript streaming
const stream = await client.messages.stream({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Write a poem' }],
});

for await (const chunk of stream) {
  if (chunk.type === 'content_block_delta' && chunk.delta.type === 'text_delta') {
    process.stdout.write(chunk.delta.text);
  }
}

Common mistake: calling message.content on a stream object before it completes — always use the stream iterator or await stream.finalMessage().
Usage stats: token counts are available via stream.get_final_message().usage (Python) or await stream.finalMessage() (TypeScript) after the stream completes.

🔔

Know when the Anthropic Claude API has an outage

Free email alerts. Star Anthropic on Prismix — no credit card needed.

View status Sign in free →

FAQ

Does Claude API support function calling / tool use?

Yes. Use the tools parameter in messages.create(). Define tools with name, description, and input_schema (JSON Schema). Claude returns tool_use content blocks which you execute and send back as tool_result. The SDK includes helpers for tool use.

Anthropic API pricing — most cost-effective model?

claude-3-5-haiku-20241022 is the cheapest at $0.80/1M input tokens. claude-3-5-sonnet-20241022 is the best value for quality at $3/1M input. claude-3-opus is $15/1M input for the hardest tasks. Prompt caching reduces costs by 90% for repeated context (system prompts, documents).

Prompt caching — how to enable?

Add cache_control: {type: 'ephemeral'} to content blocks you want cached. The cache lasts 5 minutes. Cached tokens cost 10% of normal input price on re-use. Best for: long system prompts, reference documents, few-shot examples that stay constant across requests.

Monitor related services

Anthropic API status → OpenAI API not working → Gemini API not working → All AI status → All guides →