Qwen AI Chat Fix 4 min read

Qwen Chat Not Working? Fix Access, API & Model Errors

Troubleshoot Qwen Chat — geographic access restrictions, DashScope API authentication errors, wrong model names, slow responses, and language output issues.

Qwen Chat live status

Qwen Chat — live status

Updated every 5 minutes · Full incident history →

Full status →

Common errors and fixes

Qwen Chat access blocked (outside China)

qianwen.aliyun.com is geo-restricted in many countries. Alternatives for users outside China:

  1. 1
    OpenRouter (best for API): qwen/qwen-2.5-72b-instruct — works globally with OpenAI-compatible API. Sign up at openrouter.ai for a single key that accesses all Qwen models.
  2. 2
    Hugging Face: Search "Qwen2.5" on huggingface.co → Spaces for free interactive demos with no account required.
  3. 3
    Local with Ollama: Run Qwen models on your own hardware with no API limits or geo-restrictions.
  4. 4
    Alibaba Cloud international: Sign up at alibabacloud.com (not aliyun.com) for global access to the full DashScope API with credit card.

Ollama local setup:

ollama pull qwen2.5:7b     # 4.7GB — good for most tasks
ollama pull qwen2.5:14b    # 9GB — better quality
ollama pull qwen2.5-coder:7b  # for coding tasks
ollama run qwen2.5:7b

DashScope API setup

DashScope uses an OpenAI-compatible API. Get your key at dashscope.aliyuncs.com → API Key Management, then set DASHSCOPE_API_KEY as an environment variable.

Option 1 — OpenAI-compatible SDK:

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["DASHSCOPE_API_KEY"],
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
)

response = client.chat.completions.create(
    model="qwen-max",
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)

Option 2 — native DashScope SDK:

pip install dashscope
import dashscope
from dashscope import Generation

dashscope.api_key = os.environ["DASHSCOPE_API_KEY"]

response = Generation.call(
    model="qwen-max",
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.output.choices[0].message.content)

Model names and capabilities

Use the correct model identifier for your platform. Qwen model tiers:

Model Use case Context
qwen-max Most capable, best reasoning 32K
qwen-plus Balanced quality/cost 32K
qwen-turbo Fastest, cheapest 8K
qwen2.5-72b-instruct Open weights, best quality 128K
qwen2.5-coder-32b-instruct Code generation 128K
qwq-32b Reasoning/math (like o1) 32K

QwQ is Qwen's reasoning model — it thinks step-by-step like OpenAI's o1. Best for math, logic, and complex problem solving. Much slower than qwen-max due to chain-of-thought. For OpenRouter prefix: qwen/ (e.g. qwen/qwq-32b). For Ollama: qwen2.5:7b, qwen2.5:14b, qwen2.5-coder:7b.

Slow responses / timeouts

Qwen models can be slow, especially via DashScope from outside Asia. Solutions:

  • Use qwen-turbo instead of qwen-max for speed-sensitive tasks — significantly faster with acceptable quality tradeoff.
  • QwQ is intentionally slow — it generates a long thinking process before answering. Expected behavior, not a bug.
  • For lowest latency outside China, use Qwen via OpenRouter which routes to optimized infrastructure.

Use streaming to get tokens as they generate:

response = client.chat.completions.create(
    model="qwen-turbo",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True
)
for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Language and output format

Qwen models are multilingual but have a Chinese training bias. To control output:

  • System prompt for consistent English: "Always respond in English. If the user writes in another language, still respond in English unless explicitly asked otherwise."
  • Garbled output or mojibake: ensure your HTTP client handles UTF-8 encoding correctly — all Qwen API responses are UTF-8.

JSON output mode (Qwen2.5+):

response = client.chat.completions.create(
    model="qwen-max",
    messages=[
        {"role": "system", "content": "Respond with valid JSON only."},
        {"role": "user", "content": "List 3 colors"}
    ],
    response_format={"type": "json_object"}
)
🔔

Know when Qwen has an outage

Free email alerts. Star Qwen Chat on Prismix — no credit card needed.

FAQ

Qwen vs GPT-4o vs Claude — quality comparison?

Qwen2.5-72B is competitive with GPT-4o and Claude 3.5 Sonnet on coding and reasoning benchmarks. QwQ-32B is strong at math and logic. Qwen is particularly strong on Chinese language tasks. For global users, accessing Qwen via OpenRouter gives a fair quality comparison without geo-restrictions.

Is Qwen free to use?

Qwen Chat (the app) has a free tier with limits. The DashScope API has a free trial credit for new accounts. Self-hosted Qwen (via Ollama) is free with no API limits — requires sufficient hardware (8GB+ RAM for 7B models, 32GB+ for 72B).

What is QwQ and how is it different from Qwen?

QwQ (Qwen with Questions) is Alibaba's reasoning model, similar to OpenAI's o1. It generates extended chain-of-thought before answering, making it much slower but more accurate on math, logic, and complex reasoning. Use QwQ when accuracy matters more than speed; use qwen-max for general fast tasks.

Monitor related services