Qwen Chat Not Working? Fix Access, API & Model Errors
Troubleshoot Qwen Chat — geographic access restrictions, DashScope API authentication errors, wrong model names, slow responses, and language output issues.
Common errors and fixes
Qwen Chat access blocked (outside China)
qianwen.aliyun.com is geo-restricted in many countries. Alternatives for users outside China:
- 1 OpenRouter (best for API): qwen/qwen-2.5-72b-instruct — works globally with OpenAI-compatible API. Sign up at openrouter.ai for a single key that accesses all Qwen models.
- 2 Hugging Face: Search "Qwen2.5" on huggingface.co → Spaces for free interactive demos with no account required.
- 3 Local with Ollama: Run Qwen models on your own hardware with no API limits or geo-restrictions.
- 4 Alibaba Cloud international: Sign up at alibabacloud.com (not aliyun.com) for global access to the full DashScope API with credit card.
Ollama local setup:
ollama pull qwen2.5:7b # 4.7GB — good for most tasks
ollama pull qwen2.5:14b # 9GB — better quality
ollama pull qwen2.5-coder:7b # for coding tasks
ollama run qwen2.5:7b DashScope API setup
DashScope uses an OpenAI-compatible API. Get your key at dashscope.aliyuncs.com → API Key Management, then set DASHSCOPE_API_KEY as an environment variable.
Option 1 — OpenAI-compatible SDK:
from openai import OpenAI
client = OpenAI(
api_key=os.environ["DASHSCOPE_API_KEY"],
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
)
response = client.chat.completions.create(
model="qwen-max",
messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content) Option 2 — native DashScope SDK:
pip install dashscope import dashscope
from dashscope import Generation
dashscope.api_key = os.environ["DASHSCOPE_API_KEY"]
response = Generation.call(
model="qwen-max",
messages=[{"role": "user", "content": "Hello"}]
)
print(response.output.choices[0].message.content) Model names and capabilities
Use the correct model identifier for your platform. Qwen model tiers:
| Model | Use case | Context |
|---|---|---|
qwen-max | Most capable, best reasoning | 32K |
qwen-plus | Balanced quality/cost | 32K |
qwen-turbo | Fastest, cheapest | 8K |
qwen2.5-72b-instruct | Open weights, best quality | 128K |
qwen2.5-coder-32b-instruct | Code generation | 128K |
qwq-32b | Reasoning/math (like o1) | 32K |
QwQ is Qwen's reasoning model — it thinks step-by-step like OpenAI's o1. Best for math, logic, and complex problem solving. Much slower than qwen-max due to chain-of-thought. For OpenRouter prefix: qwen/ (e.g. qwen/qwq-32b). For Ollama: qwen2.5:7b, qwen2.5:14b, qwen2.5-coder:7b.
Slow responses / timeouts
Qwen models can be slow, especially via DashScope from outside Asia. Solutions:
- Use qwen-turbo instead of qwen-max for speed-sensitive tasks — significantly faster with acceptable quality tradeoff.
- QwQ is intentionally slow — it generates a long thinking process before answering. Expected behavior, not a bug.
- For lowest latency outside China, use Qwen via OpenRouter which routes to optimized infrastructure.
Use streaming to get tokens as they generate:
response = client.chat.completions.create(
model="qwen-turbo",
messages=[{"role": "user", "content": "Hello"}],
stream=True
)
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="") Language and output format
Qwen models are multilingual but have a Chinese training bias. To control output:
- System prompt for consistent English:
"Always respond in English. If the user writes in another language, still respond in English unless explicitly asked otherwise." - Garbled output or mojibake: ensure your HTTP client handles UTF-8 encoding correctly — all Qwen API responses are UTF-8.
JSON output mode (Qwen2.5+):
response = client.chat.completions.create(
model="qwen-max",
messages=[
{"role": "system", "content": "Respond with valid JSON only."},
{"role": "user", "content": "List 3 colors"}
],
response_format={"type": "json_object"}
) Know when Qwen has an outage
Free email alerts. Star Qwen Chat on Prismix — no credit card needed.
FAQ
Qwen vs GPT-4o vs Claude — quality comparison?
Qwen2.5-72B is competitive with GPT-4o and Claude 3.5 Sonnet on coding and reasoning benchmarks. QwQ-32B is strong at math and logic. Qwen is particularly strong on Chinese language tasks. For global users, accessing Qwen via OpenRouter gives a fair quality comparison without geo-restrictions.
Is Qwen free to use?
Qwen Chat (the app) has a free tier with limits. The DashScope API has a free trial credit for new accounts. Self-hosted Qwen (via Ollama) is free with no API limits — requires sufficient hardware (8GB+ RAM for 7B models, 32GB+ for 72B).
What is QwQ and how is it different from Qwen?
QwQ (Qwen with Questions) is Alibaba's reasoning model, similar to OpenAI's o1. It generates extended chain-of-thought before answering, making it much slower but more accurate on math, logic, and complex reasoning. Use QwQ when accuracy matters more than speed; use qwen-max for general fast tasks.