r/LocalLLaMA · June 24, 2026 · 1 min read

Qwen-AgentWorld-35B-A3B for Coding?

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Benchmark from its model card. Removed online models & Qwen-AgentWorld-397B-A17B from the table.

Just Open models.

Model	MCP	Search	Term.	SWE	Android	Web	OS	Overall
DeepSeek-V4-Pro	63.27	27.61	51.26	59.44	55.17	50.32	63.70	52.97
GLM-5.1	67.60	22.46	47.32	52.07	59.10	51.50	59.13	51.31
Kimi K2.6	65.23	27.48	52.54	58.77	58.93	50.20	60.80	53.42
MiniMax-M2.7	55.82	27.30	41.62	37.44	52.40	50.52	57.73	46.12
Qwen3.5-35B-A3B	57.87	25.98	46.13	47.58	53.18	47.10	56.27	47.73
Qwen3.5-397B-A17B	68.31	30.81	55.30	64.44	54.90	48.55	60.85	54.74
Qwen-AgentWorld-35B-A3B	64.79	36.69	53.96	65.63	58.17	49.55	65.92	56.39

Just Qwen models

Model	MCP	Search	Term.	SWE	Android	Web	OS	Overall
Qwen3.5-35B-A3B	57.87	25.98	46.13	47.58	53.18	47.10	56.27	47.73
Qwen3.5-397B-A17B	68.31	30.81	55.30	64.44	54.90	48.55	60.85	54.74
Qwen-AgentWorld-35B-A3B	64.79	36.69	53.96	65.63	58.17	49.55	65.92	56.39

AgentWorld's numbers seem good comparing to other models.

I remember that many still waiting for Qwen3.7-27B/35B/etc., models.

So meanwhile, this AgentWorld model is worthy to use on Coding?
First of all, this model is suitable for chatting & writing stuffs or not?

Found this message on HF discussion of that model.

first impression:
much better quality and accuracy than Qwen3.6-27B when handling long-term agent tasks.
Huge Thanks to Qwen Team !
Local Agent Model No.1

submitted by /u/pmttyji
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA