Qwen3.6 35B - TXT vs Markdown vs HTML vs HTML+CSS
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Theres been talk of late about using HTML rather than markdown in Claude Code. I was curious how this worked with a local model so loaded up Qwen3.6 35B A3B at Q8 and F16 KV cache.
Then I gave it the same prompt write a detailed explanation of the Blazor render cycle first asking for raw text, then markdown, then unstyled HTML, then HTML+CSS, and finally with no constraint (where it chose markdown). I measured the token counts for reasoning, total response (including the md or HTML formatting) and the raw response content stripped of formatting.
I also recorded the tokens per second (running MTP with 3 draft tokens) and the total time taken.
| Output | Reasoning tokens | Output tokens | Raw content tokens | Tokens per second | Time taken |
|---|---|---|---|---|---|
| Raw text | 1,873 | 1,080 | 1,080 | 146 | 20s |
| Markdown | 1,264 | 1,496 | 1,269 | 123.5 | 23s |
| Unstyled HTML | 166 | 7,346 | 4,857 | 139 | 56s |
| Styled HTML | 108 | 10,290 | 3,418 | 139 | 82s |
| No constraint (chose markdown) | 1,465 | 2,256 | 2,002 | 122 | 31s |
Finally I got ChatGPT 5.5 Extended Reasoning to score the quality of their output based on:
- How much correct useful information is present
- How well it is explained
- How many errors it contains
- How efficiently it uses its length
| Rank | Output | Cov | Expl | Err | Dens | Total |
|---|---|---|---|---|---|---|
| 1 | Markdown | 31/40 | 21/25 | 18/25 | 8/10 | 78/100 |
| 2 | No constraint (chose markdown) | 32/40 | 18/25 | 13/25 | 8/10 | 71/100 |
| 3 | Raw text | 30/40 | 19/25 | 11/25 | 6/10 | 66/100 |
| 4 | Unstyled HTML | 34/40 | 17/25 | 6/25 | 4/10 | 61/100 |
| 5 | Styled HTML | 33/40 | 19/25 | 3/25 | 3/10 | 58/100 |
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.