r/LocalLLaMA · · 1 min read

Qwen3-tts.cpp + Compose Desktop GUI

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Qwen3-tts.cpp + Compose Desktop GUI

I improved my qwen3-tts.cpp implementation to be about 5x realtime on my RTX 5080. It is GGML based, so it should compile and run anywhere - however I only tested it with CPU & CUDA under Windows & Linux: https://github.com/Danmoreng/qwen3-tts.cpp

Additionally I made a Desktop GUI with Kotlin Compose Multiplatform, working under Windows & Linux as well: https://github.com/Danmoreng/qwen-tts-studio

Windows releases exist which you can download and run directly. Linux must be built from source.

Qwen-TTS-Studio

Features:

  • fastest GGML implementation I know of, 15x faster than Python reference
  • 0.6B & 1.7B models
  • base model with voice cloning
  • customvoice model with instructions
  • voicedesign with instructions
  • save speaker embeddings
  • mix & merge speaker embeddings
  • streaming (including semi-accurate text-highlighting)
  • included download options for pre-converted GGUF models from huggingface (https://huggingface.co/Serveurperso/Qwen3-TTS-GGUF)
submitted by /u/Danmoreng
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA