r/LocalLLaMA · June 6, 2026 · 1 min read

Serving TTS/cloning models on llama.cpp?

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Are there any quality voice cloning and speech generation models that already have support in Llama.cpp or, more likely, vLLM-Omni? It would be nice to swap them out like any other inference model and use a common API, rather making a separate container or conda for each model I want to try.

MOSS looks decent but seems to fall into the latter category.

Same thing goes for image and video generation honestly.

submitted by /u/FrozenBuffalo25
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA