Serving TTS/cloning models on llama.cpp?
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Are there any quality voice cloning and speech generation models that already have support in Llama.cpp or, more likely, vLLM-Omni? It would be nice to swap them out like any other inference model and use a common API, rather making a separate container or conda for each model I want to try.
MOSS looks decent but seems to fall into the latter category.
Same thing goes for image and video generation honestly.
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.