I released Inflect-Nano, an ultra-extreme tiny 4.63m parameter TTS model.
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| I’ve been experimenting with how small a usable neural TTS model can realistically get, and I just released Inflect-Nano-v1. As far as I researched (though I could be wrong on this), Inflect-Nano-v1 is the #2 smallest TTS model publicly released (after TinyTTS), and it performs surprisingly well for its model weight. Even if you have a certified potato computer, it can run on that. It is not SOTA, and I’m not pretending it beats large models. The interesting part is the size-to-functionality ratio: - 4.63M total inference params - 3.46M acoustic model - 1.17M vocoder - 24 kHz audio - English-only, single male voice - Runs locally with a simple PyTorch inference script For comparison, it is ~17x smaller than Kokoro, ~108x smaller than Chatterbox, and almost 1000x smaller than Fish Audio S2 Pro. The quality is still limited: it can sound robotic, stumble on difficult, unseen text, and the vocoder is also a big bottleneck. But for under 5M parameters total, I think it is an interesting baseline for extremely tiny local speech synthesis, offline assistants, embedded devices, browser/WASM-style projects, and local voice agents. Model: https://huggingface.co/owensong/Inflect-Nano-v1 (audio examples in README) I’d love feedback, especially from people interested in tiny models, local voice assistants, efficient inference, or small vocoders. If people find it useful and the model is successful, I'm open to making a v2 with a much larger training budget! [link] [comments] |
More from r/LocalLLaMA
-
Been running Qwen3.6-27B through a 3-critic harness. The harness matters more than I thought
Jun 30
-
I Hate Dario Amodei, and everything he stands for.
Jun 29
-
Introducing LongCat-2.0 - , a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'.
Jun 29
-
Krea-2-Turbo Image Model - Easy to be fully uncensored, but it can also EDIT Images!
Jun 29
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.