Phonikud: Overcoming Phonetic Underspecification for Hebrew Text-To-Speech
Mirrored from arXiv — NLP / Computation & Language for archival readability. Support the source by reading on the original site.
Computer Science > Computation and Language
Title:Phonikud: Overcoming Phonetic Underspecification for Hebrew Text-To-Speech
Abstract:Text-to-speech (TTS) for Modern Hebrew is challenged by the language's orthographic complexity, with existing solutions ignoring underspecified phonetic features such as stress. We present a framework for more phonetically accurate Hebrew TTS with four contributions: (1) Phonikud, an open-source Hebrew grapheme-to-phoneme (G2P) system that outputs fully-specified International Phonetic Alphabet (IPA) transcriptions, designed by augmenting a base diacritizer. (2) The ILSpeech corpus of paired Hebrew audio, text, and expert IPA annotations. (3) A benchmark for the previously unmeasured task of Hebrew G2P conversion. (4) Hebrew audio-to-IPA models capturing previously disregarded phonetic details for automatic TTS evaluation. Our results show that Phonikud more accurately predicts Hebrew phonemes than prior methods, and that small, local TTS models with phonetic input from Phonikud approach large proprietary systems. We release our code, data, and models at this https URL.
| Comments: | Accepted to Interspeech 2026. Project page: this https URL |
| Subjects: | Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS) |
| Cite as: | arXiv:2506.12311 [cs.CL] |
| (or arXiv:2506.12311v3 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2506.12311
arXiv-issued DOI via DataCite
|
Submission history
From: Morris Alper [view email][v1] Sat, 14 Jun 2025 02:16:38 UTC (1,380 KB)
[v2] Fri, 10 Oct 2025 00:10:56 UTC (1,433 KB)
[v3] Tue, 16 Jun 2026 20:55:53 UTC (608 KB)
Access Paper:
- View PDF
Current browse context:
References & Citations
Bibliographic and Citation Tools
Code, Data and Media Associated with this Article
Demos
Recommenders and Search Tools
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
More from arXiv — NLP / Computation & Language
-
Generating in the Limit with Infinitely Many Hallucinations
Jun 30
-
Extracting Knowledge from an Arabic-English Machine-Readable Dictionary Using Information Extraction
Jun 30
-
Developmental Trajectories of Situation Modeling and Mentalizing in Transformer Language Models
Jun 30
-
A French OSCE Dialogue Dataset and Controllable Virtual Patient System for Clinical Training
Jun 30
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.