|
From: | Marcelo Y. Matuda |
Subject: | Re: [gnuspeech-contact] Quickstart for the latest Gnuspeech? |
Date: | Sun, 8 Nov 2015 00:31:56 -0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
Hi, On 11/01/2015 09:53 PM, Advrk Aplmrkt wrote:
Thanks Marcelo for the exaplanation. So *that's* why Siri sounds so good! I can see how articulatory synthesis, when fully developed, can be more powerful because you don't need to pre-record everything!
And the user can (more or less) easily change the voices. Articulatory synthesis will (hopefully) allow users to change accent / intonation / emotion. Another application is singing synthesis (see Pavarobotti, http://www.cs.princeton.edu/~prc/SingingSynth.html and VocalTractLab).
Gnuspeech already allows changing the voices and testing custom intonation curves.
Articulatory synthesis also can be used to study the phonatory system, and can simulate speech problems.
Also, as a non-programmer and complete non-expert on the subject, how can an user support and expedite development of Gnuspeech?
Users can tell other people about the advantages of Gnuspeech (while not hiding its disadvantages). For example, with Gnuspeech you can easily change the voices (vocal tract length, breathness, etc) and Gnuspeech is still the only _articulatory_ text-to-speech system (it converts english text to speech).
Finally, other than Gnuspeech are there other Free Software text-to-speech software that can produce equal of better quality synthesis? Thanks!!
The perceived quality depends on the person. I know these: Espeak Festival Flite (Festival lite) MaryTTS RHVoice Regards, Marcelo
On 01/11/2015, Marcelo Y. Matuda <address@hidden> wrote:Hi, On 11/01/2015 03:45 PM, Advrk Aplmrkt wrote:Thanks for the links, and I agree a proper man page or quickstart guide would be super useful for end users! (and not just speech synthesis researchers) I checked out the YouTube videos, and I confess it was hard for me to understand what Gnuspeech was saying... Is there a reason why it doesn't sound nearly as natural as, say, Siri yet???Siri uses a method called Unit Selection (AFAIK), which joins segments of recorded speech. That is why the quality can be so good. Gnuspeech uses articulatory synthesis, which uses a mathematical model of the human vocal tract to synthesize the speech from scratch. It is very difficult to adjust the many parameters. Also GnuspeechSA is a C++ port of the original TTS_Server (for NeXTSTEP), developed a long time ago. It doesn't yet incorporate the research done in all these years. Hopefully articulatory synthesis will reach the quality of unit selection, but there is much work to do. Regards, Marcelo
[Prev in Thread] | Current Thread | [Next in Thread] |