Blog
-- Thoughts on data analysis, software
development and innovation management. Comments are welcome
Post 41
TTS in the future
21-Nov-2010
In the FALA 2010 conference, Dr. Heiga Zen
gave a speech entitled
"Fundamentals and recent advances in HMM-based speech synthesis".
He reviewed the growth of Hidden Markov Models (HMM) over the last years
in the TTS research community. Indeed, this direction was also
evident in the Speech Synthesis Albayzin 2010 Evaluation,
where out of the 10 systems participating, 3 were purely concatenative,
6 were based on HMM, and one was as a hybrid approach (HMM-based + concatenative).
And it was the latter who won the competition.
By the middle of his presentation, he cited Dr. Simon King's speech
at the Interspeech 2010 conference stating
that TTS synthesis is easy as long as some recommendations are
followed. Overall, they suggest to avoid non-professional speakers,
to avoid working with small corpora, with noisy recordings and labelling
mistakes, and to acquire a deal of knowledge of the language
aimed by the system. A core problem redefinition for research to tackle.
Lastly, Dr. Zen encouraged the audience to join the research in TTS synthesis, and he
provided some directions to get involved, beginning with text processing,
i.e. the first stage in a TTS synthesis system. Thus, it seems that
there is an especially nice and promising framework for my Ph.D. :)
|