Research Article Open Access

Improvement of Tone Intelligibility for Average-Voice-Based Thai Speech Synthesis

Suphattharachai Chomphan1 and Chutarat Chompunth2
  • 1 Department of Electrical Engineering, Faculty of Engineering at Si Racha, Kasetsart University, 199 M.6, Tungsukhla, Si Racha, Chonburi, 20230, Thailand
  • 2 School of Social and Environmental Development, National Institute of Development Administration, 118 M.3, Serithai Road, Klong-Chan, Bangkapi, Bangkok, 10240, Thailand

Abstract

Problem statement: Tone intelligibility in speech synthesis is an important attribute that should be taken into account. The tone correctness of the synthetic speech is degraded considerably in the average-voice-based HMM-based Thai speech synthesis. The tying mechanism in the decision tree based context clustering without appropriate criterion causes unexpected tone neutralization. Incorporation of the phrase intonation to the context clustering process in the training stage was proposed early. However, the tone correctness is not satisfied. Approach: This study proposes a number of tonal features including tone-geometrical features and phrase intonation features to be exploited in the context clustering process of HMM training stage. Results: In the experiments, subjective evaluations of both average voice and adapted voice in terms of the intelligibility of tone are conducted. Effects on decision trees of the extracted features are also evaluated. By considering gender in training speech, two core experiments were conducted. The first experiment shows that the proposed tonal features can improve the tone intelligibility for female speech model above that of male speech model, while the second experiment shows that the proposed tonal features improve the tone intelligibility for gender dependent model than for gender independent model. Conclusion: All of the experimental results confirm that the tone correctness of the synthesized speech from the average-voice-based HMM-based Thai speech synthesis is significantly improved when using most of the extracted features.

American Journal of Applied Sciences
Volume 9 No. 3, 2012, 358-364

DOI: https://doi.org/10.3844/ajassp.2012.358.364

Submitted On: 11 October 2011 Published On: 24 January 2012

How to Cite: Chomphan, S. & Chompunth, C. (2012). Improvement of Tone Intelligibility for Average-Voice-Based Thai Speech Synthesis. American Journal of Applied Sciences, 9(3), 358-364. https://doi.org/10.3844/ajassp.2012.358.364

  • 3,415 Views
  • 2,853 Downloads
  • 0 Citations

Download

Keywords

  • Thai speech
  • speech synthesis
  • tone intelligibility
  • tone correctness
  • generative model
  • context clustering
  • average voice
  • hidden Markov models