Improvement of Tone Intelligibility for Average-Voice-Based Thai Speech Synthesis

Suphattharachai Chomphan; Chutarat Chompunth

doi:10.3844/ajassp.2012.358.364

Research Article Open Access

Improvement of Tone Intelligibility for Average-Voice-Based Thai Speech Synthesis

Suphattharachai Chomphan¹ and Chutarat Chompunth²

¹ Department of Electrical Engineering, Faculty of Engineering at Si Racha, Kasetsart University, 199 M.6, Tungsukhla, Si Racha, Chonburi, 20230, Thailand
² School of Social and Environmental Development, National Institute of Development Administration, 118 M.3, Serithai Road, Klong-Chan, Bangkapi, Bangkok, 10240, Thailand

Abstract

Problem statement: Tone intelligibility in speech synthesis is an important attribute that should be taken into account. The tone correctness of the synthetic speech is degraded considerably in the average-voice-based HMM-based Thai speech synthesis. The tying mechanism in the decision tree based context clustering without appropriate criterion causes unexpected tone neutralization. Incorporation of the phrase intonation to the context clustering process in the training stage was proposed early. However, the tone correctness is not satisfied. Approach: This study proposes a number of tonal features including tone-geometrical features and phrase intonation features to be exploited in the context clustering process of HMM training stage. Results: In the experiments, subjective evaluations of both average voice and adapted voice in terms of the intelligibility of tone are conducted. Effects on decision trees of the extracted features are also evaluated. By considering gender in training speech, two core experiments were conducted. The first experiment shows that the proposed tonal features can improve the tone intelligibility for female speech model above that of male speech model, while the second experiment shows that the proposed tonal features improve the tone intelligibility for gender dependent model than for gender independent model. Conclusion: All of the experimental results confirm that the tone correctness of the synthesized speech from the average-voice-based HMM-based Thai speech synthesis is significantly improved when using most of the extracted features.

American Journal of Applied Sciences

Volume 9 No. 3, 2012, 358-364

DOI: https://doi.org/10.3844/ajassp.2012.358.364

Submitted On: 11 October 2011 Published On: 24 January 2012

How to Cite: Chomphan, S. & Chompunth, C. (2012). Improvement of Tone Intelligibility for Average-Voice-Based Thai Speech Synthesis. American Journal of Applied Sciences, 9(3), 358-364. https://doi.org/10.3844/ajassp.2012.358.364

Copyright: © 2012 Suphattharachai Chomphan and Chutarat Chompunth. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

3,415 Views
2,853 Downloads
0 Citations

Download

Keywords

Thai speech
speech synthesis
tone intelligibility
tone correctness
generative model
context clustering
average voice
hidden Markov models