Research Article Open Access

Comparative Evaluation of Phone Duration Models for Greek Emotional Speech

Alexandros Lazaridis, Vasiliki Bourna and Nikos Fakotakis

Abstract

Problem statement: In this study we cope with the task of phone duration modeling for Greek emotional speech synthesis. Approach: Various well established machine learning techniques are applied for this purpose to an emotional speech database consisting of five archetypal emotions. The constructed phone duration prediction models are built on phonetic, morphosyntactic and prosodic features that can be extracted only from text. We employ model and regression trees, linear regression, lazy learning algorithms and meta-learning algorithms using regression trees as base classifiers, trained on a Modern Greek emotional database consisting of five emotional categories: anger, fear, joy, neutral and sadness. Results: Model trees based on the M5’ algorithm and meta-learning algorithms using as base classifier regression trees based on the M5’ algorithm proved to perform better. Conclusion: It was observed that the emotional categories of the speech database with the most uniform distribution of phone durations built the most accurate models.

Journal of Computer Science
Volume 6 No. 3, 2010, 341-349

DOI: https://doi.org/10.3844/jcssp.2010.341.349

Submitted On: 18 December 2009 Published On: 31 March 2010

How to Cite: Lazaridis, A., Bourna, V. & Fakotakis, N. (2010). Comparative Evaluation of Phone Duration Models for Greek Emotional Speech. Journal of Computer Science, 6(3), 341-349. https://doi.org/10.3844/jcssp.2010.341.349

  • 3,676 Views
  • 2,336 Downloads
  • 2 Citations

Download

Keywords

  • Phone duration modeling
  • statistical modeling
  • emotional speech
  • text-to-speech synthesis