TU Berlin

Quality and Usability LabFlorian Hinterleitner

Inhalt des Dokuments

zur Navigation

Florian Hinterleitner

Work

Lupe

Research Field

quality, speech technology


Research Topics

quality and quality dimensions of synthetic speech
instrumental quality prediction of synthetic speech


Biography
Florian Hinterleitner studied Communication and Computer Sciene at the Technical University Berlin. In 2010 he completed his Magister Thesis "Signalbased Quality Prediction of Synthetic Speech". He is currently working as a research assistant at the Quality and Usability Lab of Telekom Innovation Laboratories, TU-Berlin in the domain of quality prediction of synthetic speech.

 

Projects

 

Teaching

Speech Communication (since winter semester 2010/2011)

Publications

Quality Prediction of Synthesized Speech based on Perceptual Quality Dimensions
Zitatschlüssel norrenbrock2015a
Autor Norrenbrock, Christoph Ritter and Hinterleitner, Florian and Heute, Ulrich and Möller, Sebastian
Seiten 17–35
Jahr 2015
ISSN 0167-6393
DOI 10.1016/j.specom.2014.06.003
Adresse New York, USA
Journal Speech Communication
Jahrgang 66
Monat feb
Notiz print/online
Verlag Elsevier
Wie herausgegeben full
Zusammenfassung Instrumental speech-quality prediction for text-to-speech signals is explored in a twofold manner. First, the perceptual quality space of TTS is structured by means of three perceptual quality dimensions which are derived from multiple auditory tests. Second, quality-prediction models are evaluated for each dimension using prosodic and MFCC-based measurands. Linear and nonlinear model types are compared under cross-validation restrictions, giving detailed insight into model-generalizability aspects. Perceptually regularized properties, denoted as quality elements, are introduced in order to encode the quality-indicative effect of individual signal characteristics. These elements integrate a perceptual model reference which is derived in a semi-supervised fashion from natural and synthetic speech. The results highlight the feasibility of instrumental quality prediction for TTS signals provided that broad training material is employed. High prediction accuracy, however, requires nonlinear model structures.
Link zur Originalpublikation Download Bibtex Eintrag

Navigation

Direktzugang

Schnellnavigation zur Seite über Nummerneingabe