TU Berlin

Quality and Usability LabBabak Naderi

Page Content

to Navigation

Dr. -Ing. Babak Naderi


Research Interests:

  • Subjective quality assessment
  • Speech Quality Assessment in Crowdsourcing
  • Motivation, Workload, and Performance in Crowdsourcing
  • Statistical Modeling, field data and applied statistics
  • Speech Enhancement
  • Text Complexity and Simplification


Babak Naderi has obtain his Dr.-Ing degree (PhD) on the basis of his thesis with a title of Motivation of Workers on Microtask Crowdsourcing Platforms in September 2017. Babak has Master's degree in Geodesy and Geoinformation Science form the Technical University Berlin with a thesis on "Monte Carlo Localization for Pedestrian Indoor Navigation Using a Map Aided Movement Model". He has also a Bachelor's degree in Software Engineering.

Since August 2012, Babak Naderi is working as a research scientist at the Quality and Usability Lab of  TU-Berlin.

2013-2015 Babak was awarded with an BMBF funded Education program for future IT and Development Leadership involving Bosch, Datev, Deutsche Telekom AG, Holtzbrinck, SAP, Scheer Group, Siemens, and Software AG  amongst highly ranked academic institution (Softwarecampus). He was taking part by leading CrowdMAQA project.

Within dissertation, Babak studies the motivation of crowdworkers in details. He has developed the Crowdwork Motivation Scale for measuring general motivation based on the Self-Determination Theory of Motivation. The scale has been validated within several studies. In addition, he has studied factors influencing the motivation, and influence of different motivation type on the quality of outcomes. Models for predicting task selection strategy of workers are developed, including models for automatically predicting expected workload associated to a task from its design, task acceptance and performance. 

Beside others research activities, Babak is actively working on the standardization of methods for speech quality assessment in crowdsourcing environment in the P.CROWD work program of Study Group 12 in ITU-T Standardization Sector.

Reviewed for WWW, CHI, ICASSP, CSCW, MMSys, PQS, HCOMP, ICWE, QoMEX, International Journal of Human-Computer Studies, Computer Networks, Behaviour & Information Technology, Quality and User Experience.


Selected talks:

  • "Motivation of Crowd Workers, does it matter?",Schloss Dagstuhl, Evaluation in the Crowd: Crowdsourcing and Human-Centred Experiments, November 2015.
  • "Motivation and Quality Assessment in Online Paid Crowdsourcing Micro-task Platforms",Schloss Dagstuhl, Crowdsourcing: From Theory to Practice and Long-Term Perspectives, September 2013.


Office Hours: On Appointment



Quality and Usability Lab

Technische Universität Berlin
Ernst-Reuter-Platz 7
D-10587 Berlin

Tel.:+49 (30) 8353-54221
Fax: +49 (30) 8353-58409



Intra- and Inter-rater Agreement in a Subjective Speech Quality Assessment Task in Crowdsourcing
Citation key zequeirajimenez2019c
Author Zequeira Jiménez, Rafael and Llagostera, Anna and Naderi, Babak and Möller, Sebastian and Berger, Jens
Title of Book Companion Proceedings of The 2019 World Wide Web Conference
Pages 1138–1143
Year 2019
ISBN 978-1-4503-6675-5
DOI 10.1145/3308560.3317084
Address New York, NY, USA
Month may
Publisher ACM
Series WWW '19
How Published Fullpaper
Abstract Crowdsourcing is a great tool for conducting subjective user studies with large amounts of users. Collecting reliable annotations about the quality of speech stimuli is challenging. The task itself is of high subjectivity and users in crowdsourcing work without supervision. This work investigates the intra- and inter-listener agreement withing a subjective speech quality assessment task. To this end, a study has been conducted in the laboratory and in crowdsourcing in which listeners were requested to rate speech stimuli with respect to their overall quality. Ratings were collected on a 5-point scale in accordance with the ITU-T Rec. P.800 and P.808, respectively. The speech samples were taken from the database ITU-T Rec. P.501 Annex D, and were presented four times to the listeners. Finally, the crowdsourcing results were contrasted to the ratings collected in the laboratory. Strong and significant Spearman's correlation was achieved when contrasting the ratings collected in both environments. Our analysis show that while the inter-rater agreement increased the more the listeners conducted the assessment task, the intra-rater reliability remained constant. Our study setup helped to overcome the subjectivity of the task and we found that disagreement can represent a source of information to some extent.
Link to original publication Download Bibtex entry


Quick Access

Schnellnavigation zur Seite über Nummerneingabe