TU Berlin

Quality and Usability LabSteven Schmidt

Page Content

to Navigation

There is no English translation for this web page.

Steven Schmidt


Research Field

  • Quality of Experience (QoE) for Cloud Gaming Services
  • Engagement in Virtual Reality

Research Topics

  • Identification and quantification of perceptual quality dimensions for gaming QoE
  • Prediction of gaming QoE based on encoding and network parameters
  • Classification of game content
  • Crowdsourcing for gaming evaluation


Steven Schmidt received his M.Sc. degree in Electrical Engineering at the TU Berlin with a major in Communication Systems. Since 2016 he is employed as a research assistant at the Quality and Usability Lab where he is working towards a PhD in the field of Quality of Experience in Mobile Gaming. 


ITU-T SG12 Activities:

  • ITU-T Rec. G.1032 - Influence Factors on Gaming Quality of Experience (2017)
  • ITU-T Rec. P.809 - Subjective Evaluation Methods for Gaming Quality (2018)
  • ITU-T Rec. G.1072 - Opinion Model Predicting Gaming QoE for Cloud Gaming Services (2020)


Quality and Usability Lab
Technische Universität Berlin
Ernst-Reuter-Platz 7
D-10587 Berlin, Germany

Tel:  +49 151 12044969


NDNetGaming-development of a no-reference deep CNN for gaming video quality prediction
Citation key utke2020a
Author Utke, Markus and Zadtootaghaj, Saman and Schmidt, Steven and Bosse, Sebastian and Möller, Sebastian
Pages 1–23
Year 2020
ISSN 1573-7721
DOI 10.1007/s11042-020-09144-6
Address Address of the Publisher and (NOT the conference)
Journal Multimedia Tools and Applications
Month jul
Publisher Springer
How Published Fullpaper
Abstract Gaming video streaming services are growing rapidly due to new services such as passive video streaming of gaming content, e.g. Twitch.tv, as well as cloud gaming, e.g. Nvidia GeForce NOW and Google Stadia. In contrast to traditional video content, gaming content has special characteristics such as extremely high and special motion patterns, synthetic content and repetitive content, which poses new opportunities for the design of machine learning-based models to outperform the state-of-the-art video and image quality approaches for this special computer generated content. In this paper, we train a Convolutional Neural Network (CNN) based on an objective quality model, VMAF, as ground truth and fine-tuned it based on subjective image quality ratings. In addition, we propose a new temporal pooling method to predict gaming video quality based on frame-level predictions. Finally, the paper also describes how an appropriate CNN architecture can be chosen and how well the model performs on different contents. Our result shows that among four popular network architectures that we investigated, DenseNet performs best for image quality assessment based on the training dataset. By training the last 57 convolutional layers of DenseNet based on VMAF values, we obtained a high performance model to predict VMAF of distorted frames of video games with a Spearman’s Rank correlation (SRCC) of 0.945 and Root Mean Score Error (RMSE) of 7.07 on the image level, while achieving a higher performance on the video level leading to a SRCC of 0.967 and RMSE of 5.47 for the KUGVD dataset. Furthermore, we fine-tuned the model based on subjective quality ratings of images from gaming content which resulted in a SRCC of 0.93 and RMSE of 0.46 using one-hold-out cross validation. Finally, on the video level, using the proposed pooling method, the model achieves a very good performance indicated by a SRCC of 0.968 and RMSE of 0.30 for the used gaming video dataset.
Link to publication Link to original publication Download Bibtex entry


Quick Access

Schnellnavigation zur Seite über Nummerneingabe