direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Page Content

Subjective assessment and instrumental prediction of mobile online gaming on the basis of perceptual dimensions


The assessment of the perceived quality of the user (Quality of Experience) of pure audio and video material differs in many ways from the quality of computer games. The later possess a variety of factors due to their interactive nature. Not only factors of complex and innovative game systems have an impact on the QoE, but also the players themselves. A quality judgment, that results by comparing the expected and perceived composition of an entity, depends highly on the preferences, expectations and abilities of the player.

In this still young area of research standard methods for determining the QoE are not directly applicable. This is apparent since in a task oriented human-computer interaction the goal should be achieved with minimal effort. However in a game  the player exerts an effort in order to influence the outcome and thereby feels emotionally attached. Thus new concepts such as immersion and flow, a state of happiness while being in an equilibrium between competence and challenge, appear.

An analysis of the gaming market shows, that the proportion of mobile games has risen sharply in recent years. Mobile games are special in a sense that mobile devices such as smartphones or tablets were originally not designed for games and are therefore not optimally adapted to them. This also applies to the new concept of cloud gaming, where the entire game is executed on a server and only the video and audio material is transferred to the end user.


Aim of the project

The aim of this research project is to develop methods to assess the QoE of mobile games. In addition, based on a database containing subjective quality judgments, a model similar to the well known E-model should be constructed to predict the QoE. The following concrete steps are planned for this purpose:

  • Set up and modification of a testbed for conducting experiments including a cloud gaming system for mobile games
  • Development of a classification of games to choose representative games and identify system and user factors
  • Building a questionnaire covering a large space of relevant quality dimensions
  • Identification of quality-relevant perceptual dimensions and analysis of their impact on the overall quality
  • Analyzing the performance of current objective metrics which were proposed for different contents and services in mobile gaming
  • Building a QoE model based on game, system and network characteristics as well as user and context factors
Time Frame: 
01/2016 - 06/2019
T-labs Team Members:
Steven Schmidt
Funding by:
Deutsche Forschungsgemeinschaft (DFG)
Project Number:
MO 1038/21-1

List of Publications

NDNetGaming-development of a no-reference deep CNN for gaming video quality prediction
Citation key utke2020a
Author Utke, Markus and Zadtootaghaj, Saman and Schmidt, Steven and Bosse, Sebastian and Möller, Sebastian
Pages 1–23
Year 2020
ISSN 1573-7721
DOI 10.1007/s11042-020-09144-6
Address Address of the Publisher and (NOT the conference)
Journal Multimedia Tools and Applications
Month jul
Publisher Springer
How Published Fullpaper
Abstract Gaming video streaming services are growing rapidly due to new services such as passive video streaming of gaming content, e.g. Twitch.tv, as well as cloud gaming, e.g. Nvidia GeForce NOW and Google Stadia. In contrast to traditional video content, gaming content has special characteristics such as extremely high and special motion patterns, synthetic content and repetitive content, which poses new opportunities for the design of machine learning-based models to outperform the state-of-the-art video and image quality approaches for this special computer generated content. In this paper, we train a Convolutional Neural Network (CNN) based on an objective quality model, VMAF, as ground truth and fine-tuned it based on subjective image quality ratings. In addition, we propose a new temporal pooling method to predict gaming video quality based on frame-level predictions. Finally, the paper also describes how an appropriate CNN architecture can be chosen and how well the model performs on different contents. Our result shows that among four popular network architectures that we investigated, DenseNet performs best for image quality assessment based on the training dataset. By training the last 57 convolutional layers of DenseNet based on VMAF values, we obtained a high performance model to predict VMAF of distorted frames of video games with a Spearman’s Rank correlation (SRCC) of 0.945 and Root Mean Score Error (RMSE) of 7.07 on the image level, while achieving a higher performance on the video level leading to a SRCC of 0.967 and RMSE of 5.47 for the KUGVD dataset. Furthermore, we fine-tuned the model based on subjective quality ratings of images from gaming content which resulted in a SRCC of 0.93 and RMSE of 0.46 using one-hold-out cross validation. Finally, on the video level, using the proposed pooling method, the model achieves a very good performance indicated by a SRCC of 0.968 and RMSE of 0.30 for the used gaming video dataset.
Link to publication Link to original publication Download Bibtex entry

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe