Inhalt des Dokuments
Klaus-Peter Engelbrecht's Research
MeMo usability workbench:
Framework for the evaluation of GUI or VUI mockups with user simulation. It supports the creation of mock-ups from screen designs, definition of tasks, simulation of interactions, and evaluation of the resulting data. In order to test the workbench, I repeated an experiment with real users using the workbench and compared the results (dissertation thesis , Interspeech 2008 ).
Formative usability evaluation with user models:
While user simulation has been applied in the formative evaluation of dialog systems, their performance in discovering usability problems has never been analyzed. I tried to fill this gap by repeating an experiment with real users. I also analyzed how the simulated data can be searched for problematic situations efficiently (IWSDS 2011 , ICCM 2012 ).
Activation-based user model:
Most recently, I proposed to model user behavior based on the activation of concepts (ESSV 2012 , IWSDS 2012, submitted). While so far I could only show that this leads to reasonably realistic simulations of user behavior, the goal is to model the motivational and affective aspects of the interaction in order to learn about how system characteristics impact different quality aspects perceived by the user.
Prediction of users‘ quality judgments
Prediction error is smaller for mean judgments than for individual judgments:
Traditionally, performance of judgment prediction models has been tested based on predictions of individual dialogs. I showed that the prediction error is much smaller if mean ratings for different systems or configurations are compared (Sigdial 2007 ). Furthermore, I discovered that the size of the error is related to the memory performance of the users (ESSV 2008 ).
Modeling the prediction error:
In order to gain knowledge about the user judgments, I tested new modeling approaches allowing to also model errors due to the judgment process. I used Markov Chains and Hidden Markov Models, which process dialog data as sequences of user and system exchanges (Speech Communication 2010 ). These models can be extended to include also parameters derived from the user’s speech signal (IWSDS 2009 ).
Prediction of user judgments in unseen databases:
Judgment prediction models usually perform poorly if they are applied to dialogs with systems differing from those in the training database. I used a Mixture-of-Experts approach, in which several expert models are trained, and the model matching the current case to predict best is selected automatically based on the individual prediction results. This showed a minimal improvement over the baseline model (Interspeech 2010 ). I also examined how well predictions based simulated dialogs match the mean judgments for different system configurations in a database not available during training (dissertation thesis ).
I have tried to integrate judgment prediction models and user models, in order to capture the users’ experience of the dialogs and use this information in the predictions (Interspeech 2012 , ICCM 2012 ). I hope this will lead to more general predictors of user judgments.