direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Kolloquium WS 11/12

Mondays, starting October 15, 2011
TEL Auditorium 1 and 2 (20th floor) [1], Ernst-Reuter-Platz 7, 10587 Berlin

This research colloquium is a weekly event with various invited speakers. It is open to anyone who is interested in the general area of usability and human-computer interaction. Researchers in this area will present overviews of their work. The colloquium is organized by Deutsche Telekom Laboratories [2]. If you have any questions, please contact Hamed Ketabdar [3], Sebastian Möller [4]or Klaus-Peter Engelbrecht [5].

Please contact us if you want your email to be added to our colloquium mailing list.

Colloquium Talks
Location: ***Ernst-Reuter-Platz 7, 14th Floor, Room Futurum 1414 (@DAI-Labor)***

Emilien Ghomis

HOST: Gilles Bailly
No Regular Colloquium (Holiday)!

Note the exeptional talk on March 26, listed above!
Location: Ernst-Reuter-Platz 7, 20th Floor, Auditorium 1

TITLE: Adaptation of Aesthetic Appreciation

SPEAKER: Claus-Christian Carbon

Personal taste develops over time and is highly susceptible for Zeitgeist-dependent effects. Using an adaptation paradigm often used in the domain of face research (e.g., Carbon & Ditye, 2011), we could show in a series of experiments that not only the representation of designs (e.g., car designs, Carbon, 2010) or artworks (Carbon & Leder, 2006), but also taste quickly adapts towards adaptors (Carbon, Ditye, & Leder, 2006). The present talk will not only demonstrate adaptation towards specific design characteristics in product design, but even so for specific art-specific properties, thus underlining the power of adaptation for the development and advancement of taste in important domains of our everyday life.

CCC is head of the Department of General Psychology and Methodology at the University of Bamberg, in Bamberg / Germany. His research concentrates on fields of empirical aesthetics, design appreciation, face recognition/face processing/prosopagnosia (face blindness), optical illusions, cognitive maps, advancement of methods and on applied cognition (design; the role of innovation; HCI and ergonomics).

HOST: Klaus-Peter Engelbrecht

Location: Ernst-Reuter-Patz 7, 20th Floor, Auditorium 1

TITLE: Designing Gestural Interaction for Different Contexts

Tanja Döring

Gestural interaction offers rich opportunities for human-computer interaction and has evolved to an important research topic in interaction design. Currently, different types of gesture sets are developed, and among these are, e.g., touch gestures, mobile phone gestures or in-air gestures. An important factor for designing gestures is the context, in which they are used. This talk will focus on designing gestural interaction for different contexts and present case studies from automotive and entertainment user interfaces. Different aspects of gestural interaction like gestures in combination with speech or tangible interaction will be discussed.

Tanja Döring is a researcher and PhD student in the Digital Media Group at the University of Bremen. Her research interests include tangible, gestural and mobile interaction with interactive surfaces. From April 2008 until October 2011, Tanja worked in the "Pervasive Computer and User Interface Engineering Group" at the University of Duisburg-Essen. She has a master’s degree in computer science from the University of Hamburg.
$this->_build_link_list($this->linkCount++, "http://dm.tzi.de/de/people/staff/tanja-doering/", "dm.tzi.de/de/people/staff/tanja-doering/ [6]")

HOST: Katrin Wolf

Last updated 20.01.2012 by Hamed Ketabdar
Location: Ernst-Reuter-Platz 7, 20th Floor, Auditorium 1

TITLE: Introduction to W3C Multimodal Interaction Working Group

SPEAKER: Ingmar Kliche

Almost 10 years ago the “Multimodal Interaction Working Group” was launched within the W3C to “extend the Web to allow users to dynamically select the most appropriate mode of interaction for their current needs”. This presentation will give an overview of the technologies developed within the Multimodal Working Group (such as Multimodal Architecture and Interfaces, InkML, EMMA, EmotionML, ...) and related developments in other W3C Working Groups (such as VoiceXML, SCXML, ...).

Ingmar Kliche is a project manager at Telekom Innovation Laboratories (T-Labs). He received his Master of Science degree in Electrical Engineering from the Berlin University of Technology in 1997. He has been working in speech technology for more than ten years in various capacities, including implementation of voice and multimodal applications, speech technology consulting and R&D project management. Ingmar is a member of several W3C working groups, including the W3C Voice Browser Working Group and the W3C Multimodal Interaction Working Group.

HOST: Ina Wechsung

Last updated 11.01.2012 by Hamed Ketabdar
Location: Ernst-Reuter-Platz 7, 20th Floor, Auditorium 1

TITLE: Grasp Sensing for Human-Computer Interaction

SPEAKER: Raphael Wimmer

The ability to grasp objects was a prerequisite to man's use of hand tools. These tools - such as hammers, computer mice, or mobile phones - afford different ways of grasping them. Sensing and interpreting these grasps gives us additional information about a user's goals or requirements while using the tool. This talk presents advantages and applications of grasp sensing on mobile devices as well as challenges for sensing and interpreting grasps.

Raphael Wimmer is a researcher in the Media Informatics Group at the University of Regensburg and a PhD student in the Media Informatics Group at the University of Munich. Raphael's research focuses on touch sensing technologies and novel interaction techniques. He developed CapToolKit, an open-source toolkit for prototyping capacitive sensing systems, and currently investigates how to enhance touch-sensing on small and deformable objects.

Blog: raphaelwimmer.wordpress.com [7]
Publications: www.medien.ifi.lmu.de/team/raphael.wimmer/ [8]

HOST: Katrin Wolf

Please notice exceptional time: Thursday 12:00 -13:00 h!

Location: Ernst-Reuter-Platz 7, 18th Floor, Room Plicht

SPEAKER: Konrad Krenzlin

TITLE:Optimizing Auditory Stimuli for a Novel Brain-Computer-Interface Paradigm

Brain-Computer-Interfaces (BCI) enable touchless and motionless interaction by classifying the brain's response to stimuli presented to the user. Usually, visual stimuli are used, while auditory stimuli are seldom. An exception is the PASS2D paradigm, which uses different artificial tones driving a text entry system. The choice of stimuli is very important not only for the performance of the classifier, but also for the usability of the BCI.
This talk shows my work on using short natural phonemes with different characteristic instead of artificial tones as stimuli and presents promising results.

Konrad Krenzlin is currently finishing his Bachelor in Technische Informatik at TU Berlin. Recently, he started a Master course in Audio Communication and Technology, also at TU Berlin. The talk is a presentation of his Bachelor thesis.

HOST: Sebastian Möller
Location: Ernst-Reuter-Platz 7, 20th Floor, Auditorium 1 

Is phonetics still needed in speech synthesis research?
 Petra Wagner

Recent developments have shown a rapid decrease in genuinely phonetic topics in the field of speech synthesis research. This development can be traced back to several reasons:

  • State-of-the art synthesis systems have to meet challenges lying outside the scope of traditional phonetics research
  • State-of-the art synthesis systems do most modeling in the symbolic domain, without the need for fine grained phonetic analyses and prediction
  • Progress in (semi/un)supervised machine learning has reduced the need for phonetic expertise in manual annotation of large corpora
  • Research in synthesis evaluation has made significant progress in developing objective metrics, thus reducing the need for time consuming tests
I argue that despite these developments phonetics should not „drop out“ of application oriented research employing synthesis. Instead, the community has a chance to move ones attention to areas where phonetic insight and methods are desperately needed, e.g. CALL, dialogue systems, multimodal systems. Phonetic research can help to better understand and model listener expectations, communicative dynamics such as timing in turn taking, appropriate feedback, and how the user can be provided with a better, more intuitive understanding of system states.

Petra Wagner is full professor for Phonetics and Phonology at Bielefeld University. After her studies in Linguistic, she finished her dissertation about prediction and perception of German stress pattern in 2002 at Bonn University. Her research is concerned with perceptual prominence and its relation to linguistic entities; speech, language and gestural rhythm, other topics in prosody and speech synthesis.

HOST: Benjamin Weiss

Last updated 11.1.2012 by Hamed Ketabdar
Location: Ernst-Reuter-Platz 7, 20th Floor, Auditorium 1

TITLE: Laughing, crying and other non-verbal vocalisations in conversational speech
SPEAKER: Jürgen Trouvain

Affect in spoken communication can be identified based on prosodic features like average pitch, pitch variation, intensity, timing or voice quality. However, additional to speech, there are vocalisations like crying or laughing, prominently signaling affect. Analysing such vocalisations is important for, e.g.,  affective speech synthesis, improved speech recognition.

Jürgen Trouvain is senior researcher at the Saarbrücken University, Computational Linguistics and Phonetics Institute. His 1995 masters’ thesis was about building and evaluating spectral and durational features of a German formant synthesis. He published his dissertation in 2004, entitled “Tempo Variation in Speech Production. Implications for Speech Synthesis”. Currently, he is working on laughter and other non-verbal vocalisations, non-native speech, Luxembourgish, von Kempelen's speaking machine.

There will be an ADDITIONAL SHORT PRESENTATON of Jürgen Trouvain's work on performance of humans in understanding ultra-fast speech synthesis.

HOST: Benjamin Weiss

Location: Ernst-Reuter-Platz 7, 20th Floor, Auditorium 1

TITLE: Information Retrieval using context-free Descriptions of Audio

SPEAKER: Carola Trahms

Freely accessible media provided in the Internet increase at a rapid rate. This media need to be indexed reasonably, to allow a fast and efficient search. Indexing media, such as audio clips, is by no means a trivial task. On the one hand, the indexes should be as precise and unambiguous as possible and, on the other hand, they should enable a most efficient query. Indexing media by their content is a very precise method and was successfully done in the past, but queries performed on media will not always search for the content, but for a certain association that is connected to it. Associations are a less exact method for describing media as they usually are context free and differ from person to person. The descriptions used in this thesis are spontaneous comments on audio clips given by a number of listeners.
Finding a way to index media by such spontaneous comments is a challenging task. In this thesis, a method for clustering audio clips using context-free descriptions will be suggested.

Carola Trahms studies Computer Engineering at TU Berlin. The talk is a presentation of her Bachelor thesis which she did at T-Labs. She has started a Master course in Computer Engineering at TU Berlin this year and works in T-Labs as a student worker.

HOST: Shiva Sundaram

Location: Ernst-Reuter-Platz 7, 20th Floor, Auditorium 1

TITLE: Binaural Model-Based Speech Intelligibility Enhancement and Assessment in Hearing Aids

SPEAKER: Anton Schlessinger

In Europe, about one fifth of the population has difficulties with understanding speech in noisy and complex environments. Improving speech intelligibility in these conditions allows for the reintegration of the hearing impaired into an communication-oriented society, and restores individual well-being to a high degree. Commercially available hearing aid solutions are generally based on the amplification principle and successfully enhance speech understanding for severe hearing loss in silence. However, current hearing aid solutions do not reconstruct speech intelligibility in noisy surroundings to an extent that is required by the majority of the hearing impaired. Successful solutions that reconstruct the intelligibility of noise-corrupted speech are based on the principle of spatial sampling. By such means, a target speaker signal can be enhanced and the interference can be suppressed.
In this talk, a set of standard binaural speech processors, that draw upon models of the auditory scene analysis, are reviewed, optimized and compared. These binaural speech processors are, furthermore, applied at the output of hearing aids with and without beamforming. As a result, two efficient spatial sampling schemes are combined to achieve high improvement of speech intelligibility in noisy environments.
A statistical study on binaural parameters in different acoustic real-world scenes is given. Binaural parameters of the fine-structure of the waveform are compared with binaural parameters of the corresponding envelope. In addition, natural binaural parameters are compared to binaural parameters at the output of a set of hearing aids. As a result, the study provides a comprehensive insight into the behaviour of binaural parameters in noise, thereby sizing the possibilities of a binaural parameter-based source segregation.
Furthermore, a stochastic optimization of binaural speech processors at the output of different front-ends as well as in different acoustic environments is performed. To that end, a genetic algorithm is applied, which maximizes an objective function of binaural speech intelligibility. The robustness of the optimized binaural speech processors is assessed throughout modified acoustic scenes.
The holistic approach of a model-based improvement and a model-based assessment of speech intelligibility offers an efficient and task-oriented means for the improvement of speech intelligibility.However, the development of an objective function of binaurally and nonlinearly processed speech, still remains an unsolved problem. To date, there exists no comprehensive model of speech intelligibility. Therefore the talk concludes with a summary on recent advances towards such a model.

Anton Schlesinger enrolled in the study of Media Technology at Ilmenau University of Technology, Germany, in 2000. In 2004/05 he took a one year break from his study for an internship as a consultant for classical room acoustics for recording studios and venues at the Walters-Storyk Design Group in Basel, Switzerland. Subsequently, he has been engaged as a founding member of a start-up for location-based services in Weimar, Germany. In August 2006, he received his Diplom in in Media Technology with his thesis work on the three-dimensional measurement and holographic reconstruction of sound fields.
At the end of 2006, he joined the Laboratory of Acoustical Imaging and Sound Control at the Delft University of Technology, The Netherlands. In the succeeding years, he continued research in room acoustical holography and conducted his work in audiology that resulted in his PhD thesis ``binaural model-based speech intelligibility enhancement and assessment in hearing aids.'' Since March 2011, Anton Schlesinger is employed at the Institute of Communication Acoustics at the Ruhr-Universiät Bochum, Germany, as a postdoctoral researcher. His current scientific interest is in binaural models of speech intelligibility, classification and model-based speech intelligibility enhancement.

HOST: Pablo Ramirez

Last updated on 23.11.2011 by Hamed Ketabdar
Please notice: There are two talks for this date!

Ernst-Reuter-Platz 7, 20th Floor, Auditorium 1

First Talk:

TITLE: Auswirkungen von verzögertem Systemfeedback im mobilen Kontext

SPEAKER: Arne Denneler

Berührungsempfindliche Bildschirme (Touchscreens) finden vor allem bei modernen mobilen Geräten große Verbreitung. Durch ihren Einsatz in Mobiltelefonen mit erweiterter Computerfunktionalität (Smartphones) gestatten sie es dem Nutzer, die unterschiedlichen Anwendungen intuitiv durch Anklicken oder Verschieben von grafischen Elementen zu bedienen.
Der Verzicht auf eine reale Tastatur bringt allerdings einen entscheidenden Nachteil mit sich. Herkömmliche Tastaturen geben dem Nutzer beim Drücken einer Taste unmittelbar eine taktile und eine dezente auditive Rückmeldung (Feedback). Dieses Feedback liefert eine virtuelle Tastatur zunächst nicht.
Deshalb werden bei der Verwendung virtueller Tastaturen meist künstliche Feedbacks erzeugt, die allerdings systembedingten Verzögerungen unterliegen.
In einer vorangegangenen Studie wurde mit einem statischen Versuchsaufbau untersucht, wie sich unterschiedliche Verzögerungen bei künstlich erzeugtem Feedback auf das Eingabeergebnis und auf die Beurteilung durch die Nutzer auswirken.
Im Rahmen der vorgestellten Bachelorarbeit wird die vorherige Untersuchung in Hinsicht auf mobile Situationen, also auf einen realistischeren Nutzungskontext, ausgeweitet und die Ergebnisse beider Studien miteinander verglichen.

HOST: Julia Seebode

Second Talk:

The effect of feature alteration in anthropomorphic robot heads on social facilitation

Patrick Ehrenbrink

Human performance can be influenced by the presence of other persons. As artificial intelligence and automation develop, robots become an ever more important part of society. The presented study examines if social facilitation effects can be triggered by robots. Three robots that differ in human-likeness serve as artificial experimenters in a speed-task. Their influence on the participants' performance is investigated.

Patrick Ehrenbrink is a student research assistant at the Quality and Usability Lab in Berlin. He is currently a Master's student of Human Factors at the Technische Universität Berlin. 

HOST: Ina Wechsung

Last updated 8.12.2011 by Klaus-Peter Engelbrecht
Location: Ernst-Reuter-Platz 7, 20th Floor, Auditorium 1

TITLE: Building multimodal systems is for everyone: a html+voice tutorial

SPEAKER: David Griol and Zoraida Callejas Carríon

X+V (xhtml + voice) is a markup language for developing multimodal web-based applications. X+V is specially useful to bring voice to WWW applications, as it provides the flexibility and facilities of mature web technologies such as XTHML and XML Events, along with the possibility to encode voice interaction using a subset of the VoiceXML standard. We will present the basics of X+V and explain how to structure the multimodal interaction in several sample applications. Please bring your computers – it might get interactive.

HOST: Ina Wechsung

Last updated 24.11.2011 by Hamed Ketabdar
Please notice: There are two talks for this date!

Ernst-Reuter-Platz 7, 20th Floor, Auditorium 1

First talk:

Definition and Implementation of a Human Readable Storage and Interchange Format for Spatial Audio Scenes

SPEAKER: Lukas Kaser

Research in the domain of spatial audio created a paradigm: object-based audio. The separation of an audio source and its parameters, such as its positions in space, allows the specification of a format for the storage and interchange of spatial audio scenes.
This thesis extends the Audio Scene Description Format (ASDF), which was proposed in literature, with the possibility to store time-varying spatial audio parameters. A flexible timing concept adapted from the Synchronized Multimedia Integration Language (SMIL) is specified and a solution for the notation of parameterized trajectories, defining the movement of sources, is proposed. This concepts ease the text-based authoring of complex audio scenes in ASDF in significant manner.
The implementation of basic features of the defined format is described. In addition an idea for the sonification of surfaces is described, which uses the definition of trajectories on a surface for the conversation of geometrical data to audio signals.

Lukas Kaser is a Master's student of Computer Science at the Berlin Institute of Technology.

HOST: Sascha Spors

Second talk:

TITLE: Berührungslose Interaktion am Fraunhofer HHI.
SPEAKER: Paul Chojecki
HOST: Jörg Müller 
Last updated 23.11.2011 by Hamed Ketabdar
Please notice: there are exceptionally two talks for this date!
Location: Ernst-Reuter-Platz 7, 20th Floor, Auditurium 1
1st talk:

TITLE: A 99$ eye tracker

SPEAKER: Youri Marko, EPFL

Eye tracking is a very exploited technology used in many different research contexts to probe perceptual and cognitive processes. However current gaze trackers are expensive and have strong limitations. Most commercial eye trackers are designed to observe the gaze patterns of users interacting with fixed digital monitors.  In addition, the accuracy of these devises decrease rapidly if the subject move during the experiment. 
As we were interested in analyzing gaze pattern of subjects interacting in a NATURAL way with PAPER documents, solutions that could potentially relax these limitations have been investigated. After testingdifferent remote setups, we built a wearable eye tracking device and tested it in a reading experiment on paper documents. The prototype is constructed with low cost off-the-shelf components. It integrates two cameras and an infrared LED in a pair of glasses. We used  a hacked Sony Playstation camera to track the pupil of the subject and a commercial board camera to look at the scene and track the paper document.
The results obtained with this first prototype are very encouraging and show that we perhaps don't need to spend several thousands of euros to buy an expensive commercial device in order to conduct eye tracking studies. We really hope that this device will inspire people coming from different fields and allow them to start working on new eye tracking projects.
Youri Marko is a research assistant in the/ Peripheral Systems Laboratory/ at /EPFL/ (/Ecole Polytechnique Fédérale de Lausanne, Switzerland/) where he works on printing techniques and color prediction models in order to develop new security solutions for printed documents. He obtained his Master of Science MSc in Microengineering in February 2011 at /EPFL/. During his Master Thesis ("/Design of an Eye-Tracking System for Analyzing Reading Behaviours/"), he developed a 99$ eye tracker. Youri is particularly interested in human-computer interfaces, computer vision, design and image processing. He is currently looking for a challenging position where he could work on interdisciplinary projects."
HOST: Robert Schleicher
2nd talk:

TITLE: Multimodal interaction research at 'Tampere Unit for Computer-Human Interaction' (TAUCHI)

SPEAKER: Markku Turunen

I will present an overvirew of multimodal interaction research activities at the Tampere Unit for Computer-Human Interaction (TAUCHI). University of Tampere is one of the largest universities in Finland, with about 14.500 students. It is a full-scale university with nine schools. As a sizeable part of the School of Information Sciences, Tampere Unit for Human-Computer Interaction (TAUCHI) consists of 45 staff members. It is the largest research unit in its area in Finland and one of the largest in the world. The unit it has been identified as the "as the strongest Finnish group in HCI" in the report of Academy of the Finland. TAUCHI is further divided into four research groups, Multimodal Interaction, Emotions, Sociality, and Computing, Speech-based and Pervasive Interaction Group, and Visual Interaction research Group.
I will introduce the research activities of these group, and provide some detailed examples from relevant case studies of the Speech-based and Pervasive Interaction Group.

HOST: Klaus-Peter Engelbrecht

Last updated 19.10 by Hamed Ketabdar

6 pm -8 pm
Please notice exceptional time!

Ernst-Reuter-Platz 7, 20th Floor, Auditorium 2

TITLE: A Comparison of Eyes-Free Touch Remote Controls for Big Screen Interaction

SPEAKER: Jonas Willaredt

HOST: Michael Nischt

Last updated 12.10.11 by Hamed Ketabdar

------ Links: ------

Zusatzinformationen / Extras


Schnellnavigation zur Seite über Nummerneingabe

Copyright TU Berlin 2008