Crowdsourcing and Open Data
Core Team:
Partner:
Topics:
Micro-task Crowdsourcing provide a remarkable opportunity for academic and industry sectors by offering a high scale, on demand and low cost pool of a geographically distributed workforce for completing complex tasks that can be divided into a set of short and simple online tasks, such as annotations, data collection, or participation in an subjective test
Our focus is to investigate “quality” in every aspect of the crowdsourcing process from design of work-flows and integration of AI systems, to application domains.
Besides others, we work on building state-of-the-art methods for conducting valid, reliable, and reproducible Subjective Tests using Crowdsourcing for different media: speech, video, gaming, text.These methods can be used for evaluating output of AI models (like speech enhancement, denoising, translation, summarization, etc), codecs, or studying the trade-offs between influencing factors and perceived quality. Our group is actively participating in the standardization activities in ITU-T Study Group 12, leading and participating in different Working Items, including P.Crowd (speech and crowdsourcing), P.CrowdV (video and crowdsourcing), P.CrowdG (gaming and crowdsourcing), and P.CrowdCon (conversation test in crowdsourcing).
Our research can be categorized as the following:
- Speech, Video, Gaming, Text Quality Assessment using Crowdsourcing approach
- Quality Control Mechanisms (Data Reliability, Agreement)
- Crowd and user biases, subjective normalization
- High Quality Crowd-Workflow Design
- Combination of Human Computation and AI
- Crowd- and AI-based NLP (Translation, Summarization, Knowledge graph, Chatbot, and Dialog Flow) Workflows
- Crowd assessments: Usability, UX, QoE
- Open Data, and Open Science (open-science.berlin)
Start-Up:
- Crowdee: High Quality Large Scale Crowdsourcing for Studies and AI-related Data Acquisition: (Start-Up from QU TU Berlin)
Projects:
- RUBYDemenz - Robot mit Begleitung (BMBF)
- Automated Chatlog Analysis for Self-Learning NLU and Dialog Update in Customer Support Domain (DFKI)
- BOP - Berlin Open Science Platform for the Curation of Research Data (TU Berlin)
- News-Polygraph
- Emonymous
- Evaluating the quality of speech services using crowdsourcing (DFG) [2021-2024]
- ITU-T Work Item P.CrowdV on Subjective evaluation of video and audiovisual quality with the crowdsourcing approach
- ITU-T Work Item P.CrowdG on Subjective evaluation of gaming quality with a crowdsourcing approach
- ITU-T Work Item P.CrowdCon on Subjective evaluation of conversational quality with a crowdsourcing approach
- AnonymPrevent (VW-Stiftung)
Past Projects:
- Design und Entwicklung einer kollaborativen digitalen Arbeitsplattform für die Digitalisierung von Innovationsprozessen DEKA (BMBF)
- SMESS - Towards a Standardized Methodology for Evaluating the Quality of Speech Services using Crowdsourcing
- Automated Chatlog Analysis for Self-Learning NLU and Dialog Update in Customer Support Domain (DFKI)
- BRIDGE - Data & Fact Driven Decision-Making for Skills Based Inclusion of Migrants (EIT-Digital)
- CrowdMAQA (Motivation and Quality Control in Crowdsourcing)
- AUNUMAP (Automated User Segmentation from Speech and Text for Market Research Applications)
- Vocalytics & SWYM (Fully Automated User Characterization and Personality Estmation)
- Speaker Recognition and Speaker Characterization through different Communication Channels
- Affect-based Indexing
- Anomaly Detection and Early Warning Systems
- Predicting the Perceived Quality of Audiovisual Speech (Perc Qual AVS)