Crowdsourcing and Open Data

Core Team:

Partner:

Topics:

Micro-task Crowdsourcing provide a remarkable opportunity for academic and industry sectors by offering a high scale, on demand and low cost pool of a geographically distributed workforce for completing complex tasks that can be divided into a set of short and simple online tasks, such as annotations, data collection, or participation in an subjective test

Our focus is to investigate “quality” in every aspect of the crowdsourcing process from design of work-flows and integration of AI systems, to application domains.

Besides others, we work on building state-of-the-art methods for conducting valid, reliable, and reproducible Subjective Tests using Crowdsourcing for different media: speech, video, gaming, text.These methods can be used for evaluating output of AI models (like speech enhancement, denoising, translation, summarization, etc), codecs, or studying the trade-offs between influencing factors and perceived quality. Our group is actively participating in the standardization activities in ITU-T Study Group 12, leading and participating in different Working Items, including P.Crowd (speech and crowdsourcing), P.CrowdV (video and crowdsourcing), P.CrowdG (gaming and crowdsourcing), and P.CrowdCon (conversation test in crowdsourcing).

Our research can be categorized as the following:

Speech, Video, Gaming, Text Quality Assessment using Crowdsourcing approach
Quality Control Mechanisms (Data Reliability, Agreement)
Crowd and user biases, subjective normalization
High Quality Crowd-Workflow Design
Combination of Human Computation and AI
Crowd- and AI-based NLP (Translation, Summarization, Knowledge graph, Chatbot, and Dialog Flow) Workflows
Crowd assessments: Usability, UX, QoE
Open Data, and Open Science (open-science.berlin)

Start-Up:

Crowdee: High Quality Large Scale Crowdsourcing for Studies and AI-related Data Acquisition: (Start-Up from QU TU Berlin)

Projects:

RUBYDemenz - Robot mit Begleitung (BMBF)
Automated Chatlog Analysis for Self-Learning NLU and Dialog Update in Customer Support Domain (DFKI)
BOP - Berlin Open Science Platform for the Curation of Research Data (TU Berlin)
News-Polygraph
Emonymous
Evaluating the quality of speech services using crowdsourcing (DFG) [2021-2024]
ITU-T Work Item P.CrowdV on Subjective evaluation of video and audiovisual quality with the crowdsourcing approach
ITU-T Work Item P.CrowdG on Subjective evaluation of gaming quality with a crowdsourcing approach
ITU-T Work Item P.CrowdCon on Subjective evaluation of conversational quality with a crowdsourcing approach
AnonymPrevent (VW-Stiftung)

Past Projects:

Design und Entwicklung einer kollaborativen digitalen Arbeitsplattform für die Digitalisierung von Innovationsprozessen DEKA (BMBF)
SMESS - Towards a Standardized Methodology for Evaluating the Quality of Speech Services using Crowdsourcing
Automated Chatlog Analysis for Self-Learning NLU and Dialog Update in Customer Support Domain (DFKI)
BRIDGE - Data & Fact Driven Decision-Making for Skills Based Inclusion of Migrants (EIT-Digital)

DoNotFear - Perceived Security In Public Transport (EIT-Digital)

OurPuppet - Pflegeunterstützung mit einer interaktiven Puppe für informell Pflegende (BMBF)

ICU - Internes Crowdsourcing in Unternehmen: Arbeitnehmergerechte Prozessinnovationen durch digitale Beteiligung von Mitarbeiter/innen (BMBF)

ERICS – European Refugee Information and Communication Service (EIT-Digital, 1/2017- 12/2018, Project Lead)

ALM-enabled Smart Maintenance: Low cost, Multi-purpose (ALM) IoT modules for fitting machinery/production plants and measuring real time parameters such as vibrations, energy consumption, temperature etc. based on innovative fiber optics and Nucleo STM microcontrollers (EIT-Digital, 1-12/2017)

Privacy, Security and Trust in Crowdsourcing Confidential Enterprise Data (EIT Digital, Project Lead)

CrowdMAQA (Motivation and Quality Control in Crowdsourcing)

AUNUMAP (Automated User Segmentation from Speech and Text for Market Research Applications)

Vocalytics & SWYM (Fully Automated User Characterization and Personality Estmation)

Multimedia Content Retrieval

Predicting the Perceived Quality of Audiovisual Speech (Perc Qual AVS)

Recognition of Mobile and Rich Speech (MARS)

Universal Telecommunications Interface