direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Word relatedness from Word Embedding in text analysis and information retrieval

LOCATION:  TEL, Room 208 (2nd floor), Ernst-Reuter-Platz 7, 10587 Berlin

: 15.01.2018, 12:00-12:45

SPEAKER: Allan Hanbury  (TU Wien)


Word Embedding approaches, such as word2vec, are being increasingly used as the basis for a wide variety of text analysis and information retrieval applications. In this talk, I present some of the recent contributions to this area from my research group. The first part of the talk analyses the similarity values produced by work2vec, in particular to determine the range of similarity values that is indicative of actual term relatedness. Based on these results, uses of the similarity values in sentiment analysis and information retrieval are presented. Finally, we discuss the problem of topic shifting in information retrieval resulting from the incorporation of word2vec term similarities, mainly due to the local context of these similarities. A solution is presented that involves combining the local context of word2vec with the global context provided by Latent Semantic Indexing (LSI).


Allan Hanbury is Professor for Data Intelligence at the TU Wien, Austria, and Faculty Member of the Complexity Science Hub. He is coordinator of the Austrian ICT Lighthouse Project, Data Market Austria, which is creating a Data-Services Ecosystem in Austria. He was scientific coordinator of the EU-funded Khresmoi Integrated Project on medical and health information search and analysis, and is co-founder of contextflow, the spin-off company commercialising the radiology image search technology developed in the Khresmoi project. He also coordinated the EU-funded VISCERAL project on evaluation of algorithms on big data, and the EU-funded KConnect project on technology for analysing medical text.
His areas of research include Data Science, Information Retrieval, Semantic Analysis and Search, Information Retrieval Evaluation, Recommender Systems, Data Mining and Machine Learning. He is author or co-author of over 140 publications in refereed journals and refereed international conferences.

Zusatzinformationen / Extras


Schnellnavigation zur Seite über Nummerneingabe