The project aims at exploiting large collections of unlabeled multi-modal data, mainly video footage, to further state-of-the-art in video, audio and natural language understanding, interpretation, annotation and retrieval by combining unsupervised and semi-supervised learning.
Czech Technical University, Charles University in Prague, Masaryk University, University of West Bohemia