Archive

Area of research Funding provider

Grants

Dialog
Duration Provider
EDU-AI: AI asistent pro žáky a učitele 04/2021-12/2023 TAČR
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
Dialogue systems focused on combining tasks and chit-chat 2021-2023 GAUK
ECSS: Evaluation of conversational speech synthesis 2022-2024 GAUK
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Low resource methods for dialogue systems applications 2020 - 2022 GAUK
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
NaMuDDiS: Natural multi-domain dialogue systems 2019-2021 UK
NG-NLG: Next-Generation Natural Language Generation 2022-2027 Horizon Europe, ERC
THEaiTRE: THEAITRE: Umělá inteligence autorem divadelní hry? April 2020 - September 2022 TAČR
Generation
Duration Provider
EDU-AI: AI asistent pro žáky a učitele 04/2021-12/2023 TAČR
AIAI: AI: Authorship and Interpretation 2025-2027 GAČR
Arithmetic Properties in the space of Language Model Prompts 2023 GAUK
Controllable NLG: Controllable Natural Language Generation 2021-2023 GAUK
Dialogue systems focused on combining tasks and chit-chat 2021-2023 GAUK
Domain Adaptation for Natural Language Generation 2020-2022 GAUK
ECSS: Evaluation of conversational speech synthesis 2022-2024 GAUK
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR
NaMuDDiS: Natural multi-domain dialogue systems 2019-2021 UK
NG-NLG: Next-Generation Natural Language Generation 2022-2027 Horizon Europe, ERC
The Anthropology of Artificial Intelligence: Ethics, Understanding, Human Nature 2023-2024 ETF UK
THEaiTRE: THEAITRE: Umělá inteligence autorem divadelní hry? April 2020 - September 2022 TAČR
AIvK Exponát Didaktikon: Život s umělou inteligencí: upgrade 2023-09-01 - 2023-12-31 UK
Information Retrieval
Duration Provider
EDU-AI: AI asistent pro žáky a učitele 04/2021-12/2023 TAČR
CEDMO 2.0 NPO 1.9. 2024 - 30. 4. 2026 MPO
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR
Annotations
Duration Provider
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START
PONK: Asistent přístupné úřední komunikace 9/2023-12/2025 TAČR
CzeDParse: Automatická analýza diskurzních vztahů v češtině 2019-2021 GAČR
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
HVar: Disagreement in corpus annotation and variation of human understanding of text 2024-2026 GAČR
SEEM-CZ: Epistemic and Evidential Markers in Czech 2023-2025 GAČR
ELG: European Language Grid 2019-2021 H2020
ForFun2: ForFun2: Functions and Forms of Circumstantial Modifications 2023-2025 GAČR
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR
Global Coherence: Global Coherence of Czech Texts in the Corpus-Based Perspective 2020 - 2023 GAČR
Independent component analysis of continuous word representations 2021–2022 GAUK
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
OP VVV LINDAT: LINDAT/CLARIN - Research infrastructure for language technologies – extension of the repository and its computational power 2017–2019 MŠMT - OP VVV
LiFR: Linguistic Factors of Readability in Czech Administrative and Educational Texts 2019-2021 GAČR
RapiDisc: Metody pro rychlou diskurzní anotaci ve vybraných korpusech 2022-2024 GAČR
INTERCOST-Readability: Modelování komplexity českých literárních textů VI 2018 - X 2021 MŠMT
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
OmniOMR: OmniOMR - optical music recognition using machine learning for digital libraries 2023-2027 NAKI
LAPPS-CLARIN: Transatlantic Collaboration between LAPPS and CLARIN: Semantic, Technical and Infrastructural Interoperability of Services 2016-2018, 2019-2021 Mellon Foundation (USA)
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR
Data
Duration Provider
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START
Adapting Uniform Meaning Representation (UMR) for the Italic/Romance languages 2024-2026 GAUK
CzeDParse: Automatická analýza diskurzních vztahů v češtině 2019-2021 GAČR
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] 2023–2027 NAKI
CEDMO 2.0 NPO 1.9. 2024 - 30. 4. 2026 MPO
UNCE VITRI: Center for the Transdisciplinary Research of Violence, Trauma and Justice 2018-2023 UK
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
HVar: Disagreement in corpus annotation and variation of human understanding of text 2024-2026 GAČR
Domain Adaptation for Natural Language Generation 2020-2022 GAUK
SEEM-CZ: Epistemic and Evidential Markers in Czech 2023-2025 GAČR
ELG: European Language Grid 2019-2021 H2020
ECSS: Evaluation of conversational speech synthesis 2022-2024 GAUK
Global Coherence: Global Coherence of Czech Texts in the Corpus-Based Perspective 2020 - 2023 GAČR
HPLT: High Performance Language Technologies 2022-2025 HE
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
OP VVV LINDAT: LINDAT/CLARIN - Research infrastructure for language technologies – extension of the repository and its computational power 2017–2019 MŠMT - OP VVV
LiFR: Linguistic Factors of Readability in Czech Administrative and Educational Texts 2019-2021 GAČR
RapiDisc: Metody pro rychlou diskurzní anotaci ve vybraných korpusech 2022-2024 GAČR
INTERCOST-Readability: Modelování komplexity českých literárních textů VI 2018 - X 2021 MŠMT
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
Named Entity Linking 2020-2022 GAUK
OmniOMR: OmniOMR - optical music recognition using machine learning for digital libraries 2023-2027 NAKI
EdUKate: Promoting digital education of foreign-language children through machine translation 2023-2026 TAČR
Mashcima: Synthetic training data generation and other methods for handwritten music recognition 2023-2025 GAUK
LAPPS-CLARIN: Transatlantic Collaboration between LAPPS and CLARIN: Semantic, Technical and Infrastructural Interoperability of Services 2016-2018, 2019-2021 Mellon Foundation (USA)
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
VALLEX - Between Reciprocity and Reflexivity: The Case of Czech Reciprocal Constructions 2018-2020 GAČR
Word-formation structure of Czech words: a data-based research 2019-2021 GAČR
Lexicons
Duration Provider
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START
CzeDParse: Automatická analýza diskurzních vztahů v češtině 2019-2021 GAČR
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
SEEM-CZ: Epistemic and Evidential Markers in Czech 2023-2025 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Modeling Mopheme Flow among Languages Jan 2024- Dec 2026 GAUK
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
NomVallex II.: Valency of Non-verbal Predicates. An Extension of Valency Studies to Adjectives and Deadjectival Nouns. 2019-2021 GAČR
VALLEX - Between Reciprocity and Reflexivity: The Case of Czech Reciprocal Constructions 2018-2020 GAČR
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR
Morphology
Duration Provider
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
Compound Identification and Splitting in Four Languages: A Deep Learning Approach 2022-2024 GAUK
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
LSD: Linguistic Structure Representation in Neural Networks 2018-2020 GAČR
Modeling Mopheme Flow among Languages Jan 2024- Dec 2026 GAUK
Morphological complexity of the verbal lexicon in four languages: Quantitative research based on corpus data 2023-2025 GAUK
Word-formation structure of Czech words: a data-based research 2019-2021 GAČR
Multilingual
Duration Provider
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START
Babel Octopus: Robust Multi-Source Speech Translation 2021-2023 START
Compound Identification and Splitting in Four Languages: A Deep Learning Approach 2022-2024 GAUK
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
CELL: Contextual Machine Learning of Language Translations 2020-2022 CELSA
ELG: European Language Grid 2019-2021 H2020
Exploring Multilingual Representations of Language Units in Neural Networks 2021 - 2023 GAUK
HPLT: High Performance Language Technologies 2022-2025 HE
Language Neutral and Culturally Aware Multilingual Neural Sentence Representations 2023-2026 UK
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
LSD: Linguistic Structure Representation in Neural Networks 2018-2020 GAČR
Mnohojazyčný strojový překlad 2018-2020 GAČR
Modeling Mopheme Flow among Languages Jan 2024- Dec 2026 GAUK
LangTech: Modernizace oboru Matematická lingvistika MŠMT - OP VVV
Morphological complexity of the verbal lexicon in four languages: Quantitative research based on corpus data 2023-2025 GAUK
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
Named Entity Linking 2020-2022 GAUK
NEUREM3: Neuronové reprezentace v multimodálním a mnohojazyčném modelování (Neural Representations in Multi-modal and Multi-lingual Modelling) 2019-2023 GAČR
EdUKate: Promoting digital education of foreign-language children through machine translation 2023-2026 TAČR
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
Semantics
Duration Provider
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START
Adapting Uniform Meaning Representation (UMR) for the Italic/Romance languages 2024-2026 GAUK
UNCE VITRI: Center for the Transdisciplinary Research of Violence, Trauma and Justice 2018-2023 UK
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
HVar: Disagreement in corpus annotation and variation of human understanding of text 2024-2026 GAČR
SEEM-CZ: Epistemic and Evidential Markers in Czech 2023-2025 GAČR
ELG: European Language Grid 2019-2021 H2020
ForFun2: ForFun2: Functions and Forms of Circumstantial Modifications 2023-2025 GAČR
Global Coherence: Global Coherence of Czech Texts in the Corpus-Based Perspective 2020 - 2023 GAČR
Independent component analysis of continuous word representations 2021–2022 GAUK
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
LiFR: Linguistic Factors of Readability in Czech Administrative and Educational Texts 2019-2021 GAČR
INTERCOST-Readability: Modelování komplexity českých literárních textů VI 2018 - X 2021 MŠMT
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
NG-NLG: Next-Generation Natural Language Generation 2022-2027 Horizon Europe, ERC
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
Using Auxiliary Subtasks for Learning Constraints in NLP 2023-2025 GAUK
VALLEX - Between Reciprocity and Reflexivity: The Case of Czech Reciprocal Constructions 2018-2020 GAČR
Duration Provider
ATRIUM: Advancing FronTier Research In the Arts and hUManities 2024 - 2027 HE
RES-Q Plus: Comprehensive solutions of healthcare improvement based on the global Registry of Stroke Care Quality 2022-2026 HE
ELE 2: European Language Equality 2 2022-2023 PPPA (EU)
EVERSE: European Virtual Institute for Research Software Excellence 2024-2027 HE
HumanE-AI-Net: HumanE AI Network 1. 9. 2020 - 31. 8. 2024 H2020
Identification and Prevention of Unwanted Gender Bias in Neural Language Models 2023-2024 GAČR
Improving stomach examinations with Artificial Intelligence: A deep learning approach for assisted gastroscopy 1. 7. 2024 - 31. 12. 2026 MŠMT
InCroMin: Interactive Crosslingual Minutes 2024 HE
Methods for improving neural machine translation of diverse texts 2023-2025 GAUK
Prameny Krkonoš: Prameny Krkonoš. Vývoj systému evidence, zpracování a prezentace pramenů k historii a kultuře Krkonoš a jeho využití ve výzkumu a edukaci 2020-2022 NAKI
PROGRES Q48 - Informatika: Programy progres 2017-2021 UK
PROGRES Q18 - Společenské vědy: Programy progres 2017-2021 UK
SSHOC: Social Sciences & Humanities Open Cloud 2019-30/04/2022 H2020
test
test2
MEMORISE: Virtualisation and Multimodal Exploration of Heritage on Nazi Persecution 2022-2026 HE
Machine Learning
Duration Provider
Arithmetic Properties in the space of Language Model Prompts 2023 GAUK
PONK: Asistent přístupné úřední komunikace 9/2023-12/2025 TAČR
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
Compound Identification and Splitting in Four Languages: A Deep Learning Approach 2022-2024 GAUK
CELL: Contextual Machine Learning of Language Translations 2020-2022 CELSA
Dialogue systems focused on combining tasks and chit-chat 2021-2023 GAUK
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
Domain Adaptation for Natural Language Generation 2020-2022 GAUK
ECSS: Evaluation of conversational speech synthesis 2022-2024 GAUK
Exploring Multilingual Representations of Language Units in Neural Networks 2021 - 2023 GAUK
HPLT: High Performance Language Technologies 2022-2025 HE
Independent component analysis of continuous word representations 2021–2022 GAUK
Language Neutral and Culturally Aware Multilingual Neural Sentence Representations 2023-2026 UK
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
LSD: Linguistic Structure Representation in Neural Networks 2018-2020 GAČR
Low resource methods for dialogue systems applications 2020 - 2022 GAUK
Mnohojazyčný strojový překlad 2018-2020 GAČR
LangTech: Modernizace oboru Matematická lingvistika MŠMT - OP VVV
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR
Multimodal Optical Music Recognition using Deep Learning 2017-2019 GAUK
Named Entity Linking 2020-2022 GAUK
NEUREM3: Neuronové reprezentace v multimodálním a mnohojazyčném modelování (Neural Representations in Multi-modal and Multi-lingual Modelling) 2019-2023 GAČR
NG-NLG: Next-Generation Natural Language Generation 2022-2027 Horizon Europe, ERC
OmniOMR: OmniOMR - optical music recognition using machine learning for digital libraries 2023-2027 NAKI
Mashcima: Synthetic training data generation and other methods for handwritten music recognition 2023-2025 GAUK
THEaiTRE: THEAITRE: Umělá inteligence autorem divadelní hry? April 2020 - September 2022 TAČR
Using Auxiliary Subtasks for Learning Constraints in NLP 2023-2025 GAUK
Corpora
Duration Provider
PONK: Asistent přístupné úřední komunikace 9/2023-12/2025 TAČR
CzeDParse: Automatická analýza diskurzních vztahů v češtině 2019-2021 GAČR
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] 2023–2027 NAKI
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
SEEM-CZ: Epistemic and Evidential Markers in Czech 2023-2025 GAČR
ELG: European Language Grid 2019-2021 H2020
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR
Global Coherence: Global Coherence of Czech Texts in the Corpus-Based Perspective 2020 - 2023 GAČR
HPLT: High Performance Language Technologies 2022-2025 HE
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
OP VVV LINDAT: LINDAT/CLARIN - Research infrastructure for language technologies – extension of the repository and its computational power 2017–2019 MŠMT - OP VVV
LiFR: Linguistic Factors of Readability in Czech Administrative and Educational Texts 2019-2021 GAČR
RapiDisc: Metody pro rychlou diskurzní anotaci ve vybraných korpusech 2022-2024 GAČR
INTERCOST-Readability: Modelování komplexity českých literárních textů VI 2018 - X 2021 MŠMT
Morphological complexity of the verbal lexicon in four languages: Quantitative research based on corpus data 2023-2025 GAUK
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK
LAPPS-CLARIN: Transatlantic Collaboration between LAPPS and CLARIN: Semantic, Technical and Infrastructural Interoperability of Services 2016-2018, 2019-2021 Mellon Foundation (USA)
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
NomVallex II.: Valency of Non-verbal Predicates. An Extension of Valency Studies to Adjectives and Deadjectival Nouns. 2019-2021 GAČR
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR
Discourse
Duration Provider
CzeDParse: Automatická analýza diskurzních vztahů v češtině 2019-2021 GAČR
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] 2023–2027 NAKI
UNCE VITRI: Center for the Transdisciplinary Research of Violence, Trauma and Justice 2018-2023 UK
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
Global Coherence: Global Coherence of Czech Texts in the Corpus-Based Perspective 2020 - 2023 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
LiFR: Linguistic Factors of Readability in Czech Administrative and Educational Texts 2019-2021 GAČR
Low resource methods for dialogue systems applications 2020 - 2022 GAUK
RapiDisc: Metody pro rychlou diskurzní anotaci ve vybraných korpusech 2022-2024 GAČR
INTERCOST-Readability: Modelování komplexity českých literárních textů VI 2018 - X 2021 MŠMT
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK
NaMuDDiS: Natural multi-domain dialogue systems 2019-2021 UK
Parsers
Duration Provider
CzeDParse: Automatická analýza diskurzních vztahů v češtině 2019-2021 GAČR
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
ELG: European Language Grid 2019-2021 H2020
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
LSD: Linguistic Structure Representation in Neural Networks 2018-2020 GAČR
RapiDisc: Metody pro rychlou diskurzní anotaci ve vybraných korpusech 2022-2024 GAČR
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
Monolingual
Duration Provider
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] 2023–2027 NAKI
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR
HPLT: High Performance Language Technologies 2022-2025 HE
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
LiFR: Linguistic Factors of Readability in Czech Administrative and Educational Texts 2019-2021 GAČR
VALLEX - Between Reciprocity and Reflexivity: The Case of Czech Reciprocal Constructions 2018-2020 GAČR
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR
Tools
Duration Provider
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] 2023–2027 NAKI
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
Compound Identification and Splitting in Four Languages: A Deep Learning Approach 2022-2024 GAUK
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
ELG: European Language Grid 2019-2021 H2020
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
OP VVV LINDAT: LINDAT/CLARIN - Research infrastructure for language technologies – extension of the repository and its computational power 2017–2019 MŠMT - OP VVV
Mashcima: Synthetic training data generation and other methods for handwritten music recognition 2023-2025 GAUK
The Anthropology of Artificial Intelligence: Ethics, Understanding, Human Nature 2023-2024 ETF UK
THEaiTRE: THEAITRE: Umělá inteligence autorem divadelní hry? April 2020 - September 2022 TAČR
LAPPS-CLARIN: Transatlantic Collaboration between LAPPS and CLARIN: Semantic, Technical and Infrastructural Interoperability of Services 2016-2018, 2019-2021 Mellon Foundation (USA)
AIvK Exponát Didaktikon: Život s umělou inteligencí: upgrade 2023-09-01 - 2023-12-31 UK
Machine Translation
Duration Provider
Babel Octopus: Robust Multi-Source Speech Translation 2021-2023 START
Bergamot: Browser-based Multilingual Translation 2019-2021 H2020
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CELL: Contextual Machine Learning of Language Translations 2020-2022 CELSA
ELG: European Language Grid 2019-2021 H2020
ELITR: European Live Translator 2019-2021 H2020
HPLT: High Performance Language Technologies 2022-2025 HE
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
LSD: Linguistic Structure Representation in Neural Networks 2018-2020 GAČR
Machine Translation of Interpreted Speech 2020-2022 GAUK
Mnohojazyčný strojový překlad 2018-2020 GAČR
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
EdUKate: Promoting digital education of foreign-language children through machine translation 2023-2026 TAČR
Research of Methods of Neural Machine Translation Evaluation 2018-2020 GAUK
Using Auxiliary Subtasks for Learning Constraints in NLP 2023-2025 GAUK
Utilising Linguistic Knowledge in Neural Machine Translation 2018 - 2020 GAUK
Speech Recognition
Duration Provider
Babel Octopus: Robust Multi-Source Speech Translation 2021-2023 START
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
ELG: European Language Grid 2019-2021 H2020
ELITR: European Live Translator 2019-2021 H2020
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Machine Translation of Interpreted Speech 2020-2022 GAUK
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
Information Structure
Duration Provider
CEDMO 2.0 NPO 1.9. 2024 - 30. 4. 2026 MPO
Exploring Multilingual Representations of Language Units in Neural Networks 2021 - 2023 GAUK
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
LiFR: Linguistic Factors of Readability in Czech Administrative and Educational Texts 2019-2021 GAČR
INTERCOST-Readability: Modelování komplexity českých literárních textů VI 2018 - X 2021 MŠMT
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK
Multi-modality
Duration Provider
CEDMO 2.0 NPO 1.9. 2024 - 30. 4. 2026 MPO
CEMI: Center for large-scale multi-modal data interpretation 2012 - 2019 GAČR
UNCE VITRI: Center for the Transdisciplinary Research of Violence, Trauma and Justice 2018-2023 UK
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CELL: Contextual Machine Learning of Language Translations 2020-2022 CELSA
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
Language Neutral and Culturally Aware Multilingual Neural Sentence Representations 2023-2026 UK
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Machine Translation of Interpreted Speech 2020-2022 GAUK
Multimodal Optical Music Recognition using Deep Learning 2017-2019 GAUK
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
NEUREM3: Neuronové reprezentace v multimodálním a mnohojazyčném modelování (Neural Representations in Multi-modal and Multi-lingual Modelling) 2019-2023 GAČR
EdUKate: Promoting digital education of foreign-language children through machine translation 2023-2026 TAČR
Coreference
Duration Provider
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Using Auxiliary Subtasks for Learning Constraints in NLP 2023-2025 GAUK
Linked data
Duration Provider
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
ELG: European Language Grid 2019-2021 H2020
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
NG-NLG: Next-Generation Natural Language Generation 2022-2027 Horizon Europe, ERC
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
Publications
Duration Provider
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Taggers
Duration Provider
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
ELG: European Language Grid 2019-2021 H2020
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
LSD: Linguistic Structure Representation in Neural Networks 2018-2020 GAČR
Named Entity Linking 2020-2022 GAUK
Valency
Duration Provider
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
NomVallex II.: Valency of Non-verbal Predicates. An Extension of Valency Studies to Adjectives and Deadjectival Nouns. 2019-2021 GAČR
VALLEX - Between Reciprocity and Reflexivity: The Case of Czech Reciprocal Constructions 2018-2020 GAČR
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR
Teaching
Duration Provider
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
LCT: European Masters Program Language and Communication Technologies IX.2007-VIII.2013, IX.2013-VIII.2019, IX.2019-VIII.2025 EU ERASMUS MUNDUS
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR
INTERCOST-Readability: Modelování komplexity českých literárních textů VI 2018 - X 2021 MŠMT
LangTech: Modernizace oboru Matematická lingvistika MŠMT - OP VVV
NaMuDDiS: Natural multi-domain dialogue systems 2019-2021 UK
AIvK Exponát Didaktikon: Život s umělou inteligencí: upgrade 2023-09-01 - 2023-12-31 UK
Psycholinguistics
Duration Provider
HVar: Disagreement in corpus annotation and variation of human understanding of text 2024-2026 GAČR
Syntax
Duration Provider
ELG: European Language Grid 2019-2021 H2020
ForFun2: ForFun2: Functions and Forms of Circumstantial Modifications 2023-2025 GAČR
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
LiFR: Linguistic Factors of Readability in Czech Administrative and Educational Texts 2019-2021 GAČR
LSD: Linguistic Structure Representation in Neural Networks 2018-2020 GAČR
INTERCOST-Readability: Modelování komplexity českých literárních textů VI 2018 - X 2021 MŠMT
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
VALLEX - Between Reciprocity and Reflexivity: The Case of Czech Reciprocal Constructions 2018-2020 GAČR
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR
Multiword Expressions
Duration Provider
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
Speech Retrieval
Duration Provider
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Spellcheckers
Duration Provider
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Provider: HE
Duration Provider Grant ID PI Area
EVERSE: European Virtual Institute for Research Software Excellence 2024-2027 HE 101129744 Pavel Straňák
ATRIUM: Advancing FronTier Research In the Arts and hUManities 2024 - 2027 HE 101132163 Pavel Straňák
InCroMin: Interactive Crosslingual Minutes 2024 HE 101070631 Ondřej Bojar
MEMORISE: Virtualisation and Multimodal Exploration of Heritage on Nazi Persecution 2022-2026 HE 101061016 Pavel Pecina
RES-Q Plus: Comprehensive solutions of healthcare improvement based on the global Registry of Stroke Care Quality 2022-2026 HE 101057603 Pavel Pecina
HPLT: High Performance Language Technologies 2022-2025 HE 101070350 Jan Hajič Corpora, Data, Machine Learning, Machine Translation, Monolingual, Multilingual
Provider: Social Sciences and Humanities Research Council of Canada
Duration Provider Grant ID PI Area
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada 895-2023-1002 Jan Hajič jr. Corpora, Data, Information Retrieval, Linked data, Machine Learning, Multi-modality, Tools
Provider: ETF UK
Duration Provider Grant ID PI Area
The Anthropology of Artificial Intelligence: Ethics, Understanding, Human Nature 2023-2024 ETF UK 247002 Rudolf Rosa Generation, Tools
Provider: Horizon Europe, ERC
Duration Provider Grant ID PI Area
NG-NLG: Next-Generation Natural Language Generation 2022-2027 Horizon Europe, ERC 101039303 Ondřej Dušek Dialog, Generation, Linked data, Machine Learning, Semantics
Provider: PPPA (EU)
Duration Provider Grant ID PI Area
ELE 2: European Language Equality 2 2022-2023 PPPA (EU) LC-01884166 (Project 101075356) Jan Hajič
Provider: CELSA
Duration Provider Grant ID PI Area
CELL: Contextual Machine Learning of Language Translations 2020-2022 CELSA CELSA/19/018 Pavel Pecina Machine Learning, Machine Translation, Multi-modality, Multilingual

MŠMT - velké infrastruktury

Duration Provider Grant ID PI Area
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury LM2015071 Jan Hajič Annotations, Coreference, Corpora, Data, Dialog, Discourse, Lexicons, Linked data, Machine Learning, Machine Translation, Morphology, Multi-modality, Parsers, Publications, Semantics, Speech Recognition, Taggers, Tools, Valency
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury LM2023062 Jan Hajič Annotations, Coreference, Corpora, Data, Dialog, Discourse, Generation, Information Structure, Lexicons, Linked data, Machine Learning, Machine Translation, Monolingual, Morphology, Multi-modality, Multilingual, Multiword Expressions, Parsers, Publications, Semantics, Speech Recognition, Speech Retrieval, Spellcheckers, Syntax, Taggers, Tools, Valency
Provider: MPO
Duration Provider Grant ID PI Area
CEDMO 2.0 NPO 1.9. 2024 - 30. 4. 2026 MPO MPO 60273/24/21300/21000 Ondřej Bojar Data, Information Retrieval, Information Structure, Multi-modality

Institutional support for research at the Charles University

Duration Provider Grant ID PI Area
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK UNCE/24/SSH/009 Zdeněk Žabokrtský Annotations, Corpora, Data, Discourse, Information Structure, Multilingual
Language Neutral and Culturally Aware Multilingual Neural Sentence Representations 2023-2026 UK PRIMUS/23/SCI/023 Jindřich Libovický Machine Learning, Multi-modality, Multilingual
AIvK Exponát Didaktikon: Život s umělou inteligencí: upgrade 2023-09-01 - 2023-12-31 UK Unknown Rudolf Rosa Generation, Teaching, Tools
NaMuDDiS: Natural multi-domain dialogue systems 2019-2021 UK PRIMUS 19/SCI/10 Ondřej Dušek Dialog, Discourse, Generation, Teaching
UNCE VITRI: Center for the Transdisciplinary Research of Violence, Trauma and Justice 2018-2023 UK UNCE/HUM/009 Jakub Mlynář Data, Discourse, Multi-modality, Semantics
PROGRES Q48 - Informatika: Programy progres 2017-2021 UK Q48 Jan Hajič
PROGRES Q18 - Společenské vědy: Programy progres 2017-2021 UK Q18 Jan Hajič

Horizon 2020 - European Commission

Duration Provider Grant ID PI Area
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020 101004984 Silvie Cinková Annotations, Corpora, Data, Multilingual, Parsers, Semantics, Taggers, Teaching, Tools
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020 870930 Pavel Pecina Annotations, Data, Dialog, Linked data, Machine Translation, Multi-modality, Multilingual, Parsers, Semantics, Speech Recognition
SSHOC: Social Sciences & Humanities Open Cloud 2019-30/04/2022 H2020 823782 Jan Hajič
ELG: European Language Grid 2019-2021 H2020 825627 Jan Hajič Annotations, Corpora, Data, Linked data, Machine Translation, Multilingual, Parsers, Semantics, Speech Recognition, Syntax, Taggers, Tools
Bergamot: Browser-based Multilingual Translation 2019-2021 H2020 825303 Ondřej Bojar Machine Translation
ELITR: European Live Translator 2019-2021 H2020 825460 Ondřej Bojar Machine Translation, Speech Recognition
HumanE-AI-Net: HumanE AI Network 1. 9. 2020 - 31. 8. 2024 H2020 952026 Jan Hajič

EU ERASMUS MUNDUS

Duration Provider Grant ID PI Area
LCT: European Masters Program Language and Communication Technologies IX.2007-VIII.2013, IX.2013-VIII.2019, IX.2019-VIII.2025 EU ERASMUS MUNDUS 610622-EPP-1-2019-1-DE-EPPKA1-JMD-MOB Vladislav Kuboň Teaching

Mellon Foundation (USA)

Duration Provider Grant ID PI Area
LAPPS-CLARIN: Transatlantic Collaboration between LAPPS and CLARIN: Semantic, Technical and Infrastructural Interoperability of Services 2016-2018, 2019-2021 Mellon Foundation (USA) G-1901-06505 Jan Hajič Annotations, Corpora, Data, Tools

MŠMT - OP VVV

Duration Provider Grant ID PI Area
OP VVV LINDAT: LINDAT/CLARIN - Research infrastructure for language technologies – extension of the repository and its computational power 2017–2019 MŠMT - OP VVV CZ.02.1.01/0.0/0.0/16_013/0001781 Jan Hajič Annotations, Corpora, Data, Tools
LangTech: Modernizace oboru Matematická lingvistika MŠMT - OP VVV CZ.02.2.69/0.0/0.0/16_018/0002373 Zdeněk Žabokrtský Machine Learning, Multilingual, Teaching

Technology Agency (Czech Republic)

Duration Provider Grant ID PI Area
THEaiTRE: THEAITRE: Umělá inteligence autorem divadelní hry? April 2020 - September 2022 TAČR TL03000348 Rudolf Rosa Dialog, Generation, Machine Learning, Tools
PONK: Asistent přístupné úřední komunikace 9/2023-12/2025 TAČR TQ01000526 Barbora Vidová Hladká Annotations, Corpora, Machine Learning
EdUKate: Promoting digital education of foreign-language children through machine translation 2023-2026 TAČR TQ01000458 Lucie Poláková Data, Machine Translation, Multi-modality, Multilingual
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR FW03010656 Pavel Pecina Generation, Information Retrieval, Information Structure, Machine Learning, Machine Translation, Semantics
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR TQ01000153 Rudolf Rosa Annotations, Corpora, Generation, Monolingual, Teaching, Tools
EDU-AI: AI asistent pro žáky a učitele 04/2021-12/2023 TAČR TL05000236 Ondřej Dušek Dialog, Generation, Information Retrieval

Czech Science Foundation

Duration Provider Grant ID PI Area
AIAI: AI: Authorship and Interpretation 2025-2027 GAČR 25-14501L Rudolf Rosa Generation
HVar: Disagreement in corpus annotation and variation of human understanding of text 2024-2026 GAČR 24-11132S Šárka Zikánová Annotations, Data, Psycholinguistics, Semantics
SEEM-CZ: Epistemic and Evidential Markers in Czech 2023-2025 GAČR 23-05240S Barbora Štěpánková Annotations, Corpora, Data, Lexicons, Semantics
ForFun2: ForFun2: Functions and Forms of Circumstantial Modifications 2023-2025 GAČR 23-05238S Marie Mikulová Annotations, Semantics, Syntax
Identification and Prevention of Unwanted Gender Bias in Neural Language Models 2023-2024 GAČR 23-06912S David Mareček
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR 22-20927S Veronika Kolářová Annotations, Corpora, Lexicons, Monolingual, Syntax, Valency
RapiDisc: Metody pro rychlou diskurzní anotaci ve vybraných korpusech 2022-2024 GAČR 22-03269S Jiří Mírovský Annotations, Corpora, Data, Discourse, Parsers
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR GX20-16819X Jan Hajič Coreference, Machine Learning, Machine Translation, Parsers, Semantics, Syntax, Valency
Global Coherence: Global Coherence of Czech Texts in the Corpus-Based Perspective 2020 - 2023 GAČR 20-09853S Lucie Poláková Annotations, Corpora, Data, Discourse, Semantics
NEUREM3: Neuronové reprezentace v multimodálním a mnohojazyčném modelování (Neural Representations in Multi-modal and Multi-lingual Modelling) 2019-2023 GAČR 19-26934X Ondřej Bojar Machine Learning, Multi-modality, Multilingual
CzeDParse: Automatická analýza diskurzních vztahů v češtině 2019-2021 GAČR 19-03490S Jiří Mírovský Annotations, Corpora, Data, Discourse, Lexicons, Parsers
NomVallex II.: Valency of Non-verbal Predicates. An Extension of Valency Studies to Adjectives and Deadjectival Nouns. 2019-2021 GAČR 19-16633S Veronika Kolářová Corpora, Lexicons, Valency
LiFR: Linguistic Factors of Readability in Czech Administrative and Educational Texts 2019-2021 GAČR 19-19191S Silvie Cinková Annotations, Corpora, Data, Discourse, Information Structure, Monolingual, Semantics, Syntax
Word-formation structure of Czech words: a data-based research 2019-2021 GAČR 19-14534S Magda Ševčíková Data, Morphology
Mnohojazyčný strojový překlad 2018-2020 GAČR 18-24210S Ondřej Bojar Machine Learning, Machine Translation, Multilingual
LSD: Linguistic Structure Representation in Neural Networks 2018-2020 GAČR 18-02196S David Mareček Machine Learning, Machine Translation, Morphology, Multilingual, Parsers, Syntax, Taggers
VALLEX - Between Reciprocity and Reflexivity: The Case of Czech Reciprocal Constructions 2018-2020 GAČR 18-03984S Markéta Lopatková Data, Lexicons, Monolingual, Semantics, Syntax, Valency
CEMI: Center for large-scale multi-modal data interpretation 2012 - 2019 GAČR GAP103/12/G084 Pavel Pecina Multi-modality

Ministry of Education, Youth and Sport (Czech Republic)

Duration Provider Grant ID PI Area
INTERCOST-Readability: Modelování komplexity českých literárních textů VI 2018 - X 2021 MŠMT LTC18020 Silvie Cinková Annotations, Corpora, Data, Discourse, Information Structure, Semantics, Syntax, Teaching
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT LUAUS23283 Jan Hajič Corpora, Data, Lexicons, Linked data, Multilingual, Multiword Expressions, Semantics, Syntax, Valency
Improving stomach examinations with Artificial Intelligence: A deep learning approach for assisted gastroscopy 1. 7. 2024 - 31. 12. 2026 MŠMT LUABA24136 Pavel Pecina

Ministry of Culture

Duration Provider Grant ID PI Area
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] 2023–2027 NAKI DH23P03OVV037 Kateřina Rysová Corpora, Data, Discourse, Monolingual, Tools
OmniOMR: OmniOMR - optical music recognition using machine learning for digital libraries 2023-2027 NAKI DH23P03OVV008 Jan Hajič jr. Annotations, Data, Machine Learning
Prameny Krkonoš: Prameny Krkonoš. Vývoj systému evidence, zpracování a prezentace pramenů k historii a kultuře Krkonoš a jeho využití ve výzkumu a edukaci 2020-2022 NAKI DG20P02OVV010 Petra Hoffmannová

Program START (UK - OP VVV)

Duration Provider Grant ID PI Area
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START START/HUM/010 Annotations, Data, Lexicons, Morphology, Multilingual, Semantics
Babel Octopus: Robust Multi-Source Speech Translation 2021-2023 START START/SCI/089 Peter Polák Machine Translation, Multilingual, Speech Recognition

Grant Agency of the Charles University

Duration Provider Grant ID PI Area
Modeling Mopheme Flow among Languages Jan 2024- Dec 2026 GAUK 101924 Abishek Stephen Lexicons, Morphology, Multilingual
Adapting Uniform Meaning Representation (UMR) for the Italic/Romance languages 2024-2026 GAUK 104924 Federica Gamba Data, Semantics
Mashcima: Synthetic training data generation and other methods for handwritten music recognition 2023-2025 GAUK 289623 Jiří Mayer Data, Machine Learning, Tools
Methods for improving neural machine translation of diverse texts 2023-2025 GAUK 244523 Josef Jon
Using Auxiliary Subtasks for Learning Constraints in NLP 2023-2025 GAUK 272323 Dávid Javorský Coreference, Machine Learning, Machine Translation, Semantics
Morphological complexity of the verbal lexicon in four languages: Quantitative research based on corpus data 2023-2025 GAUK 246723 Hana Hledíková Corpora, Morphology, Multilingual
Arithmetic Properties in the space of Language Model Prompts 2023 GAUK 291923 Generation, Machine Learning
ECSS: Evaluation of conversational speech synthesis 2022-2024 GAUK 40222 Ondřej Plátek Data, Dialog, Generation, Machine Learning
Compound Identification and Splitting in Four Languages: A Deep Learning Approach 2022-2024 GAUK 128122 Emil Svoboda Machine Learning, Morphology, Multilingual, Tools
Independent component analysis of continuous word representations 2021–2022 GAUK 370721 Tomáš Musil Annotations, Machine Learning, Semantics
Controllable NLG: Controllable Natural Language Generation 2021-2023 GAUK 39221 Sourabrata Mukherjee Generation
Dialogue systems focused on combining tasks and chit-chat 2021-2023 GAUK 373921 Dialog, Generation, Machine Learning
Exploring Multilingual Representations of Language Units in Neural Networks 2021 - 2023 GAUK 338521 Tomasz Limisiewicz Information Structure, Machine Learning, Multilingual
Named Entity Linking 2020-2022 GAUK 1280120 Data, Machine Learning, Multilingual, Taggers
Machine Translation of Interpreted Speech 2020-2022 GAUK 398120 Dominik Macháček Machine Translation, Multi-modality, Speech Recognition
Domain Adaptation for Natural Language Generation 2020-2022 GAUK 140320 Zdeněk Kasner Data, Generation, Machine Learning
Low resource methods for dialogue systems applications 2020 - 2022 GAUK 302120 Dialog, Discourse, Machine Learning
Research of Methods of Neural Machine Translation Evaluation 2018-2020 GAUK 1140218 Dušan Variš Machine Translation
Utilising Linguistic Knowledge in Neural Machine Translation 2018 - 2020 GAUK 976518 Jindřich Helcl Machine Translation
Multimodal Optical Music Recognition using Deep Learning 2017-2019 GAUK 1444217 Jan Hajič jr. Machine Learning, Multi-modality
National Scientific Foundation
Duration Provider Area
PIRE: Partnership for International Research and Education till 2014 NSF Machine Translation, Semantics, Speech Recognition, Teaching

Horizon 2020 - European Commission

Duration Provider Area
CLARIN-PLUS September 2015 – August 2017 H2020
QT21: Quality Translation 21 II.2015-I.2018 H2020 Data, Lexicons, Linked data, Machine Learning, Machine Translation, Tools
KConnect: Khresmoi Multilingual Medical Text Analysis, Search and Machine Translation Connected in a Thriving Data-Value Chain 2015-2017 H2020 Information Retrieval, Machine Translation, Semantics
HimL: Health in my Language 2.2015–1.2018 H2020 Data, Lexicons, Machine Translation, Morphology
CRACKER: Cracking the Language Barrier: Coordination, Evaluation and Resources for European MT Research 1.2015-12.2017 H2020 Data, Machine Translation

FP6: Research - European Commission

Duration Provider Area
EuroMatrix IX.2006-II.2009 FP6 Annotations, Corpora, Machine Translation, Tools, Valency

Grant Agency of the Charles University

Duration Provider Area
Neural machine translation for low-resource languages 2019-2021 GAUK Machine Translation, Monolingual
Developing derivational networks for multiple languages 2019-2021 GAUK Data, Morphology, Multilingual
Vektorová reprezentace textu založená na neuronových sítích 2019 - 2021 GAUK Information Retrieval, Machine Learning, Machine Translation
Universal morphosyntactic annotation of language data 2017-2019 GAUK Annotations, Corpora, Machine Learning, Multilingual, Parsers
DeepSynt: Deep Syntactic Representation across Languages 2017-2018 GAUK Corpora, Data, Multilingual
Open domain dialog management with knowledge graphs 2016-2018 GAUK Data, Dialog, Machine Learning
open-domain SLU: Spoken Language Understanding in open-domain environment 2016-2018 GAUK Dialog, Information Retrieval, Linked data, Machine Learning, Semantics
ANNMT: Utilization of artificial neural networks in machine translation 2016-2018 GAUK Machine Translation
Using Language Knowledge in Scene Text Recognition 2015-2017 GAUK Multi-modality
cross-coref: Cross-lingual approaches to coreference resolution 2015-2017 GAUK Annotations, Coreference, Corpora, Data, Machine Learning, Machine Translation, Multilingual
DiaMine: Information mining from spoken dialogue 2015-2017 GAUK Data, Dialog, Machine Learning, Speech Recognition
Čapek GAUK: An alternative way of getting more annotated linguistic data 2014-2016 GAUK Annotations, Tools
AdaNLG: An adaptive natural language generator 2014-2016 GAUK Dialog, Generation, Multilingual, Semantics
croSSSynt: Modelling dependency syntax across languages 2014-2016 GAUK Annotations, Corpora, Data, Multilingual, Parsers
MSDS: Modern Spoken Dialog Systems 2014, 2015, 2016 GAUK Data, Dialog, Machine Learning, Speech Recognition
DepRefSet: Utilizing a Multitude of References in Machine Translation 2013-2015 GAUK Data, Machine Translation
Interactive information retrieval in audiovisual dialogue corpora 2013-2015 GAUK Information Retrieval, Speech Retrieval
Tools and data for Machine Translation between Related Languages 2012-2013 GAUK Corpora, Data, Machine Translation, Tools, Valency
Utilization of coreference in MT: Utilization of coreference in Machine Translation 2011-2013 GAUK Linked data, Machine Translation
Sentence-Level Polarity Detection in a Computer Corpus 2011-2013 GAUK Annotations, Corpora, Data, Lexicons, Tools

Czech Science Foundation

Duration Provider Area
AnaConn: Anaphoricity in Connectives: Lexical Description and Bilingual Corpus Analysis 2017–2019 GAČR Discourse, Lexicons, Multilingual
ForFun: Subcategorization of adverbial meanings based on corpus data 2017-2019 GAČR Annotations, Corpora, Data, Monolingual, Semantics
IRTC: Implicit Relations in Text Coherence 2017-2019 GAČR Annotations, Corpora, Data, Discourse, Psycholinguistics
CzEngClass: Contextually-based synonymy and valency of verbs in a bilingual setting 2017-2019 GAČR Annotations, Corpora, Data, Lexicons, Semantics, Valency
CorefChains: Structure of coreferential chains in parallel language data 2016-2018 GAČR Annotations, Coreference, Corpora, Data
NomVallex: Corpus-based Valency Lexicon of Czech Nouns 2016-2018 GAČR Corpora, Lexicons, Valency
DerInfMorph: An Integrated Approach to Derivational and Inflectional Morphology of Czech 2016-2018 GAČR Data, Monolingual, Morphology
Manyla: Morphologically and Syntactically Annotated Corpora of Many Languages 2015–2017 GAČR Annotations, Corpora, Data, Morphology, Multilingual, Parsers, Taggers
zelligharris: Reviving Zellig S. Harris: More linguistic information for distributional lexical analysis of English and Czech 2015-2017 GAČR Annotations, Corpora, Data, Semantics, Taggers
On Linguistic Structure of Evaluative Meaning in Czech 2015-2017 GAČR Annotations, Corpora, Data, Lexicons, Semantics
Combining Words: Syntactic Properties of Czech Multiword Expressions with Light Verbs 2015-2017 GAČR Annotations, Data, Lexicons, Multiword Expressions, Valency
LiStr: Sentence structure induction without annotated corpora 2014 - 2016 GAČR Machine Learning, Multilingual, Parsers
CzEngVallex: A comparison of Czech and English verbal valency based on corpus material (theory and practice) 2013-2015 GAČR Annotations, Corpora, Data, Lexicons
Vybrané derivační vztahy pro automatické zpracovaní češtiny 2012–2014 GAČR Morphology
VALLEX: Delving Deeper: Lexicographic Description of Syntactic and Semantic Properties of Czech Verbs 2012-2015 GAČR Annotations, Data, Lexicons, Semantics, Syntax, Valency
Systematic, economical and corpus-based description of valency properties of Czech deverbal nouns (theory and practice) 2012-2014 GAČR Lexicons, Valency
CorefDisk: Coreference, Discourse Relations and Information Structure in a Contrastive Perspective 2012 - 2015 GAČR Annotations, Coreference, Corpora, Data, Discourse, Information Structure
CZECHMATE: Čeština ve věku strojového překladu 2011 – 2013 GAČR Annotations, Corpora, Data, Machine Translation, Morphology, Parsers
NoSCoM: Non-Standard Computational Models and Their Applications in Complexity, Linguistics, and Learning 2010-2014 GAČR
Komputační lingvistika: Explicitní popis jazyka a anotovaná data se zřetelem na češtinu 2010-2013 GAČR Annotations, Coreference, Corpora, Data, Discourse, Information Structure

OP Praha – Pól růstu ČR

Duration Provider Area
MTviet: Machine Translation from Vietnamese into Czech for the Purposes of the Police of the Czech Republic 2017-2018 Praha OP PPR Machine Translation

Ministry of Culture

Duration Provider Area
ÚSTR: Systém pro trvalé uchování dokumentace a prezentaci historichých pramenů z období totalitních režimů 2016-2019 NAKI
VIADAT: Virtuální asistent pro zpřístupnění historických audiovizuálních dat 2016-2019 NAKI Annotations, Speech Recognition, Tools
AMALACH 2012-2015 NAKI Information Retrieval, Machine Translation, Multi-modality, Speech Recognition, Speech Retrieval, Teaching
EVALD (Evaluator of Discourse): Automatic Evaluation of Text Coherence in Czech 1. 3. 2016 – 31. 12. 2019 NAKI Coreference, Discourse, Information Structure

Ministry of Education, Youth and Sport (Czech Republic)

Duration Provider Area
Multilingual Corpus Annotation as a Support for Language Technologies 2014-2016 MŠMT Annotations, Coreference, Corpora, Data, Discourse
MOBAme: Modern Bayesian methods in machine learning 2013-2013 MŠMT Teaching
VYSTADIAL: Development of statistical methods for spoken dialogue systems 2012-2016 MŠMT Corpora, Dialog, Speech Recognition, Tools
KontaktII: Strojový překlad se sémantickou informací 2012-2014 MŠMT Annotations, Corpora, Data, Lexicons, Machine Translation, Semantics, Valency
LINDAT/Clarin: Establishing and operating the Czech node of pan-European infrastructure for research (Vybudování a provoz českého uzlu pan-evropské infrastruktury pro výzkum) 2010-2015 MŠMT Annotations, Coreference, Corpora, Data, Dialog, Discourse, Lexicons, Linked data, Machine Learning, Machine Translation, Morphology, Multi-modality, Parsers, Publications, Semantics, Speech Recognition, Taggers, Tools, Valency
Kontakt: Towards a Computational Analysis of Text Structure 2010 - 2012 MŠMT Annotations, Coreference, Corpora, Data, Discourse
TextLink-cz: TextLink: Skladba diskurzu v evropských jazycích 1.11.2015 - 31.12.2017 MŠMT Annotations, Corpora, Data, Discourse, Lexicons, Linked data, Monolingual
LD-Parseme: PARSEME: Parsing a víceslovné výrazy – k jazykovědné přesnosti a výpočetní efektivitě ve zpracování přirozeného jazyka 04-2014 – 03-2017 MŠMT Lexicons, Multiword Expressions, Semantics, Valency

FP7: Research - European Commission

Duration Provider Area
TextLink: TextLink: Structuring Discourse in Multilingual Europe 2014 - 2017 FP7 Coreference, Corpora, Discourse, Linked data, Multilingual
QTLeap: Quality Translation by Deep Language Engineering Approaches 2013–2016 FP7 Linked data, Machine Translation
PARSEME: PARSEME: Parsing and Multiword Expressions 2013-2017 FP7 Lexicons, Multiword Expressions, Semantics, Valency
MosesCore 2012-2015 FP7 Data, Machine Translation, Teaching, Tools
EUDAT: EUDAT: European Data Infrastructure 2011–2014 FP7 Data
FAUST: Feedback Analysis for User adaptive Statistical Translation 2010–2013 FP7 Machine Translation
KHRESMOI: Medical information analysis and retrieval 2010-2014 FP7 Information Retrieval, Machine Translation
CLARA: Common Language Resources and their Applications - a Marie Curie ITN 2009-2013 FP7 Annotations, Corpora, Data, Machine Translation, Teaching
EuroMatrixPlus 2009-2012 FP7 Machine Translation

Institutional support for research at the Charles University

Duration Provider Area
PRVOUK: Programy rozvoje vědních oblastí na Univerzitě Karlově - Informatika 2012-2016 UK

Technology Agency (Czech Republic)

Duration Provider Area
INTLIB: Intelligent library 2012-2015 TAČR Data, Linked data, Tools

EU Lifelong Learning Programme

Duration Provider Area
Merlin 2012-2014 LLP Annotations, Corpora, Data

MVČR

Duration Provider Area
PoliSys: Systém pro analýzu policejních dat pro potřeby Policie ČR 03/2017-03/2018 MVČR Data, Information Retrieval, Machine Learning, Morphology

Inspire

Duration Provider Area
INSPIRE: INSPIRE in Pocket Inspire Machine Translation