All Papers (a mirror of biblio)

Ibrahim Sa'id Ahmad, Antonios Anastasopoulos, Ondřej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, William Chen, Qianqian Dong, Marcello Federico, Barry Haddow, Dávid Javorský, Mateusz Krubiński, Tsz Kin Lam, Xutai Ma, Prashant Mathur, Evgeny Matusov, Chandresh Kumar Maurya, John McCrae, Kenton Murray, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, Atul Kr. Ojha, John Ortega, Sara Papi, Peter Polák, Pavel Pecina, Adam Pospíšil, Elizabeth Salesky, Nivedita Sethiya, Anoop Sarkar, Jiatong Shi, Claytone Sikasote, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Brian Thompson, Alex Waibel, Shinji Watanabe, Patrick Wilken, Petr Zemánek, Rodolfo Zevallos (2024): FINDINGS OF THE IWSLT 2024 EVALUATION CAMPAIGN. In: Proceedings of the 21st International Conference on Spoken Language Translation, pp. 1-11, Association for Computational Linguistics, Stroudsburg, USA, ISBN 979-8-89176-141-4 (url, bibtex)
Sunit Bhattacharya, Vilém Zouhar, Věra Kloudová, Ondřej Bojar (2024): Stroop Effect in Multi-Modal Sight Translation (Electronic). In: ArXiv.org Computing Research Repository, ISSN 2331-8422, pp. 1-5 (url, local PDF)
Dominika Ďurišková, Daniela Jurášová, Matúš Žilinec, Eduard Šubert, Ondřej Bojar (2024): Khan Academy Corpus: A multilingual corpus of Khan Academy lectures. In: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 9743-9752, European Language Resources Association, Torino, Italy, ISBN 978-2-493814-10-4 (url, bibtex)
Michelle Elizabeth, Ondřej Bojar (2024): Revamping the SLTev Tool for Evaluation of Spoken Language Translation. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 121, pp. 5-14 (pdf, bibtex)
Miroslav Hrabal, Josef Jon, Martin Popel, Nam Hoang Luu, Danil Semin, Ondřej Bojar (2024): CUNI at WMT24 General Translation Task: LLMs, (Q)LoRA, CPO and Model Merging. In: Proceedings of the Ninth Conference on Machine Translation, pp. 232-246, Association for Computational Linguistics, Kerrville, TX, USA, ISBN 979-8-89176-179-7 (url, bibtex)
Josef Jon, Ondřej Bojar (2024): GAATME: A Genetic Algorithm for Adversarial Translation Metrics Evaluation. In: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 7562-7569, European Language Resources Association, Torino, Italy, ISBN 978-2-493814-10-4 (url, bibtex)
Josef Jon, Ondřej Bojar (2024): An Analysis of Surprisal Uniformity in Machine and Human Translations. In: Proceedings of the 1st Workshop on Creative-text Translation and Technology, pp. 40-56, European Association for Machine Translation, Sheffield, UK, ISBN 9781068690730 (bibtex)
Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondřej Bojar, Anton Dvorkovich, Christian Federmann, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Marzena Karpinska, Philipp Koehn, Benjamin Marie, Christof Monz, Kenton Murray, Masaaki Nagata, Martin Popel, Maja Popović, Mariya Shmatova, Steinþór Steingrímsson, Vilém Zouhar (2024): Findings of the WMT24 General Machine Translation Shared Task: The LLM Era is Here but MT is Not Solved Yet. In: Proceedings of the Ninth Conference on Machine Translation, pp. 1-46, Association for Computational Linguistics, Kerrville, TX, USA, ISBN 979-8-89176-179-7 (pdf, bibtex)
Michal Novák, Peter Polák, Kateřina Rysová, Magdaléna Rysová, Ondřej Bojar (2024): Towards Automated Spoken Language Assessment: A Study of ASR Transcription of Examinations for Non-Native Speakers of Czech. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 122, pp. 43-70 (pdf, local PDF, bibtex)
Adam Osuský, Dávid Javorský, Ondřej Bojar (2024): InsBERT: Word importance from artificial insertions. In: Proceedings of the 24th Conference Information Technologies – Applications and Theory (ITAT 2024), pp. 96-106, CEUR-WS.org, Košice, Slovakia (pdf, bibtex)
Shantipriya Parida, Ondřej Bojar, Idris Abdulmumin, Shamsuddeen Hassan Muhammad, Ibrahim Sa'id Ahmad (2024): Findings of WMT2024 English-to-Low Resource Multimodal Translation Task. In: Proceedings of the Ninth Conference on Machine Translation, pp. 677-683, Association for Computational Linguistics, Kerrville, TX, USA, ISBN 979-8-89176-179-7 (url, bibtex)
Matthias Sperber, Ondřej Bojar, Barry Haddow, Dávid Javorský, Xutai Ma, Matteo Negri, Jan Niehues, Peter Polák, Elizabeth Salesky, Katsuhito Sudoh, Marco Turchi (2024): Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation. In: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 6484-6495, European Language Resources Association, Torino, Italy, ISBN 978-2-493814-10-4 (url, bibtex)
Hening Wang, Leixin Zhang, Ondřej Bojar (2024): Human and Machine: Language Processing in Translation Tasks. In: Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024), pp. 243-250, Association for Computational Linguistics, Online (url, bibtex)
Uladzislau Yorsh, Martin Holeňa, Ondřej Bojar, David Herel (2024): On Difficulties of Attention Factorization through Shared Memory. In: The Second Tiny Papers Track at ICLR 2024, pp. 1-8, OpenReview.net (bibtex)
Patrik Zavoral, Dušan Variš, Ondřej Bojar (2024): Adversarial Testing as a Tool for Interpretability: Length-based Overfitting of Elementary Functions in Transformers (Electronic). In: ArXiv.org Computing Research Repository, ISSN 2331-8422, pp. 1-9 (url)
Leixin Zhang, David Burian, Vojtěch John, Ondřej Bojar (2024): Unveiling Semantic Information in Sentence Embeddings. In: Proceedings of the Fifth International Workshop on Designing Meaning Representations (DMR 2024) @ LREC-COLING 2024, pp. 39-47, ELRA Language Resource Association, ISBN 978-2-493814-39-5 (url, bibtex)
Vilém Zouhar, Ondřej Bojar (2024): Quality and Quantity of Machine Translation References for Automatic Metrics. In: Fourth Workshop on Human Evaluation of NLP Systems (HumEval) @ LREC-COLING 2024, pp. 1-11, ELRA, Paris, France, ISBN 978-2-493814-41-8 (url, bibtex)
Vilém Zouhar, Věra Kloudová, Martin Popel, Ondřej Bojar (2024): Evaluating Optimal Reference Translations. In: Natural Language Processing, ISSN 2977-0424, 2024, pp. 1-24 (url, bibtex)
Milind Agarwal, Sweta Agrawal, Antonios Anastasopoulos, Luisa Bentivogli, Ondřej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, Mingda Chen, William Chen, Khalid Choukri, Alexandra Chronopoulou, Thierry Declerck, Qianqian Dong, Kevin Duh, Yannick Estève, Marcello Federico, Souhir Gahbiche, Barry Haddow, Benjamin Hsu, Phu Mon Htut, Hirofumi Inaguma, Dávid Javorský, John Judge, Yasumasa Kano, Tom Ko, Rishu Kumar, Pengwei Li, Xutai Ma, Prashant Mathur, Evgeny Matusov, Paul McNamee, John McCrae, Kenton Murray, Maria Nadejde, Satoshi Nakamura, Matteo Negri, Ha Nguyen, Jan Niehues, Xing Niu, Atul Kr. Ojha, John Ortega, Proyag Pal, Juan Pino, Lonneke van der Plas, Peter Polák, Elijah Rippeth, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Yun Tang, Brian Thompson, Kevin Tran, Marco Turchi, Alex Waibel, Mingxuan Wang, Shinji Watanabe, Rodolfo Zevallos (2023): FINDINGS OF THE IWSLT 2023 EVALUATION CAMPAIGN. In: Proceedings of the 20th International Conference on Spoken Language Translation, pp. 1-61, Association for Computational Linguistics, Stroudsburg, USA, ISBN 978-1-959429-84-5 (url, bibtex)
Sunit Bhattacharya, Ondřej Bojar (2023): Unveiling Multilinguality in Transformer Models: Exploring Language Specificity in Feed-Forward Networks (Electronic). In: ArXiv.org Computing Research Repository, ISSN 2331-8422, pp. 120-126 (pdf)
Tirthankar Ghosal, Ondřej Bojar, Marie Hledíková, Tom Kocmi, Anna Nedoluzhko (2023): Overview of the Second Shared Task on Automatic Minuting (AutoMin) at INLG 2023. In: Proceedings of the 16th International Natural Language Generation Conference, pp. 138-167, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 979-8-89176-001-1 (bibtex)
Dávid Javorský, Ondřej Bojar, François Yvon (2023): Assessing Word Importance Using Models Trained for Semantic Tasks. In: Findings of the Association for Computational Linguistics: ACL 2023, pp. 8846-8856, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-959429-62-3 (pdf, bibtex)
Josef Jon, Ondřej Bojar (2023): Character-level NMT and language similarity. In: Proceedings of Machine Translation Summit XIX vol. 1: Research Track, pp. 360-371, Asia-Pacific Association for Machine Translation (AAMT), Kyoto, Japan, ISBN 978-4-9913461-0-1 (pdf, bibtex)
Josef Jon, Ondřej Bojar (2023): Breeding Machine Translations: Evolutionary approach to survive and thrive in the world of automated evaluation. In: Proceedings of 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2191-2212, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-959429-72-2 (url, bibtex)
Josef Jon, Martin Popel, Ondřej Bojar (2023): CUNI at WMT23 General Translation Task: MT and a Genetic Algorithm. In: Proceedings of the Eighth Conference on Machine Translation, pp. 119-127, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 979-8-89176-041-7 (pdf, bibtex)
Josef Jon, Dušan Variš, Michal Novák, Joao Paulo Aires, Ondřej Bojar (2023): Negative Lexical Constraints in Neural Machine Translation. In: Proceedings of Machine Translation Summit XIX vol. 1: Research Track, pp. 372-384, Asia-Pacific Association for Machine Translation (AAMT), Kyoto, Japan, ISBN 978-4-9913461-0-1 (pdf, bibtex)
Věra Kloudová, David Mraček, Ondřej Bojar, Martin Popel (2023): Možnosti a meze tvorby tzv. optimálních referenčních překladů: po stopách „překladatelštiny“ v profesionálních překladech zpravodajských textů. In: Slovo a slovesnost, ISSN 0037-7031, vol. 84, no. 2, pp. 122-156 (url, bibtex)
František Kmječ, Ondřej Bojar (2023): Team Iterate @ AutoMin 2023 - Experiments with Iterative Minuting. In: Proceedings of the 16th International Natural Language Generation Conference: Generation Challenges, pp. 114-120, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 979-8-89176-003-5 (url, bibtex)
Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondřej Bojar, Anton Dvorkovich, Christian Federmann, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Philipp Koehn, Benjamin Marie, Christof Monz, Makoto Morishita, Kenton Murray, Makoto Nagata, Toshiaki Nakazawa, Martin Popel, Maja Popović, Mariya Shmatova (2023): Findings of the 2023 Conference on Machine Translation (WMT23): LLMs Are Here but Not Quite There Yet. In: Proceedings of the Eighth Conference on Machine Translation, pp. 1-42, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 979-8-89176-041-7 (url, bibtex)
Ivana Kvapilíková, Ondřej Bojar (2023): Low-Resource Machine Translation Systems for Indic Languages. In: Proceedings of the Eighth Conference on Machine Translation, pp. 954-958, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 979-8-89176-041-7 (bibtex)
Ivana Kvapilíková, Ondřej Bojar (2023): Boosting Unsupervised Machine Translation with Pseudo-Parallel Data. In: Proceedings of Machine Translation Summit XIX vol. 1: Research Track, pp. 135-147, Asia-Pacific Association for Machine Translation (AAMT), Kyoto, Japan, ISBN 978-4-9913461-0-1 (bibtex)
Dominik Macháček, Ondřej Bojar, Raj Dabre (2023): MT Metrics Correlate with Human Ratings of Simultaneous Speech Translation. In: Proceedings of the 20th International Conference on Spoken Language Translation, pp. 169-179, Association for Computational Linguistics, Stroudsburg, USA, ISBN 978-1-959429-84-5 (pdf, local PDF, bibtex)
Dominik Macháček, Raj Dabre, Ondřej Bojar (2023): Turning Whisper into Real-Time Transcription System. In: Proceedings of the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 13th International Joint Conference on Natural Language Processing: System Demonstrations, pp. 17-24, Asian Federation of Natural Language Processing, Bali, Indonesia (pdf, bibtex)
Dominik Macháček, Peter Polák, Ondřej Bojar, Raj Dabre (2023): Robustness of Multi-Source MT to Transcription Errors. In: Findings of the Association for Computational Linguistics: ACL 2023, pp. 3707-3723, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-959429-62-3 (pdf, bibtex)
Toshiaki Nakazawa, Kazutaka Kinugawa, Hideya Mino, Isao Goto, Raj Dabre, Shohei Higashiyama, Shantipriya Parida, Makoto Morishita, Ondřej Bojar, Akiko Eriguchi, Yusuke Oda, Chenhui Chu, Sadao Kurohashi (2023): Overview of the 10th Workshop on Asian Translation. In: Proceedings of the 10th Workshop on Asian Translation, pp. 1-28, International Conference on Computational Linguistics, Macau, China (bibtex)
Kristýna Neumannová, Ondřej Bojar (2023): The Role of Compounds in Human vs. Machine Translation Quality. In: Proceedings of Machine Translation Summit XIX vol. 1: Research Track, pp. 248-260, Asia-Pacific Association for Machine Translation (AAMT), Kyoto, Japan, ISBN 978-4-9913461-0-1 (pdf, bibtex)
Shantipriya Parida, Ondřej Bojar (2023): HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language. In: Findings of the Association for Computational Linguistics: ACL 2023, pp. 10162-10183, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-959429-62-3 (bibtex)
Andrej Perković, Jernej Vičič, Dávid Javorský, Ondřej Bojar (2023): Shortening of the results of machine translation using paraphrasing dataset. In: Proceedings of the 23rd Conference Information Technologies – Applications and Theory (ITAT 2023), pp. 121-130, 23rd Conference on Information Technologies – Applications and Theory, Košice, Slovakia (pdf, bibtex)
Peter Polák, Danni Liu, Ngoc-Quan Ngoc, Jan Niehues, Alex Waibel, Ondřej Bojar (2023): Towards Efficient Simultaneous Speech Translation: CUNI-KIT System for Simultaneous Track at IWSLT 2023. In: Proceedings of the 20th International Conference on Spoken Language Translation, pp. 389-396, Association for Computational Linguistics, Stroudsburg, USA, ISBN 978-1-959429-84-5 (url, bibtex)
Peter Polák, Brian Yan, Shinji Watanabe, Alex Waibel, Ondřej Bojar (2023): Incremental Blockwise Beam Search for Simultaneous Speech Translation with Controllable Quality-Latency Tradeoff. In: Proceedings of the 24st Annual Conference of the International Speech Communication Association, pp. 3979-3983, International Speech Communication Association, Baixas, France (url, bibtex)
František Trebuňa, Kristína Szabová, Ondřej Bojar (2023): Searching for Reasons of Transformers’ Success: Memorization vs Generalization. In: 26th International Conference, TSD 2023, pp. 25-32, Springer, Cham, Switzerland, ISBN 978-3-031-40497-9 (url, bibtex)
Iryna Tryhubyshyn, Aleš Tamchyna, Ondřej Bojar (2023): Bad MT Systems are Good for Quality Estimation. In: Proceedings of Machine Translation Summit XIX vol. 1: Research Track, pp. 200-208, Asia-Pacific Association for Machine Translation (AAMT), Kyoto, Japan, ISBN 978-4-9913461-0-1 (url, bibtex)
Idris Abdulmumin, Satya Ranjan Dash, Musa Abdullahi Dawud, Shantipriya Parida, Shamsuddeen Hassan Muhammad, Ibrahim Sa'id Ahmad, Subhadarshi Panda, Ondřej Bojar, Bashir Shehu Galadanci, Bello Shehu Bello (2022): Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation. In: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pp. 6471-6479, European Language Resources Association, Marseille, France, ISBN 979-10-95546-72-6 (url, local PDF, bibtex)
Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Věra Kloudová, Surafel Melaku Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nadejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Yun Tang, Matthias Sperber, Sebastian Stuker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alex Waibel, Changhan Wang, Shinji Watanabe (2022): FINDINGS OF THE IWSLT 2022 EVALUATION CAMPAIGN. In: Proceedings of the 19th International Conference on Spoken Language Translation, pp. 98-157, Association for Computational Linguistics, Stroudsburg, USA, ISBN 978-1-955917-41-4 (url, local PDF, bibtex)
Niyati Bafna, Martin Vastl, Ondřej Bojar (2022): Constrained Decoding for Technical Term Retention in English-Hindi MT. In: Proceedings of ICON 2021: 18th International Conference on Natural Language Processing, pp. 1-6, NLP Association India, Centre for Natural Language Processing, Department of Computer Science and Engineering, Silchar, India (local PDF, bibtex)
Rachel Bawden, Ondřej Bojar, Rajen Chatterjee, Anton Dvorkovich, Christian Federmann, Mark Fishel, Markus Freitag, Thamme Gowda, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Matthias Huck, Rebecca Knowles, Tom Kocmi, Philipp Koehn, Christof Monz, Makoto Morishita, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri, Michal Novák, Martin Popel, Maja Popović, Mariya Shmatova, Marco Turchi (2022): Findings of the 2022 Conference on Machine Translation (WMT22). In: Proceedings of the Seventh Conference on Machine Translation, pp. 1-34, Association for Computational Linguistics, Stroudsburg, PA, USA (pdf, local PDF, bibtex)
Sunit Bhattacharya, Rishu Kumar, Ondřej Bojar (2022): Team ÚFAL at CMCL 2022 Shared Task: Figuring out the correct recipe for predicting Eye-Tracking features using Pretrained Language Model. In: Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, pp. 130-135, Association for Computational Linguistics, Stroudsburg, PA, USA (local PDF, bibtex)
Sunit Bhattacharya, Vilém Zouhar, Ondřej Bojar (2022): Sentence Ambiguity, Grammaticality and Complexity Probes. In: Proceedings of the 5th Workshop on Analyzing and Interpreting Neural Networks for NLP, pp. 1-11, Association for Computational Linguistics, Stroudsburg, PA, USA (pdf, local PDF, bibtex)
Lukáš Burget, Ondřej Bojar (2022): Průběžná zpráva NEUREM3 (technical report). In: (pdf, bibtex)
Satya Ranjan Dash, Shantipriya Parida, Esau Villatoro Tello, Biswaranjan Acharya, Ondřej Bojar (2022): Natural Language Processing In Healthcare, A Special Focus on Low Resource Languages. In: , ISBN 9780367685393 (bibtex)
Muskan Garg, Seema Wazarkar, Muskaan Singh, Ondřej Bojar (2022): Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers. In: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pp. 6837-6847, European Language Resources Association, Marseille, France, ISBN 979-10-95546-72-6 (url, local PDF, bibtex)
Christian Huber, Rishu Kumar, Ondřej Bojar, Alex Waibel (2022): Short-Term Word-Learning in a Dynamically Changing Environment (Electronic). In: ArXiv.org Computing Research Repository, ISSN 2331-8422, pp. 1-4 (url)
Dávid Javorský, Dominik Macháček, Ondřej Bojar (2022): Continuous Rating as Reliable Human Evaluation of Simultaneous Speech Translation. In: Proceedings of the Seventh Conference on Machine Translation, pp. 154-164, Association for Computational Linguistics, Stroudsburg, PA, USA (pdf, local PDF, bibtex)
Josef Jon, Martin Popel, Ondřej Bojar (2022): CUNI-Bergamot Submission at WMT22 General Task. In: Proceedings of the Seventh Conference on Machine Translation, pp. 280-289, Association for Computational Linguistics, Stroudsburg, PA, USA (pdf, local PDF, bibtex)
Nalin Kumar, Ondřej Bojar (2022): Genre Transfer in NMT: Creating Synthetic Spoken Parallel Sentences using Written Parallel Data. In: 19th International Conference on Natural Language Processing, pp. 224-233, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-959429-38-8 (url, local PDF, bibtex)
Ivana Kvapilíková, Ondřej Bojar (2022): CUNI Submission to MT4All Shared Task. In: Proceedings of the LREC 2022 Workshop of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages (SIGUL 2022), pp. 78-82, European Language Resources Association (ELRA), Paris, France, ISBN 979-10-95546-91-7 (bibtex)
Toshiaki Nakazawa, Hideya Mino, Isao Goto, Raj Dabre, Shohei Higashiyama, Shantipriya Parida, Anoop Kunchukuttan, Makoto Morishita, Ondřej Bojar, Chenhui Chu, Kaori Abe, Yusuke Oda, Sadao Kurohashi (2022): Overview of the 9th Workshop on Asian Translation. In: Proceedings of the 9th Workshop on Asian Translation, pp. 1-36, International Conference on Computational Linguistics, Gyeongju, Korea (url, bibtex)
Anna Nedoluzhko, Muskaan Singh, Marie Hledíková, Tirthankar Ghosal, Ondřej Bojar (2022): ELITR Minuting Corpus: A Novel Dataset for Automatic Minuting from Multi-Party Meetings in English and Czech. In: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pp. 3174-3182, European Language Resources Association, Marseille, France, ISBN 979-10-95546-72-6 (pdf, local PDF, bibtex)
Peter Polák, Ngoc-Quan Ngoc, Tuan-Nam Nguyen, Danni Liu, Carlos Mullov, Jan Niehues, Ondřej Bojar, Alex Waibel (2022): CUNI-KIT System for Simultaneous Speech Translation Task at IWSLT 2022. In: Proceedings of the 19th International Conference on Spoken Language Translation, pp. 277-285, Association for Computational Linguistics, Stroudsburg, USA, ISBN 978-1-955917-41-4 (url, local PDF, bibtex)
Peter Polák, Muskaan Singh, Anna Nedoluzhko, Ondřej Bojar (2022): ALIGNMEET: A Comprehensive Tool for Meeting Annotation, Alignment, and Evaluation. In: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pp. 1771-1779, European Language Resources Association, Marseille, France, ISBN 979-10-95546-72-6 (pdf, local PDF, bibtex)
Borek Požár, Klára Tauchmanová, Kristýna Neumannová, Ivana Kvapilíková, Ondřej Bojar (2022): CUNI Submission to the BUCC 2022 Shared Task on Bilingual Term Alignment. In: Proceedings of the LREC 2022 15th Workshop on Building and Using Comparable Corpora, pp. 43-49, European Language Resources Association, Paris, France, ISBN 979-10-95546-94-8 (local PDF, bibtex)
Kirill Semenov, Ondřej Bojar (2022): Automated Evaluation Metric for Terminology Consistency in MT. In: Proceedings of the Seventh Conference on Machine Translation, pp. 1-6, Association for Computational Linguistics, Stroudsburg, PA, USA (bibtex)
Sukanta Sen, Ondřej Bojar, Barry Haddow (2022): Simultaneous Translation for Unsegmented Input: A Sliding Window Approach (Electronic). In: ArXiv.org Computing Research Repository, ISSN 2331-8422, pp. 1-8 (url)
Kartik Shinde, Tirthankar Ghosal, Ondřej Bojar (2022): Automatic minuting: A pipeline method for generating minutes from multi-party meeting proceedings. In: Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation, pp. 1-12, ACL, Stroudsburg PA 18360, USA (url, local PDF, bibtex)
Patrícia Schmidtová, Rudolf Rosa, David Košťák, Tomáš Studeník, Daniel Hrbek, Tomáš Musil, Josef Doležal, Ondřej Dušek, David Mareček, Klára Vosecká, Marie Nováková, Petr Žabka, Alisa Zakhtarenko, Dominik Jurko, Martina Kinská, Tom Kocmi, Ondřej Bojar (2022): THEaiTRE: Generating Theatre Play Scripts using Artificial Intelligence. In: , ISBN 978-80-88132-14-1 (url, bibtex)
Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussà, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher M. Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri (2021): Findings of the 2021 Conference on Machine Translation (WMT21). In: Proceedings of the Sixth Conference on Machine Translation, pp. 1-88, Association for Computational Linguistics, Online, ISBN 978-1-954085-94-7 (pdf, local PDF, bibtex)
Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alex Waibel, Changhan Wang, Matthew Wiesner (2021): FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN. In: Proceedings of the 18th International Conference on Spoken Language Translation, pp. 1-29, Association for Computational Linguistics, Stroudsburg, USA, ISBN 978-1-954085-74-9 (url, local PDF, bibtex)
Ebrahim Ansari, Ondřej Bojar, Barry Haddow, Mohammad Mahmoudi (2021): SLTev: Comprehensive Evaluation of Spoken Language Translation. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pp. 71-79, Association for Computational Linguistics (ACL), Stroudsburg, PA, USA, ISBN 978-1-954085-05-3 (url, local PDF, local PDF, bibtex)
Ondřej Bojar, Dominik Macháček, Sangeet Sagar, Otakar Smrž, Jonáš Kratochvíl, Peter Polák, Ebrahim Ansari, Mohammad Mahmoudi, Rishu Kumar, Dario Franceschini, Chiara Canton, Ivan Simonini, Thai-Son Nguyen, Felix Schneider, Sebastian Stüker, Alex Waibel, Barry Haddow, Rico Sennrich, Philip Williams (2021): ELITR Multilingual Live Subtitling: Demo and Strategy. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pp. 271-277, Association for Computational Linguistics (ACL), Stroudsburg, PA, USA, ISBN 978-1-954085-05-3 (bibtex)
Ondřej Bojar, Vojtěch Srdečný, Rishu Kumar, Otakar Smrž, Felix Schneider, Barry Haddow, Phil Williams, Chiara Canton (2021): Operating a Complex SLT System with Speakers and Human Interpreters. In: Proceedings of Machine Translation Summit XVIII 1st Workshop on Automatic Spoken Language Translation in Real-World Settings, pp. 23-34, Association for Machine Translation in the Americas, Stroudsburg, PA, USA (pdf, bibtex)
Markus Freitag, Ricardo Rei, Nitika Mathur, Chi-kiu Lo, Craig Stewart, George Foster, Alon Lavie, Ondřej Bojar (2021): Results of the WMT21 Metrics Shared Task: Evaluating Metrics with Expert-based Human Evaluations on TED and News Domain. In: Proceedings of the Sixth Conference on Machine Translation, pp. 733-774, Association for Computational Linguistics, Online, ISBN 978-1-954085-94-7 (url, local PDF, bibtex)
Petr Gebauer, Ondřej Bojar, Vojtěch Švandelík, Martin Popel (2021): CUNI Systems in WMT21: Revisiting Backtranslation Techniques for English-Czech NMT. In: Proceedings of the Sixth Conference on Machine Translation, pp. 123-129, Association for Computational Linguistics, Online, ISBN 978-1-954085-94-7 (url, local PDF, bibtex)
Tirthankar Ghosal, Muskaan Singh, Anna Nedoluzhko, Ondřej Bojar (2021): Report on the SIGDial 2021 Special Session on Summarization of Dialogues and Multi-Party Meetings (SummDial). In: ACM SIGIR Forum, ISSN 0163-5840, December 2021, pp. 1-17 (pdf, bibtex)
Michael Hanna, Ondřej Bojar (2021): A Fine-Grained Analysis of BERTScore. In: Proceedings of the Sixth Conference on Machine Translation, pp. 507-517, Association for Computational Linguistics, Online, ISBN 978-1-954085-94-7 (url, local PDF, bibtex)
Daniel Hrbek, 1.0 THEaiTRobot, Tomáš Studeník, David Košťák, Martina Kinská, Rudolf Rosa, Ondřej Dušek, Tom Kocmi, David Mareček, Tomáš Musil, Patrícia Schmidtová, Dominik Jurko, Ondřej Bojar, Klára Vosecká, Josef Doležal, Marie Nováková, Petr Žabka (2021): AI: Když robot píše hru (online premiéra divadelní hry) (Electronic). (url)
Josef Jon, João Paulo de Souza Aires, Dušan Variš, Ondřej Bojar (2021): End-to-End Lexically Constrained Machine Translation for Morphologically Rich Languages. In: Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp. 4019-4033, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-954085-52-7 (url, local PDF, bibtex)
Josef Jon, Michal Novák, João Paulo de Souza Aires, Dušan Variš, Ondřej Bojar (2021): CUNI systems for WMT21: Multilingual Low-Resource Translation for Indo-European Languages Shared Task. In: Proceedings of the Sixth Conference on Machine Translation, pp. 354-361, Association for Computational Linguistics, Online, ISBN 978-1-954085-94-7 (url, local PDF, bibtex)
Josef Jon, Michal Novák, João Paulo de Souza Aires, Dušan Variš, Ondřej Bojar (2021): CUNI systems for WMT21: Terminology translation Shared Task. In: Proceedings of the Sixth Conference on Machine Translation, pp. 828-834, Association for Computational Linguistics, Online, ISBN 978-1-954085-94-7 (url, local PDF, bibtex)
Věra Kloudová, Ondřej Bojar, Martin Popel (2021): Detecting Post-edited References and Their Effect on Human Evaluation. In: Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval), pp. 114-119, Association for Computational Linguistics, Stroudsburg, USA, ISBN 978-1-954085-10-7 (pdf, local PDF, bibtex)
Tom Kocmi, Dominik Macháček, Ondřej Bojar (2021): The Reality of Multi-Lingual Machine Translation. In: , ISBN 978-80-88132-11-0 (pdf, local PDF, bibtex)
Matyáš Kopp, Vladislav Stankov, Jan Oldřich Krůza, Pavel Straňák, Ondřej Bojar (2021): ParCzech 3.0: A Large Czech Speech Corpus with Rich Metadata. In: 24th International Conference on Text, Speech and Dialogue, pp. 293-304, Springer, Cham, Switzerland, ISBN 978-3-030-83526-2 (pdf, local PDF, bibtex)
Ivana Kvapilíková, Ondřej Bojar (2021): Machine Translation of Covid-19 Information Resources via Multilingual Transfer. In: ITAT 2021 2nd Workshop on Automata, Formal and Natural Languages – WAFNL 2021, pp. 176-181, Faculty of Mathematics and Physics, Praha, Czechia (pdf, local PDF, bibtex)
Dominik Macháček, Matúš Žilinec, Ondřej Bojar (2021): Lost in Interpreting: Speech Translation from Source or Interpreter?. In: Proceedings of INTERSPEECH 2021, pp. 2376-2380, ISCA, Baxas, France (pdf, local PDF, bibtex)
Toshiaki Nakazawa, Hideki Nakayma, Chenchen Ding, Raj Dabre, Shohei Higashiyama, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Shantipriya Parida, Ondřej Bojar, Chenhui Chu, Akiko Eriguchi, Kaori Abe, Yusuke Oda, Sadao Kurohashi (2021): Overview of the 8th Workshop on Asian Translation. In: Proceedings of the 8th Workshop on Asian Translation, pp. 1-45, Association for Computational Linguistics, Stroudsburg, USA (url, local PDF, bibtex)
Shantipriya Parida, Subhadarshi Panda, Ketan Kotwal, Amulya Ratna Dash, Satya Ranjan Dash, Yashvardhan Sharma, Petr Motlíček, Ondřej Bojar (2021): NLPHut’s Participation at WAT2021. In: Proceedings of the 8th Workshop on Asian Translation, pp. 146-154, Association for Computational Linguistics, Stroudsburg, USA (pdf, bibtex)
Peter Polák, Ondřej Bojar (2021): Coarse-To-Fine And Cross-Lingual ASR Transfer. In: ITAT 2021 2nd Workshop on Automata, Formal and Natural Languages – WAFNL 2021, pp. 154-160, Faculty of Mathematics and Physics, Praha, Czechia (pdf, local PDF, bibtex)
Peter Polák, Muskaan Singh, Ondřej Bojar (2021): Explainable Quality Estimation: CUNI Eval4NLP Submission. In: Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems, pp. 250-255, Association for Computational Linguistics, Stroudsburg, PA, USA (pdf, local PDF, bibtex)
Rudolf Rosa, Tomáš Musil, Ondřej Dušek, Dominik Jurko, Patrícia Schmidtová, David Mareček, Ondřej Bojar, Tom Kocmi, Daniel Hrbek, David Košťák, Martina Kinská, Marie Nováková, Josef Doležal, Klára Vosecká, Tomáš Studeník, Petr Žabka (2021): When a Robot Writes a Play: Automatically Generating a Theatre Play Script. In: Proceedings of the ALIFE 2021: The 2021 Conference on Artificial Life, pp. 565-567, MIT Press, Cambridge, MA, USA (url, local PDF, local PDF, local ZIP, bibtex)
Rudolf Rosa, Tomáš Musil, Ondřej Dušek, Dominik Jurko, Patrícia Schmidtová, David Mareček, Ondřej Bojar, Tom Kocmi, Daniel Hrbek, David Košťák, Martina Kinská, Marie Nováková, Josef Doležal, Klára Vosecká, Tomáš Studeník, Petr Žabka (2021): THEaiTRE 1.0: Interactive Generation of Theatre Play Scripts. In: Proceedings of the Text2Story’21 Workshop, pp. 71-76, RWTH Aachen University, Aachen, Germany (pdf, local PDF, local PDF, local ZIP, bibtex)
Arghyadeep Sen, Shantipriya Parida, Ketan Kotwal, Subhadarshi Panda, Ondřej Bojar, Satya Ranjan Dash (2021): Bengali Visual Genome: A Multimodal Dataset for Machine Translation and Image Captioning. In: 9th International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA 2021), pp. 63-70, Springer Nature Singapore, Singapore, ISBN 978-981-16-6624-7 (local PDF, bibtex)
Muskaan Singh, Tirthankar Ghosal, Ondřej Bojar (2021): An Empirical Performance Analysis of State-of-the-Art Summarization Models for Automatic Minuting. In: Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation, pp. 50-60, ACL, 209 N. Eighth Street, Stroudsburg PA 18360, USA (url, bibtex)
Dušan Variš, Ondřej Bojar (2021): Sequence Length is a Domain: Length-based Overfitting in Transformer Models. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 8246-8257, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-955917-09-4 (pdf, local PDF, local PDF, local PDF, bibtex)
Vilém Zouhar, Michal Novák, Matúš Žilinec, Ondřej Bojar, Mateo Obregón, Robin L. Hill, Frédéric Blain, Marina Fomicheva, Lucia Specia, Lisa Yankovskaya (2021): Backtranslation Feedback Improves User Confidence in MT, Not Quality. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 151-161, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-954085-46-6 (url, local PDF, bibtex)
Vilém Zouhar, Aleš Tamchyna, Martin Popel, Ondřej Bojar (2021): Neural Machine Translation Quality and Post-Editing Performance. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 10204-10214, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-955917-09-4 (pdf, local PDF, bibtex)
Ebrahim Ansari, Amittai Axelrod, Nguyen Bach, Ondřej Bojar, Roldano Cattoni, Fahim Dalvi, Nadir Durrani, Marcello Federico, Christian Federmann, Jiatao Gu, Fei Huang, Kevin Knight, Xutai Ma, Ajay Nagesh, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Xing Shi, Sebastian Stüker, Marco Turchi, Alex Waibel, Changhan Wang (2020): FINDINGS OF THE IWSLT 2020 EVALUATION CAMPAIGN. In: Proceedings of the 17th International Conference on Spoken Language Translation, pp. 1-34, Association for Computational Linguistics, Online, ISBN 978-1-952148-07-1 (pdf, local PDF, bibtex)
Petra Barančíková, Ondřej Bojar (2020): Costra 1.1: An Inquiry into Geometric Properties of Sentence Spaces. In: 23rd International Conference on Text, Speech and Dialogue, pp. 135-143, Springer, Cham, Switzerland, ISBN 978-3-030-58322-4 (local PDF, bibtex)
Petra Barančíková, Ondřej Bojar (2020): COSTRA 1.0: A Dataset of Complex Sentence Transformations. In: Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020), pp. 3535-3541, European Language Resources Association, Marseille, France, ISBN 979-10-95546-34-4 (url, local PDF, bibtex)
Loïc Barrault, Magdalena Biesialska, Ondřej Bojar, Marta R. Costa-Jussà, Christian Federmann, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Matthias Huck, Eric Joanis, Tom Kocmi, Philipp Koehn, Chi-kiu Lo, Nikola Ljubešić, Christof Monz, Makoto Morishita, Masaaki Nagata, Toshiaki Nakazawa, Santanu Pal, Matt Post, Marcos Zampieri (2020): Findings of the 2020 Conference on Machine Translation (WMT20). In: Fifth Conference on Machine Translation - Proceedings of the Conference, pp. 1-55, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-81-0 (pdf, local PDF, bibtex)
Ondřej Bojar, Dominik Macháček, Sangeet Sagar, Otakar Smrž, Jonáš Kratochvíl, Ebrahim Ansari, Dario Franceschini, Chiara Canton, Ivan Simonini, Thai-Son Nguyen, Felix Schneider, Sebastian Stüker, Alex Waibel, Barry Haddow, Rico Sennrich, Philip Williams (2020): ELITR: European Live Translator. In: Proceedings of the 22st Annual Conference of the European Association for Machine Translation (2020), pp. 463-464, European Association for Machine Translation, Lisboa, Portugal, ISBN 978-989-33-0589-8 (url, bibtex)
Erion Çano, Ondřej Bojar (2020): How Many Pages? Paper Length Prediction from the Metadata. In: 4th International Conference on Natural Language Processing and Information Retrieval, pp. 91-95, ACM, New York, USA, ISBN 978-1-4503-7760-7 (url, local PDF, bibtex)
Erion Çano, Ondřej Bojar (2020): Human or Machine: Automating Human Likeliness Evaluation of NLG Texts (Electronic). In: ArXiv.org Computing Research Repository, ISSN 2331-8422 (url)
Erion Çano, Ondřej Bojar (2020): Automating Text Naturalness Evaluation of NLG Systems (Electronic). In: ArXiv.org Computing Research Repository, ISSN 2331-8422 (url)
Erion Çano, Ondřej Bojar (2020): Two Huge Title and Keyword Generation Corpora of Research Articles. In: Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020), pp. 6663-6671, European Language Resources Association, Marseille, France, ISBN 979-10-95546-34-4 (url, local PDF, bibtex)
Dario Franceschini, Chiara Canton, Ivan Simonini, Armin Schweinfurth, Adelheid Glott, Sebastian Stüker, Thai-Son Nguyen, Felix Schneider, Thanh-Le Ha, Alex Waibel, Barry Haddow, Phil Williams, Rico Sennrich, Ondřej Bojar, Sangeet Sagar, Dominik Macháček, Otakar Smrž (2020): Removing European Language Barriers with Innovative Machine Translation Technology. In: Proceedings of the 1st International Workshop on Language Technology Platforms, pp. 44-49, ELRA, Paris, France, ISBN 979-10-95546-64-1 (url, local PDF, bibtex)
Tom Kocmi, Ondřej Bojar (2020): Efficiently Reusing Old Models Across Languages via Transfer Learning. In: Proceedings of the 22st Annual Conference of the European Association for Machine Translation (2020), pp. 1-10, European Association for Machine Translation, Lisboa, Portugal, ISBN 978-989-33-0589-8 (bibtex)
Tom Kocmi, Martin Popel, Ondřej Bojar (2020): Announcing CzEng 2.0 Parallel Corpus with over 2 Gigawords (technical report). In: , pp. 1-6 (pdf, bibtex)
Jonáš Kratochvíl, Peter Polák, Ondřej Bojar (2020): Large Corpus of Czech Parliament Plenary Hearings. In: Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020), pp. 6363-6367, European Language Resources Association, Marseille, France, ISBN 979-10-95546-34-4 (url, local PDF, bibtex)
Ivana Kvapilíková, Mikel Artetxe, Gorka Labaka, Eneko Agirre, Ondřej Bojar (2020): Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pp. 255-262, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-952148-03-3 (url, local PDF, bibtex)
Ivana Kvapilíková, Tom Kocmi, Ondřej Bojar (2020): CUNI Systems for the Unsupervised and Very Low Resource Translation Task in WMT20. In: Fifth Conference on Machine Translation - Proceedings of the Conference, pp. 1123-1128, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-81-0 (pdf, local PDF, bibtex)
Dominik Macháček, Ondřej Bojar (2020): Presenting Simultaneous Translation in Limited Space. In: Proceedings of the 20th Conference Information Technologies - Applications and Theory (ITAT 2020), pp. 32-37, Tomáš Horváth, Košice, Slovakia (pdf, bibtex)
Dominik Macháček, Jonáš Kratochvíl, Sangeet Sagar, Matúš Žilinec, Ondřej Bojar, Thai-Son Nguyen, Felix Schneider, Philip Williams, Yuekun Yao (2020): ELITR Non-Native Speech Translation at IWSLT 2020. In: Proceedings of the 17th International Conference on Spoken Language Translation, pp. 200-208, Association for Computational Linguistics, Online, ISBN 978-1-952148-07-1 (pdf, local PDF, bibtex)
Nitika Mathur, Johnny Tian-Zheng Wei, Markus Freitag, Qingsong Ma, Ondřej Bojar (2020): Results of the WMT20 Metrics Shared Task. In: Fifth Conference on Machine Translation - Proceedings of the Conference, pp. 688-725, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-81-0 (pdf, local PDF, bibtex)
Toshiaki Nakazawa, Hideki Nakayma, Chenchen Ding, Raj Dabre, Shohei Higashiyama, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Shantipriya Parida, Ondřej Bojar, Sadao Kurohashi (2020): Overview of the 7th Workshop on Asian Translation. In: Proceedings of the 7th Workshop on Asian Translation (WAT2020), pp. 1-44, Association for Computational Linguistics, Stroudsburg, USA (url, local PDF, bibtex)
Shantipriya Parida, Satya Ranjan Dash, Ondřej Bojar, Petr Motlíček, Priyanka Pattnaik, Debasish Kumar Mallick (2020): OdiEnCorp 2.0: Odia-English Parallel Corpus for Machine Translation. In: Proceedings of the LREC 2020 WILDRE5 – 5th Workshop on Indian Language Data: Resources and Evaluation, pp. 14-19, European Language Resources Association, Paris, France, ISBN 979-10-95546-67-2 (local PDF, bibtex)
Shantipriya Parida, Petr Motlíček, Amulya Ratna Dash, Satya Ranjan Dash, Debasish Kumar Mallick, Satya Prakash Biswal, Priyanka Pattnaik, Biranchi Narayan Nayak, Ondřej Bojar (2020): ODIANLP’s Participation in WAT2020. In: Proceedings of the 7th Workshop on Asian Translation (WAT2020), pp. 103-108, Association for Computational Linguistics, Stroudsburg, USA (url, local PDF, bibtex)
Peter Polák, Sangeet Sagar, Dominik Macháček, Ondřej Bojar (2020): CUNI Neural ASR with Phoneme-Level Intermediate Step for Non-Native SLT at IWSLT 2020. In: Proceedings of the 17th International Conference on Spoken Language Translation, pp. 191-199, Association for Computational Linguistics, Online, ISBN 978-1-952148-07-1 (url, local PDF, bibtex)
Martin Popel, Marketa Tomkova, Jakub Tomek, Łukasz Kaiser, Jakob Uszkoreit, Ondřej Bojar, Zdeněk Žabokrtský (2020): Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals. In: Nature Communications, ISSN 2041-1723, vol. 11, no. 4381, pp. 1-15 (url, local PDF, bibtex)
Rudolf Rosa, Ondřej Dušek, Tom Kocmi, David Mareček, Tomáš Musil, Patrícia Schmidtová, Dominik Jurko, Ondřej Bojar, Daniel Hrbek, David Košťák, Martina Kinská, Josef Doležal, Klára Vosecká (2020): THEaiTRE: Artificial Intelligence to Write a Theatre Play. In: Proceedings of AI4Narratives — Workshop on Artificial Intelligence for Narratives, pp. 9-13, RWTH Aachen University, Aachen, Germany (pdf, local PDF, local PDF, local ZIP, bibtex)
Vilém Zouhar, Ondřej Bojar (2020): Outbound Translation User Interface Ptakopet: A Pilot Study. In: Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020), pp. 6967-6975, European Language Resources Association, Marseille, France, ISBN 979-10-95546-34-4 (url, local PDF, bibtex)
Vilém Zouhar, Tereza Vojtěchová, Ondřej Bojar (2020): WMT20 Document-Level Markable Error Exploration. In: Fifth Conference on Machine Translation - Proceedings of the Conference, pp. 371-380, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-81-0 (url, local PDF, bibtex)
Petra Barančíková, Ondřej Bojar (2019): In Search for Linear Relations in Sentence Embedding Spaces. In: Proceedings of the 19th Conference ITAT 2019: Slovenskočeský NLP workshop (SloNLP 2019), pp. 125-132, CreateSpace Independent Publishing Platform, Košice, Slovakia (local PDF, bibtex)
Loïc Barrault, Ondřej Bojar, Marta R. Costa-Jussà, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Shervin Malmasi, Christof Monz, Mathias Müller, Santanu Pal, Matt Post, Marcos Zampieri (2019): Findings of the 2019 Conference on Machine Translation (WMT19). In: Fourth Conference on Machine Translation - Proceedings of the Conference, pp. 1-61, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-27-7 (url, bibtex)
Ondřej Bojar, Raffaella Bernardi, Bonnie L. Webber (2019): Representation of sentence meaning (A JNLE Special Issue). In: Natural Language Engineering, ISSN 1351-3249, vol. 25, no. 4, pp. 427-432 (pdf, local PDF, bibtex)
Erion Çano, Ondřej Bojar (2019): Keyphrase Generation: A Multi-Aspect Survey. In: Proceedings of the 25th Conference of Open Innovations Association FRUCT 2019, pp. 85-94, Finnish-Russian University Cooperation in Telecommunications, Helsinki, Finland, ISBN 978-952-69244-0-3 (pdf, bibtex)
Erion Çano, Ondřej Bojar (2019): Keyphrase Generation: A Text Summarization Struggle. In: The 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 666-672, NAACL-HLT 2019, Minneapolis, MN, USA, ISBN 978-1-950737-13-0 (url, bibtex)
Erion Çano, Ondřej Bojar (2019): Sentiment Analysis of Czech Texts: An Algorithmic Survey. In: Proceedings of the 11th International Conference on Agents and Artificial Intelligence, pp. 973-979, SCITEPRESS Digital Library, Setúbal, Portugal, ISBN 978-989-758-350-6 (url, bibtex)
Erion Çano, Ondřej Bojar (2019): Efficiency Metrics for Data-Driven Models: A Text Summarization Case Study. In: Proceedings of the 12th International Conference on Natural Language Generation (INLG 2019), pp. 229-239, Association for Computational Linguistics, Stroudsubrgh, PA, USA, ISBN 978-1-950737-94-9 (url, bibtex)
Tom Kocmi, Ondřej Bojar (2019): CUNI Submission for Low-Resource Languages in WMT News 2019. In: Fourth Conference on Machine Translation - Proceedings of the Conference, pp. 234-240, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-27-7 (pdf, bibtex)
Daniel Kondratyuk, Ronald Cardenas, Ondřej Bojar (2019): Replacing Linguists with Dummies: A Serious Need for Trivial Baselinesin Multi-Task Neural Machine Translation. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 113, pp. 31-40 (pdf, bibtex)
Ivana Kvapilíková, Dominik Macháček, Ondřej Bojar (2019): CUNI Systems for the Unsupervised News Translation Task in WMT 2019. In: Fourth Conference on Machine Translation - Proceedings of the Conference, pp. 241-248, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-27-7 (pdf, bibtex)
Dominik Macháček, Jonáš Kratochvíl, Tereza Vojtěchová, Ondřej Bojar (2019): A Speech Test Set of Practice Business Presentations with Additional Relevant Texts. In: Statistical Language and Speech Processing, pp. 151-161, Springer Nature Switzerland AG, Cham, Switzerland, ISBN 978-3-030-31371-5 (url, bibtex)
Qingsong Ma, Johnny Tian-Zheng Wei, Ondřej Bojar, Yvette Graham (2019): Results of the WMT19 Metrics Shared Task: Segment-Level and Strong MT Systems Pose Big Challenges . In: Fourth Conference on Machine Translation - Proceedings of the Conference, pp. 62-90, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-27-7 (url, bibtex)
Toshiaki Nakazawa, Nobushige Doi, Shohei Higashiyama, Chenchen Ding, Raj Dabre, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Shantipriya Parida, Ondřej Bojar, Sadao Kurohashi (2019): Overview of the 6th Workshop on Asian Translation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 1-35, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-90-1 (pdf, bibtex)
Anna Nedoluzhko, Ondřej Bojar (2019): Towards Automatic Minuting of Meetings. In: Proceedings of the 19th Conference ITAT 2019: Slovenskočeský NLP workshop (SloNLP 2019), pp. 112-119, CreateSpace Independent Publishing Platform, Košice, Slovakia (url, local PDF, bibtex)
Shantipriya Parida, Ondřej Bojar, Satya Ranjan Dash (2019): OdiEnCorp: Odia-English and Odia-Only Corpus for Machine Translation. In: Proceedings of the Third International Conference on Smart Computing and Informatics, Volume 1, pp. 495-504, Springer, Singapore, ISBN 978-981-13-9282-5 (url, bibtex)
Shantipriya Parida, Ondřej Bojar, Satya Ranjan Dash (2019): Hindi Visual Genome: A Dataset for Multimodal English-to-Hindi Machine Translation. In: Computación y Sistemas, ISSN 1405-5546, vol. 23, no. 4, pp. 1499-1505 (url, bibtex)
Shantipriya Parida, Petr Motlíček, Ondřej Bojar (2019): Idiap NMT System for WAT 2019 Multi-Modal Translation Task. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 175-180, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-90-1 (pdf, bibtex)
Thuong-Hai Pham, Dominik Macháček, Ondřej Bojar (2019): Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed. In: Computación y Sistemas, ISSN 1405-5546, vol. 23, no. 3, pp. 923-934 (url, bibtex)
Martin Popel, Dominik Macháček, Michal Auersperger, Ondřej Bojar, Pavel Pecina (2019): English-Czech Systems in WMT19: Document-Level Transformer. In: Fourth Conference on Machine Translation - Proceedings of the Conference, pp. 342-348, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-27-7 (pdf, local PDF, bibtex)
Kateřina Rysová, Magdaléna Rysová, Tomáš Musil, Lucie Poláková, Ondřej Bojar (2019): A Test Suite and Manual Evaluation of Document-Level NMT at WMT19. In: Fourth Conference on Machine Translation - Proceedings of the Conference, pp. 455-463, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-27-7 (url, local PDF, bibtex)
Dušan Variš, Ondřej Bojar (2019): Unsupervised Pretraining for Neural Machine Translation Using Elastic Weight Consolidation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pp. 130-135, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-47-5 (pdf, local PDF, local PDF, bibtex)
Tereza Vojtěchová, Michal Novák, Miloš Klouček, Ondřej Bojar (2019): SAO WMT19 Test Suite: Machine Translation of Audit Reports. In: Fourth Conference on Machine Translation - Proceedings of the Conference, pp. 680-692, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-27-7 (url, bibtex)
Ondřej Bojar, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Christof Monz (2018): Findings of the 2018 Conference on Machine Translation (WMT18). In: Proceedings of the Third Conference on Machine Translation, Volume 2: Shared Tasks, pp. 272-307, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-81-0 (pdf, bibtex)
Ondřej Bojar, Jiří Mírovský, Kateřina Rysová, Magdaléna Rysová (2018): EvalD Reference-Less Discourse Evaluation for WMT18. In: Proceedings of the Third Conference on Machine Translation, Volume 2: Shared Tasks, pp. 545-549, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-81-0 (url, local PDF, bibtex)
Franck Burlot, Yves Scherrer, Vinit Ravishankar, Ondřej Bojar, Stig-Arne Grönroos, Maarit Koponen, Tommi Nieminem, François Yvon (2018): The WMT’18 Morpheval test suites for English-Czech, English-German, English-Finnish and Turkish-English. In: Proceedings of the Third Conference on Machine Translation, Volume 2: Shared Tasks, pp. 550-564, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-81-0 (pdf, bibtex)
Ondřej Cífka, Ondřej Bojar (2018): Are BLEU and Meaning Representation in Opposition?. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1362-1371, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-32-2 (url, bibtex)
Silvie Cinková, Ondřej Bojar (2018): Testsuite on Czech–English Grammatical Contrasts. In: Proceedings of the Third Conference on Machine Translation, Volume 2: Shared Tasks, pp. 565-575, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-81-0 (pdf, bibtex)
Jindřich Helcl, Jindřich Libovický, Tom Kocmi, Tomáš Musil, Ondřej Cífka, Dušan Variš, Ondřej Bojar (2018): Neural Monkey: The Current State and Beyond. In: The 13th Conference of The Association for Machine Translation in the Americas, Vol. 1: MT Researchers’ Track, pp. 168-176, The Association for Machine Translation in the Americas, Stroudsburg, PA, USA (url, local PDF, local PDF, bibtex)
Tom Kocmi, Ondřej Bojar (2018): Trivial Transfer Learning for Low-Resource Neural Machine Translation. In: Proceedings of the Third Conference on Machine Translation, Volume 1: Research Papers, pp. 244-252, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-81-0 (url, local PDF, bibtex)
Tom Kocmi, Shantipriya Parida, Ondřej Bojar (2018): CUNI NMT System for WAT 2018 Translation Tasks. In: Proceedings of the 5th Workshop on Asian Translation (WAT2018), pp. 1-7, Asian Federation of Natural Language Processing, Hong Kong, China (url, bibtex)
Tom Kocmi, Roman Sudarikov, Ondřej Bojar (2018): CUNI Submissions in WMT18. In: Proceedings of the Third Conference on Machine Translation, Volume 2: Shared Tasks, pp. 435-441, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-81-0 (pdf, bibtex)
Tom Kocmi, Dušan Variš, Ondřej Bojar (2018): CUNI Basque-to-English Submission in IWSLT18. In: Proceedings of the International Workshop on Spoken Language Translation, pp. 142-146, Karlsruhe Institute of Technology, Karlsruhe, Germany (pdf, bibtex)
Dominik Macháček, Jonáš Vidra, Ondřej Bojar (2018): Morphological and Language-Agnostic Word Segmentation for NMT. In: Proceedings of the 21st International Conference on Text, Speech and Dialogue—TSD 2018, pp. 277-284, Springer-Verlag, Cham, Switzerland, ISBN 978-3-030-00794-2 (url, bibtex)
Qingsong Ma, Ondřej Bojar, Yvette Graham (2018): Results of the WMT18 Metrics Shared Task: Both characters and embeddings achieve good performance. In: Proceedings of the Third Conference on Machine Translation, Volume 2: Shared Tasks, pp. 682-701, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-81-0 (pdf, bibtex)
Shantipriya Parida, Ondřej Bojar (2018): Translating Short Segments with NMT: A Case Study in English-to-Hindi. In: Proceedings of the 21st Annual Conference of the European Association for Machine Translation (2018), pp. 1-392, European Association for Machine Translation, Allschwil, Switzerland, ISBN 978-84-09-01901-4 (url, local PDF, local PDF, bibtex)
Martin Popel, Ondřej Bojar (2018): Training Tips for the Transformer Model. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 110, pp. 43-70 (pdf, bibtex)
Mostafa Abdou, Vladan Glončák, Ondřej Bojar (2017): Variable Mini-Batch Sizing and Pre-Trained Embeddings. In: Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers, pp. 680-686, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-96-8 (pdf, bibtex)
Ondřej Bojar, Yvette Graham, Amir Kamran (2017): Results of the WMT17 Metrics Shared Task. In: Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers, pp. 489-513, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-96-8 (pdf, bibtex)
Ondřej Bojar, Jindřich Helcl, Tom Kocmi, Jindřich Libovický, Tomáš Musil (2017): Results of the WMT17 Neural MT Training Task. In: Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers, pp. 525-533, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-96-8 (url, local PDF, bibtex)
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Shujian Huang, Matthias Huck, Philipp Koehn, Qun Liu, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Raphael Rubino, Lucia Specia, Marco Turchi (2017): Findings of the 2017 Conference on Machine Translation (WMT17). In: Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers, pp. 169-214, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-96-8 (pdf, bibtex)
Ondřej Bojar, Tom Kocmi, David Mareček, Roman Sudarikov, Dušan Variš (2017): CUNI Submission in WMT17: Chimera Goes Neural. In: Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers, pp. 248-256, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-96-8 (url, bibtex)
Matthias Huck, Aleš Tamchyna, Ondřej Bojar, Alexander Fraser (2017): Producing Unseen Morphological Variants in Statistical Machine Translation. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, pp. 369-375, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-35-7 (url, bibtex)
Tom Kocmi, Ondřej Bojar (2017): An Exploration of Word Embedding Initialization in Deep-Learning Tasks. In: Proceedings of the 14th International Conference on Natural Language Processing, pp. 56-64, NLP Association of India, Kolkata, India (bibtex)
Tom Kocmi, Ondřej Bojar (2017): Curriculum Learning and Minibatch Bucketing in Neural Machine Translation. In: Proceedings of the International Conference Recent Advances in Natural Language Processing, pp. 379-386, INCOMA Ltd., Šumen, Bulgaria, ISBN 978-954-452-048-9 (url, bibtex)
Tom Kocmi, Ondřej Bojar (2017): LanideNN: Multilingual Language Identification on Character Window. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, pp. 927-936, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-35-7 (url, bibtex)
Tom Kocmi, Dušan Variš, Ondřej Bojar (2017): CUNI NMT System for WAT 2017 Translation Tasks. In: Proceedings of the 4th Workshop on Asian Translation (WAT2017), pp. 154-159, Asian Federation of Natural Language Processing, Taipei, Taiwan, ISBN 978-1-948087-06-3 (bibtex)
David Mareček, Ondřej Bojar, Ondřej Hübsch, Rudolf Rosa, Dušan Variš (2017): CUNI Experiments for WMT17 Metrics Task. In: Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers, pp. 604-611, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-96-8 (url, bibtex)
Jan-Thorsten Peter, Hermann Ney, Ondřej Bojar, Ngoc-Quam Pham, Jan Niehues, Alex Waibel, Franck Burlot, François Yvon, Marcis Pinnis, Valters Sics, Joost Bastings, Miguel Rios, Wilker Aziz, Phil Williams, Frédéric Blain, Lucia Specia (2017): The QT21 Combined Machine Translation System for English to Latvian. In: Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers, pp. 348-357, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-96-8 (pdf, bibtex)
Matiss Rikters, Ondřej Bojar (2017): Paying Attention to Multi-Word Expressions in Neural Machine Translation. In: Proceedings of MT Summit XVI, vol. 1: Research Track, pp. 86-95, IAMT, Nagoya, Japan (url, bibtex)
Matiss Rikters, Mark Fishel, Ondřej Bojar (2017): Visualizing Neural Machine Translation Attention and Confidence. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 109, pp. 39-50 (pdf, bibtex)
Dušan Variš, Ondřej Bojar (2017): CUNI System for WMT17 Automatic Post-Editing Task. In: Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers, pp. 661-666, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-96-8 (pdf, bibtex)
Antonio Jimeno Yepes, Aurelie Névéol, Mariana Neves, Karin Verspoor, Ondřej Bojar, Arthur Boyer, Cristian Grozea, Barry Haddow, Madeleine Kittner, Yvonne Lichtblau, Pavel Pecina, Roland Roller, Rudolf Rosa, Amy Siu, Philippe Thomas, Saskia Trescher (2017): Findings of the WMT 2017 Biomedical Translation Shared Task. In: Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers, pp. 234-247, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-96-8 (pdf, bibtex)
Amal Abdelsalam, Ondřej Bojar (2016): Bilingual Embeddings and Word Alignments for Translation Quality Estimation. In: Proceedings of the First Conference on Machine Translation (WMT). Volume 2: Shared Task Papers, pp. 764-771, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-10-4 (pdf, bibtex)
Gabriel Altman, Jan Andres, Johan van der Auwera, Jarmila Bachmannová, Jan Balhar, Aleš Bičan, Lenka Bičanová, Jana Bílková, Petr Biskup, Ondřej Bláha, Izabela Blaszczyk, Ondřej Bojar, Tomáš Bořil, Máša Bořkovcová, Ivana Bozděchová, Pavel Caha, Václav Cvrček, Radek Čech, Marie Čechová, František Čermák, David S. Danaher, František Daneš, Jaroslav David, Mojmír Dočekal, Jakub Dotlačil, Vít Dovalil, Věra Dvořák, Eva Eckertová, Viktor Elšík, Joseph Emonds, Adolf Erhart, François Esvan, Dan Faltýnek, Masako Fidler, Alena Andrlová Fidlerová, Zbyněk Fišer, Eva Flanderková, Mirjam Fried, Markus Giger, Miroslav Grepl, Jan Hajič, Eva Hajičová, Ernst Hansack, Björn Hansen, Radoslav Harman, Milan Harvalík, Martin Havlík, Eva Havlová, Elke Hentschel, Milada Hirschová, Zdeňka Hladká, Jana Hoffmannová, Jiří Homoláč, Milada Homolková, Tomáš Hoskovec, Jan Hric, Jaroslav Hubáček, Jan Chloupek, Leonid L. Iomdin, Pavel Ircing, Laura Janda, Ilona Janyšková, Milan Jelínek, Tomáš Jelínek, Lucie Jílková, Filip Jurčíček, Michal Jurka, Petr Karlík, Petr Karlík mladší, Helena Karlíková, Stanislava Kloferová, Martina Kloudová, Miroslava Knappová, Robert Kolár, Ivana Kolářová, Marie Kopřivová, Jan Kořenský, Pavel Kosek, Peter Kosta, Michaela Koščová, Jiří Koten, Ondřej Koupil, Michal Kovář, Michala Králíková, Marie Krappmann, Jiří Kraus, Marie Krčmová, Susan Kresin, Michal Křen, Michal Křístek, Pavel Kubaník, Miroslav Kubát, Tomáš Kubík, Vladislav Kuboň, Ivona Kučerová, Natalia Levshina, Alena Macurová, Ján Mačutek, Jarosław Malicki, Petr Mareš, Olga Martincová, Jiří Marvan, Jindřich Matoušek, Barbara Mertins, Roland Meyer, Krzysztof Migdalski, Eva Minářová, Kamila Mrázková, Iveta Mrázová, Richard Müller, Olga Müllerová, Mira Nábělková, Olga Navrátilová, Iva Nebeská, Anna Nedoluzhko, Marek Nekula, Zuzana Nevěřilová, Stefan Michael Newerkla, Mark Newson, Pavel Novák, Renata Novotná, Norbert Nübler, Radek Ocelák, Karel Oliva, Ivo Osolsobě, Klára Osolsobě, Ludmila Pacnerová, Karel Pala, Zdena Palková, Jarmila Panevová, Pavel Pecina, Jaroslav Peregrin, Anna Maria Perissutti, Ondřej Pešek, Vladimír Petkevič, Petr Plecháč, Jana Pleskalová, Jan Radimský, Paul Rastall, Alexandr Rosen, Zdenka Rusínová, Lucie Saicová Římalová, Tamah Sherman, Tobias Scheer, Boris Skalka, Radek Skarnitzl, Marián Sloboda, Olga Stehlíková, Hana Strachoňová, Jana Straková, Roman Sukač, Zbyněk Sviták, Aleš Svoboda, Josef Syka, Ondřej Šefčík, Radek Šimík, Hana Gruet Škrabalová, Dušan Šlosar, Rudolf Šrámek, Jan Štěpán, František Štícha, Michaela Tabakovičová, Knut Tarald Taraldsen, Lucie Taraldsen Medová, Jiří Trávníček, Vladimír Trpka, Jana Marie Tušková, Ludmila Uhlířová, Lenka Uličná, Oldřich Uličný, Jana Valdrová, Irena Vaňková, Ivo Vasiljev, Radoslav Večerka, Jarmil Vepřek, Ljuba Veselinova, Kateřina Veselovská, Ludmila Veselovská, Jan Volín, Taťána Vykypělová, Roland Wagner, James Wilson, Uliana Yazhinova, Daniel Zeman, Jiří Zeman, Šárka Zikánová, Markéta Ziková, Petr Zima, Ilse Zimmermann, Zdeněk Žabokrtský, Stanislav Žaža (2016): Nový encyklopedický slovník češtiny. In: , ISBN 978-80-7422-480-5 (url, bibtex)
Ondřej Bojar, Ondřej Cífka, Jindřich Helcl, Tom Kocmi, Roman Sudarikov (2016): UFAL Submissions to the IWSLT 2016 MT Track. In: Proceedings of the ninth International Workshop on Spoken Language Translation (IWSLT), pp. 1-8, Karlsruhe Institute of Technology (pdf, bibtex)
Ondřej Bojar, Filip Děchtěrenko, Maria Zelenina (2016): A Pilot Eye-Tracking Study of WMT-Style Ranking Evaluation. In: Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem, pp. 20-26, LREC, Portorož, Slovenia (bibtex)
Ondřej Bojar, Ondřej Dušek, Tom Kocmi, Jindřich Libovický, Michal Novák, Martin Popel, Roman Sudarikov, Dušan Variš (2016): CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered. In: Text, Speech, and Dialogue: 19th International Conference, TSD 2016, Lecture Notes in Computer Science, ISSN 0302-9743, 9924, pp. 231-238, Springer International Publishing, Cham / Heidelberg / New York / Dordrecht / London, ISBN 978-3-319-45509-9 (url, bibtex)
Ondřej Bojar, Christian Federmann, Barry Haddow, Philipp Koehn, Matt Post, Lucia Specia (2016): Ten Years of WMT Evaluation Campaigns: Lessons Learnt. In: Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem, pp. 27-34, LREC, Portorož, Slovenia (bibtex)
Ondřej Bojar, Yvette Graham, Amir Kamran, Miloš Stanojević (2016): Results of the WMT16 Metrics Shared Task. In: Proceedings of the First Conference on Machine Translation (WMT). Volume 2: Shared Task Papers, pp. 199-231, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-10-4 (pdf, bibtex)
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aurelie Névéol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, Marcos Zampieri (2016): Findings of the 2016 Conference on Machine Translation (WMT16). In: Proceedings of the First Conference on Machine Translation (WMT). Volume 2: Shared Task Papers, pp. 131-198, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-10-4 (pdf, bibtex)
Bushra Jawaid, Amir Kamran, Ondřej Bojar (2016): Enriching Source for English-to-Urdu Machine Translation. In: Proceedings of the the 6th Workshop on South and Southeast Asian NLP, pp. 54-63, International Committee for Computational Linguistics, Ōsaka, Japan (bibtex)
Bushra Jawaid, Amir Kamran, Miloš Stanojević, Ondřej Bojar (2016): Results of the WMT16 Tuning Shared Task. In: Proceedings of the First Conference on Machine Translation (WMT). Volume 2: Shared Task Papers, pp. 232-238, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-10-4 (pdf, bibtex)
Tom Kocmi, Ondřej Bojar (2016): SubGram: Extending Skip-gram Word Representation with Substrings. In: Text, Speech, and Dialogue: 19th International Conference, TSD 2016, Lecture Notes in Computer Science, ISSN 0302-9743, 9924, pp. 182-189, Springer International Publishing, Cham / Heidelberg / New York / Dordrecht / London, ISBN 978-3-319-45509-9 (url, bibtex)
Viktor Kocúr, Ondřej Bojar (2016): Particle Swarm Optimization Submission for WMT16 Tuning Task. In: Proceedings of the First Conference on Machine Translation (WMT). Volume 2: Shared Task Papers, pp. 518-524, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-10-4 (pdf, bibtex)
Jindřich Libovický, Jindřich Helcl, Marek Tlustý, Pavel Pecina, Ondřej Bojar (2016): CUNI System for WMT16 Automatic Post-Editing and Multimodal Translation Tasks. In: Proceedings of the First Conference on Machine Translation (WMT). Volume 2: Shared Task Papers, pp. 646-654, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-10-4 (url, bibtex)
Jan-Thorsten Peter, Tamer Alkhouli, Hermann Ney, Matthias Huck, Fabienne Braune, Alexander Fraser, Aleš Tamchyna, Ondřej Bojar, Barry Haddow, Rico Sennrich, Frédéric Blain, Lucia Specia, Jan Niehues, Alex Waibel, Alexandre Allauzen, Lauriane Aufrant, Franck Burlot, Elena Knyazeva, Thomas Lavergne, François Yvon, Stella Frank, Marcis Pinnis (2016): The QT21/HimL Combined Machine Translation System. In: Proceedings of the First Conference on Machine Translation (WMT). Volume 2: Shared Task Papers, pp. 344-355, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-10-4 (pdf, bibtex)
Martin Popel, Roman Sudarikov, Ondřej Bojar, Rudolf Rosa, Jan Hajič (2016): TectoMT – a deep-linguistic core of the combined Chimera MT system. In: Baltic Journal of Modern Computing, ISSN 2255-8942, vol. 4, no. 2, pp. 377-377 (pdf, local PDF, local PDF, local PDF, bibtex)
Rudolf Rosa, Martin Popel, Ondřej Bojar, David Mareček, Ondřej Dušek (2016): Moses & Treex Hybrid MT Systems Bestiary. In: Proceedings of the 2nd Deep Machine Translation Workshop, pp. 1-10, ÚFAL MFF UK, Praha, Czechia, ISBN 978-80-88132-02-8 (url, local PDF, local PDF, bibtex)
Rudolf Rosa, Roman Sudarikov, Michal Novák, Martin Popel, Ondřej Bojar (2016): Dictionary-based Domain Adaptation of MT Systems without Retraining. In: Proceedings of the First Conference on Machine Translation (WMT). Volume 2: Shared Task Papers, pp. 449-455, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-10-4 (pdf, bibtex)
Roman Sudarikov, Ondřej Dušek, Martin Holub, Ondřej Bojar, Vincent Kríž (2016): Verb Sense Disambiguation in Machine Translation. In: Sixth Workshop on Hybrid Approaches to Translation (HyTra-6), pp. 42-50, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-4-87974-713-6 (pdf, bibtex)
Roman Sudarikov, Martin Popel, Ondřej Bojar, Aljoscha Burchardt, Ondřej Klejch (2016): Using MT-ComparEval. In: Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem, pp. 76-82, LREC, Portorož, Slovenia (pdf, bibtex)
Aleš Tamchyna, Alexander Fraser, Ondřej Bojar, Marcin Junczys-Dowmunt (2016): Target-Side Context for Discriminative Models in Statistical Machine Translation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1704-1714, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-00-5 (pdf, bibtex)
Aleš Tamchyna, Roman Sudarikov, Ondřej Bojar, Alexander Fraser (2016): CUNI-LMU Submissions in WMT2016: Chimera Constrained and Beaten. In: Proceedings of the First Conference on Machine Translation (WMT). Volume 2: Shared Task Papers, pp. 385-390, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-10-4 (url, bibtex)
Le Thanh, Hoa Vu Throng, Jonathan Oberländer, Ondřej Bojar (2016): Using Term Position Similarity and Language Modeling for Bilingual Document Alignment. In: Proceedings of the First Conference on Machine Translation (WMT). Volume 2: Shared Task Papers, pp. 710-716, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-10-4 (pdf, bibtex)
Phil Williams, Rico Sennrich, Maria Nadejde, Matthias Huck, Barry Haddow, Ondřej Bojar (2016): Edinburgh’s Statistical Machine Translation Systems for WMT16. In: Proceedings of the First Conference on Machine Translation (WMT). Volume 2: Shared Task Papers, pp. 399-410, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-10-4 (pdf, bibtex)
Ondřej Bojar (2015): Machine translation. In: The Oxford Handbook of Inflection, pp. 323-347, Oxford University Press, Oxford, UK, ISBN 978-0-19-959142-8 (url, bibtex)
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Matthias Huck, Chris Hokamp, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Carolina Scarton, Lucia Specia, Marco Turchi (2015): Findings of the 2015 Workshop on Statistical Machine Translation. In: Proceedings of the 10th Workshop on Machine Translation, pp. 1-46, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-941643-32-7 (pdf, bibtex)
Ondřej Bojar, Aleš Tamchyna (2015): CUNI in WMT15: Chimera Strikes Again. In: Proceedings of the 10th Workshop on Machine Translation, pp. 79-83, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-941643-32-7 (url, bibtex)
Franky, Ondřej Bojar, Kateřina Veselovská (2015): Resources for Indonesian Sentiment Analysis. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 103, pp. 21-41 (pdf, bibtex)
Tam Hoang, Ondřej Bojar (2015): TmTriangulate: A Tool for Phrase Table Triangulation. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 104, pp. 75-86 (pdf, bibtex)
Matouš Macháček, Ondřej Bojar (2015): Evaluating Machine Translation Quality Using Short Segments Annotations. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 103, pp. 85-110 (pdf, bibtex)
Miloš Stanojević, Amir Kamran, Ondřej Bojar (2015): Results of the WMT15 Tuning Shared Task. In: Proceedings of the 10th Workshop on Machine Translation, pp. 274-281, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-941643-32-7 (pdf, bibtex)
Miloš Stanojević, Amir Kamran, Philipp Koehn, Ondřej Bojar (2015): Results of the WMT15 Metrics Shared Task. In: Proceedings of the 10th Workshop on Machine Translation, pp. 256-273, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-941643-32-7 (pdf, bibtex)
Roman Sudarikov, Ondřej Bojar (2015): Giving a Sense: A Pilot Study in Concept Annotation from Multiple Resources. In: Proceedings of the 15th conference ITAT 2015: Slovenskočeský NLP workshop (SloNLP 2015), pp. 88-94, CreateSpace Independent Publishing Platform, Praha, Czechia, ISBN 978-1515120650 (pdf, bibtex)
Roman Sudarikov, Ondřej Bojar (2015): Giving a Sense: A Pilot Study in Concept Annotation from Multiple Resources. In: UFAL WDS 2015 (Conference of PhD Students in Mathematical Linguistics), pp. 14-21, Institute of Formal and Applied Linguistics, Charles University in Prague, Praha, Czechia (bibtex)
Roman Sudarikov, Petr Fanta, Ondřej Bojar (2015): TeamUFAL: WSD+EL as Document Retrieval. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp. 350-354, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-941643-40-2 (url, bibtex)
Aleš Tamchyna, Ondřej Bojar (2015): What a Transfer-Based System Brings to the Combination with PBMT. In: Proceedings of the Fourth Workshop on Hybrid Approaches to Translation (HyTra), pp. 11-20, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-941643-67-9 (bibtex)
Ondřej Bojar, Christian Buck, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Hervé Saint-Amand, Radu Soricut, Lucia Specia, Aleš Tamchyna (2014): Findings of the 2014 Workshop on Statistical Machine Translation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 12-58, Association for Computational Linguistics, Baltimore, MD, USA, ISBN 978-1-941643-17-4 (bibtex)
Ondřej Bojar, Vojtěch Diatka, Pavel Rychlý, Pavel Straňák, Vít Suchomel, Aleš Tamchyna, Daniel Zeman (2014): HindEnCorp – Hindi-English and Hindi-only Corpus for Machine Translation. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 3550-3555, European Language Resources Association, Reykjavík, Iceland, ISBN 978-2-9517408-8-4 (pdf, local PDF, local PDF, bibtex)
Ondřej Bojar, Daniel Zeman (2014): Czech Machine Translation in the project CzechMATE. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 101, pp. 71-96 (pdf, local PDF, local PDF, bibtex)
Bushra Jawaid, Ondřej Bojar (2014): Two-Step Machine Translation with Lattices. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 682-686, European Language Resources Association, Reykjavík, Iceland, ISBN 978-2-9517408-8-4 (local PDF, bibtex)
Bushra Jawaid, Amir Kamran, Ondřej Bojar (2014): English to Urdu Statistical Machine Translation: Establishing a Baseline. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 37-42, Dublin City University and Association for Computational Linguistics, Dublin, Ireland, ISBN 978-1-941643-26-6 (bibtex)
Bushra Jawaid, Amir Kamran, Ondřej Bojar (2014): A Tagged Corpus and a Tagger for Urdu. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 2938-2943, European Language Resources Association, Reykjavík, Iceland, ISBN 978-2-9517408-8-4 (attachment1, local PDF, bibtex)
Matouš Macháček, Ondřej Bojar (2014): Results of the WMT14 Metrics Shared Task. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 293-301, Association for Computational Linguistics, Baltimore, MD, USA, ISBN 978-1-941643-17-4 (pdf, bibtex)
Eduard Šubert, Ondřej Bojar (2014): Twitter Crowd Translation -- Design and Objectives. In: Translating and the Computer 36, pp. 217-227, Editions Tradulex; AsLing, Geneva, Switzerland, ISBN 9782970073628 (url, bibtex)
Aleš Tamchyna, Martin Popel, Rudolf Rosa, Ondřej Bojar (2014): CUNI in WMT14: Chimera Still Awaits Bellerophon. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 195-200, Association for Computational Linguistics, Baltimore, MD, USA, ISBN 978-1-941643-17-4 (pdf, local PDF, local PDF, bibtex)
Zdeňka Urešová, Jan Hajič, Ondřej Bojar (2014): Comparing Czech and English AMRs. In: Proceedings of Workshop on Lexical and Grammatical Resources for Language Processing (LG-LP 2014, at Coling 2014), pp. 55-64, Association for Computational Linguistics and Dublin City University, Dublin, Ireland, ISBN 978-1-873769-44-7 (pdf, local PDF, bibtex)
Dušan Variš, Ondřej Bojar (2014): Japonsko-český strojový překlad. In: Proceedings of the 14th conference ITAT 2014, pp. 1-8, Institute of Computer Science AS CR, Praha, Czechia, ISBN 978-80-87136-18-8 (bibtex)
Nianwen Xue, Ondřej Bojar, Jan Hajič, Martha Palmer, Zdeňka Urešová, Xiuhong Zhang (2014): Not an Interlingua, But Close: Comparison of English AMRs to Chinese and Czech. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 1765-1772, European Language Resources Association, Reykjavík, Iceland, ISBN 978-2-9517408-8-4 (pdf, local PDF, local PDF, local PDF, bibtex)
Jan Berka, Ondřej Bojar, Mark Fishel, Maja Popović, Daniel Zeman (2013): Tools for Machine Translation Quality Inspection (technical report). In: (url, local PDF, local PDF, bibtex)
Ondřej Bojar, Christian Buck, Chris Callison-Burch, Christian Federmann, Barry Haddow, Philipp Koehn, Christof Monz, Matt Post, Radu Soricut, Lucia Specia (2013): Findings of the 2013 Workshop on Statistical Machine Translation. In: Proceedings of the Eight Workshop on Statistical Machine Translation, pp. 1-44, Association for Computational Linguistics, Sofija, Bulgaria, ISBN 978-1-937284-57-2 (url, bibtex)
Ondřej Bojar, Matouš Macháček, Aleš Tamchyna, Daniel Zeman (2013): Scratching the Surface of Possible Translations. In: Text, Speech and Dialogue: 16th International Conference, TSD 2013. Proceedings, Lecture Notes in Computer Science, ISSN 0302-9743, 8082, pp. 465-474, Springer Verlag, Berlin / Heidelberg, ISBN 978-3-642-40584-6 (local PDF, bibtex)
Ondřej Bojar, Rudolf Rosa, Aleš Tamchyna (2013): Chimera – Three Heads for English-to-Czech Translation. In: Proceedings of the Eight Workshop on Statistical Machine Translation, pp. 92-98, Association for Computational Linguistics, Sofija, Bulgaria, ISBN 978-1-937284-57-2 (url, local PDF, local PDF, bibtex)
Ondřej Bojar, Aleš Tamchyna (2013): The Design of Eman, an Experiment Manager. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 99, pp. 39-58 (pdf, bibtex)
Petra Galuščáková, Martin Popel, Ondřej Bojar (2013): PhraseFix: Statistical Post-Editing of TectoMT. In: Proceedings of the Eight Workshop on Statistical Machine Translation, pp. 141-147, Association for Computational Linguistics, Sofija, Bulgaria, ISBN 978-1-937284-57-2 (bibtex)
Michal Kalina, Ondřej Bojar (2013): Jak překladač z Matfyzu porazil Google. In: Hospodářské noviny IHNED, ISSN 1213-7693 (url, bibtex)
Matouš Macháček, Ondřej Bojar (2013): Results of the WMT13 Metrics Shared Task. In: Proceedings of the Eight Workshop on Statistical Machine Translation, pp. 45-51, Association for Computational Linguistics, Sofija, Bulgaria, ISBN 978-1-937284-57-2 (pdf, bibtex)
Aleš Tamchyna, Ondřej Bojar (2013): No Free Lunch in Factored Phrase-Based Machine Translation. In: Lecture Notes in Computer Science, ISSN 0302-9743, 7817, pp. 210-223 (url, bibtex)
Jan Berka, Ondřej Bojar, Mark Fishel, Maja Popović, Daniel Zeman (2012): Automatic MT Error Analysis: Hjerson Helping Addicter. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 2158-2163, European Language Resources Association, İstanbul, Turkey, ISBN 978-2-9517408-7-7 (url, local PDF, bibtex)
Ondřej Bojar (2012): Čeština a strojový překlad. In: , ISBN 978-80-904571-4-0 (bibtex)
Ondřej Bojar (2012): Strojový překlad. In: Vesmír, ISSN 0042-4544, 91, pp. 488-490 (bibtex)
Ondřej Bojar, Mauro Cettolo, Silvie Cinková, Philipp Koehn, Miroslav Týnovský, Zdeněk Žabokrtský (2012): Scientific Report on Rich Tree-Based SMT (technical report). In: (bibtex)
Ondřej Bojar, Silvie Cinková, Jan Hajič, Barbora Hladká, Vladislav Kuboň, Jiří Mírovský, Jarmila Panevová, Nino Peterek, Johanka Spoustová, Zdeněk Žabokrtský (2012): The Czech Language in the Digital Age. In: , ISBN 978-3-642-30705-8 (local PDF, bibtex)
Ondřej Bojar, Bushra Jawaid, Amir Kamran (2012): Probes in a Taxonomy of Factored Phrase-Based Models. In: Proceedings of the Seventh Workshop on Statistical Machine Translation, pp. 253-260, Association for Computational Linguistics, Montréal, Canada, ISBN 978-1-937284-20-6 (url, local PDF, bibtex)
Ondřej Bojar, Dekai Wu (2012): Towards a Predicate-Argument Evaluation for MT. In: Proceedings of Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-6), ACL, pp. 30-38, Association for Computational Linguistics, Jeju, Korea, ISBN 978-1-937284-38-1 (url, local PDF, bibtex)
Ondřej Bojar, Zdeněk Žabokrtský, Ondřej Dušek, Petra Galuščáková, Martin Majliš, David Mareček, Jiří Maršík, Michal Novák, Martin Popel, Aleš Tamchyna (2012): The Joy of Parallelism with CzEng 1.0. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 3921-3928, European Language Resources Association, İstanbul, Turkey, ISBN 978-2-9517408-7-7 (url, local PDF, bibtex)
Mark Fishel, Ondřej Bojar, Maja Popović (2012): Terra: a Collection of Translation Error-Annotated Corpora. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 7-14, European Language Resources Association, İstanbul, Turkey, ISBN 978-2-9517408-7-7 (local PDF, bibtex)
Mark Fishel, Rico Sennrich, Maja Popović, Ondřej Bojar (2012): TerrorCat: a Translation Error Categorization-based MT Quality Metric. In: Proceedings of the Seventh Workshop on Statistical Machine Translation, pp. 64-70, Association for Computational Linguistics, Montréal, Canada, ISBN 978-1-937284-20-6 (url, bibtex)
Petra Galuščáková, Ondřej Bojar (2012): Improving SMT by Using Parallel Data of a Closely Related Language. In: Human Language Technologies – The Baltic Perspective - Proceedings of the Fifth International Conference Baltic HLT 2012, pp. 58-65, IOS Press, Amsterdam, Netherlands, ISBN 978-1-61499-132-8 (url, bibtex)
Jan Hajič, Eva Hajičová, Jarmila Panevová, Petr Sgall, Ondřej Bojar, Silvie Cinková, Eva Fučíková, Marie Mikulová, Petr Pajas, Jan Popelka, Jiří Semecký, Jana Šindlerová, Jan Štěpánek, Josef Toman, Zdeňka Urešová, Zdeněk Žabokrtský (2012): Announcing Prague Czech-English Dependency Treebank 2.0. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 3153-3160, European Language Resources Association, İstanbul, Turkey, ISBN 978-2-9517408-7-7 (url, local PDF, bibtex)
Bushra Jawaid, Ondřej Bojar (2012): Tagger Voting for Urdu. In: Proceedings of the Workshop on South and Southeast Asian Natural Language Processing (WSSANLP) at Coling 2012, pp. 135-144, Coling 2012 Organizing Committee, Mumbai, India (pdf, bibtex)
Jiří Maršík, Ondřej Bojar (2012): TrTok: A Fast and Trainable Tokenizer for Natural Languages. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 98, pp. 75-85 (pdf, bibtex)
Loganathan Ramasamy, Ondřej Bojar, Zdeněk Žabokrtský (2012): Morphological Processing for English-Tamil Statistical Machine Translation. In: Proceedings of the Workshop on Machine Translation and Parsing in Indian Languages (MTPIL-2012), pp. 113-122, The COLING 2012 Organizing Committee, Mumbai, India (bibtex)
Aleš Tamchyna, Petra Galuščáková, Amir Kamran, Miloš Stanojević, Ondřej Bojar (2012): Selecting Data for English-to-Czech Machine Translation. In: Proceedings of the Seventh Workshop on Statistical Machine Translation, pp. 374-381, Association for Computational Linguistics, Montréal, Canada, ISBN 978-1-937284-20-6 (url, local PDF, bibtex)
Jan Berka, Martin Černý, Ondřej Bojar (2011): Quiz-Based Evaluation of Machine Translation. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 95, pp. 77-86 (pdf, local PDF, bibtex)
Ondřej Bojar (2011): Analyzing Error Types in English-Czech Machine Translation. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 95, pp. 63-76 (pdf, local PDF, bibtex)
Ondřej Bojar, Miloš Ercegovčević, Martin Popel, Omar F. Zaidan (2011): A Grain of Salt for the WMT Manual Evaluation. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 1-11, Association for Computational Linguistics, Edinburgh, UK, ISBN 978-1-937284-12-1 (pdf, local PDF, local PDF, bibtex)
Ondřej Bojar, Petra Galuščáková, Miroslav Týnovský (2011): Evaluating Quality of Machine Translation from Czech to Slovak. In: Information Technologies – Applications and Theory, pp. 3-9, Univerzita Pavla Jozefa Šafárika v Košiciach, Košice, Slovakia, ISBN 978-80-89557-01-1 (local PDF, bibtex)
Ondřej Bojar, Aleš Tamchyna (2011): Improving Translation Model by Monolingual Data. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 330-336, Association for Computational Linguistics, Edinburgh, UK, ISBN 978-1-937284-12-1 (url, local PDF, bibtex)
Mark Fishel, Ondřej Bojar, Daniel Zeman, Jan Berka (2011): Automatic Translation Error Analysis. In: Lecture Notes in Computer Science, ISSN 0302-9743, 6836, pp. 72-79 (url, local ODP, local PDF, local PDF, bibtex)
Petra Galuščáková, Ondřej Bojar (2011): Czech-Slovak Parallel Corpora. In: Natural Language Processing, Multilinguality , pp. 65-71, Tribun EU, Bratislava, Slovakia, ISBN 978-80-263-0049-6 (local PDF, bibtex)
Ondřej Hálek, Rudolf Rosa, Aleš Tamchyna, Ondřej Bojar (2011): Named Entities from Wikipedia for Machine Translation. In: Information Technologies – Applications and Theory, pp. 23-30, Univerzita Pavla Jozefa Šafárika v Košiciach, Košice, Slovakia, ISBN 978-80-89557-02-8 (local PDF, local PDF, local PDF, bibtex)
Matouš Macháček, Ondřej Bojar (2011): Approximating a Deep-Syntactic Metric for MT Evaluation and Tuning. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 92-98, Association for Computational Linguistics, Edinburgh, UK, ISBN 978-1-937284-12-1 (url, local PPT, local PDF, local PDF, bibtex)
David Mareček, Rudolf Rosa, Petra Galuščáková, Ondřej Bojar (2011): Two-step translation with grammatical post-processing. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 426-432, Association for Computational Linguistics, Edinburgh, UK, ISBN 978-1-937284-12-1 (url, local PDF, local PDF, bibtex)
Česlav Przywara, Ondřej Bojar (2011): eppex: Epochal Phrase Table Extraction for Statistical Machine Translation. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 96, pp. 89-98 (url, local PDF, bibtex)
Daniel Zeman, Mark Fishel, Jan Berka, Ondřej Bojar (2011): Addicter: What Is Wrong with My Translations?. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 96, pp. 79-88 (pdf, local PDF, local PDF, bibtex)
Ondřej Bojar (2010): Vládce jazyků. In: Maxim, ISSN 1214-1569, 2010/10, pp. 52-53 (bibtex)
Ondřej Bojar, Kamil Kos (2010): 2010 Failures in English-Czech Phrase-Based MT. In: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pp. 60-66, Association for Computational Linguistics, Uppsala, Sweden, ISBN 978-1-932432-71-8 (url, bibtex)
Ondřej Bojar, Kamil Kos, David Mareček (2010): Tackling Sparse Data Issue in Machine Translation Evaluation. In: Proceedings of the ACL 2010 Conference Short Papers, pp. 86-91, Association for Computational Linguistics, Uppsala, Sweden, ISBN 978-1-932432-69-5 (url, bibtex)
Ondřej Bojar, Adam Liška, Zdeněk Žabokrtský (2010): Evaluating Utility of Data Sources in a Large Parallel Czech-English Corpus CzEng 0.9. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), pp. 447-452, European Language Resources Association, Valletta, Malta, ISBN 2-9517408-6-7 (bibtex)
Ondřej Bojar, Pavel Straňák, Daniel Zeman (2010): Data Issues in English-to-Hindi Machine Translation. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), pp. 1771-1777, European Language Resources Association, Valletta, Malta, ISBN 2-9517408-6-7 (local PDF, local ODP, local PDF, bibtex)
Ondřej Bojar, Jana Šindlerová (2010): Building a Bilingual ValLex Using Treebank Token Alignment: First Observations. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), pp. 304-309, European Language Resources Association, Valletta, Malta, ISBN 2-9517408-6-7 (local PDF, bibtex)
Jiří Diviš, Ondřej Bojar (2010): Automatic Source Code Reduction. In: Information Technologies – Applications and Theory, pp. 9-16, PONT s. r. o., Seňa, Slovakia, ISBN 978-80-970179-4-1 (bibtex)
Aleš Tamchyna, Ondřej Bojar (2010): Bohatá anotace ve frázovém strojovém překladu. In: Informačné technológie – Aplikácie a Teória, Zborník príspevkov prezentovaných na konferencii ITAT, pp. 99-106, PONT s. r. o., Seňa, Slovakia, ISBN 978-80-970179-3-4 (bibtex)
Ondřej Bojar (2009): Exploiting Linguistic Data in Machine Translation. In: , ISBN 978-80-904175-8-8 (local PDF, bibtex)
Ondřej Bojar, David Mareček, Václav Novák, Martin Popel, Jan Ptáček, Jan Rouš, Zdeněk Žabokrtský (2009): English-Czech MT in 2008. In: Proceedings of the Fourth Workshop on Statistical Machine Translation, pp. 125-129, Association for Computational Linguistics, Athina, Greece (pdf, local PDF, bibtex)
Ondřej Bojar, Pavel Straňák, Daniel Zeman, Gaurav Jain, Michal Hrušecký, Michal Richter, Jan Hajič (2009): English-Hindi Translation – Obtaining Mediocre Results with Bad Data and Fancy Models. In: Proceedings of ICON 2009: 7th International Conference on Natural Language Processing, pp. 316-321, Macmillan Publishers, India, Hyderabad, India, ISBN 978-023-032-845-7 (local PDF, local PDF, bibtex)
Ondřej Bojar, Miroslav Týnovský (2009): Evaluation of Tree Transfer System (technical report). In: (local PDF, bibtex)
Ondřej Bojar, Zdeněk Žabokrtský (2009): CzEng 0.9, Building a Large Czech-English Automatic Parallel Treebank. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 92, pp. 63-83 (pdf, local PDF, bibtex)
Petr Homola, Natalia Klyueva, Ondřej Bojar (2009): Towards a Rule-Based Machine Translation System Between Czech and Russian. In: Formal Description of Slavic Languages, pp. 37-38, Universität Potsdam, Potsdam, Germany (local PDF, bibtex)
Hana Klempová, Michal Novák, Peter Fabian, Jan Ehrenberger, Ondřej Bojar (2009): Získávání paralelních textů z webu. In: Informačné Technológie – Aplikácie a Teória. Zborník príspevkov, ITAT 2009, pp. 47-54, PONT s.r.o., Seňa, Slovakia, ISBN 978-80-970179-1-0 (local PDF, bibtex)
David Kolovratník, Natalia Klyueva, Ondřej Bojar (2009): Statistical Machine Translation between Related and Unrelated Languages. In: Information Technologies – Applications and Theory, pp. 31-36, PONT s.r.o., Seňa, Slovakia, ISBN 978-80-970179-2-7 (local PDF, local PDF, bibtex)
Kamil Kos, Ondřej Bojar (2009): Evaluation of Machine Translation Metrics for Czech as the Target Language. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 92, pp. 135-148 (pdf, local PDF, bibtex)
Ondřej Odcházel, Ondřej Bojar (2009): Computer Aided Translation Backed by Machine Translation. In: Translating and the Computer 31, pp. 1-8, ASLIB, London, UK (bibtex)
Jana Šindlerová, Ondřej Bojar (2009): Towards English-Czech Parallel Valency Lexicon via Treebank Examples. In: Proceedings of 8th Treebanks and Linguistic Theories Workshop (TLT), pp. 185-195, Università Cattolica del Sacro Cuore, Milano, Italy, ISBN 978-88-8311-712-1 (local PDF, bibtex)
Ondřej Bojar (2008): Exploiting Linguistic Data in Machine Translation (PhD thesis). In: (local PDF, local PDF, local PDF, bibtex)
Ondřej Bojar, Silvie Cinková, Jan Ptáček (2008): Towards English-to-Czech MT via Tectogrammatical Layer. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 90, pp. 57-68 (pdf, local PDF, bibtex)
Ondřej Bojar, Jan Hajič (2008): Phrase-Based and Deep Syntactic English-to-Czech Statistical Machine Translation. In: ACL 2008 WMT: Proceedings of the Third Workshop on Statistical Machine Translation, pp. 143-146, Association for Computational Linguistics, Columbus, OH, USA, ISBN 978-1-932432-09-1 (url, local PDF, bibtex)
Ondřej Bojar, Miroslav Janíček, Miroslav Týnovský (2008): Implementation of Tree Transfer System (technical report). In: (local PDF, bibtex)
Ondřej Bojar, Miroslav Janíček, Zdeněk Žabokrtský, Pavel Češka, Peter Beňa (2008): CzEng 0.7: Parallel Corpus with Community-Supplied Translations. In: Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), pp. 1203-1208, European Language Resources Association, Marrakech, Morocco, ISBN 2-9517408-4-0 (local PDF, bibtex)
Ondřej Bojar, Adam Lopez (2008): Tree-based Translation. In: Proceedings of MT Marathon 2008, University of Edinburgh, Edinburgh, Scotland (url, local PDF, bibtex)
Ondřej Bojar, Pavel Straňák, Daniel Zeman (2008): English-Hindi Translation in 21 Days. In: Proceedings of the 6th International Conference On Natural Language Processing (ICON-2008) NLP Tools Contest, International Institute of Information Technologies, Hyderabad, Pune, India (url, local PDF, local PPT, bibtex)
Natalia Klyueva, Ondřej Bojar (2008): UMC 0.1: Czech-Russian-English Multilingual Corpus. In: Proceedings of the Conference "Korpusnaja lingvistika - 2008", pp. 188-195, St.Petersburg State University, Sankt-Peterburg, Russia, ISBN 978-5-288-04769-5 (pdf, local PDF, bibtex)
Zdeněk Žabokrtský, Ondřej Bojar (2008): TectoMT, Developer's Guide (technical report). In: (bibtex)
Ondřej Bojar (2007): English-to-Czech Factored Machine Translation. In: ACL 2007 WMT: Proceedings of the Second Workshop on Statistical Machine Translation, pp. 232-239, Association for Computational Linguistics, Praha, Czechia, ISBN 978-1-932432-86-2 (url, bibtex)
Ondřej Bojar, Silvie Cinková, Jan Ptáček (2007): Towards English-to-Czech MT via Tectogrammatical Layer. In: Proceedings of the 6th International Workshop on Treebanks and Linguistic Theories (TLT 2007), NEALT Proceedings Series, ISSN 1736-6305, 1, pp. 7-18, North European Association for Language Technology, Bergen, Norway (url, local PDF, bibtex)
Ondřej Bojar, Martin Čmejrek (2007): Mathematical Model of Tree Transformations (technical report). In: (local PDF, bibtex)
Ondřej Bojar, Magdalena Prokopová (2007): Czech-English Machine Translation Dictionary (technical report). In: (local PDF, bibtex)
Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Corbett Moran, Richard Zens, Chris Dyer, Ondřej Bojar, Alexandra Constantin, Evan Herbst (2007): Moses: Open Source Toolkit for Statistical Machine Translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume, Proceedings of the Student Research Workshop, Proceedings of Demo and Poster Sessions, Tutorial Abstracts, pp. 177-180, Association for Computational Linguistics, Praha, Czechia, ISBN 978-1-932432-87-9 (url, local PDF, bibtex)
Václava Benešová, Ondřej Bojar (2006): Czech Verbs of Communication and the Extraction of their Frames. In: Text, Speech and Dialogue. 9th International Conference, TSD 2006, Brno, Czech Republic, September 11–15, 2006, Proceedings, Lecture Notes in Computer Science, ISSN 0302-9743, 4188, pp. 29-36, Springer, Berlin / Heidelberg, ISBN 978-3-540-39090-9 (url, bibtex)
Ondřej Bojar (2006): Strojový překlad: zamyšlení nad účelností hloubkových jazykových analýz. In: Proceedings of Malý informatický seminář (MIS), pp. 3-13, Matfyzpress, Praha, Czechia, ISBN 80-7378-000-3 (bibtex)
Ondřej Bojar, Evgeny Matusov, Hermann Ney (2006): Czech-English Phrase-Based Machine Translation. In: Advances in Natural Language Processing (5th International Conference on NLP, FinTAL 2006), pp. 214-224, Springer, Berlin / Heidelberg, ISBN 978-3-540-37334-6 (url, bibtex)
Ondřej Bojar, Magdalena Prokopová (2006): Czech-English Word Alignment. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) , pp. 1236-1239, ELRA, Genova, Italy, ISBN 2-9517408-2-4 (url, local PDF, bibtex)
Ondřej Bojar, Zdeněk Žabokrtský (2006): CzEng: Czech-English Parallel Corpus, Release version 0.5. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 86, pp. 59-62 (bibtex)
Philipp Koehn, Marcello Federico, Wade Shen, Nicola Bertoldi, Ondřej Bojar, Chris Callison-Burch, Brooke Cowan, Chris Dyer, Hieu Hoang, Richard Zens, Alexandra Constantin, Christine Corbett Moran, Evan Herbst (2006): Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Confusion Network Decoding (technical report). In: (local PDF, bibtex)
Ondřej Bojar (2005): Budování česko-anglického slovníku pro strojový překlad. In: ITAT 2005 Information Technologies - Applications and Theory, pp. 201-211, Univerzita Pavla Jozefa Šafárika, Račkova dolina, Slovakia, ISBN 80-7097-609-8 (bibtex)
Ondřej Bojar, Cyril Brom, Milan Hladík, Vojtěch Toman (2005): The Project ENTs: Towards Modelling Human-like Artificial Agents. In: SOFSEM 2005: Communications, pp. 111-122, Society for Computer Science, Liptovský Ján, Slovakia, ISBN 80-969255-4-7 (bibtex)
Ondřej Bojar, Jan Hajič (2005): Extracting Translations Verb Frames. In: Proceedings of Modern Approaches in Translation Technologies, pp. 2-6, Bulgarian Academy of Sciencies, Borovec, Bulgaria, ISBN 954-90906-9-8 (bibtex)
Ondřej Bojar, Petr Homola, Vladislav Kuboň (2005): Problems of Reusing an existing MT System. In: Second International Joint Conference on Natural Language Processing: Companion Volume including Posters/Demos and tutorial abstracts, pp. 179-184, Asian Federation of Natural Language Processing, Jeju Island, Korea, ISBN 978-3-540-29172-5 (bibtex)
Ondřej Bojar, Petr Homola, Vladislav Kuboň (2005): Problémy recyklování systému automatického překladu. In: ITAT 2005 Information Technologies - Applications and Theory, pp. 335-344, Univerzita Pavla Jozefa Šafárika, Račkova dolina, Slovakia, ISBN 80-7097-609-8 (bibtex)
Ondřej Bojar, Petr Homola, Vladislav Kuboň (2005): An MT System Recycled. In: Proceedings of the 10th Machine Translation Summit, pp. 380-387, Phuket, Thailand, ISBN 974-7431-26-2 (bibtex)
Ondřej Bojar, Jiří Semecký, Václava Benešová (2005): VALEVAL: Testing VALLEX Consistency and Experimenting with Word-Frame Disambiguation. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 83, pp. 5-17 (bibtex)
Markéta Lopatková, Ondřej Bojar, Jiří Semecký, Václava Benešová, Zdeněk Žabokrtský (2005): Valency Lexicon of Czech Verbs VALLEX: Recent Experiments with Frame Disambiguation. In: Proceedings of the 8th International Conference, TSD 2005, Lecture Notes in Computer Science, ISSN 0302-9743, 3658, pp. 99-106, Springer, Berlin / Heidelberg, ISBN 3-540-28789-2 (bibtex)
Ondřej Bojar (2004): Problems of Inducing Large Coverage Constraint-Based Dependency Grammar for Czech. In: Proceedings of International Workshop on Constraint Solving and Language Processing, CSLP 2004, pp. 29-42, Roskilde University, Roskilde (bibtex)
Ondřej Bojar (2004): Automated Extraction of Lexico-Syntactic Information. In: WDS, pp. 211--217, Charles University, Matfyzpress, Prague (bibtex)
Ondřej Bojar (2004): Czech Syntactic Analysis Constraint-Based, XDG: One Possible Start. In: , pp. 43--54 (bibtex)
Ondřej Bojar, Jiří Semecký, Shravan Vasishth, Ivana Kruijff-Korbayová (2004): Processing noncanonical word order in Czech. In: Proceedings of Architectures and Mechanisms for Language Processing, AMLaP 2004, pp. 91--91, Université de Provence, Aix en Provence (bibtex)
Ondřej Bojar (2003): Building Subcorpora Suitable for Extraction of Lexico-Syntactic Information. In: Proceedings of the Student Session, ESSLLI, pp. 25--34 (pdf, local PDF, bibtex)
Ondřej Bojar (2003): AX - Systém pro automatizovanou extrakci lexikálně-syntaktických údajů. In: MIS 2003, pp. 15--24, MATFYZPRESS, Praha, ISBN 80-86732-22-3 (url, local PostScript, bibtex)
Ondřej Bojar (2003): Towards Automatic Extraction of Verb Frames. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 79--80, pp. 101--120 (bibtex)
Ondřej Bojar, Cyril Brom, Milan Hladík, Mikuláš Vejlupek, Vojtěch Toman, David Voňka (2003): ENTI -- Simulátor přirozeného prostředí lidského světa. In: MIS 2003, pp. 3--14, MATFYZPRESS, Praha, ISBN 80-86732-22-3 (url, local PostScript, bibtex)

Institute of Formal and Applied Linguistics

Charles University, Czech Republic
Faculty of Mathematics and Physics

Search form