Josef Jon
Main Research Interests
Machine translation
Curriculum Vitae
Education
- 2013–2017 Bachelor, Faculty of Information Technology, Brno University of Technology, Computer Science.
- 2014 Bachelor (Erasmus), Escuela Técnica Superior de Ingeniería Informática, Universidad de Sevilla, Computer Science.
- 2017–2019 Masters, Faculty of Information Technology, Brno University of Technology, Bioinformatics.
- Master’s thesis: Exploring Contextual Information in Neural Machine Translation.
- 2022–current PhD, Faculty of Mathematics and Physics, Charles University, Natural language processing.
Experience
- 2016–present NLP developer, Lingea, Brno.
- Development and deployment of machine translation systems and related tools.
- Integration of terminology databases and dictionaries into neural machine translation.
- Document-level NMT.
- Domain adaptation and self-adapting NMT.
- Efficient and multilingual NMT models suitable for deployment.
- Low-resource NMT (mainly focused on Slavic languages, capitalizing on language similarity).
- Integration of NMT into our CAT system.
- Management of EU projects.
- 2020–present Machine translation researcher, Charles University, Prague.
- Bergamot project – efficient and private client-side NMT for browsers.
Selected Bibliography
- Google Scholar
- ORCID: 0000-0002-6163-4889
- Scopus ID: 57219789644
- Researcher ID: ADU-3839-2022
- CUNI at WMT24 General Translation Task: LLMs, (Q)LoRA, CPO and Model Merging. In: Proceedings of the Ninth Conference on Machine Translation, pp. 232-246, Association for Computational Linguistics, Kerrville, TX, USA, ISBN 979-8-89176-179-7 (url, bibtex)
- GAATME: A Genetic Algorithm for Adversarial Translation Metrics Evaluation. In: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 7562-7569, European Language Resources Association, Torino, Italy, ISBN 978-2-493814-10-4 (url, bibtex)
- An Analysis of Surprisal Uniformity in Machine and Human Translations. In: Proceedings of the 1st Workshop on Creative-text Translation and Technology, pp. 40-56, European Association for Machine Translation, Sheffield, UK, ISBN 9781068690730 (bibtex)
- Character-level NMT and language similarity. In: Proceedings of Machine Translation Summit XIX vol. 1: Research Track, pp. 360-371, Asia-Pacific Association for Machine Translation (AAMT), Kyoto, Japan, ISBN 978-4-9913461-0-1 (pdf, bibtex)
- Breeding Machine Translations: Evolutionary approach to survive and thrive in the world of automated evaluation. In: Proceedings of 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2191-2212, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-959429-72-2 (url, bibtex)
- CUNI at WMT23 General Translation Task: MT and a Genetic Algorithm. In: Proceedings of the Eighth Conference on Machine Translation, pp. 119-127, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 979-8-89176-041-7 (pdf, bibtex)
- Negative Lexical Constraints in Neural Machine Translation. In: Proceedings of Machine Translation Summit XIX vol. 1: Research Track, pp. 372-384, Asia-Pacific Association for Machine Translation (AAMT), Kyoto, Japan, ISBN 978-4-9913461-0-1 (pdf, bibtex)
- CUNI-Bergamot Submission at WMT22 General Task. In: Proceedings of the Seventh Conference on Machine Translation, pp. 280-289, Association for Computational Linguistics, Stroudsburg, PA, USA (pdf, local PDF, bibtex)
- End-to-End Lexically Constrained Machine Translation for Morphologically Rich Languages. In: Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp. 4019-4033, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-954085-52-7 (url, local PDF, bibtex)
- CUNI systems for WMT21: Multilingual Low-Resource Translation for Indo-European Languages Shared Task. In: Proceedings of the Sixth Conference on Machine Translation, pp. 354-361, Association for Computational Linguistics, Online, ISBN 978-1-954085-94-7 (url, local PDF, bibtex)
- CUNI systems for WMT21: Terminology translation Shared Task. In: Proceedings of the Sixth Conference on Machine Translation, pp. 828-834, Association for Computational Linguistics, Online, ISBN 978-1-954085-94-7 (url, local PDF, bibtex)
Supervisor: Ondřej Bojar