Ondřej Dušek - Bibliography

Papers

2023

  • Zdeněk Kasner, Ioannis Konstas, Ondřej Dušek. Mind the Labels: Describing Relations in Knowledge Graphs With Pretrained Models, in EACL [Anthology] [Github] [Poster]
  • Sourabrata Mukherjee, Vojtěch Hudeček, Ondřej Dušek. Polite Chatbot: A Text Style Transfer Application, in EACL Student Research Workshop [Anthology] [Github] [Poster]
  • Zdeněk Kasner, Ekaterina Garanina, Ondřej Plátek, Ondřej Dušek. TabGenie: A Toolkit for Table-to-Text Generation, in ACL Demo [Anthology] [Demo] [PyPi] [Poster]
  • Belz et al. (29 authors). Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP, in Workshop on Insights from Negative Results in NLP [Anthology]

2022

  • Zdeněk Kasner, Ondřej Dušek. Neural Pipeline for Zero-Shot Data-to-Text Generation, in: ACL [Anthology] [Github] [Poster]
  • Tomáš Nekvinda, Ondřej Dušek. AARGH! End-to-end Retrieval-Generation for Task-Oriented Dialog, in: SIGdial. [arXiv] [video] [Github]
  • Sourabrata Mukherjee, Zdeněk Kasner, Ondřej Dušek. Balancing the Style-Content Trade-Off in Sentiment Transfer Using Polarity-Aware Denoising, in: Text, Speech and Dialogue [SpringerLink]
  • Vojtěch Hudeček, Léon-Paul Schaub, Daniel Stancl, Patrick Paroubek, Ondřej Dušek: A Unifying View On Task-oriented Dialogue Annotation, in: LREC [Anthology] [Github]
  • Vojtěch Hudeček, Ondřej Dušek. Learning Interpretable Latent Dialogue Actions With Less Supervision, in: AACL-IJCNLP [arXiv] [Github]
  • Rudolf Rosa, Patrícia Schmidtová, Ondřej Dušek, Tomáš Musil, David Mareček, Saad Obaid, Marie Nováková, Klára Vosecká, Josef Doležal. GPT-2-based Human-in-the-loop Theatre Play Script Generation, in: Workshop on Narrative Understanding [Anthology]
  • Rudolf Rosa, Patrícia Schmidtová, Alisa Zakhtarenko, Ondrej Dusek, Tomáš Musil, David Mareček, Saad Obaid Ul Islam, Marie Nováková, Klára Vosecká, Daniel Hrbek, David Košťák. GPT-2-based Human-in-the-loop Theatre Play Script Generation, in: INLG [Anthology] [Github]
  • Rudali Huidrom, Ondřej Dušek, Zdeněk Kasner, Thiago Castro Ferreira, Anya Belz. Two Reproductions of a Human-Assessed Comparative Evaluation of a Semantic Error Detection System, in: INLG GenChal [Anthology]
  • Gabor Baranyi, Bruno Carlos Dos Santos Melício, Zsófia Gaál, Levente Hajder, András Simonyi, Dániel Sindely, Joul Skaf, Ondřej Dušek, Tomáš Nekvinda, András Lőrincz, AI Technologies for Machine Supervision and Help in a Rehabilitation Scenario, in: Multimodal Technologies and Interaction 6(7) [Web]

2021

  • Jonáš Kulhánek, Vojtěch Hudeček, Tomáš Nekvinda, Ondřej Dušek. AuGPT: Auxiliary Tasks and Data Augmentation for End-To-End Dialogue with Pre-Trained Language Models, in: NLP4ConvAI Workshop. [arXiv]
  • Xinnuo Xu, Ondřej Dušek, Shashi Narayan, Verena Rieser, Ioannis Konstas. MiRANews: Dataset and Benchmarks for Multi-Resource-Assisted News Summarization, In: EMNLP Findings. [Anthology]
  • Emiel van Miltenburg, Miruna Clinciu, Ondřej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Stephanie Schoch, Craig Thomson, Luou Wen. Underreporting of errors in NLG output, and what to do about it, In: INLG (Commendation for an outstanding position paper). [Anthology]
  • Zdeněk Kasner, Simon Mille and Ondřej Dušek. Text-in-Context: Token-Level Error Detection for Table-to-Text Generation, In: INLG [Anthology / Poster].
  • Vojtěch Hudeček, Ondřej Dušek and Zhou Yu. Discovering Dialogue Slots with Weak Supervision, In: ACL. [Anthology]
  • Xinnuo Xu, Ondřej Dušek, Verena Rieser and Ioannis Konstas. AggGen: Ordering and Aggregating while Generating, In: ACL. [Anthology]
  • Tomáš Nekvinda and Ondřej Dušek. Shades of BLEU, Flavours of Success: The Case of MultiWOZ, In: GEM Workshop. [Anthology]
  • Sebastian Gehrmann et al. (50+ authors). The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics, In: GEM Workshop. [Anthology]
  • Léon-Paul Schaub, Vojtěch Hudeček, Daniel Štancl, Ondřej Dušek and Patrick Paroubek. Defining And Detecting Inconsistent System Behavior inTask-oriented Dialogues, In: TALN-RECITAL. [Anthology]

2020

  • Ondřej Dušek and Zdeněk Kasner. Evaluating Semantic Accuracy of Data-to-Text Generation with Natural Language Inference, In: INLG (Best Paper Award). [ACL anthology / video / Github]
  • Zdeněk Kasner and Ondřej Dušek. Data-to-Text Generation with Iterative Text Editing, In: INLG. [ACL anthology]
  • Zdeněk Kasner and Ondřej Dušek. Train Hard, Finetune Easy: Multilingual Denoising for RDF-to-Text Generation, In: WebNLG+ Workshop. [ACL Anthology]
  • Jindřich Libovický, Zdeněk Kasner, Jindřich Helcl, and Ondřej Dušek. Expand and Filter: CUNI and LMU Systems for the WNGT 2020 Duolingo Shared Task, In: WNGT Workshop. [ACL anthology]
  • Tomáš Nekvinda and Ondřej Dušek. One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech, In: Interspeech. [ISCA archive / Github]
  • Jan Vainer and Ondřej Dušek. SpeedySpeech: Efficient Neural Speech Synthesis, In: Interspeech. [ISCA archive / Github]
  • Xinnuo Xu, Ondřej Dušek, Jingyi Li, Verena Rieser, and Ioannis Konstas. Fact-based Content Weighting for Evaluating Abstractive Summarisation, In: ACL. [ACL anthology / video / Github]

2019

  • Ondřej Dušek, Jekaterina Novikova, and Verena Rieser. Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge, In: Computer Speech and Language. [ScienceDirect / arXiv / web]
  • Ondřej Dušek, Karin Sevegnani, Ioannis Konstas, and Verena Rieser. Automatic Quality Estimation for Natural Language Generation: Ranting (Jointly Rating and Ranking), In: INLG, Tokyo. [arXiv / slides / Github]
  • Ondřej Dušek, David M. Howcroft, and Verena Rieser. Semantic Noise Matters for Neural Natural Language Generation, In: INLG, Tokyo. [PDF / poster / Github]
  • Ondřej Dušek and Filip Jurčíček. Neural Generation for Czech: Data and Baselines, In: INLG, Tokyo. [arXiv / slides / Github (code) / Github (data)]
  • Simon Keizer, Ondřej Dušek, Xingkun Liu, and Verena Rieser. User Evaluation of a Multi-dimensional Statistical Dialogue System, In: SIGDIAL, Stockholm.  [ACL / arXiv / poster / code]

2018

  • Ondřej Dušek, Jekaterina Novikova, and Verena Rieser. Findings of the E2E NLG Challenge, In: INLG, Tilburg. [arXiv / web / slides]
  • Xinnuo Xu, Ondřej Dušek, Ioannis Konstas, and Verena Rieser. Better Conversations by Modeling, Filtering, and Optimizing for Coherence and Diversity, In: EMNLP, Brussels. [arXiv / Github / poster]
  • Jekaterina Novikova, Ondřej Dušek, and Verena Rieser. RankME: Reliable Human Ratings for Natural Language Generation, In: NAACL, New Orleans, 2018. [arXiv / Poster / Github]
  • Shubham Agarwal, Ondřej Dušek, Ioannis Konstas, and Verena Rieser. Improving Context Modelling in Multimodal Dialogue Generation, In: INLG, Tilburg. [arXiv / Github / poster]
  • Shubham Agarwal, Ondřej Dušek, Ioannis Konstas, and Verena Rieser. A Knowledge-Grounded Multimodal Search-Based Conversational Agent, In: SCAI EMNLP workshop, Brussels. [arXiv / Github / poster]
  • Igor Shalyminov, Ondřej Dušek, and Oliver Lemon. Neural Response Ranking for Social Conversation: A Data-Efficient Approach, In: SCAI EMNLP workshop, Brussels. [arXiv / Github / slides]

2017

  • Ondřej Dušek, Jekaterina Novikova, and Verena Rieser. Referenceless Quality Estimation for Natural Language Generation, In: LGNL, Sydney, 2017. [arXiv / Poster / Slides / Github]
  • Jekaterina Novikova, Ondřej Dušek, Amanda Cercas Curry, and Verena Rieser. Why We Need New Evaluation Metrics for NLG, In: EMNLP, Copenhagen, 2017. [arXiv / Github]
  • Jekaterina Novikova, Ondřej Dušek, and Verena Rieser. The E2E Dataset: New Challenges For End-to-End Generation, In: SIGDIAL, Saarbrücken, 2017. [arXiv / Web / Poster / Slides / Video]
  • Jekaterina Novikova, Ondřej Dušek, and Verena Rieser. Data-driven Natural Language Generation: Paving the Road to Success, In: WiNLP, Vancouver, 2017. [arXiv]

2016

  • Ondřej Dušek and Filip Jurčíček. A Context-aware Natural Language Generator for Dialogue Systems, In: SIGDIAL, Los Angeles, 2016. [PDF / arXiv / Software]
  • Ondřej Dušek and Filip Jurčíček. Sequence-to-Sequence Generation for Spoken Dialogue via Deep Syntax Trees and Strings, In: ACL, Berlin, 2016. [PDF / arXiv / Software]
  • Ondřej Bojar, Ondřej Dušek, Tom Kocmi, Jindřich Libovický, Michal Novák, Martin Popel, Roman Sudarikov, and Dušan Variš. CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered, In: TSD, Brno, 2016. [View on SpringerLink]
  • Rudolf Rosa, Martin Popel, Ondřej Bojar, David Mareček, and Ondřej Dušek. Moses & Treex Hybrid MT Systems Bestiary, In: DMTW, Lisbon, 2016. [PDF]
  • Roman Sudarikov, Ondřej Bojar, Ondřej Dušek, Martin Holub, and Vincent Kríž. Verb Sense Disambiguation in Machine Translation, In: HyTra-6, Osaka, 2016. [PDF]
  • Ondřej Dušek and Filip Jurčíček. A Context-aware Natural Language Generation Dataset for Dialogue Systems, In: RE-WOCHAT, Portorož, 2016. [PDF / PDF slides]

2015

  • Rudolf Rosa, Ondřej Dušek, Michal Novák, and Martin Popel. Translation Model Interpolation for Domain Adaptation in TectoMT, In: Deep MT Workshop, Prague, 2015 [PDF / PDF slides]
  • Ondřej Dušek, Luís Gomes, Michal Novák, Martin Popel, and Rudolf Rosa. New Language Pairs in TectoMT, In: WMT, Lisbon, 2015 [PDF / PDF poster]
  • Ondřej Dušek and Filip Jurčíček. Training a Natural Language Generator from Unaligned Data, In: ACL-IJCNLP, Beijing, 2015. [PDF / PDF slides / PDF poster (for YRRSDS) / Presentation video / Software]
  • Ondřej Dušek, Eva Fučíková, Jan Hajič, Martin Popel, Jana Šindlerová, and Zdeňka Urešová. Using Parallel Texts and Lexicons for Verbal Word Sense Disambiguation, In: Depling, Uppsala, 2015. [PDF / PDF slides]
  • Zdeňka Urešová, Ondřej Dušek, Eva Fučíková, Jan Hajič, and Jana Šindlerová. Bilingual English-Czech Valency Lexicon Linked to a Parallel Corpus, In: LAW IX - The 9th Linguistic Annotation Workshop, Denver, 2015. [PDF]

2014

  • Daniela Majchráková, Ondřej Dušek, Jan Hajič, Agáta Karčová, Radovan Garabík. Semi-automatic Detection of Multiword Expressions in the Slovak Dependency Treebank, In: Computational Linguistics in Bulgaria, Sofia, 2014. [PDF]
  • Daniel Zeman, Ondřej Dušek, David Mareček, Martin Popel, Loganathan Ramasamy, Jan Štěpánek, Zdeněk Žabokrtský and Jan Hajič. HamleDT: Harmonized multi-language dependency treebank, in: Language Resources and Evaluation (48) 4, December 2014. [View on SpringerLink]
  • Ondřej Dušek, Ondřej Plátek, Lukáš Žilka, and Filip Jurčíček. Alex: Bootstrapping a Spoken Dialogoue System for a New Domain by Real Users, in: Proceedings of Sigdial, Philadelphia, 2014. [PDF / PDF poster]
  • Ondřej Dušek, Jan Hajič, Jaroslava Hlaváčová, Michal Novák, Pavel Pecina, Rudolf Rosa, Aleš Tamchyna, Zdeňka Urešová and Daniel Zeman. Machine Translation of Medical Texts in the Khresmoi Project, in: Ninth Workshop on Statistical Machine Translation, Baltimore, 2014. [PDF]
  • Ondřej Dušek, Jan Hajič, and Zdeňka Urešová: Verbal Valency Frame Detection and Selection in Czech and English, in: The 2nd Workshop on EVENTS, Baltimore, 2014. [PDF / PDF poster]
  • Pavel Pecina, Ondřej Dušek, Lorraine Goeuriot, Jan Hajič, Jaroslava Hlaváčová, Gareth Jones, Liadh Kelly, Johannes Leveling, David Mareček, Michal Novák, Martin Popel, Rudolf Rosa, Aleš Tamchyna, and Zdeňka Urešová: Adaptation of Machine Translation for Multilingual Information Retrieval in the Medical Domain, in: Artificial Inteligence in Medicine (61) 3, 2014. [View on ScienceDirect]
  • Matěj Korvas, Ondřej Plátek, Ondřej Dušek, Lukáš Žilka, and Filip Jurčíček: Free English and Czech Telephone Speech Corpus Shared Under the CC-BY-SA 3.0 License, in: Proceedings of LREC, Reykjavík, 2014. [PDF / PDF slides]
  • Zdeňka Urešová, Ondřej Dušek, Jan Hajič, and Pavel Pecina: Multilingual Test Sets for Machine Translation of Search Queries for Cross-lingual Information Retrieval in the Medical Domain, in: Proceedings of LREC, Reykjavík, 2014. [PDF / PDF poster]

2013

  • Ondřej Dušek, Filip Jurčíček: Robust Multilingual Statistical Morphological Generation Models, in: ACL Student Research Workshop, Sofia, 2013. [PDF / PDF slides / Presentation video / Software used for the experiments]
  • Ondřej Dušek: Towards a Truly Statistical Natural Language Generator for Spoken Dialogues, in: Week of Doctoral Students. Prague, 2013. [PDF / PDF slides]
  • Aleš Tamchyna, Ondřej Dušek, Rudolf Rosa, Pavel Pecina: MTMonkey: A Scalable Infrastructure for a Machine Translation Web Service, in: The Prague Bulletin of Mathematical Linguistics 100, 31-40. [PDF / PDF poster / Software]

2012

  • Ondřej Dušek, Zdeněk Žabokrtský, Martin Popel, Martin Majliš, Michal Novák, David Mareček: Formemes in English-Czech Deep Syntactic MT, in: Proceedings of the Seventh Workshop on Statistical Machine Translation, Montréal, 2012. [PDF]
  • Rudolf Rosa, David Mareček, Ondrej Dušek: DEPFIX: A System for Automatic Correction of Czech MT Outputs, in: Proceedings of the Seventh Workshop on Statistical Machine Translation, Montréal, 2012. [PDF]
  • Rudolf Rosa, Ondřej Dušek, David Mareček, Martin Popel: Using Parallel Features in Parsing of Machine-Translated Sentences for Correction of Grammatical Errors, in: Proceedings of SSST-6, Jeju, 2012. [PDF]
  • Ondřej Bojar, Zdeněk Žabokrtský, Ondrej Dušek, Petra Galušcáková, Martin Majliš, David Marecek, Jiří Maršík, Michal Novák, Martin Popel, Aleš Tamchyna: The Joy of Parallelism with CzEng 1.0, in: Proceedings of LREC, Istanbul, 2012. [PDF]

Theses

  • Novel Methods for Natural Language Generation in Spoken Dialogue Systems. Ph.D. Thesis, Faculty of Mathematics and Physics, Charles University, Prague, 2017. [PDF / PDF summary / PDF slides]
  • Confrontation of Czech and German valency lexicons. Master's thesis, Faculty of Arts, Charles University in Prague, 2013. [PDF (in German)]
  • Deep automatic analysis of English. Master's thesis, Faculty of Mathematics and Physics, Charles University in Prague, 2010. [PDF]
  • BashCommander. Bachelor thesis, Faculty of Mathematics and Physics, Charles University in Prague, 2007. [PDF]

Talks

  • Large Language Models for Dialogue Applications. 4EU+ AI Days, Charles University. June 13, 2024. [PDF slides]
  • Large Language Models: How they work and what they are good for. Challenges of AI in Teaching Foreign Languages, Czech University of Life Sciences Prague. May 3, 2024. [PDF slides]
  • Looking for LLMs' Limits in Dialogue & Data-to-text. SCICHAT Workshop at EACL. Mar 21, 2024 [PDF slides]
  • Dialogue Systems (introduction). AI in HCI, Czech Technical University. Mar 8, 2024 [PDF slides]
  • Getting Structure in Dialogue with Large Language Models. Hora Informaticae, Czech Academy of Sciences. Jan 23, 2024 [PDF]
  • AI/Large Language Models. Jan 23, 2024 [PDF slides]
  • Skipping Chit-chat with ChatGPT: Large Language Models and Structured Outputs. Lecture Series: Machines That Understand? University of Vienna. Dec 7, 2023 [PDF slides] [Video]
  • Getting Past Chit-chat with ChatGPT: Large Language Models and Structured Outputs. Responsible Use of AI in Universities, Charles University. Nov 23, 2023 [PDF slides]
  • Getting Structure in Dialogue with Large Language Models. Data, AI, Znalosti meetup, University of Economics Prague. Nov 9, 2023 [PDF slides]
  • Large Language Models for Text Generation. Den s Katedrou jazykové přípravy. Sep 21, 2023 [PDF slides]
  • Neural Networks for Dialogue Systems. ČSOB/KBC Data Boot Camp. Jun 13, 2023 [PDF slides]
  • Data-to-text Generation with Neural Language Models. Scandinavian Conference on Image Analysis (SCIA). Apr 20, 2023 [PDF slides]
  • Dialogue Systems (introduction). AI in HCI, Czech Technical University. Mar 24, 2023 [PDF slides]
  • AI in Context of Text Generation. AI in Context Seminar, Charles University. Mar 9, 2023 [Seminar website] [PDF slides]
  • Robust Data-to-text Generation with Pretrained Language Models. Prague Computer Science Seminar. Feb 9, 2023 [Seminar website] [PDF slides]
  • Robust Data-to-text Generation with Pretrained Language Models. Heinrich-Heine University of Düsseldorf seminar on Selected Topic in Machine Learning and Natural Language Processing. Jan 26, 2023 [PDF slides]
  • End-to-end Neural Dialogue Systems. VOCALLS AI Afternoon, Prague. Oct 19 2022 [PDF slides]
  • Neural Conversational AI. MLSS^N Summer School, Kraków. Jun 30, 2022 [PDF slides] [Live recording]
  • Large Neural Language Models for Data-to-text Generation. AICZECHIA Seminar, Online. Mar 22, 2022 [PDF slides] [Video]
  • Better Supervision for End-to-end Neural Dialogue Systems. VSG Invited Talks @ FIT, Brno University of Technology. Dec 1, 2021 [Web] [PDF slides] [Video]
  • Accuracy in Neural Text Generation. Heinrich-Heine University of Düsseldorf seminar on Selected Topic in Machine Learning and Natural Language Processing. Jul 23, 2021 [PDF slides]
  • Dialogue Systems at Charles University. Czechbots conference. Mar 3, 2020. [PDF slides]
  • Challenges in Neural NLG. ÚFAL Monday seminar. Dec 2, 2019. [PDF slides]
  • Challenges in Neural NLG. Apple Cambridge. Oct 16, 2019. [PDF slides]
  • Challenges in Response Generation and Conversational AI. ILCC/HCRC Seminar, University of Edinburgh. Sep 14, 2018. [PPTX slides (24MB)]
  • Can You Be Friends with a Smart Speaker Device? Pint of Science Festival, Edinburgh. May 15, 2018. [PPTX slides (63MB)]
  • Sequence-to-sequence Natural Language Generation. University of Sheffield. Jun 1, 2017. [PDF slides]
  • Home Intelligent? Assistants. Edinburgh Science Festival. Apr 8, 2017. [PPTX slides (63MB)]
  • Sequence-to-sequence Natural Language Generation for Spoken Dialogue Systems. ÚFAL Monday seminar. Mar 28, 2017. [PDF slides / Presentation video]
  • Sequence-to-sequence Natural Language Generation. HWU Interaction Lab meeting. Nov 16, 2016. [PDF slides]
  • Sequence-to-sequence Natural Language Generation. Diligent project meeting. Nov 10, 2016. [PDF slides]
  • Natural Language Generation (Mostly) for Spoken Dialogue Systems. Lecture in Filip Jurčíček's Statistical Dialogue Systems Course. May 11, 2016. [PDF slides]
  • Natural Language Generation for Spoken Dialogue Systems. Lecture in Filip Jurčíček's Statistical Dialogue Systems Course. May 14, 2015. [PDF slides]
  • A Two-stage Syntax-based Natural Language Generator. ÚFAL Monday seminar. Mar 9, 2015. [PDF slides / Presentation video]
  • Tecto to AMR and Translation (with Tim O'Gorman and others). JHU/CLSP Fred Jelinek Memorial PIRE Workshop, Aug 1, 2014. [PDF slides / Video]
  • Ein Vergleich der deutschen und tschechischen Valenzwörterbücher durch Korpusanalyse und Befragung unter Linguisten. The 4th PRAGESTT Students' German Philology Conference. Mar 21, 2014. [PDF slides / PDF handout (in German)]
  • Natural Language Generation (Not Only) in Dialogue Systems. Lecture in Filip Jurčíček's Statistical Dialogue Systems Course. May 22, 2013. [PDF slides]
  • Learning Morphology from the Corpus. ÚFAL Monday seminar. Nov 11, 2013. [PDF slides / Presentation video]