Biomedical and Clinical NLP
- Vojtěch Lanz, Pavel Pecina (2024). Paragraph Retrieval for Enhanced Question Answering in Clinical Documents. In Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, pp. 580–590, Bangkok, Thailand (bib).
Multimodality
- Mateusz Krubiński, Pavel Pecina (2024). Towards Unified Uni- and Multi-modal News Headline Generation.In Findings of the Association for Computational Linguistics: EACL 2024, pp. 437-450. St. Julian's, Malta (bib).
- Mateusz Krubiński, Pavel Pecina (2023). MLASK: Multimodal Summarization of Video-based News Articles.In Findings of the Association for Computational Linguistics: EACL 2023, pp. 910-924. ISBN 978-1-959429-47-0, Dubrovnik, Croatia (bib).
- Jindřich Libovický, Jindřich Helcl, Marek Tlustý, Ondřej Bojar, Pavel Pecina. (2016). CUNI at Post-editing and Multimodal Translation Tasks. In Proceedings of the First Conference on Machine Translation, pp. 646-654, Berlin, Germany (bib).
- Jindřich Libovický, Pavel Pecina. (2016). A Dataset and Evaluation Metric for Coherent Text Recognition from Scene Images. In Multimodal Corpora: Computer vision and language processing, pp. 33-36, Portorož, Slovenia (bib).
- Jan Hajič jr., Pavel Pecina. (2015). Matching Illustrative Images to “Soft News” Articles. In UFAL WDS 2015, Institute of Formal and Applied Linguistics, Charles University, Prague, pp. 49-56, Praha, Czechia (bib).
- Jindřich Libovický, Lukáš Neumann, Pavel Pecina, Jiří Matas. (2015). A Machine Learning Approach to Hypothesis Decoding in Scene Text Recognition. In Computer Vision - ACCV 2014 Workshops. Singapore, 2014. Revised Selected Papers, Part II. Lecture Notes in Computer Science, vol. 9009, pp. 169-180, Springer International Publishing (bib).
Optical Music Recognition
- Mayer Jiří, Straka Milan, Jan Hajič jr., Pecina Pavel (2024). Practical End-to-End Optical Music Recognition for Pianoform Music. In Document Analysis and Recognition - ICDAR 2024., pp. 55–73, Lecture Notes in Computer Science, vol 14809. Springer, Cham (bib).
- Jan Hajič jr., Petr Žabička, Jan Rychtář, Jiří Mayer, Martina Dvořáková, Filip Jebavý, Markéta Vlková, Pavel Pecina (2023). The OmniOMR Project. In Proceedings of the 5th International Workshop on Reading Music Systems, pp. 12-14, University of Alicante, Alicante, Spain (bib).
- Jonáš Havelka, Jiří Mayer, Pavel Pecina (2023). Symbol Generation via Autoencoders for Handwritten Music Synthesis.In Proceedings of the 5th International Workshop on Reading Music Systems, pp. 20-24, University of Alicante, Alicante, Spain (bib).
- Jiří Mayer, Pavel Pecina (2022). Obstacles with Synthesizing Training Data for OMR. In Proceedings of the 4th International Workshop on Reading Music Systems, pp. 15-19, University of Alicante, Alicante, Spain (bib).
- Jiří Mayer, Pavel Pecina (2021). Synthesizing Training Data for Handwritten Music Recognition. In Document Analysis and Recognition - ICDAR 2021, Lecture Notes in Computer Science, ISSN 0302-9743, 12823, pp. 626-641, Springer International Publishing, Cham, Switzerland, ISBN 978-3-030-86333-3 (bib).
- Jan Hajič jr., Matthias Dorfer, Gerhard Widmer, Pavel Pecina (2018). Towards Full-Pipeline Handwritten OMR with Musical Symbol Detection by U-Nets. In Proceedings of the 19th Conference of the International Society for Music Information Retrieval, pp. 225-232, International Society for Music Information Retrieval, New York, NY, USA, ISBN 978-2-9540351-2-3 (bib).
- Jan Hajič jr., Pavel Pecina. (2017). The MUSCIMA++ Dataset for Handwritten Optical Music Recognition. In Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, vol 1, pp. 39-46, Kyoto, Japan (bib).
- Jan Hajič jr., Pavel Pecina. (2017). Groundtruthing (Not Only) Music Notation with MUSICMarker: A Practical Overview. In Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, vol 2, pp. 47-48, Kyoto, Japan (bib).
- Jan Hajič jr., Pavel Pecina. (2017). How to Exploit Music Notation Syntax for OMR?. In Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, pp. 55-56, Kyoto, Japan (bib).
- Jan Hajič jr., Pavel Pecina. (2016). Further Steps Towards a Standard Testbed for Optical Music Recognition. In Proceedings of the 17th International Society for Music Information Retrieval Conference, pp. 157-163, New York City, USA (bib).
Machine Translation
- Ibrahim Said Ahmad et al. (2024). Findings of the IWSLT 2024 Evaluation Campaign. In Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024), pp. 1–11, Bangkok, Thailand (bib).
- Mateusz Krubiński, Hashem Sellat, Shadi Saleh, Adam Pospíšil, Petr Zemánek, Pavel Pecina (2023). Multi-Parallel Corpus of North Levantine Arabic. In Proceedings of ArabicNLP 2023, pp. 411-417, Singapore (Hybrid), ISBN 978-1-959429-27-2 (bib).
- Mateusz Krubiński, Erfan Ghadery, Pavel Pecina, Marie-Francine Moens (2021). Just Ask! Evaluating Machine Translation by Asking and Answering Questions. In Proceedings of the Sixth Conference on Machine Translation, pp. 495-506, Online, ISBN 978-1-954085-94-7 (bib).
- Mateusz Krubiński, Erfan Ghadery, Pavel Pecina, Marie-Francine Moens (2021). MTEQA at WMT21 Metrics Shared Task. In Proceedings of the Sixth Conference on Machine Translation, pp. 1024-1029, Online, ISBN 978-1-954085-94-7 (bib).
- Martin Popel, Dominik Macháček, Michal Auersperger, Ondřej Bojar, Pavel Pecina (2019). English-Czech Systems in WMT19: Document-Level Transformer. In Fourth Conference on Machine Translation - Proceedings of the Conference, pp. 342-348, ISBN 978-1-950737-27-7 (bib).
- Antonio Jimeno Yepes, Aurelie Neveol, Mariana Neves, Karin Verspoor, Ondrej Bojar, Arthur Boyer, Cristian Grozea, Barry Haddow, Madeleine Kittner, Yvonne Lichtblau, Pavel Pecina, Roland Roller, Rudolf Rosa, Amy Siu, Philipp. Thomas, Saskia Trescher. (2017). Findings of the WMT 2017 Biomedical Translation Shared Task In Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers pp. 234-247, Copenhagen, Denmark (bib).
- Pavel Pecina, Antonio Toral, Vassilis Papavassiliou, Prokopis Prokopidis, Aleš Tamchyna, Andy Way, Josef van Genabith. (2015). Domain adaptation of statistical machine translation with domain-focused web crawling. Language Resources and Evaluation, 49(1), pp. 147-193. Springer Netherlands (bib).
- Antonio Toral, Pavel Pecina, Longyue Wang, Josef van Genabith. (2015). Linguistically-augmented Perplexity-based Data Selection for Language Models. In Computer Speech & Language, Special Issue on Hybrid Machine Translation: Integration of Linguistics and Statistics, 32(1), pp. 11-26, Elsevier (bib).
- Pavel Pecina, Ondřej Dušek, Lorraine Goeuriot, Jan Hajič, Jaroslava Hlaváčová, Gareth J. F. Jones, Liadh Kelly, Johannes Leveling, David Mareček, Michal Novák, Martin Popel, Rudolf Rosa, Aleš Tamchyna, Zdeňka Urešová. (2014). Adaptation of machine translation for multilingual information retrieval in the medical domain. In Artificial Intelligence in Medicine 61, pp. 165-185, Elsevier (bib).
- Ondřej Bojar, Christian Buck, Christian Federmann, Barry Haddow, Johannes Leveling, Philipp Koehn, Christof Monz, Pavel Pecina, Matt Post, Herve Saint-Amand, Radu Soricut, Lucia Specia, Aleš Tamchyna. (2014). Findings of the 2014 Workshop on Statistical Machine Translation. In Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 12-58, Baltimore, USA (bib).
- Ondřej Dušek, Jan Hajič, Jaroslava Hlaváčová, Michal Novák, Pavel Pecina, Rudolf Rosa, Aleš Tamchyna, Zdeňka Urešová, Daniel Zeman. (2014). Machine Translation of Medical Texts in the Khresmoi Project. In Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 221-228, Baltimore, USA (bib).
- Jindřich Libovický, Pavel Pecina. (2014). Tolerant BLEU: a Submission to the WMT14 Metrics Task In Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 409-413, Baltimore, USA (bib).
- Zdeňka Urešová, Ondřej Dušek, Jan Hajič, Pavel Pecina. (2014). Multilingual Test Sets for Machine Translation of Search Queries for Cross-Lingual Information Retrieval in the Medical Domain. In Proceedings of the Ninth International Conference on Language Resources and Evaluation, pp. 3244-3247, Reykjavik, Iceland (bib).
- Aleš Tamchyna, Ondřej Dušek, Rudolf Rosa, Pavel Pecina. (2013). MTMonkey: A Scalable Infrastructure for a Machine Translation Web Service. In The Prague Bulletin of Mathematical Linguistics, No. 100, pp. 31-40 (bib).
- Pavel Pecina. (2013). Jörg Tiedemann: Bitext Alignment (book review). Machine Translation, Volume 27, Issue 1, pp. 77-79, Springer Netherlands (bib).
- Pavel Pecina, Antonio Toral, Josef van Genabith. (2012). Simple and Effective Parameter Tuning for Domain Adaptation of Statistical Machine Translation. In Proceedings of the 24th International Conference on Computational Linguistics, pp. 2209-2224, Mumbai, India (bib).
- Antonio Toral, Leroy Finn, Dominic Jones, Pavel Pecina, David Lewis, Declan Groves. (2012). Retraining Machine Translation with Post-edits to Increase Post-editing Productivity in Content Management Systems. In International Workshop on Expertise in Translation and Post-editing Research and Application, pp. 39-40, Copenhagen, Denmark (bib).
- Pavel Pecina, Antonio Toral, Vassilis Papavassiliou, Prokopis Prokopidis, Josef van Genabith. (2012). Domain Adaptation of Statistical Machine Translation using Web-Crawled Resources: A Case Study. In Proceedings of the 16th Annual Conference of the European Association for Machine Translation, pp. 145-152, Trento, Italy (bib).
- Antonio Toral, Marc Poch, Pavel Pecina, Gregor Thurmair (2012). Efficiency-Based Evaluation of Aligners for Industrial Applications. In Proceedings of the 16th Annual Conference of the European Association for Machine Translation, pp. 57-60, Trento, Italy (bib).
- Christian Federmann, Maite Melero, Pavel Pecina, Josef van Genabith. (2012). Towards Optimal Choice Selection for Improved Hybrid Machine Translation. In The Prague Bulletin of Mathematical Linguistics, 97, pp. 5-22 (bib).
- Eleftherios Avramidis, Marta R. Costa-jussa, Christian Federmann, Maite Melero, Pavel Pecina, Josef van Genabith. (2012). A Richly Annotated, Multilingual Parallel Corpus for Hybrid Machine Translation. In Proceedings of the Eight International Conference on Language Resources and Evaluation, pp. 3430-3435, Istanbul, Turkey (bib).
- Eleftherios Avramidis, Marta R. Costa-jussa, Christian Federmann, Maite Melero, Pavel Pecina, Josef van Genabith. (2012). The ML4HMT Workshop on Optimising the Division of Labour in Hybrid Machine Translation. In Proceedings of the Eight International Conference on Language Resources and Evaluation, pp. 2189-2139, Istanbul, Turkey (bib).
- Pavel Pecina, Antonio Toral, Andy Way, Vassilis Papavassiliou, Prokopis Prokopidis, Maria Giagkou. (2011). Towards Using Web-Crawled Data for Domain Adaptation in Statistical Machine Translation. In Proceedings of the 15th Annual Conference of the European Associtation for Machine Translation, pp. 297-304, Leuven, Belgium (bib).
- Antonio Toral, Pavel Pecina, Andy Way, Marc Poch. (2011). Towards a User-Friendly Webservice Architecture for Statistical Machine Translation in the PANACEA project. In Proceedings of the 15th Annual Conference of the European Associtation for Machine Translation, pp. 63-72, Leuven, Belgium (bib).
- Santanu Pal, Sudip Kumar Naskar, Pavel Pecina, Sivaji Bandyopadhyay, Andy Way. (2010). Handling Named Entities and Compound Verbs in Phrase-Based Statistical Machine Translation. In Proceedings of the 2010 Workshop on Multiword Expressions: from Theory to Applications, pp. 46-54, Beijing, China (bib).
- Jinhua Du, Pavel Pecina, Andy Way. (2010). An Augmented Three-Pass System Combination Framework: DCU Combination System for WMT 2010. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pp. 290-295, Uppsala, Sweden (bib).
- Sergio Penkale, Rejwanul Haque, Sandipan Dandapat, Pratyush Banerjee, Ankit K. Srivastava, Jinhua Du, Pavel Pecina, Sudip Kumar Naskar, Mikel L. Forcada, Andy Way. (2010). MATREX: The DCU MT System for WMT 2010. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pp. 143-148, Uppsala, Sweden (bib).
- Jimmy Lin, Craig G. Murray, Bonnie Dorr, Jan Hajič, Pavel Pecina. (2009). A Cost-Effective Lexical Acquisition Process for Large-Scale Thesaurus Translation. Language Resources and Evaluation, 43, pp. 27-40. Springer Netherlands (bib).
- Petr Homola, Vladislav Kuboň, Pavel Pecina. (2009). A Simple Automatic MT Evaluation Metric. In Proceedings of the Fourth Workshop on Statistical Machine Translation, pp. 33-36, Athens, Greece (bib).
- Craig G. Murray, Bonnie J. Dorr, Jimmy Lin, Jan Hajič, Pavel Pecina. (2006). Leveraging Reusability: Cost-Effective Lexical Acquisition for Large-Scale Ontology Translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 945-952, Sydney, Australia (bib).
- Craig G. Murray, Bonnie Dorr, Jimmy Lin, Pavel Pecina, Jan Hajič. (2006). Leveraging Recurrent Phrase Structure in Large-scale Ontology Translation. In Proceedings of the 11th Annual conference of the European Association for Machine Translation, pp. 1-10, Oslo, Norway (bib).
- William Byrne, Sanjeev Khudanpur, Woosung Kim, Shankar Kumar, Pavel Pecina, Paola Virga, Peng Xu, David Yarowsky. (2003). The Johns Hopkins University 2003 Chinese-English Machine Translation System. In Proceedings of the ninth Machine Translation Summit of the International Association for Machine Translation, pp. 447-450, New Orleans, Louisiana, USA (bib).
Lexical Association Measures, Multiword Expresions
- Lubomír Krčmář, Karel Ježek, Pavel Pecina. (2013). Determining Compositionality of Word Expressions Using Various Word Space Models and Measures. In Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality, pp. 64-73, Sofia, Bulgaria (bib).
- Eduard Bejček, Pavel Straňák, Pavel Pecina. (2013). Syntactic Identification of Occurrences of Multiword Expressions in Text using a Lexicon with Dependency Structures. In Proceedings of the 9th Workshop on Multiword Expressions, pp. 106-115, Atlanta, Georgia, USA (bib).
- Lubomír Krčmář, Karel Ježek, Pavel Pecina. (2013). Determining Compositionality of Word Expressions Using Word Space Models. In Proceedings of the 9th Workshop on Multiword Expressions, pp. 42-50, Atlanta, Georgia, USA (bib).
- Pavel Pecina. (2011). Book Reviews: Syntax-Based Collocation Extraction by Violeta Seretan. Computational Linguistics, 37, pp. 631-633 (bib).
- Pavel Pecina. (2010). Lexical association measures and collocation extraction. Language Resources and Evaluation, 44, pp. 137-158. Springer Netherlands (bib).
- Pavel Pecina. (2009). Lexical Association Measures: Collocation Extraction. vol. 4 of Studies in Computational and Theoretical Linguistics. UFAL, Praha, Czech Republic (bib).
- Pavel Pecina. (2008). Lexical Association Measures: Collocation Extraction. PhD thesis, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic (bib).
- Pavel Pecina. (2008). A Machine Learning Approach to Multiword Expression Extraction. In Proceedings of the LREC 2008 Workshop Towards a Shared Task for Multiword Expressions, pp. 54-57, Marrakech, Morocco (bib).
- Pavel Pecina. (2008). Reference Data for Czech Collocation Extraction. In Proceedings of the LREC 2008 Workshop Towards a Shared Task for Multiword Expressions, pp. 11-14, Marrakech, Morocco (bib).
- Pavel Pecina, Pavel Schlesinger. (2006). Combining Association Measures for Collocation Extraction. In Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pp. 651-658, Sydney, Australia (bib).
- Silvie Cinková, Petr Podveský, Pavel Pecina, Pavel Schlesinger. (2006). Semi-automatic Building of Swedish Collocation Lexicon. In Proceedings of the 5th International Conference on Language Resources and Evaluation, pp. 1890-1893, Genova, Italy (bib).
- Pavel Pecina. (2005). An Extensive Empirical Study of Collocation Extraction Methods. In Proceedings of the 43th Annual Meeting of the Association for Computational Linguistics, Student Research Workshop, pp. 13-18, Ann Arbor, Michigan (bib).
Web as a Corpus
- Drahomíra Spoustová, Miroslav Spousta, Pavel Pecina. (2010). Building a Web Corpus of Czech. In Proceedings of the 7th International Conference on Language Resources and Evaluation, pp. 998-1001, Valletta, Malta (bib).
- Miroslav Spousta, Michal Marek, Pavel Pecina. (2008). Victor: the Web-Page Cleaning Tool. In Proceedings of the 4th Web as Corpus Workshop - Can we beat Google?, pp. 12-17, Marrakech, Morocco (bib).
- Michal Marek, Pavel Pecina, Miroslav Spousta. (2007). Web Page Cleaning with Conditional Random Fields. In Proceedings of the 3rd Web As a Corpus Workshop, Incorporating CLEANEVAL, pp. 155-162, Louvain-la-Neuve, Belgium (bib).
Information Retrieval
- Shadi Saleh, Hadi Abdi Khojasteh, Hashem Sellat, Pavel Pecina (2021). CUNI-MTIR at COVID-19 MLIA @ Eval Task 2. Multilingual Information Access (MLIA) systems, Online (bib).
- Shadi Saleh, Hashem Sellat, Hadi Abdi Khojasteh, Pavel Pecina (2021). CUNI-MTIR at COVID-19 MLIA @ Eval Task 3. Multilingual Information Access (MLIA) systems, Online (bib).
- Shadi Saleh, Pavel Pecina (2020). Document Translation vs. Query Translation for Cross-Lingual Information Retrieval in the Medical Domain. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6849-6860, ISBN 978-1-952148-25-5 (bib).
- Shadi Saleh, Pavel Pecina (2019). Term Selection for Query Expansion in Medical Cross-Lingual Information Retrieval. In Advances in Information Retrieval; 41st European Conference on IR Research, ECIR 2019, Lecture Notes in Computer Science, ISSN 0302-9743, 1, pp. 507-522, Springer International Publishing, Berlin, Germany, ISBN 978-3-030-15719-7 (bib).
- Shadi Saleh, Pavel Pecina (2019). An Extended CLEF eHealth Test Collection for Cross-lingual Information Retrieval in the Medical Domain. In Advances in Information Retrieval; 41st European Conference on IR Research, ECIR 2019, Lecture Notes in Computer Science, ISSN 0302-9743, 1, pp. 188-195, Springer International Publishing, Berlin, Germany, ISBN 978-3-030-15719-7 (bib).
- Shadi Saleh, Pavel Pecina (2018). CUNI team: CLEF eHealth Consumer Health Search Task 2018. In Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum, pp. 1-11, CEUR-WS, Aachen, Germany (bib).
- Joao Palotti, Guido Zuccon, Jimmy, Pavel Pecina, Mihai Lupu, Lorraine Goeuriot, Liadh Kelly, Allan Hanbury. (2017). CLEF 2017 Task Overview: The IR Task at the eHealth Evaluation Lab - Evaluating Retrieval Methods for Consumer Health Search. In Working Notes of CLEF 2017 - Conference and Labs of the Evaluation Forum, Dublin, Ireland (bib).
- Shadi Saleh, Pavel Pecina. (2017). Task3 Patient-Centred Information Retrieval: Team CUNI. In Working Notes of CLEF 2017 - Conference and Labs of the Evaluation Forum, Dublin, Ireland (bib).
- Petra Galuščáková, Michal Batko, Jan Čech, Jiří Matas, David Novák, Pavel Pecina. (2017). Visual Descriptors in Methods for Video Hyperlinking. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pp. 294-300, Bucharest, Romania (bib).
- Shadi Saleh, Pavel Pecina. (2016). Reranking Hypotheses of Machine-Translated Queries for Cross-Lingual Information Retrieval. In Experimental IR Meets Multilinguality, Multimodality, and Interaction. 7th International Conference of the CLEF Association, CLEF 2016, pp. 54-66, Évora, Portugal (bib).
- Guido Zuccon, Joao Palotti, Lorraine Goeuriot, Liadh Kelly, Mihai Lupu, Pavel Pecina, Henning Müller, Julie Budaher, Anthony Deacon. (2016). The IR Task at the CLEF eHealth Evaluation Lab 2016: User-centred Health Information Retrieval. In Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum, CEUR Workshop Proceedings 1609, pp. 15-27, Évora, Portugal (bib).
- Shadi Saleh, Pavel Pecina. (2016). Task3 Patient-Centred Information Retrieval: Team CUNI. In Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum, CEUR Workshop Proceedings 1609, pp. 123-129, Évora, Portugal (bib).
- Shadi Saleh, Pavel Pecina (2016). Adapting SMT Query Translation Reranker to New Languages in Cross-Lingual Information Retrieval. In Proceedings of the Medical Information Retrieval (MedIR) Workshop. A SIGIR 2016 workshop, Pisa, Italy (bib).
- Petra Galuščáková, Michal Batko, Martin Kruliš, Jakub Lokoč, David Novák, Pavel Pecina (2016). CUNI at TRECVID 2015 Video Hyperlinking Task. In TRECVID 2015 Workshop Notebook, Gaithersburg, MD, USA (bib).
- Petra Galuščáková, Shadi Saleh, Pavel Pecina. SHAMUS: UFAL Search and Hyperlinking Multimedia System. In Proceedings of the 38th European Conference on Information Retrieval, demo papers, pp. 853-856, Padova, Italy (bib).
- Petra Galuščáková, Pavel Pecina. (2015). Audio Information for Hyperlinking of TV Content. In Proceedings of the Third Edition Workshop on Speech, Language & Audio in Multimedia, pp. 27-30. Brisbane, Australia (bib).
- Petra Galuščáková, Pavel Pecina. (2015). CUNI at MediaEval 2015 Search and Anchoring in Video Archives: Anchoring via Information Retrieval. In Working Notes Proceedings of the MediaEval 2015 Workshop, CEUR Workshop Proceedings, vol. 1436, Wurzen, Germany (bib).
- João Palotti, Guido Zuccon, Lorraine Goeuriot, Liadh Kelly, Allan Hanbury, Gareth JF Jones, Mihai Lupu, Pavel Pecina. (2015). CLEF eHealth Evaluation Lab 2015, Task 2: Retrieving information about medical symptoms. In Working Notes of CLEF 2015 - Conference and Labs of the Evaluation forum, CEUR Workshop Proceedings 1391, Toulouse, France (bib).
- Shadi Saleh, Feraena Bibyna, Pavel Pecina. (2015). CUNI at the CLEF eHealth 2015 Task 2. In Working Notes of CLEF 2015 - Conference and Labs of the Evaluation forum, CEUR Workshop Proceedings 1391, Toulouse, France (bib).
- Petra Galuščáková, Pavel Pecina. (2014). CUNI at MediaEval 2014 Search and Hyperlinking Task: Search Task Experiments. In Working Notes Proceedings of the MediaEval 2014 Workshop, CEUR Workshop Proceedings, vol. 1263, Barcelona, Spain (bib).
- Petra Galuščáková, Martin Kruliš, Jakub Lokoč, Pavel Pecina. (2014). CUNI at MediaEval 2014 Search and Hyperlinking Task: Visual and Prosodic Features in Hyperlinking. In Working Notes Proceedings of the MediaEval 2014 Workshop, CEUR Workshop Proceedings, vol. 1263, Barcelona, Spain (bib).
- Lorraine Goeuriot, Liadh Kelly, Wei Li, Joao Palotti, Pavel Pecina, Guido Zuccon, Allan Hanbury, Gareth Jones, Henning Müller. (2014). ShARe/CLEF eHealth Evaluation Lab 2014, Task 3: User-centred health information retrieval. In CLEF Online Working Notes, CEUR Workshop Proceedings 1180, pp. 43-61, Sheffield, UK (bib).
- Shadi Saleh, Pavel Pecina. (2014). CUNI at the ShARe/CLEF eHealth Evaluation Lab 2014. In CLEF Online Working Notes, CEUR Workshop Proceedings 1180, pp. 226-235, Sheffield, UK (bib).
- Petra Galuščáková, Pavel Pecina. (2014). Experiments with Segmentation Strategies for Passage Retrieval in Audio-Visual Documents. In Proceedings of International Conference on Multimedia Retrieval, pp. 217-224, Glassgov, UK (bib).
- Petra Galuščáková, Pavel Pecina. (2013). CUNI at MediaEval 2013 Similar Segments in Social Speech Task. In Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, CEUR Workshop Proceedings, vol. 1043, Barcelona, Spain (bib).
- Petra Galuščáková, Pavel Pecina. (2013). CUNI at MediaEval 2013 Search and Hyperlinking Task. In Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, CEUR Workshop Proceedings, vol. 1043, Barcelona, Spain (bib).
- Niraj Aswani et al. (2013). Khresmoi - multilingual semantic search of medical text and images. In Proceedings of the 14th World Congress on Medical and Health Informatics, Copenhagen, Denmark, Volume 192 of Studies in Health Technology and Informatics, pp. 1266 (bib).
- Niraj Aswani et al. (2013). Khresmoi Professional: Multilingual Semantic Search for Medical Professionals. In Proceedings of the ACM SIGIR Workshop on Health Search and Discovery: Helping Users and Advancing Medicine, pp. 31-34, Dublin, Ireland (bib).
- Maria Eskevich, Gareth J.F. Jones, Shu Chen, Robin Aly, Roeland Ordelman, Danish Nadeem, Camille Guinaudeau, Guillaume Gravier, Pascale Sébillot, Tom De Nies, Pedro Debevere, Rik Van de Walle, Petra Galuščáková, Pavel Pecina, Martha Larson. (2013). Multimedia Information Seeking through Search And Hyperlinking. In Proceedings of the 3rd ACM conference on International conference on multimedia retrieval, pp. 287-294, Dallas, Texas, USA (bib).
- Petra Galuščáková, Pavel Pecina. (2012). CUNI at MediaEval 2012 Search and Hyperlinking Task. In Working Notes Proceedings of the MediaEval 2012 Workshop, CEUR Workshop Proceedings, vol. 927, Pisa, Italy (bib).
- Niraj Aswani et al. (2012). Khresmoi: Multimodal Multilingual Medical Information Search. In Proceedings of the 24th International Conference of the European Federation for Medical Informatics, Quality of Life through Quality of Information, Village of the future, IOS Press, Pisa, Italy (bib).
- Petra Galuščáková, Pavel Pecina, Jan Hajič. Penalty Functions for Evaluation Measures of Unsegmented Speech Retrieval. In In Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics. Proceedings of the Third International Conference of the CLEF Initiative - CLEF 2012, Lecture Notes in Computer Science, vol. 7488, pp. 100-111, Springer Berlin Heidelberg (bib).
- Jana Straková, Pavel Pecina. (2010). Czech Information Retrieval with Syntax-based Language Models. In Proceedings of the 7th International Conference on Language Resources and Evaluation, pp. 1359-1362, Valletta, Malta (bib).
- Pavel Pecina, Petra Hoffmannová, Gareth Jones, Ying Zhang, Douglas Oard. (2008). Overview of the CLEF-2007 Cross-Language Speech Retrieval Track. In Advances in Multilingual and Multimodal Information Retrieval, vol. 5152 of Lecture Notes in Computer Science, pp. 674-686. Springer Berlin Heidelberg (bib).
- Pavel Češka, Pavel Pecina. (2008). Charles University at CLEF 2007 Ad-Hoc Track. In Advances in Multilingual and Multimodal Information Retrieval, vol. 5152 of Lecture Notes in Computer Science, pp. 33-36. Springer Berlin Heidelberg (bib).
- Pavel Ircing, Pavel Pecina, Douglas Oard, Jianqiang Wang, Ryen White, Jan Hoidekr. (2007). Information Retrieval Test Collection for Searching Spontaneous Czech Speech. In Text, Speech and Dialogue, vol. 4629 of Lecture Notes in Computer Science, pp. 439-446. Springer Berlin Heidelberg (bib).
- Douglas Oard, Jianqiang Wang, Gareth Jones, Ryen White, Pavel Pecina, Dagobert Soergel, Xiaoli Huang, Izhak Shafran. (2007). Overview of the CLEF-2006 Cross-Language Speech Retrieval Track. In Evaluation of Multilingual and Multi-modal Information Retrieval, vol. 4730 of Lecture Notes in Computer Science, pp. 744-758. Springer Berlin Heidelberg (bib).
- Pavel Pecina, Petra Hoffmannová, Gareth Jones, Ying Zhang, Douglas Oard. (2007). Overview of the CLEF-2007 Cross-Language Speech Retrieval Track. In Working Notes for the CLEF 2007 Workshop on Cross-Language Information Retrieval and Evaluation, Budapest, Hungary (bib).
- Pavel Češka, Pavel Pecina. (2007). Charles University at CLEF 2007 Ad-Hoc Track. In Working Notes for the CLEF 2007 Workshop on Cross-Language Information Retrieval and Evaluation, Budapest, Hungary (bib).
- Pavel Češka, Pavel Pecina. (2007). Charles University at CLEF 2007 CL-SR Track. In Working Notes for the CLEF 2007 Workshop on Cross-Language Information Retrieval and Evaluation, Budapest, Hungary (bib).
- Douglas Oard, Jianqiang Wang, Gareth Jones, Ryen White, Pavel Pecina, Dagobert Soergel, Xiaoli Huang, Izhak Shafran. (2006). Overview of the CLEF-2006 Cross-Language Speech Retrieval Track. In Working Notes for the CLEF 2006 Workshop on Cross-Language Information Retrieval and Evaluation, Alicante, Spain (bib).
Arabic Language Processing
- Mohammed Attia, Pavel Pecina, Younes Samih, Khaled Shaalan, Josef van Genabith. (2016). Arabic Spelling Error Detection and Correction. In Natural Language Engineering, 22(5), pp. 751-773, Cambridge University Press (bib).
- Mohammed Attia, Pavel Pecina, Antonio Toral, Josef van Genabith. (2014). A corpus-based finite-state morphological toolkit for contemporary Arabic. In Journal of Logic and Computation 24 (2), pp. 455-472, Oxford Journals (bib).
- Mohammed Attia, Pavel Pecina, Younes Samih, Khaled Shaalan, Josef van Genabith. (2012). Improved Spelling Error Detection and Correction for Arabic. In Proceedings of the 24th International Conference on Computational Linguistics, pp. 103-112, Mumbai, India (bib).
- Khaled Shaalan,Younes Samih, Mohammed Attia, Pavel Pecina, Josef van Genabith. (2012). Arabic Word Generation and Modelling for Spell Checking. In Proceedings of the Eight International Conference on Language Resources and Evaluation, pp. 719-725, Istanbul, Turkey (bib).
- Mohammed Attia, Pavel Pecina, Antonio Toral, Lamia Tounsi, Josef Genabith. (2011). A Lexical Database for Modern Standard Arabic Interoperable with a Finite State Morphological Transducer. In Systems and Frameworks for Computational Morphology, vol. 100 of Communications in Computer and Information Science, pp. 98-118, Springer Berlin Heidelberg (bib).
- Mohammed Attia, Pavel Pecina, Antonio Toral, Lamia Tounsi, Josef van Genabith. (2011). An Open-Source Finite State Morphological Transducer for Modern Standard Arabic. In Proceedings of the 9th International Workshop on Finite-State Methods and Natural Language Processing, pp. 125-133, Blois, France (bib).
- Mohammed Attia, Pavel Pecina, Lamia Tounsi, Antonio Toral, Josef van Genabith. (2011). Lexical Profiling for Arabic. In Electronic Lexicography in the 21st Century, pp. 22-33, Bled, Slovenia (bib).
- Mohammed Attia, Antonio Toral, Lamia Tounsi, Pavel Pecina, Josef van Genabith. (2010). Automatic Extraction of Arabic Multiword Expressions. In Proceedings of the 2010 Workshop on Multiword Expressions: from Theory to Applications, pp. 19-27, Beijing, China (bib).
Morphology and Tagging
- Thanh Long Duong, Steven Bird, Paul Cook, Pavel Pecina. (2013). Increasing the quality and quantity of source language data for unsupervised cross-lingual POS tagging. In Proceedings of the Sixth International Joint Conference on Natural Language Processing, pp. 1243-1249, Nagoya, Japan (bib).
- Thanh Long Duong, Paul Cook, Steven Bird, Pavel Pecina. (2013). Simpler unsupervised POS tagging with bilingual projections. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 634-639, Sofia, Bulgaria (bib).
- Drahomíra Spoustová, Pavel Pecina, Jan Hajič, Miroslav Spousta. (2008). Validating the Quality of Full Morphological Annotation. In Proceedings of the 6th International Conference on Language Resources and Evaluation, pp. 1-4, Marrakech, Morocco (bib).
Summarization
- Mateusz Krubiński, Pavel Pecina (2022). From COMET to COMES – Can Summary Evaluation Benefit from Translation Evaluation?. In Proceedings of the 3rd Workshop on Evaluation and Comparison of NLP Systems, pp. 21-31, Online (bib).
Neural Representations
- Michal Auersperger, Pavel Pecina (2022). Defending Compositionality in Emergent Languages. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop, pp. 285-291, Hybrid: Seattle, Washington + Online, ISBN 978-1-955917-73-5 (bib).
- Michal Auersperger, Pavel Pecina (2021). Solving SCAN Tasks with Data Augmentation and Input Embeddings. In Proceedings of the Recent Advances in Natural Language Processing, pp. 86-91, INCOMA Ltd., Shoumen, Bulgaria, ISBN 978-954-452-072-4 (bib).
Named Entities
- Victor Mireles, Stephanie Billib, Artem Revenko, Stephan Jänicke, Frank Uiterwaal, Pavel Pecina (2023). Exploratory Analysis of the Applicability of Formalised Knowledge to Personal Experience Narration. In Data Science — Analytics and Applications. iDSC 2023, pp. 75-80, Springer, Cham, ISBN 978-3-031-42171-6 (bib).
Lexical Semantics
- Christopher Brückner, Leixin Zhang, Pavel Pecina (2024). Similarity-Based Cluster Merging for Semantic Change Modeling. In Proceedings of the 5th Workshop on Computational Approaches to Historical Language Change, pp. 23–28, Bangkok, Thailand (bib).
- Jan Hajič, Martin Holub, Marie Hučínová, Martin Pavlík, Pavel Pecina, Pavel Straňák, Pavel M. Šidák. (2004). Validating and Improving the Czech WordNet via Lexico-Semantic Annotation of the Prague Dependency Treebank. In Proceedings of the fourth International conference on Language Resources and Evaluation Workshop: Building Lexical Resources from Semantically Annotated Corpora, pp. 25-30, Lisbon, Portugal (bib).
Language Identification
- Vincent Kríž, Martin Holub, Pavel Pecina. (2015). Feature Extraction for Native Language Identification Using Language Modeling. In Proceedings of Recent Advances in Natural Language Processing, pp. 298-306. Hissar, Bulgaria (bib).
Misc
- Gabriel Altman et al. (2016). Nový encyklopedický slovník češtiny. Nakladatelství Lidové noviny, Praha, Czechia, ISBN 978-80-7422-480-5 (bib).
- Leo Wanner et al. (2021). Towards a Versatile Intelligent Conversational Agent as Personal Assistant for Migrants. In Advances in Practical Applications of Agents, Multi-Agent Systems, and Social Good. The PAAMS Collection, Salamanca, Spain, Lecture Notes in Computer Science, ISSN 0302-9743, 12946, pp. 316-327, Springer, Cham, Switzerland, ISBN 978-3-030-85739-4 (bib).