Principal investigator (ÚFAL):
VALLEX
Delving Deeper: Lexicographic Description of Syntactic and Semantic Properties of Czech Verbs
Language phenomena at the syntax-semantics interface have been studied extensively, yet an adequate framework for their lexicographic description is still missing. The goal of the project is to propose such a framework and to apply it in the lexicographic processing of language data. Both theoretical and applied research is pursued, as this is an approach that benefits both. The research focuses on two areas. First, it aims at deepening the insight into various changes in valency structure of verbs; a formal model for lexicographic representation of such changes is designed. The model is used for description of grammatical, syntactic and semantic diatheses in the Valency Lexicon of Czech Verbs (VALLEX). Second, the project deals with mapping lexical resources that primarily aims at enhancing VALLEX with a semantic classification based on the FrameNet lexical database. The main applied output of the project is a qualitatively and quantitatively enhanced version of VALLEX available for a wide professional audience, for students and other language users as well as for NLP applications.
The main goal of the project is to propose an adequate framework for the theoretical description of language phenomena at the syntax-semantics interface; such a framework will be applied in lexicographic processing of language data. A close interplay between theoretical research and its application to an extensive data annotation represents a fruitful strategy that fortifies both sides involved.
The following areas are addressed in the project:
-
The lexicographic representation of various changes in valency structure of verbs
-
Theoretical research; design of a formal model for lexicographic description
-
Grammatical and syntactic diatheses: theoretical and practical aspects
-
Semantic diatheses: theoretical and practical aspects
-
Other types of changes in valency structure of verbs
-
Comparative aspects of diatheses
-
Application in an electronic language resource
-
Mapping lexical resources: an effective way of enriching lexical information
-
Enhancing Czech valency lexicon with semantic classes and semantic roles
-
Strengthening lexical resources with corpus evidence
The full project description can be found in the project proposal.
Project partners
Project Summary
The accomplishments of the four year project Delving Deeper: Lexicographic Description of Syntactic and Semantic Properties of Czech Verbs are twofold:
1. The theoretical insight into various language phenomena at the syntactic-semantic interface has been deepened – both grammaticalized alternations (diatheses and reciprocity) and lexicalized alternations (lexical-semantic conversions, structural splitting of a situational participant and multiple structural realization of a situational participant) were put under scrutiny. A contrastive perspective was also applied: we have focused namely on differences in syntactic behavior of Czech, Polish and Russian verbs undergoing changes in their valency structure. Moreover, a typologically new type of syntactic change – related to syntactic reflexivity in Czech – was identified and studied in detail, esp. changes in morphemic expressions of verbal complementations conditioned by the long and clitic variant of the reflexive pronoun.
Based on these theoretical achievements, the formal model of the lexicon has been refined and enriched – as a result, the final model provides an adequate and economic lexicographic representation of the studied phenomena. Further, elaborate lexicographic rules for describing changes in valency behavior of verbs undergoing alternations have been designed.
2. The electronic Valency lexicon of Czech verbs VALLEX has been substantially qualitatively and quantitatively enhanced in two dimensions. First, the information on the applicability of individual alternations (both grammaticalized and lexicalized) was added to its data component. Moreover, the lexicon has been enriched with the sample annotations of Polish and Russian lexical units undergoing alternations. Second, VALLEX has been interlinked with the annotation lexicon of the Prague Dependency Treebank; as a result of the interlinking, the lexicon has been enriched with examples from the treebank. The lexicon is accessible on the following webpage http://ufal.mff.cuni.cz/vallex/3.0/ .
The results were made available to the research community in journals dedicated to Czech and other Slavic languages (esp. categories Jimp, Jneimp and Jrec; 5 articles already published and 3 accepted for publication), in thematic anthologies and as chapters of monographs (category C; 1 published, 4 accepted for publication) and 1 theoretical monograph (already published). In addition, these results were presented at international and Czech conferences on both theoretical and computational linguistics (especially at those with proceedings monitored in WoS, category D; 5 published texts).
The main applied output of the project is both qualitatively and quantitatively enhanced valency lexicon of Czech verbs available for a wide professional audience as well as for students and other language users. An emphasis was laid on both human and machine-readability; thus both linguists and developers of applications within the Natural Language Processing domain can use it.
The lexicon is prepared for publication as a monograph (category B) and it has been already released as an electronic language resource (software, category R).
Another positive aspect of the project was the involvement of students, which resulted in 1 MSc thesis (Gregoire Labbé) and 2 PhD theses (Václava Kettnerová and Eduard Bejček). Five more PhD students also participated in the project (Anna Vernerová, Marie Podobová, Natalia Klyueva, Katarzyna Vaculová and Adriana Filas) and thus have obtained important research experience.
Publications
2016
-
book
-
Lopatková Markéta, Kettnerová Václava, Bejček Eduard, Vernerová Anna, Žabokrtský Zdeněk: Valenční slovník českých sloves VALLEX. Karolinum, Praha, 698 pp., 2016.
-
chapters in books
-
Lopatková Markéta, Vernerová Anna, Kettnerová Václava: Diateze ve Valenčním slovníku českých sloves VALLEX. In Skwarska, K., Kaczmarska, E. (eds.) Výzkum slovesné valence ve slovanských zemích. Slovanský ústav AV ČR, v.v.i., Praha, pp. 149-167, 2016.
-
Panevová Jarmila: Valence v gramatice, valence ve slovníku. In Skwarska, K., Kaczmarska, E. (eds.) Výzkum slovesné valence ve slovanských zemích. Slovanský ústav AV ČR, v.v.i., Praha, pp. 13-25, 2016.
-
Skwarska Karolína, Bolbot Katarzyna, Filas Adriana: Zpracování polských zvratných konstrukcí ve slovníku VALLEX. In Skwarska, K., Kaczmarska, E. (eds.) Výzkum slovesné valence ve slovanských zemích. Slovanský ústav AV ČR, v.v.i., Praha, pp. 169-180, 2016.
2015
-
articles
-
Kettnerová, V., Lopatková, M., Panevová, J.: Shoda doplňku v reflexivních konstrukcích v češtině. Slovo a slovesnost, Vol. 76, No. 3, pp. 198-214, 2015.
-
Podobová, M.: Bezobjektové posesivní rezultativní konstrukce od sloves tranzitivních a bezobjektové i objektové posesivní rezultativní konstrukce od sloves intranzitivních – jejich tvorba a funkce v českém jazyce. Nová čeština doma a ve světě, 1/2015, pp. 82-113, 2015.
-
Skwarska, K.: Opis leksykograficzny syntaktycznych i semantycznych cech czasowników w języku polskim i rosyjskim na podstawie korpusów narodowych. Prace Filologiczne. Vol. LXVII, pp. 349-366, 2015.
-
proceedings/collections
-
Hajič, J., Hajičová, E., Mikulová, M., Mírovský, J., Panevová, J., Zeman, D.: Deletions and node reconstructions in a dependency-based multilevel annotation scheme. In CICLing 2015: 16th International Conference on Computational Linguistics and Intelligent Text Processing, LNCS 9041Springer, Berlin / Heidelberg, pp. 17-31, 2015.
-
Hajičová, E., Mikulová, M., Panevová, J.: Reconstruction of Deletions in a Dependency-based Description of Czech: Selected Issues. In Depling 2015: Proceedings of the Third International Conference on Dependency Linguistics, Uppsala University, Uppsala, Sweden, pp. 131-140, 2015.
-
Kettnerová, V., Lopatková, M.: At the Lexicon-Grammar Interface: The Case of Complex Predicates in the Functional Generative Description. In Depling 2015: Proceedings of the Third International Conference on Dependency Linguistics, Uppsala University, Uppsala, Sweden, pp. 191-200, 2015.
-
others
2014
-
book
-
Kettnerová, V.: Lexikálně-sémantické konverze ve valenčním slovníku. Praha: Karolinum, 282 p., 2014.
-
chapter in book
-
Kettnerová, V., Lopatková, M.: Alternace slovesných rámců. Kapitola v knize Jarmila Panevová a kol. Mluvnice současné češtiny 2 / Syntax češtiny na základě anotovaného korpusu. Praha: Karolinum, pp. 117-133., 2014.
-
articles
-
Kettnerová, V., Lopatková, M., Panevová, J.: An Interplay between Valency Information and Reflexivity. The Prague Bulletin of Mathematical Linguistics No. 102, pp. 105–126, 2014.
-
Skwarska, K.: Sintaksicheskije i semanticheskije xarakteristiki russkix, pol´skix i cheshskix glagolov v slovare sochetajemostej VALLEX. (Syntactic and semantic properties of Russian, Polish and Czech verbs in valency lexicon VALLEX). Uchenyje zapiski Petrozavodskogo gosudarstvennogo universiteta, Nr. 1, pp. 42-46, 2014.
-
proceedings/collections
-
Bejček, E., Kettnerová, V., Lopatková, M.: Automatic Mapping Lexical Resources: A Lexical Unit as the Keystone. In Proceedings pf the 6th International Conference on Language Resources and Evaluation, LREC 2014. ELRA, pp. 2826-2832, 2014.
-
Kettnerová, V., Lopatková, M.: Reflexive Verbs in a Valency Lexicon: The Case of Czech Reflexive Morphemes. In Proceedings of the XVI EURALEX International Congress: The User in Focus (EURALEX 2014), EURAC research, Bolzano/Bozen, Italy, pp. 1007-1023, 2014.
-
Klyueva, N., Kuboň, V.: Automatic Valency Derivation for Related Languages. In FLAIRS 2014: Proceedings of the Twenty-Seventh International Florida Artificial Intelligence Research Society Conference, AAAI Press, Palo Alto, California, pp. 437-442, 2014.
-
Skwarska, K.: Rol‘ leksicheskogo napolnenia v konstrukcijach s otnosheniem semanticheskoj diatezy v cheshskom, russkom i pol´skom jazykach. [The Role of Lexical Realization in the Constructions Related through Semantic Diathesis] In Motoki Nomachi, Andrii Danylenko, Predrag Piper (eds.) Proceedings from the 36th Meeting of the Commission on the Grammatical Structure of the Slavic Languages of the International Committee of Slavists, pp. 347-359, 2014.
-
Vernerová, A., Kettnerová, V., Lopatková, M.: To pay or to get paid: Enriching a Valency Lexicon with Diatheses. In Proceedings pf the 6th International Conference on Language Resources and Evaluation, LREC 2014. ELRA, pp. 2452-2459, 2014.
2013
-
articles
-
proceedings/collections
-
Kettnerová, V., Lopatková, M.: The Representation of Czech Light Verb Constructions in a Valency Lexicon. In Hajičová, E., Gerdes, K., Wanner, L. (eds.) Proceedings of the Second International Conference on Dependency Linguistics, Depling 2013, pp. 147-156, 2013. Matfyzpress, Charles University in Prague, Prague, Czech Republic.
-
Kettnerová, V., Lopatková, M., Bejček, E., Vernerová, A., Podobová, M.: Corpus Based Identification of Czech Light Verbs. In Gajdošová, K., Žáková, A. (eds.) Proceedings of the Seventh International Conference Slovko 2013; Natural Language Processing, Corpus Linguistics, E-learning, pp. 118-128, 2013. RAM-Verlag, Lüdenscheid, Germany.
-
Vernerová, A., Lopatková, M.: Towards Automatic Detection of Applicable Diatheses. In Vinař, T. (ed.) ITAT 2013: Information Technologies – Applications and Theory (Proceedings), pp. 10-17, 2013. Slovenská spoločnosť pre umelú inteligenciu. CreateSpace Independent Publishing Platform.
-
others
-
Labbé, G.: Traitement de la valence de verbes de mouvement slovènes sur la base de la valence de verbes tchèques. Mémoire de master 1. Institut National des Langues et Civilisations Orientales / Univerzita Karlova v Praze, p. 64 2013.
2012
-
articles
-
Kettnerová, V.: Syntaktické konstrukce typu Včely se hemží na zahradě – Zahrada se hemží včelami. Korpus – gramatika – axiologie, Vol. 05, pp. 54-71, 2012.
-
Kettnerová, V., Lopatková, M., Bejček, E.: Mapping Semantic Information from FrameNet onto VALLEX. The Prague Bulletin of Mathematical Linguistics, Vol. 97, pp. 23-41, 2012.
-
proceedings/collections
-
Kettnerová, V., Lopatková, M., Bejček, E.: The Syntax-Semantics Interface of Czech Verbs in the Valency Lexicon. In Proceedings of the 15th EURALEX International Congress, Department of Linguistics and Scandinavian Studies, University of Oslo, Oslo, Norway, pp. 434-443, 2012.
-
Kettnerová, V., Lopatková, M., Urešová, Z.: The Rule-Based Approach to Czech Grammaticalized Alternations. In P. Sojka, A. Horák , I. Kopeček, K. Pala (eds.) Proceedings of Text, Speech and Dialogue International Conference, TSD 2012, LNCS 7499, Springer Verlag, Berlin / Heidelberg, pp. 158-165, 2012.
-
others
Data Release
-
Lopatková, M., Kettnerová, V., Bejček, E., Vernerová, A., Žabokrtský, Z.: VALLEX 3.0 - Valenční slovník českých sloves. LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University, http://hdl.handle.net/11234/1-2307, 2016
-
Lopatková, M, Kettnerová, V., Bejček, E., Skwarska, K., Žabokrtský, Z.: VALLEX 2.6. Data/software, ÚFAL MFF UK, http://ufal.mff.cuni.cz/vallex/2.6/, Dec 2012