Dependency Grammars and Treebanks
Academic year 2023/2024
In the official schedule in SIS, the course is presented as having 45 minutes lecture and 45 minutes practicals every week. In reality, we will usually have 90 minutes practicals one week and 90 minutes lecture the other week. (Nevertheless it is not guaranteed that all lectures will be in even weeks and all practicals in odd weeks. Sometimes we may have to change the order based on the availability of the teachers.)
Both lectures and practicals take place in the SW1 lab on Thursdays 9:00 – 10:30.
Lectures and schedule
The distribution of topics to future lectures is only approximate, and so are the planned dates when the topic will be covered. The slides and other materials here will be updated after they have been used in the lecture.
-
Practical 1 (2024-02-22)
-
Lecture 1 (2024-02-29): Introduction, trees, dependencies; representation of non-dependency relations; word order and (non-)projectivity. Slides 1A. Slides 1B.
-
Lecture 2 (2024-03-07): Stratificational approach to language description (stratificational grammar, FGD, MTT); PDT introduction + morphology. Slides 2A. Slides 2B. Slides 2C.
-
Practical 2 (2024-03-14)
-
Lecture 3 (2024-03-21): UD introduction + morphology. Slides 3.
-
Practical 3 (2024-03-28)
-
Lecture 4 (2024-04-04): PDT a-layer (surface syntax). Slides 4.
-
Practical 4 (2024-04-11)
-
Lecture 5 (2024-04-18): UD basic syntax Slides 5 (today covered until slide no. 63).
-
Practical 5 (2024-04-25)
-
Lecture 6 (2024-05-02): Enhanced UD Slides 6.
-
Practical 6 (2024-05-09) PML-TQ
-
Lecture 7 (2024-05-16): PDT t-layer (deep syntax); coreference; valency etc. Slides 7A, Slides 7B, Slides 7C, Slides 7D, Slides 7E, Slides 7F, Slides 7G, Slides 7H, Slides 7I.
-
no session 2024-05-23 (conflict with LREC-COLING)
Final grade
To pass the course, you will be required to submit all of the homework tasks. The quality of your homework solutions will determine your grade.
-
Excellent: ≥ 90%
-
Very good: ≥ 70%
-
Good: ≥ 50%
Archive – academic year 2022/23
In comparison to previous years, the course will be compressed, as it now has fewer hours than before. In the official schedule in SIS, the course is presented as having 45 minutes lecture and 45 minutes practicals every week. In reality, we will usually have 90 minutes lecture one week and 90 minutes practicals the following week. (Nevertheless it is not guaranteed that all lectures will be in odd weeks and all practicals in even weeks. Sometimes we may have to change the order based on the availability of the teachers.)
Lectures and schedule
The distribution of topics to future lectures is only approximate. The details on the lectures will appear here during the semester. If you wish to study ahead, refer to the archived information from 2020/2021 below.
-
Lecture 1 (2023-02-16): Introduction, trees, dependencies; representation of non-dependency relations; word order and (non-)projectivity. Slides 1A. Slides 1B.
-
Practical 1 (2023-02-23)
-
Lecture 2 (2023-03-02): Stratificational approach to language description (stratificational grammar, FGD, MTT); PDT introduction + morphology. Slides 2A. Slides 2B. Slides 2C.
-
Practical 2 (2023-03-09)
-
Practical 3 (2023-03-16)
-
Lecture 3 (2023-03-23): UD introduction + morphology. Slides 3 (today covered up to slide 42/47).
-
Lecture 4 (2023-03-30): PDT a-layer (surface syntax). Slides 4.
-
Practical 4 (2023-04-06)
-
Lecture 5 (2023-04-13): UD basic syntax Slides 5 (today covered up to slide 69/105).
-
Practical 5 (2023-04-20)
-
Lecture 6 (2023-04-27): Enhanced UD Slides 6.
-
Lecture 7 (2023-05-04): PDT t-layer (deep syntax); coreference; valency etc. Slides 7A, Slides 7B, Slides 7C, Slides 7D, Slides 7E, Slides 7F, Slides 7G, Slides 7H, Slides 7I.
-
Practical 6 (2023-05-11) PML-TQ
-
Practical 7 (2023-05-18) Udapi
Final grade
To pass the course, you will be required to actively participate in the classes and to submit all of the homework tasks. The quality of your homework solutions will determine your grade.
-
Excellent: ≥ 90%
-
Very good: ≥ 70%
-
Good: ≥ 50%
Archive - academic year 2020/21
Lectures
-
Lecture 1 (March 3, 2021): Introduction, trees, dependencies
-
reading:
-
Osborne, T. (2019) A Dependency Grammar of English. John Benjamins Publishing Company, Amsterdam/Philadelphia (available in my office)
-
Hajičová, E., Panevová, J., Sgall, P. (2002) Úvod do teoretické a počítačové lingvistiky, sv. I. Karolinum, Praha (available in the secretariat)
-
Štekauer, P., ed. (2000) Rudiments of English Linguistics. Slovacontact, Prešov. (available in my office)
-
Štěpánek, J. (2006) Závislostní zachycení větné struktury v anotovaném syntaktickém korpusu. PhD Thesis, MFF UK (link)
-
Wikipedia - basic articles on dependency grammar are consistent with Timothy Osborne's approac
-
Universal dependencies (intro): https://universaldependencies.org/
-
Prague Dependency Treebank (intro): https://ufal.mff.cuni.cz/pdt3.5/
-
Lecture 2 (March 10, 2021): Non-dependency relations and their representation; Word order and (non-)projectivity
-
reading:
-
Kuhlmann, M., Nivre, J. (2006): Mildly Non-Projective Dependency Structures. In COLING/ACL Main Conference Poster Sessions, 507–514 (link).
-
Petkevič, V. (1995) A New Formal Specification of Underlying Structure. Theoretical Linguistics, vol. 21, No.1
-
Štěpánek, J. (2006) Závislostní zachycení větné struktury v anotovaném syntaktickém korpusu. PhD Thesis, MFF UK (link)
-
Havelka, J. (2007): Mathematical Properties of Dependency Trees and their Application to Natural Language Syntax. PhD Thesis, MFF UK (link)
-
Universal Dependencies: https://universaldependencies.org/
-
Prague Dependency Treebank: https://ufal.mff.cuni.cz/pdt3.5/
-
Lecture 3 (March 17, 2021): Intro to Stratificational Approach to Language Description (stratificational grammar, FGD, MTT)
-
reading:
-
Hajičová, E., Panevová, J., Sgall, P. (2002) Úvod do teoretické a počítačové lingvistiky, sv. I. Karolinum, Praha (available in the secretariat)
-
Štekauer, P., ed. (2000) Rudiments of English Linguistics. Slovacontact, Prešov. (available in my office)
-
Sgall, P. (1967) Generativní popis jazyka a česká deklinace. Academia, Praha (available in my office)
-
Žabokrtský, Z. (2006) Resemblances between Meaning - Text Theory and Functional Generative Description. In Proceedings of the 2nd International Conference of Meaning-Text Theory, Slavic Culture Languages Publishers House, Moskva, pp. 549-557. (link)
-
https://www.britannica.com/science/linguistics/Stratificational-grammar
-
advanced:
Sgall, P., Hajičová, E., Panevová, J. (1986) The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. Reidel, Dordrecht.
-
Lecture 4 (March 24, 2021):
-
TOPIC 1: Intro to Prague Dependency Treebank
-
reading:
-
Hajič, J., Hajičová, E., Mírovský, J., Panevová, J.: Linguistically Annotated Corpus as an Invaluable Resource for Advancements in Linguistic Research: A Case Study. The Prague Bulletin of Mathematical Linguistics, No. 106, ISSN 0032-6585, pp. 69-124, 2016 (link)
-
Hajičová, E., Panevová, J., Sgall, P. (2002) Úvod do teoretické a počítačové lingvistiky, sv. I. Karolinum, Praha (available in the secretariat)
-
Hajič, J. (1998) Building a Syntactically Annotated Corpus: The Prague Dependency Treebank. In E. Hajičová (ed.): Issues of Valency and Meaning. Studies in Honour of Jarmila Panevová, Karolinum, Charles University Press, Prague, Republic, pp. 106-132 (link)
-
Štekauer, P., ed. (2000) Rudiments of English Linguistics. Slovacontact, Prešov. (available in my office)
-
PDT-C webpage
-
PDT 2.0 guide
-
TOPIC 2: PDT and its morphological annotation
-
note: You are not supposed to memorize the tag structure but you might be ask to provide examples (using the following table pdf)
-
reading:
-
Matthews, H. (1997) The Concise Oxford Dictionary of Linguistics. Oxford University Press, Oxford
-
Filipec, J. (1994) Lexicology and Lexicography: Development and State of the Research. In Luelsdorff, P.A. (ed.) The Prague School of Structural and Functional Linguistics, Amsterdam-Philadelphia, John Benjamins, p.163–183
-
Hajič, J. (2004) Disambiguation of Rich Inflection (Computational Morphology of Czech). Karolinum, Charles Univeristy Press, Prague.
-
-
-
Straková Jana, Straka Milan and Hajič Jan. (2014) Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 13-18, Baltimore, Maryland, June 2014. Association for Computational Linguistics.
-
-
-
Table with morphological tags in PDT 2.0 (
pdf)
-
Lecture 5 (March 31, 2021): Intro to UD, morphology
-
reading:
-
Nivre Joakim et al. (2020) Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection. In: Proceedings of LREC 2020. ELRA, Marseille, France, p. 4034-4043, 2020 (link).
-
https://universaldependencies.org/
-
Lecture 6 (April 7, 2021): Surface syntactic annotation in PDT (a-layer)
-
reading:
-
Hajič, J. (1998) Building a Syntactically Annotated Corpus: The Prague Dependency Treebank. In E. Hajičová (ed.): Issues of Valency and Meaning. Studies in Honour of Jarmila Panevová, Karolinum, Charles University Press, Prague, Republic, pp. 106-132 (link)
-
Štekauer, P., ed. (2000) Rudiments of English Linguistics.Slovacontact, Prešov (chapter 4, Syntax)
-
Quirk, R., Greenbaum, S., Leech, G., Svartvik, J. (1985) A Comprehensive Grammar of the English Language, Longman, London.
-
PDT documentation: Manual for Analytical Annotation (link)
-
Table with analytical functions in PDT 2.0 (pdf)
-
Lectures 7, 8 and 9 (April 14, 21 and 28, 2021): Syntax in UD
-
Lecture 10 (May 5, 2021):
-
May 12, 2021 - Rector's Day (lecture cancelled)
-
Lecture 11 (May 19, 2021):
-
Lecture 12 (May 26, 2021): Valency in the PDT family
-
reading:
-
Hajič, J. et al (2003) PDT-VALLEX: Creating a Large-coverage Valency Lexicon for Treebank Annotation. In Proceedings of The Second Workshop on Treebanks and Linguistic Theories, Vaxjo University Press, Vaxjo, Sweden, p. 57-68 (
link)
-
PDT documentation: PDT-C
link
-
A bit of history - see the PDT 2.0 Guide -
link
-
Lecture 13 (June 2, 2021):
Other Useful Links and Other Materials
Recordings
A subset of the lectures have been recorded and are available for viewing. These are old recordings from 2020 (remote teaching during Covid-19 lockdown) but their contents still largely overlaps with today's course, so they may help you if you missed a class. The recordings that are still available do not cover the whole course, though. Only those related to Universal Dependencies are available.