Instructor: | Jirka Hana |
e-mail for homeworks etc: | Jirka.LastName@gmail.com (start the email's subject with NPFL096) |
Time & Place: | Tue 9:00-10:30 (S6 Malá Strana) |
This course will introduce you to the methods of processing morphology of natural languages. It covers both supervised and unsupervised methods of morphological analysis, morpheme segmentation, lexicon creation, etc. Most of the course consists of discussion of important papers in the field. You will replicate or extend a system from one of the papers.
In each class, we will discuss one or more papers (sometimes books or dissertations). It is expected that everybody will have read the papers. For each paper, one or two people will be responsible for leading the discussion (in some cases it will be me).
There are two small projects/home-works and one bigger project.
Homework | Due |
[HW 1 - Goldsmith] | Mar 20 |
[HW 2 - Edit distance] | Jul 31 |
[Project] | Jul 31 |
"Active participation" refers to your comments and questions during class, your answers to my questions, etc. I do not keep track of whether your answers, etc. are correct, but simply whether or not you participate. It is important that you read the assigned papers (especially if you are leading the discusssion).
Homeworks | 0-20 |
Project | 0-50 |
Active class participation | 0-30 |
Total: | 0-100 |
Grade | Points |
1 | 90-100 |
2 | 76-89 |
3 | 60-75 |
4 | 0-59 |
This is a preliminary schedule, some of the papers will change
Date | Topic | Related/Other papers | ||
18 Feb | me | Introduction; Morphology [slides] | ||
25 Feb | No class | |||
4 Mar | me | FS Technology; Morphological Analysis; Two-level morphology [slides] | ||
11 Mar | me | Corpora, Tagsets, Annotation | ||
18 Mar | me | A. Feldman & J. Hana (2010). A resource-light approach to morpho-syntactic tagging (Chapter 6, 7) [slides] | ||
25 Mar | J. Goldsmith (2001). Unsupervised Learning of the Morphology of a Natural Language. | |||
1 Apr | D. Yarowsky & R. Wicentowski (2000): Minimally Supervised Morphological Analysis by Multimodal Alignment. | R. Wicentowski (2004): Multilingual noise-robust supervised morphological analysis using the WordFrame model. | ||
8 Apr | No class | |||
15 Apr | S.Cucerzan & D. Yarowsky (2002): Bootstrapping a Multilingual Part-of-speech Tagger in One Person-day _ (2003): Minimally Supervised Induction of Grammatical Gender |
|||
22 Apr | P. Schone & D. Jurafsky (2001): Knowledge-free induction of inflectional morphologies | P. J. Schone (2001): Toward knowledge-free induction of machine-readable dictionaries. | ||
29 Apr | M. G. Snover & M. R. Brent (2001): A Bayesian model for morpheme and paradigm identification. | |||
6 May | C. Monson et al (2007). ParaMor: Minimally Supervised Induction of Paradigm Structure and Morphological Analysis. | C. Monson (2009): ParaMor: From Paradigm Structure to Natural Language Morphology Induction. (thesis) C. Monson et al. (2009): ParaMor and Morpho Challenge 2008; T. Tchoukalov, C. Monson & B. Roark (2010): Morphological Analysis by Multiple Sequence Alignment. |
||
13 May | Morfessor: M. Creutz and K. Lagus (2007): Unsupervised models for morpheme segmentation and morphology learning. Helping Morfessor/Paramor (Kohonen et al 2010, Tepper and Xia 2008, Klíč and Hana) |
Creutz, M (2006): Induction of the Morphology of Natural Language (thesis) | ||
20 May |
Links: