PDT & Monolingual Corpora

The Prague Dependency Treebank

The Prague Dependency Treebank (PDT) contains a large amount of Czech texts with complex and interlinked morphological, syntactic and complex semantic annotation; in addition, certain properties of sentence information structure and coreference relations are annotated at the semantic level. ... [learn more]

Prague Discourse Treebank

Annotation of discourse relations is a project related to the Prague Dependency Treebank 2.5 (PDT; Bejček et al. 2011), which is a revised, updated and extended version of the Prague Dependency Treebank 2.0 (Hajič et al. 2006). It represents a new manually annotated layer of language description, above the existing layers of the PDT (morphology, surface syntax and underlying syntax) and it portrays linguistic phenomena from the perspective of discourse structure and coherence. ... [learn more]

Prague Database of Spoken Language

The project focuses on speech reconstruction of Czech and English. It is part of the Prague Dependency Treebank family of annotated corpus resources and tools, to which it adds the spoken language layer(s). It consists of the Prague DaTabase of Spoken English and Prague DaTabase of Spoken Czech ... [learn more]

 

Other Monolingual Corpora

Project Tags
Czech Academic Corpus Corpora, Data, Monolingual
Czech Legal Text Treebank Annotations, Corpora, Data, Information Retrieval, Linked data, Monolingual, Semantics
Czech Named Entity Corpus Corpora, Data, Monolingual
Czech RST Discourse Treebank 1.0 Annotations, Corpora, Data, Discourse, Monolingual
CzeDLex - A Lexicon of Czech Discourse Connectives Annotations, Corpora, Data, Discourse, Lexicons, Linked data, Monolingual
EngVallex - English valency lexicon linked to corpora Annotations, Corpora, Data, Lexicons, Monolingual, Semantics, Valency
EVALD 3.0 (Evaluator of Discourse) Coreference, Corpora, Data, Discourse, Information Structure, Monolingual, Multi-modality, Tools
HindEnCorp Corpora, Data, Machine Translation, Monolingual, Multilingual
Lindat KonText Annotations, Corpora, Data, Monolingual, Multilingual, Tools
Modeling of Complexity in Czech Literary Texts Annotations, Coreference, Corpora, Data, Discourse, Information Structure, Monolingual, Publications, Semantics, Syntax, Teaching
MorfFlex CZ Corpora, Data, Lexicons, Monolingual, Morphology
NomVallex: Valency Lexicon of Czech Nouns and Adjectives Corpora, Data, Lexicons, Monolingual, Semantics, Syntax, Valency
PDT-Vallex: Valency Lexicon Linked to Czech Corpora Annotations, Corpora, Data, Lexicons, Monolingual, Semantics, Valency
PDTSC 2.0 Annotations, Corpora, Data, Linked data, Monolingual, Morphology, Multi-modality, Semantics, Speech Recognition, Speech Retrieval, Valency
Prague Dependency Treebank Annotations, Coreference, Corpora, Data, Discourse, Information Structure, Lexicons, Monolingual, Morphology, Multiword Expressions, Parsers, Semantics, Syntax, Taggers, Tools, Valency
Prague Dependency Treebank 3.0 Annotations, Coreference, Corpora, Data, Discourse, Information Structure, Monolingual, Morphology, Multiword Expressions, Semantics
Prague Dependency Treebank 3.5 Annotations, Coreference, Corpora, Data, Discourse, Information Structure, Lexicons, Machine Learning, Monolingual, Morphology, Multiword Expressions, Parsers, Publications, Semantics, Syntax, Taggers, Tools, Valency
Prague Discourse Treebank 1.0 Annotations, Coreference, Corpora, Data, Discourse, Information Structure, Monolingual, Morphology, Multiword Expressions, Semantics
Prague Discourse Treebank 2.0 Annotations, Coreference, Corpora, Data, Discourse, Information Structure, Monolingual, Morphology, Multiword Expressions, Semantics
Prague Discourse Treebank 3.0 Annotations, Coreference, Corpora, Data, Discourse, Information Structure, Monolingual, Morphology, Multiword Expressions, Semantics, Syntax, Valency
Prague Discourse Treebank 4.0 Annotations, Coreference, Corpora, Data, Discourse, Information Structure, Monolingual, Morphology, Multiword Expressions, Semantics, Syntax, Valency
Prague English Dependency Treebank Annotations, Corpora, Data, Lexicons, Monolingual, Valency
ROMi 1.0 Corpora, Data, Dialog, Monolingual, Speech Recognition
Semantic Pattern Recognition Annotations, Corpora, Data, Lexicons, Monolingual, Morphology, Parsers, Publications, Semantics, Taggers, Tools, Valency
Sentiment Analysis in Czech Annotations, Corpora, Data, Lexicons, Monolingual, Semantics, Tools
Slovakoczech NLP workshop Annotations, Coreference, Corpora, Data, Dialog, Discourse, Information Retrieval, Information Structure, Lexicons, Linked data, Machine Learning, Machine Translation, Monolingual, Morphology, Multi-modality, Multilingual, Multiword Expressions, Parsers, Publications, Semantics, Speech Recognition, Speech Retrieval, Spellcheckers, Taggers, Tools, Valency
SumeCzech Corpora, Data, Monolingual
UrMonoCorp Corpora, Data, Monolingual
VPS-30-En: Verb Pattern Sample - 30 English Annotations, Corpora, Data, Lexicons, Monolingual, Semantics, Valency
VPS-GradeUp Annotations, Corpora, Data, Lexicons, Machine Learning, Monolingual, Semantics, Valency
Working with the Penn Discourse Treebank Annotations, Corpora, Data, Discourse, Linked data, Monolingual, Tools