UDPipe 2 is a Python prototype, capable of performing tagging, lemmatization and syntactic analysis of CoNLL-U input. It took part in several competitions, reaching excellent results in all of them:
Compared to UDPipe 1, it is Python-only, it does not perform tokenization, and the models require more computation power.
The UDPipe 2 models are currently available from the LINDAT UDPipe REST Service. Apart from the web interface, you can use the udpipe2_client.py script to process your files through the service.
You can get UDPipe 2 sources from the udpipe-2 branch of the UDPipe repository. The sources can be used to both train a new model and to run a local REST server for inference.
The available models are described on a separate page.
This work has been supported by the Ministry of Education, Youth and Sports of the Czech Republic, Project No. LM2018101 LINDAT/CLARIAH-CZ.
@InProceedings{straka-2018-udpipe,
title = "{UDP}ipe 2.0 Prototype at {C}o{NLL} 2018 {UD} Shared Task",
author = "Straka, Milan",
booktitle = "Proceedings of the {C}o{NLL} 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies",
month = oct,
year = "2018",
address = "Brussels, Belgium",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/K18-2020",
doi = "10.18653/v1/K18-2020",
pages = "197--207",
}