CRAC 2023 Shared Task on Multilingual Coreference Resolution

Overview

Coreference resolution is the task of clustering together multiple mentions of the same entity appearing in a textual document (e.g. Joe Biden, the U.S. President and he). This CodaLab-powered shared task deals with multilingual coreference resolution and is associated with the CRAC 2023 Workshop (the Sixth Workshop on Computational Models of Reference, Anaphora and Coreference) held at EMNLP 2023.

Official Results

The following table shows four versions of the CoNLL metric macro-averaged over all datasets:

head-match excluding singletons (the primary metric, see below),
partial-match excluding singletons,
exact-match excluding singletons and
head-match with singletons.

A more detailed evaluation will be provided in the shared task overview paper.

system	head-match	partial-match	exact-match	with singletons
1. CorPipe	74.90	73.33	71.46	76.82
2. Anonymous	70.41	69.23	67.09	73.20
3. Ondfa	69.19	68.93	53.01	68.37
4. McGill	65.43	64.56	63.13	68.23
5. DeepBlueAI	62.29	61.32	59.95	54.51
6. DFKI-Adapt	61.86	60.83	59.18	53.94
7. Morfbase	59.53	58.49	56.89	52.07
8. BASELINE	56.96	56.28	54.75	49.32
9. DFKI-MPrompt	53.76	51.62	50.42	46.83

Important Dates

February 27, 2023 - shared task opening
- registration form available
- training and development data available at the LINDAT/CLARIAH-CZ repository
March 20, 2023 - start of the development phase:
- evaluation tool available at GitHub
- UDPipe-reparsed development data available (gold morphological and syntactic annotation replaced with UDPipe predictions, to make the coreferefence resolution task more realistic)
March 28, 2023
- baseline system available at GitHub
- development evaluation possible via CodaLab
June 1, 2023 - start of the evaluation phase: test evaluation via CodaLab
June 21, 2023 (AoE) - end of evaluation phase
September 15, 2023 (was September 1) - submission of system description papers to the CRAC 2023 workshop
see CRAC's Important Dates for the dates of acceptance notification and camera-ready papers
December 6-7, 2023 – the CRAC workshop at EMNLP

Background

Recently, inspired by the Universal Dependencies initiative (UD) [1], the coreference community has started discussions on establishing a universal annotation scheme and using it to harmonize existing corpora. The discussions at the CRAC 2020 workshop led to proposing the Universal Anaphora initiative. One of the lines of effort related to Universal Anaphora resulted in CorefUD, which is a multilingual collection of coreference data resources harmonized under a common scheme [2]. The current public edition of CorefUD 1.1 contains 17 datasets for 12 languages, namely Catalan, Czech (2×), English (2×), French, German (2×), Hungarian (2×), Lithuanian, Norwegian (2×), Polish, Russian, Spanish, and Turkish. The CRAC 2023 shared task deals with coreference resolution in all these languages. It is the 2nd edition of the shared task; findings of the first edition can be found in [8].

The file format used in CorefUD 1.1 represents coreference using the bracketing notation inspired by the CoNLL-2011 and CoNLL-2012 shared tasks [3], and inserts it into the MISC column of the CoNLL-U, the file format used in UD. The content of the other columns is fully compatible with morphological and syntactic annotations of the UD framework in CorefUD (with, for instance, automatically parsed trees added to resources that miss manual syntactic annotations). Thus, the shared task participant can easily employ UD-style morphosyntactic features for coreference prediction for all resources in a unified way, if they want to (pilot studies of the relation between coreference and dependency syntax can be found in [4] and [5]).

Task Description

The main rules of the CRAC 2023 shared task are the following:

Shared task participants are supposed to both (a) identify mentions in texts and (b) predict which mentions belong to the same coreference cluster (i.e., refer to the same entity or event).
Training and development data was published first; evaluation data (without gold annotations) will be published only after the beginning of the evaluation phase and must not be used for improving the models.
Participants are expected to deliver their submissions exclusively via CodaLab; a submission must have the form of a zip file containing test set files with predicted coreference, ideally for all 17 CorefUD datasets; however, participants who are unable to predict coreference for all CorefUD datasets are encouraged to submit at least a subset of test set files.
Technically, ‘files with predicted coreference’ means that coreference attributes using the CorefUD notation are filled into the MISC column of test set CoNLL-U files.
In this shared task, only identity coreference is supposed to be predicted (even if some of the CorefUD datasets contain annotation of bridging).
There is a single official evaluation criterion that will be used for the main ranking of all submissions within the evaluation phase. In other words, there are no subtasks delimited within this shared task. We define the criterion as the arithmetic mean (macro-average) of the CoNLL score (an average of the F1 values of MUC, B-cubed and CEAFe scores) [6, 3] across the 17 datasets.
Even if there are no subtasks declared, additional evaluation criteria might be evaluated for all submissions and presented by the organizers (for instance, secondary rankings of submissions according to scores reached for individual languages).
A deep-learning-based baseline system [7] will be available to participants, and it is up to their decision whether they start developing their system from scratch, or by incremental improvements of this baseline.
After the evaluation period, participants will be invited to submit their system description papers to the CRAC 2023 workshop.

Even if all datasets included in the shared task are available in the same file format, systems competing in the shared task are supposed to be flexible enough to accommodate various types of variablity present in the CorefUD collection, such as

different training data sizes - for instance, the size difference between ParCorFull and PCEDT datasets spans two orders of magnitude,
different grammatical types of mentions (and hence also different density of mentions) - for instance, some datasets are limited to pronouns and selected NPs, while others are less limited; this is also related to presence/absence of annotated zero mentions (empty mentions are annotated e.g. for Czech or Polish),
completeness of annotation - in most datasets, all sentences have been processed in the same way, but for instance some documents in the Turkish dataset contain sentences that have not been annotated yet.

Changes to 2022 edition

new and updated datasets
original morpho-syntax features in dev and test sets replaced by the output of UDPipe 2 in order to make the evaluation scheme more realistic
head-matching score used as the primary score instead of partial matching

References

[1] De Marneffe, M.-C., Manning, C. D., Nivre, J., & Zeman, D. (2021). Universal Dependencies. Computational Linguistics, 47(2), 255-308.
[2] Nedoluzhko, A., Novák M., Popel M., Žabokrtský Z., Zeldes A., Zeman D. (2022). CorefUD 1.0: Coreference Meets Universal Dependencies. In Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022) (pp. 4859-4872).
[3] Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., & Zhang, Y. (2012). CoNLL-2012 shared task: Modeling multilingual unrestricted coreference in OntoNotes. In Joint Conference on EMNLP and CoNLL-Shared Task (pp. 1-40).
[4] Popel, M., Žabokrtský, Z., Nedoluzhko, A., Novák, M., & Zeman, D. (2021). Do UD Trees Match Mention Spans in Coreference Annotations? In Findings of the Association for Computational Linguistics: EMNLP 2021 (pp. 3570-3576).
[5] Nedoluzhko, A., Novák, M., Popel, M., Žabokrtský, Z., & Zeman, D. (2021). Is one head enough? Mention heads in coreference annotations compared with UD-style heads. In Proceedings of the Sixth International Conference on Dependency Linguistics (Depling, SyntaxFest 2021) (pp. 101-114).
[6] Denis, P. & Baldridge, J. (2009). Global joint models for coreference resolution and named entity classification. Procesamiento del lenguaje natural, Nº. 42, (pp. 87-96).
[7] Pražák, O., Konopík, M., & Sido, J. (2021). Multilingual Coreference Resolution with Harmonized Annotations. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021). 2021.
[8] Žabokrtský Z., Konopík M., Nedoluzhko A., Novák M., Ogrodniczuk M., Popel M., Pražák O., Sido J., Zeman D., Zhu Y. (2022). Findings of the Shared Task on Multilingual Coreference Resolution. In Proceedings of the CRAC 2022 Shared Task on Multilingual Coreference Resolution (pp. 1-17).

Organizers

Charles University (Prague, Czechia): Anna Nedoluzhko, Michal Novák, Martin Popel, Zdeněk Žabokrtský, Daniel Zeman
Polish Academy of Sciences (Warsaw, Poland): Maciej Ogrodniczuk
University of West Bohemia (Pilsen, Czechia): Miloslav Konopík, Ondřej Pražák, Jakub Sido

Registration Form and Contact to Organizers

If you are interested in participating in this shared task, please fill the registration form as soon as possible.

Technically, this registration will not be connected with participants' CodaLab accounts in any way. In other words, it will be possible to upload your CodaLab submissions without being registered here. However, we strongly recommend that at least one person from each participating team fills this registration form so that we can keep you informed about all updates regarding the shared task.

In addition, you can send any questions about the shared task to the organizers via corefud@googlegroups.com.

Acknowledgements

This shared task is supported by the Grants No. 20-16819X (LUSyD) of the Czech Science Foundation, and LM2018101 (LINDAT/CLARIAH-CZ) of the Ministry of Education, Youth, and Sports of the Czech Republic.

Data and Evaluation

CorefUD datasets

The public edition of CorefUD 1.1 data is used in this shared task, both for training and evaluation purposes. CorefUD 1.1 is a collection of previously existing datasets annotated with coreference, converted into a common annotation scheme. The datasets are enriched with automatic morphological and syntactic annotations that are fully compliant with the standards of the Universal Dependencies project. All the datasets are stored in the CoNLL-U format, with coreference-specific information captured in the MISC column.

The public edition of CorefUD 1.1 contains 17 datasets for 12 languages, labeled as follows:

Catalan-AnCora, based on the Catalan part of Coreferentially annotated corpus AnCora,
Czech-PCEDT, based on the Czech part of the Prague Czech-English Dependency Treebank,
Czech-PDT, based on the Prague Dependency Treebank,
English-GUM, based on the Georgetown University Multilayer Corpus,
English-ParCorFull, based on the English part of ParCorFull,
French-Democrat, based on the Democrat corpus,
German-ParCorFull, based on the German part of ParCorFull,
German-PotsdamCC, based on the Potsdam Commentary Corpus,
Hungarian-SzegedKoref, based on the Hungarian coreference corpus SzegedKoref,
Hungarian-KorKor, based on the a Hungarian coreference corpus KorKor (newly added into CorefUD in version 1.1),
Lithuanian-LCC, based on the Lithuanian Coreference Corpus,
Norwegian-BokmaalNARC, based on the Bokmaal part of the Norwegian Anaphora Resolution Corpus (newly added into CorefUD in version 1.1),
Norwegian-NynorskNARC, based on the Nynorsk part of the Norwegian Anaphora Resolution Corpus (newly added into CorefUD in version 1.1),
Polish-PCC, based on the Polish Coreference Corpus,
Russian-RuCor, based on the Russian Coreference Corpus RuCor,
Spanish-AnCora, based on the Spanish part of Coreferentially annotated corpus AnCora,
Turkish-ITCC, based on the Turkish Coreference Corpus (newly added into CorefUD in version 1.1).

(There is also a non-public edition of CorefUD 1.1, containing 4 more datasets, however, they cannot be used for this shared task purposes because of their license limitations.)

File format

The full specification of the CoNLL-U format is available at the website of Universal Dependencies. In a nutshell: every token has its own line; lines starting with # are sentence-level comments, and empty lines terminate a sentence. Regular token lines start with an integer number. There are also lines starting with intervals (e.g. 4-5), which introduce what UD calls “multi-word tokens”; these lines must be preserved in the output but otherwise the participants do not have to care about them (coreference annotation does not occur on them). Finally, there are also lines starting with decimal numbers (e.g. 2.1), which correspond to empty nodes in the dependency graph; these nodes may represent zero mentions and may contain coreference annotation. Every token/node line contains 10 tab-separated fields (columns). The first column is the numeric ID of the token/node, the next column contains the word FORM; any coreference annotation, if present, will appear in the last column, which is called MISC. The file must use Linux-style line breaks, that is, a single LF character, rather than CR LF, which is common on Windows.

The MISC column is either a single underscore (_), meaning there is no extra annotation, or one or more pieces of annotation (typically in the Attribute=Value form), separated by vertical bars (|). The annotation pieces relevant for this shared task always start with Entity=; these should be learned from the training data and predicted for the test data. Any other annotation that is present in the MISC column of the input file should be preserved in the output (especially note that if you discard SpaceAfter=No, or introduce a new one, the validator may report the file as invalid).

For more information on the Entity attribute, see the PDF with the description of the CorefUD 1.0 format (the CorefUD 1.1 format is identical).

Example:

# global.Entity = eid-etype-head-minspan-infstat-link-identity
# sent_id = GUM_academic_art-3
# text = Claire Bailey-Ross xxx@port.ac.uk University of Portsmouth, United Kingdom
1	Claire	Claire	PROPN	NNP	Number=Sing	0	root	0:root	Entity=(e5-person-1-1,2,4-new-coref|Discourse=attribution:3->57:7
2	Bailey	Bailey	PROPN	NNP	Number=Sing	1	flat	1:flat	SpaceAfter=No
3	-	-	PUNCT	HYPH	_	4	punct	4:punct	SpaceAfter=No
4	Ross	Ross	PROPN	NNP	Number=Sing	2	flat	2:flat	Entity=e5)
5	xxx@port.ac.uk	xxx@port.ac.uk	PROPN	NNP	Number=Sing	1	list	1:list	Entity=(e6-abstract-1-1-new-sgl)
6	University	university	NOUN	NNP	Number=Sing	1	list	1:list	Entity=(e7-organization-1-3,5,6-new-sgl-University_of_Portsmouth
7	of	of	ADP	IN	_	8	case	8:case	_
8	Portsmouth	Portsmouth	PROPN	NNP	Number=Sing	6	nmod	6:nmod:of	Entity=(e8-place-1-3,4-new-sgl-Portsmouth|SpaceAfter=No
9	,	,	PUNCT	,	_	11	punct	11:punct	_
10	United	unite	VERB	NNP	Tense=Past|VerbForm=Part	11	amod	11:amod	Entity=(e9-place-2-1,2-new-coref-United_Kingdom
11	Kingdom	kingdom	NOUN	NNP	Number=Sing	1	list	1:list	Entity=e9)e8)e7)

Data download

Each CorefUD dataset is divided into a training section, a development section, and a test section (train/dev/test for short). Technically, each CorefUD dataset consists of three CoNLL-U files containing disjoint sets of documents; boundaries between the three sections can be placed only on document boundaries.

Training and development files containing gold coreference annotations are identical to the CoNLL-U files available in CorefUD 1.1 (the link leads to the LINDAT/CLARIAH-CZ repository where the data can be downloaded from). In addition, the development set with gold coreference annotation stripped off and original morpho-syntax features replaced by the output of UDPipe 2 (a pipeline for an automatic UD-like annotation) is available for download (blind dev set). It might be useful for development purposes.

Test sets without gold coreference annotations are available to participants since the beginning of the evaluation phase. In these data sets, the original morpho-syntax features are again replaced by the output of UDPipe 2. Test data with gold coreference annotation will be used internally in CodaLab for evaluation of submissions.

Submissions of all participant on the dev set were published after the shared task.

Evaluation Metric and Evaluation Tool

The official scorer for the shared task is corefud-scorer.py.

Run the following command to calculate the primary score (CoNLL score) that will be used to rank the submissions (KEY_FILE is the file with gold annotations, RESPONSE_FILE is the file with your predictions):

python corefud-scorer.py KEY_FILE RESPONSE_FILE

The main evaluation metric for the task is the CoNLL score, which is an unweighted average of the F1 values of MUC, B-cubed, and CEAFe scores. To encourage the participants to develop multilingual systems, the primary ranking score will be computed by macro-averaging CoNLL F1 scores over all datasets.

For the same reason, singletons (entities with a single mention) will not be taken into account in calculation of the primary score, as many of the datasets do not have singletons annotated.

Although some of the datasets also comprise annotation of split antecedents, bridging and other anaphoric relations, these are not going to be evaluated.

Besides the primary ranking, the overview paper on the shared task will also introduce multiple secondary rankings, e.g. by CoNLL score for individual languages, or by CoNLL scores calculated with exact matching.

Head-match score

The primary score is calculated using the head match. That is, to compare gold and predicted mentions, we compare their heads. Submitted systems are thus expected to predict a mention head word by filling in its relative position within all words in the corresponding mention span to the Entity attribute. For example, the annotation Entity=(e9-place-2- identifies the second word of the mention as its head. Note that this differs from the previous edition of the shared task, where we used partial matching, which ignored any setting of heads of predicted mentions during evaluation. Partial matching led several teams to optimize their predicted mentions by reducing them to their syntactic heads, thereby losing the information about full mention spans. This is something we want to avoid, while at the same time preventing methods as strict as exact matching.

However, it is still advisable to predict full mention spans, too. Evaluation with head matching uses them to disambiguate between mentions with the same head token. In addition, systems that predict only mention heads are likely to fail in the evaluation with exact matching, which will be calculated as one of the supplementary scores.

If the submitted system is not able to predict the mention heads (i.e. it predicts mention spans only, and the head index is always 1), mention heads can be estimated using the provided dependency tree and heuristics, e.g. the ones provided by Udapi (see below), using the following command: udapy -s corefud.MoveHead < in.conllu > out.conllu

Submission Instructions

In a typical case, shared task participants should proceed as follows.

In the development phase (Phase 1):

download training and development data
decide whether to start from the baseline system, or from scratch
use training data for developing your coreference resolution model
apply the trained model to predict coreference in the blind development sets
check validity of the development set files with predicted coreference
evaluate the quality of your prediction by applying the evaluation script
if you can see space for improvement of your model, improve it and go to 3
create your CodaLab account
practice the CodaLab submission process by uploading a zip file containing the development set files to CodaLab

The zip file uploaded to CodaLab must contain the following 17 files (without any other files or subdirectories):

ca_ancora-corefud-dev.conllu
cs_pcedt-corefud-dev.conllu
cs_pdt-corefud-dev.conllu
de_parcorfull-corefud-dev.conllu
de_potsdamcc-corefud-dev.conllu
en_gum-corefud-dev.conllu
en_parcorfull-corefud-dev.conllu
es_ancora-corefud-dev.conllu
fr_democrat-corefud-dev.conllu
hu_szegedkoref-corefud-dev.conllu
hu_korkor-corefud-dev.conllu
lt_lcc-corefud-dev.conllu
no_bokmaalnarc-corefud-dev.conllu
no_nynorsknarc-corefud-dev.conllu
pl_pcc-corefud-dev.conllu
ru_rucor-corefud-dev.conllu
tr_itcc-corefud-dev.conllu

In the evaluation phase (Phase 2):

download the test sets (or test sets with predictions by the baseline system)
apply the trained model to predict coreference in the test sets
check validity of the test set files with predicted coreference
submit a zip file with test set files to CodaLab, and see the evaluation results. The zip file must contain 17 files (ca_ancora-corefud-test.conllu, cs_pcedt-corefud-test.conllu etc.)
if something goes wrong, fix your prediction model and go to step 2

Let us emphasize that even if multiple submissions are possible within the evaluation phase, their number is limited, they are allowed rather because of resolving some unexpected situations, but definitely should not be used for systematic optimization of parameters or hyperparameters of your model towards the scores shown by CodaLab.

Participants who have developed multiple coreference prediction systems are encouraged to submit their predictions separately, up to 3 systems per team, as long as the systems are different in some interesting ways (e.g. using different architectures, not just different hyperparameter settings). In order to submit an additional system of yours, please create an additional team account at CodaLab.

Pre-submission tests

Many things can go wrong when filling the predicted coreference annotation in the CoNLL-U format (incorrect syntax in the MISC column, unmatched brackets etc.) It is highly recommended to always check validity prior to submitting the files, so that you do not run out of the maximum daily (2) and total submission (10) trials specified for the shared task.

That said, even files not passing the validation tests will be considered for the evaluation and contributing to the final score (provided the evaluation script does not fail on such files).

There are two basic requirements for each submitted CoNLL-U file:

It must be accepted by the official UD validator script with the settings described below.
The contents of the input file must be preserved, i.e., the participating system is supposed to add coreference annotation but not to change anything else. In particular, tokenization and sentence segmentation must not change.
- Within the submitted zip file, the CoNLL-U files output by the system must reside in the root folder and their names must be identical to the names of the corresponding input files. If this naming (and placement) convention is not observed, the scorer will not be able to pair the output with the input, and the output will not be scored.

The official UD validator will be used to check the validity of the CoNLL-U format. Anyone can obtain it by cloning the UD tools repository from Github and running the script validate.py. Python 3 is needed to run the script (depending on your system, it may be available under the command python or python3; if in doubt, try python -V to see the version).

$ git clone git@github.com:UniversalDependencies/tools.git
$ cd tools
$ python3 validate.py -h

In addition, a third-party module called regex must be installed via pip. Try this if you do not have the module already:

$ sudo apt-get install python3-pip; python3 -m pip install regex

The validation script distinguishes several levels of validity; level 2 is sufficient in the shared task, as the higher levels deal with morphosyntactic requirements on the UD-released treebanks. On the other hand, we will use the --coref option to turn on tests specific to coreference annotation. The validator also requires the option --lang xx where xx is the ISO language code of the data set.

$ python3 validate.py --level 2 --coref --lang cs cs_pdt-corefud-test.conllu
*** PASSED ***

If there are errors, the script will print messages describing the location and the nature of the error, it will print *** FAILED *** with (number of) errors, and it will return a non-zero exit value. If the file is OK, the script will print *** PASSED *** and return zero as its exit value. The script may also print warning messages that point to potential problems in the file but are not considered errors and will not make the file invalid.

Baseline system

The baseline system is based on the multilingual coreference resolution system presented in [7]. The model uses multilingual BERT in the end-to-end setting. In simple words, the model goes through all potential spans and maximizes the probability of gold antecedents for each span. The same system is used for all the languages. More details can be found in [7].

The simplified system adapted to CorefUD 1.0, is publically available on GitHub along with tagged dev data and its dev data results.

Files with coreference predicted by the baseline system can be downloaded directly as zip files dev set and test set, so you do not have to run the baseline system yourself and can only try to improve its outputs. The zip files structure is identical to what will be expected by the CodaLab submission system. The files with baseline predictions were post-processed by Udapi to make them pass the pre-submission validation tests: udapy -s read.Conllu split_docs=1 corefud.MergeSameSpan corefud.IndexClusters < orig.conllu > fixed.conllu

Udapi – Recommended Python API for CorefUD

Udapi is a Python API for reading, writing, querying and editing Universal Dependencies data in the CoNLL-U format (and several other formats). Newly, it has support for the coreference annotations (and it was used for producing CorefUD). Even if you decide not to build your system by extending the baseline system, you can use Udapi for accessing the CorefUD data in a comfortable way. For getting an insight into Udapi, you can use

Daniel Zeman's tutorial (with even basic concepts explained),
and/or Martin Popel's tutorial (a bit more advanced, Jupyter oriented).

Submission of system description papers

All shared task participants are invited to submit their system descriptions papers to the CRAC 2023 Workshop.

System description papers can have the form of long or short research papers, up to 8 pages of content for long papers and up to 4 pages of content for short papers, plus an unlimited number of pages for references in both cases.

Identity of the authors of the participanting systems is known, and thus there is no reason for making the submissions anonymous.

Terms and Conditions

Training, development, and test datasets are subject to license agreements specified individually for each dataset in the public edition of the CorefUD 1.1 collection (which, in turn, are the same as license agreements of the original resources before CorefUD harmonization). In all cases, the licenses are sufficient for using the data for the CRAC 2023 shared task purposes. However, the participants must check the license agreements in case they want to use their trained models also for other purposes; for instance, usage for commercial purposes is prohibited with several CorefUD datasets as they are available under CC BY-NC-SA.

Whenever using the CorefUD 1.1 collection (inside or outside this shared task), please cite it as follows:

@misc{11234/1-4698,
title = {Coreference in Universal Dependencies 1.1 ({CorefUD} 1.1)},
author = { Nov{\' a}k, Michal and Popel, Martin and {\v Z}abokrtsk{\'y}, Zden{\v e}k and Zeman, Daniel and Nedoluzhko, Anna and
Acar, Kutay and Bourgonje, Peter and Cinkov{\' a}, Silvie and Cebiro{\v{g}}lu Eryi{\v{g}}it, G{\"u}l{\c{s}}en and Haji{\v c}, Jan and
Hardmeier, Christian and Haug, Dag and J{\o}rgensen, Tollef and  K{\aa}sen, Andre and Krielke, Pauline and Landragin, Fr{\'e}d{\'e}ric and
Lapshinova-Koltunski, Ekaterina and M{\ae}hlum, Petter and Mart{\'{\i}}, M.Ant{\`o}nia and  Mikulov{\' a}, Marie and N{\o}klestad, Anders and
Ogrodniczuk, Maciej and {\O}vrelid, Lilja and Pamay Arslan, Tu{\v{g}}ba and Recasens, Marta and Solberg, Per Erik and Stede, Manfred and
Straka, Milan and Toldova, Svetlana and Vadász, No{\' e}mi and Velldal, Erik and Vincze, Veronika and Zeldes, Amir and {\v Z}itkus, Voldemaras},
url = {http://hdl.handle.net/11234/1-5053},
note = {{LINDAT}/{CLARIAH}-{CZ} digital library at the Institute of Formal and Applied Linguistics ({{\'U}FAL}), Faculty of Mathematics and Physics, Charles University},
year = {2022}
}

For a more general reference to CorefUD harmonization efforts, please cite the following LREC paper:

@inproceedings{biblio8283899234757555533,
  author    = {Anna Nedoluzhko and Michal Novák and Martin Popel and Zdeněk Žabokrtský and Amir Zeldes and Daniel Zeman},
  year      = 2022,
  title     = {CorefUD 1.0: Coreference Meets Universal Dependencies},
  booktitle = {Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022)},
  pages     = {4859--4872},
  publisher = {European Language Resources Association},
  address   = {Marseille, France},
  isbn      = {979-10-95546-72-6},
}

By submitting results to this competition, the participants consent to the public release of their scores at the CRAC 2023 workshop and in the associated proceedings, at the task organizers' discretion. Participants further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgment that the submission was erroneous or deceptive.

Search form