Coreference resolution is the task of identifying in text different mentions of the same real-world entity. We make available a set of news stories from Croatian newline "Vjesnik", manually annotated with coreference relations between entity mentions. The published dataset is the test set that we used for evaluating the performance of the constrained mention-pair model for coreference resolution for Croatian, published in the following paper:
Goran Glavaš, Jan Šnajder (2015). Resolving Entity Coreference in Croatian with a Constrained Mention-Pair Model. Proceedings of the The 5th Workshop on Balto-Slavic Natural Language Processing, at the Conference on Recent Advances in Natural Language Processing (RANLP 2015), Hissar. [pdf]
Should you decide to use the dataset, please cite the paper. The BibTeX format is:
@InProceedings{glavavs2015resolving,
title={Resolving Entity Coreference in Croatian with a Constrained Mention-Pair Model},
author={Glava\v{s}, Goran and {\v S}najder, Jan},
booktitle={The 5th Workshop on Balto-Slavic Natural Language Processing},
year={2015},
pages={17--23}
}
CroCoref - all files available by TakeLab, FER are licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.