Cro36WSD is a medium-scale WSD lexical sample for Croatian which also adapts the multi-label annotation scheme. Cro36WSD comprises multi- and single-label dataset variants.
The construction of the dataset is described in:Domagoj Alagić and Jan Šnajder (2016). Cro36WSD: A Lexical Sample for Croatian Word Sense Disambiguation. Proceedings of the 10th edition of the Language Resources and Evaluation Conference (LREC 2016). In press.
If you use this dataset for your own work, please cite the above paper. The BibTeX citation is:
@inproceedings{alagic2016cro36wsd, title={Cro36WSD: A Lexical Sample for Croatian Word Sense Disambiguation}, author={Alagi{\'c}, Domagoj and {\v{S}}najder, Jan}, booktitle={Proceedings of the 10th edition of the Language Resources and Evaluation Conference, LREC 2016}, year={2016}, organization={ELRA}, address={Portoro\v{z}, Slovenia} }
The dataset is available from here: TakeLab-Cro36WSD.zip.
The archive contains three folders: