Cro36WSD -- 36 Words Lexical Sample for Croatian Word Sense Disambiguation

Version: 1.1
Release date: March 17th, 2016
Last modified: April 5th, 2016

1 Description

Cro36WSD is a medium-scale WSD lexical sample for Croatian which also adapts the multi-label annotation scheme. Cro36WSD comprises multi- and single-label dataset variants.

The construction of the dataset is described in:

Domagoj Alagić and Jan Šnajder (2016). Cro36WSD: A Lexical Sample for Croatian Word Sense Disambiguation. Proceedings of the 10th edition of the Language Resources and Evaluation Conference (LREC 2016). In press.

If you use this dataset for your own work, please cite the above paper. The BibTeX citation is:

@inproceedings{alagic2016cro36wsd,
    title={Cro36WSD: A Lexical Sample for Croatian Word Sense Disambiguation},
    author={Alagi{\'c}, Domagoj and {\v{S}}najder, Jan},
    booktitle={Proceedings of the 10th edition of the Language Resources and Evaluation Conference, LREC 2016},
    year={2016},
    organization={ELRA},
    address={Portoro\v{z}, Slovenia}
  }

2 Dataset

The dataset is available from here: TakeLab-Cro36WSD.zip.

The archive contains three folders:

3 License

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.