Cro6WSD - Six Words Croatian Word Sense Disambiguation Dataset

Version: 1.0
Release date: September 10, 2015

1 Description

Cro6WSD is a small-scale WSD dataset for Croatian. The construction of the dataset is described in:

Domagoj Alagić and Jan Šnajder (2015). Experiments on Active Learning for Croatian Word Sense Disambiguation. Proceedings of the 5th Workshop on Balto-Slavic Natural Language Processing (BSNLP 2015), Hissar, Bulgaria. 49-58.

If you use this dataset for your own work, please cite the above paper. The BibTeX citation is:

@inproceedings{alagic2015experiments,
  title={Experiments on Active Learning for {C}roatian Word Sense Disambiguation},
  author={Alagi{\'c}, Domagoj and {\v{S}}najder, Jan},
  booktitle={Proceedings of the 5th Workshop on Balto-Slavic Natural Language Processing, BSNLP 2015},
  pages={49-58},
  year={2015},
  address={Hissar, Bulgaria},
  organization={ACL}
}

2 Dataset

The dataset is available from here: TakeLab-Cro6WSD.tar.gz.

The archive contains two folders:

3 License

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.