The Opinion Mining from Croatian User Reviews Dataset
is a dataset of user reviews annotated with linguistic data.
Reviews were downloaded from pauza.hr website.
Spelling errors were corrected with GNU Aspell before annotation.
Language is simple, informal and domain specific.
The dataset accompanies the paper:
Goran Glavaš, Damir Korenčić, Jan Šnajder (2013). Aspect-Oriented Opinion Mining from User Reviews in Croatian.
Proceedings of the 51st Annual Meeting of the Association for
Computational Linguistics (Volume 2: Short Papers),
Sofia: Association for Computational Linguistics, 2013.
The paper describes a method for aspect-based opinion mining.
First a lexicon of aspects (product features) and opinion clues (aspect attributes) is constructed.
Than pairing of aspects and clues is solved as a supervised classification problem.
Finally, prediction of overall review scores is performed using supervised classification and regression.
Among other features, the extracted aspect-clue pairs are used for score prediction.
Should you decide to use the dataset, please cite the paper. The BibTeX format is:
@InProceedings{glavas2013cropinion,
title={Aspect-Oriented Opinion Mining from User Reviews in Croatian},
author={Glava{\v s}, Goran and Koren{\v c}i{\' c}, Damir and {\v S}najder, Jan},
booktitle={51st Annual Meeting of the Association for Computational Linguistics},
year={2013},
pages={in press}
}
<Word>token</Word> <Lemma>word lemma</Lemma> <MolexLemmas> - list of (possible) lemmas constructed by MOLEX <string>lemma1</string> ... </MolexLemmas> <POSTag>part-of-speech tag</POSTag> <BasicStem>word stem</BasicStem> <MSDs> - list of morphosyntactic descriptors <string>descriptor1</string> ... <MSDs/>After the sequence of tagged words, sequence of dependency relations follows. Each relation contains the following linguistic data:
<DependencyRelation> <Governor> <Word>governor word</Word> [same data as for sentence tokens] </Governor> <Dependent> <Word>dependent word</Word> [same data as for sentence tokens] </Dependent> <Relation>type of dependency relation</Relation> </DependencyRelation>
For more details, please see the references provided in the paper at the beginning of Section 3.
Licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License