HeidelTime is a rule-based, multilingual, cross-domain, open-source temporal expression tagger. Available here is the 1.6 version of HeidelTime with prepared resources for tagging of Croatian texts, as well as resources for pre-processing required for Croatian.
For details, please check the following paper:
Skukan, L.,Glavaš, G.,Šnajder, J.(2014). HeidelTime.Hr: Extracting and Normalizing Temporal Expressions in Croatian. In Proceedings of the Ninth Language Technologies Conference, Ljubljana. Information Society, 99-103. [paper]
If you use this dataset for your own work, please cite the above paper. The BibTeX citation is:
@inproceedings{skukan2014heideltimehr,
title={HeidelTime.Hr: Extracting and Normalizing Temporal Expressions in Croatian},
author={Skuka, Luka and Glava\v{s}, Goran and {\v{S}}najder, Jan},
booktitle={Proceedings of the Nineth Language Technologies Conference},
pages={99-103},
year={2014},
organization={Information Society}
}
The dataset is available from here: TakeLab-HeidelTimeHr.tar.gz.
The archive contains two files and two directories. Details on setting up the tool for tagging of Croatian texts are given in the file USER-GUIDE.txt. The resources required for pre-processing are given in the src/ directory, while the HeidelTimeExecutable/ directory contains an executable[1] version of HeidelTime with added resources for Croatian.
[1] - Provided the steps detailed in User-GUIDE.txt are executed.
