Version: 1.0
Release date: July 29, 2017
The Claim Microstructure dataset contains posts split into claim segments, translated into claim microstructures.
The dataset is created to explore how claims can be structured using a restricted language and grammar.
Additionally, it was used to help solve the stance classification task;
using claim microstructure information when determining stance of a claim.
The task and the dataset are described in:
If you use the Claim-microstructure dataset for your own work, please cite the above paper. The BibTeX citation is:
@InProceedings{boltuzic2017back, author = {Boltu\v{z}i\'{c}, Filip and \v{S}najder, Jan}, title = {Toward Stance Classification Based on Claim Microstructures}, booktitle = {Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis}, month = {September}, year = {2017}, address = {Copenhagen}, publisher = {Association for Computational Linguistics} }
The dataset is available from here: TakeLab-claim-microstructure.tar.gz. There are two files in the archive:
The json schema of the user posts, claim segments, paraphrases and microstructures is as follows:
[ { "segment_id": "..", "post_id": "..", "post_text": "segment_text": "..", "segment_paraphrase": "..", "a1_log_claim": "..", "a1_log_claim_quality_score": "..", "a1_stance": "..", "a2_log_claim": "..", "a2_log_claim_quality_score": "..", "a2_stance": "..", }, ]
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.