Version: 1.0
Release date: July 29, 2017
The Claim Microstructure dataset contains posts split into claim segments, translated into claim microstructures.
The dataset is created to explore how claims can be structured using a restricted language and grammar.
Additionally, it was used to help solve the stance classification task;
using claim microstructure information when determining stance of a claim.
The task and the dataset are described in:
If you use the Claim-microstructure dataset for your own work, please cite the above paper. The BibTeX citation is:
@InProceedings{boltuzic2017back,
author = {Boltu\v{z}i\'{c}, Filip and \v{S}najder, Jan},
title = {Toward Stance Classification Based on Claim Microstructures},
booktitle = {Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis},
month = {September},
year = {2017},
address = {Copenhagen},
publisher = {Association for Computational Linguistics}
}
The dataset is available from here: TakeLab-claim-microstructure.tar.gz. There are two files in the archive:
The json schema of the user posts, claim segments, paraphrases and microstructures is as follows:
[
{
"segment_id": "..",
"post_id": "..",
"post_text":
"segment_text": "..",
"segment_paraphrase": "..",
"a1_log_claim": "..",
"a1_log_claim_quality_score": "..",
"a1_stance": "..",
"a2_log_claim": "..",
"a2_log_claim_quality_score": "..",
"a2_stance": "..",
},
]

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.