FAQIR - a Frequently Asked Questions Retrieval Test Collection

Version: 1.0
Release date: November 2, 2015

1 Description

FAQ retrieval is an interesting area at the intersection of question answering, semantic search and information retrieval. We provide here a data set for the FAQ retrieval task in English language.

2 Dataset

The dataset is available here: FAQIRv1.0.tar.gz. The archive contains two files:
FAQIRv1.0.xml
readme.txt
All the data is organised in the xml file, while readme.txt gives instructions on how to interpret the specific xml tags. The data set contains a total of 4133 FAQ-pairs and 1233 queries with corresponding manually annotated relevance judgements.

3 License

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.