Datasets for the Event-Centered Information Retrieval

Version: 1.0
Release date: July 29, 2013

1 Description

Traditional information retrieval models assume keyword-based queries and use unstructured document representations. There is an abundance of event-oriented texts (e.g., breaking news) and event-oriented information needs that often involve structure that cannot be expressed using keywords. Here we make publicly available datasets used for event-centered IR using graph kernels on event graphs -- We provide pooled documents and relevance judgmements for two collections: (1) a general collection of news stories and (2) a topic-specific (topic: Syria) collection of news stories.

2 Datasets

Clear collections (non-processed for event graphs) can be downloaded here. Collections processed for event graphs are available upon request. Due to their size (1.2GB), the direct download link is not provided.

Each collection consists of 50 queries. Each query has its own folder in which you will find:

3 License

Creative Commons License
Datasets for Event-Centered Information Retrieval by TakeLab is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.