Datasets for the Construction and Evaluation of Event Graphs
Release date: June 30, 2013
Structuring information about events from text can be useful for many NLP and IR application such as topic detection and tracking (TDT), event-oriented information retrieval (IR), question answering (QA), text simplification or summarization, etc.
Here we make publicly available datasets used for constructing and evaluating event graphs -- structured, event-based representations of event-oriented documents such as news stories. We provide two datasets:
- EvExtra corpus is a 330K tokens corpus annotated with factual event mentions and their semantic types. The corpus is about three times larger than the standard TimeBank corpus commonly used in event extraction tasks.
- The collection of 105 manually constructed event graphs containing annotations of factual event anchors, their arguments and temporal and coreference relations between pairs of event mentions.
The corpus can be downloaded from here.
Manually Constructed Event Graphs
The collection can be downloaded from here.
Datasets for Construction and Evaluation of Event Graphs by TakeLab is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.