GPKEX - Genetically Programmed Keyphrase Extraction

Version: 1.0
Release date: July 28, 2013

1 Description

GPKEX is  a keyphrase extraction method based on genetic programming. GKPEX represents keyphrase scoring measures as syntax trees and evolves them to produce rankings for keyphrase candidates extracted from text. GPKEX can evolve simple and interpretable keyphrase scoring measures that perform comparably to more complex machine learning methods. For details, please refer to the paper:

Marko Bekavac and Jan Šnajder (2013). GPKEX: Genetically Programmed Keyphrase Extraction from Croatian Texts. Proceedings of the 4th Biennial International Workshop on Balto-Slavic Natural Language Processing (BNLP 2013), Sofia, ACL, 2013, 43-47. [pdf]

If you use GPKEX or the associated datasets, please cite the paper. The BibTeX citation is:

@InProceedings{bekavac2013gpkex,
title={GPKEX: Genetically Programmed Keyphrase Extraction from Croatian Texts},
author={Bekavac, Marko and {\v S}najder, Jan},
booktitle={4th Biennial International Workshop on Balto-Slavic Natural Language Processing},
year={2013},
pages={in press}
 }

2 Data

The GPKEX source code is available from GitHub under the BSD-3 license.
The manually annotated dataset used for the evaluation experiments is available from here.