News agencies can greatly benefit from text analytics. In this project we developed a series of semantic text analysis tools for the Croatian News Agency (HINA), to improve the quality and reduce the costs of document processing services.
Description
A news agency heavily depends on text analytics. For the Croatian News Agency (HINA), TakeLab developed a package of text analytics tools to make text analysis in HINA more cost efficient. The package includes:
- KTN – a system for automatic document classification
- A package for training the classifiers – to further reduce human effort, the automatic classifiers are derived via active learning
- KEX – a system for extracting keywords from documents, this can be used for indexing and clustering documents
- An interface to existing HINA systems
Project fact sheet
Participants: HINA, TakeLab FER
Duration: 2 years