Who is the most mentioned person in a set of documents? Does Pero Perić has anything to do with Korupcija d.o.o? CroNER recognizes named entities in Croatian texts and can help you answer these questions.
Search for "Ana" fails to find the document containing "s Anom"? MOLEX morphologically normalizes Croatian words ("Anom" -> "Ana"), thus improving the performance of your search engine.
What topics were popular in the late 90s? Is Gorbachov more associated with war-related topics or with culture? Let CatViz visualize large news corpus and answer these questions in an instant.
Need to index large number of documents with a set of descriptors (keywords)? Let eCADIS use machine learning to do that work for you. Automatically.
A semantic search engine for quick and easy full-text and structured retrieval. Now used for indexing and retrieving the entire Croatian legislation.
If you need quick and efficient multi-label text classification with a huge number of potentially hierarchically related labels, KTN Indexer is a tool for you.
Tools and Utilities
Turn "cevapcici" into "ćevapčići" with DIACRO, a robust system for automatic diacritics restoration in Croatian texts.
Looking to create parallel corpora for machine translation or other pruposes? CORAL (CORpus ALigner) can facilitate the task for you.
If you are looking for a tool to extract domain-specific terminology from a domain-specific document collection, TermeX might be the solution.