Need to index large number of documents with a set of descriptors (keywords)? Let eCADIS use machine learning to do that work for you. Automatically.
eCADIS can assign fully automatically keywords to documents (that best summarize the content of the document) or can facilitate this process in that it allows human indexers to index documents more efficiently and more consistently. The keyword assigment is based on a controlled vocabulary – a thesaurus. In its present version, eCADIS uses the Eurovoc thesaurus*, but it can be adapted to any other thesaurus. Keywords (also known as “descriptors”) can facilitate document retrieval or can enable cross-language retrieval, as is the case with the Eurovoc thesaurus.
The eCADIS workstation system works using two parallel windows: the Document window (Figure 1) and the Eurovoc browser window (Figure 2). The Document window displays the document that is being indexed together with the results of its computational linguistic processing (alphabetical and frequency lists of types, lemmas, literal descriptors appearing in the text, word bigrams, word trigrams, and word tetragrams). The Eurovoc browser window allows the user to freely browse through the hierarchy of index terms of the Eurovoc thesaurus and to select the descriptors to be assigned to the document.
eCADIS automatic indexing is the result of applying machine learning techniques to a substantial number of manually indexed Croatian official documents. The eCADIS behaves intelligently and suggests the descriptors that best describe the meaning of a document (even though they might not be literally present in the document itself). In its off-line version, eCADIS assigns descriptors fully automatically.
eCADIS is primarily a commercial tool. If you are interested in using eCADIS either for research purposes or commercially, please drop us an email at firstname.lastname@example.org.