- Tytuł:
- Evaluation of automatic updates of Roget’s Thesaurus
- Autorzy:
-
Kennedy, A.
Szpakowicz, S. - Powiązania:
- https://bibliotekanauki.pl/articles/103921.pdf
- Data publikacji:
- 2014
- Wydawca:
- Polska Akademia Nauk. Instytut Podstaw Informatyki PAN
- Tematy:
-
lexical resources
Roget’s Thesaurus
WordNet
semantic relatedness
synonym selection
pseudo-word-sense disambiguation
analogy - Opis:
- Thesauri and similarly organised resources attract increasing interest of Natural Language Processing researchers. Thesauri age fast, so there is a constant need to update their vocabulary. Since a manual update cycle takes considerable time, automated methods are required. This work presents a tuneable method of measuring semantic relatedness, trained on Roget’s Thesaurus, which generates lists of terms related to words not yet in the Thesaurus. Using these lists of terms, we experiment with three methods of adding words to the Thesaurus. We add, with high confidence, over 5500 and 9600 new word senses to versions of Roget’s Thesaurus from 1911 and 1987 respectively. We evaluate our work both manually and by applying the updated thesauri in three NLP tasks: selection of the best synonym from a set of candidates, pseudo-word-sense disambiguation and SAT-style analogy problems. We find that the newly added words are of high quality. The additions significantly improve the performance of Roget’s-based methods in these NLP tasks. The performance of our system compares favourably with that of WordNet-based methods. Our methods are general enough to work with different versions of Roget’s Thesaurus.
- Źródło:
-
Journal of Language Modelling; 2014, 2, 1; 1-49
2299-856X
2299-8470 - Pojawia się w:
- Journal of Language Modelling
- Dostawca treści:
- Biblioteka Nauki