Extraction of Polish noun senses from large corpora by means of clustering

Szczegóły
Opis

Tytuł:: Extraction of Polish noun senses from large corpora by means of clustering
Autorzy:: Broda, B.
Piasecki, M.
Szpakowicz, S.
Powiązania:: https://bibliotekanauki.pl/articles/969804.pdf
Data publikacji:: 2010
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: corpus linguistics
semantic similarity
Polish nouns
word clustering
Clustering by Committee
co-occurrence retrieval models
rank weight function
Polish WordNet
WordNet-based synonymy test
document clustering
keywords extraction
Źródło:: Control and Cybernetics; 2010, 39, 2; 401-420
0324-8569
Język:: angielski
Prawa:: Wszystkie prawa zastrzeżone. Swoboda użytkownika ograniczona do ustawowego zakresu dozwolonego użytku
Dostawca treści:: Biblioteka Nauki
: Artykuł

Przejdź do źródła

We investigate two methods of identifying noun senses, based on clustering of lemmas and of documents. We have adapted to Polish the well-known algorithm of Clustering by Committee, and tested it on very large Polish corpora. The evaluation by means of a WordNet-based synonymy test used Polish wordnet (plWordNet 1.0). Various clustering algorithms were analysed for the needs of extraction of document clusters as indicators of the senses of words which occur in them. The two approaches to wordsense identification have been compared, and conclusions drawn.

Informacja

Powiązane pozycje