- Tytuł:
- Extraction of Polish noun senses from large corpora by means of clustering
- Autorzy:
-
Broda, B.
Piasecki, M.
Szpakowicz, S. - Powiązania:
- https://bibliotekanauki.pl/articles/969804.pdf
- Data publikacji:
- 2010
- Wydawca:
- Polska Akademia Nauk. Instytut Badań Systemowych PAN
- Tematy:
-
corpus linguistics
semantic similarity
Polish nouns
word clustering
Clustering by Committee
co-occurrence retrieval models
rank weight function
Polish WordNet
WordNet-based synonymy test
document clustering
keywords extraction - Opis:
- We investigate two methods of identifying noun senses, based on clustering of lemmas and of documents. We have adapted to Polish the well-known algorithm of Clustering by Committee, and tested it on very large Polish corpora. The evaluation by means of a WordNet-based synonymy test used Polish wordnet (plWordNet 1.0). Various clustering algorithms were analysed for the needs of extraction of document clusters as indicators of the senses of words which occur in them. The two approaches to wordsense identification have been compared, and conclusions drawn.
- Źródło:
-
Control and Cybernetics; 2010, 39, 2; 401-420
0324-8569 - Pojawia się w:
- Control and Cybernetics
- Dostawca treści:
- Biblioteka Nauki