Document Clustering : Concepts, Metrics and Algorithms

Szczegóły
Opis

Tytuł:: Document Clustering : Concepts, Metrics and Algorithms
Autorzy:: Tarczynski, T.
Powiązania:: https://bibliotekanauki.pl/articles/226231.pdf
Data publikacji:: 2011
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: document clustering
text mining
k-means
hierarchical clustersting
vector space model
Źródło:: International Journal of Electronics and Telecommunications; 2011, 57, 3; 271-277
2300-1933
Język:: angielski
Prawa:: CC BY-NC-ND: Creative Commons Uznanie autorstwa - Użycie niekomercyjne - Bez utworów zależnych 3.0 PL
Dostawca treści:: Biblioteka Nauki
: Artykuł

Przejdź do źródła

Document clustering, which is also refered to as text clustering, is a technique of unsupervised document organisation. Text clustering is used to group documents into subsets that consist of texts that are similar to each orher. These subsets are called clusters. Document clustering algorithms are widely used in web searching engines to produce results relevant to a query. An example of practical use of those techniques are Yahoo! hierarchies of documents [1]. Another application of document clustering is browsing which is defined as searching session without well specific goal. The browsing techniques heavily relies on document clustering. In this article we examine the most important concepts related to document clustering. Besides the algorithms we present comprehensive discussion about representation of documents, calculation of similarity between documents and evaluation of clusters quality.

Informacja

Powiązane pozycje