- Tytuł:
- Evaluation of Selected Approaches to Clustering Categorical Variables
- Autorzy:
- Šulc, Zdeněk
- Powiązania:
- https://bibliotekanauki.pl/articles/465958.pdf
- Data publikacji:
- 2014
- Wydawca:
- Główny Urząd Statystyczny
- Tematy:
-
variable clustering
nominal variables
association measures
similarity measures. - Opis:
- This paper focuses on recently proposed similarity measures and their performance in categorical variable clustering. It compares clustering results using three recently developed similarity measures (IOF, OF and Lin measures) with results obtained using two association measures for nominal variables (Cramér’s V and the uncertainty coefficient) and with the simple matching coefficient (the overlap measure). To eliminate the influence of a particular linkage method on the structure of final clusters, three linkage methods are examined (complete, single, average). The created groups (clusters) of variables can be considered as the basis for dimensionality reduction, e.g. by choosing one of the variables from a given group as a representative for the whole group. The quality of resulting clusters is evaluated by the within-cluster variability, expressed by the WCM coefficient, and by dendrogram analysis. The examined similarity measures are compared and evaluated using two real data sets from a social survey.
- Źródło:
-
Statistics in Transition new series; 2014, 15, 4; 591-610
1234-7655 - Pojawia się w:
- Statistics in Transition new series
- Dostawca treści:
- Biblioteka Nauki