Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Wyszukujesz frazę "nominal variable" wg kryterium: Temat


Wyświetlanie 1-3 z 3
Tytuł:
The effect of binary data transformation in categorical data clustering
Autorzy:
Cibulková, Jana
Šulc, Zdenek
Sirota, Sergej
Rezanková, Hana
Powiązania:
https://bibliotekanauki.pl/articles/1194463.pdf
Data publikacji:
2019-07-02
Wydawca:
Główny Urząd Statystyczny
Tematy:
hierarchical cluster analysis
nominal variable
binary variable
categorical data
similarity measures
evaluation criteria
generated data
Opis:
This paper focuses on hierarchical clustering of categorical data and compares two approaches which can be used for this task. The first one, an extremely common approach, is to perform a binary transformation of the categorical variables into sets of dummy variables and then use the similarity measures suited for binary data. These similarity measures are well examined, and they occur in both commercial and non-commercial software. However, a binary transformation can possibly cause a loss of information in the data or decrease the speed of the computations. The second approach uses similarity measures developed for the categorical data. But these measures are not so well examined as the binary ones and they are not implemented in commercial software. The comparison of these two approaches is performed on generated data sets with categorical variables and the evaluation is done using both the internal and the external evaluation criteria. The purpose of this paper is to show that the binary transformation is not necessary in the process of clustering categorical data since the second approach leads to at least comparably good clustering results as the first approach.
Źródło:
Statistics in Transition new series; 2019, 20, 2; 33-47
1234-7655
Pojawia się w:
Statistics in Transition new series
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Evaluation of Selected Approaches to Clustering Categorical Variables
Autorzy:
Šulc, Zdeněk
Powiązania:
https://bibliotekanauki.pl/articles/465958.pdf
Data publikacji:
2014
Wydawca:
Główny Urząd Statystyczny
Tematy:
variable clustering
nominal variables
association measures
similarity measures.
Opis:
This paper focuses on recently proposed similarity measures and their performance in categorical variable clustering. It compares clustering results using three recently developed similarity measures (IOF, OF and Lin measures) with results obtained using two association measures for nominal variables (Cramér’s V and the uncertainty coefficient) and with the simple matching coefficient (the overlap measure). To eliminate the influence of a particular linkage method on the structure of final clusters, three linkage methods are examined (complete, single, average). The created groups (clusters) of variables can be considered as the basis for dimensionality reduction, e.g. by choosing one of the variables from a given group as a representative for the whole group. The quality of resulting clusters is evaluated by the within-cluster variability, expressed by the WCM coefficient, and by dendrogram analysis. The examined similarity measures are compared and evaluated using two real data sets from a social survey.
Źródło:
Statistics in Transition new series; 2014, 15, 4; 591-610
1234-7655
Pojawia się w:
Statistics in Transition new series
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
On the Analysis of Correlation Between Nominal Data and Numerical Data
Autorzy:
Gniazdowski, Zenon
Powiązania:
https://bibliotekanauki.pl/articles/2163406.pdf
Data publikacji:
2022-12
Wydawca:
Warszawska Wyższa Szkoła Informatyki
Tematy:
nominal data
numerical data
numerical coding of nominal data
complex random variable
correlation coefficient
complex correlation
complex least squares method
Opis:
The article investigates the possibility of measuring the strength of a linear correlation relationship between nominal data and numerical data. Correlation coefficients for variables coded with real numbers as well as for variables coded with complex numbers were studied. For variables coded with real numbers, unambiguous measures of real linear correlation were obtained. In the case of complex coding, it has been observed that the obtained complex correlation coefficients change with the permutation of the phases in the complex numbers used to code classes of elements with equal cardinalities. It was found that a necessary condition for linear correlation is the possibility of linear ordering of a set with data. Since linear order is not possible in the set of complex numbers, complex correlation coefficients cannot be used as a measure of linear correlation. In the event of such a situation, a substitute action was suggested that would prevent equal cardinality of classes of identical elements contained in the set with nominal data. This action would consist in the correction of data, analogous to the correction during preprocessing or cleaning of data containing missing or outlier values.
Źródło:
Zeszyty Naukowe Warszawskiej Wyższej Szkoły Informatyki; 2022, 16, 27; 57-82
1896-396X
2082-8349
Pojawia się w:
Zeszyty Naukowe Warszawskiej Wyższej Szkoły Informatyki
Dostawca treści:
Biblioteka Nauki
Artykuł
    Wyświetlanie 1-3 z 3

    Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies