- Tytuł:
- A comparative analysis of the principal component method and parallel analysis in working with official statistical data
- Autorzy:
- Holubova, Halyna
- Powiązania:
- https://bibliotekanauki.pl/articles/10559806.pdf
- Data publikacji:
- 2023-02-24
- Wydawca:
- Główny Urząd Statystyczny
- Tematy:
-
principal components
principal component analysis
factor analysis
Kaiser criterion
рarallel analysis
simulation - Opis:
- The dynamic development of the digitized society generates large-scale information data flows. Therefore, data need to be compressed in a way allowing its content to remain complete and informative. In order for the above to be achieved, it is advisable to use the principal component method whose main task is to reduce the dimension of multidimensional space with a minimal loss of information. The article describes the basic conceptual approaches to the definition of principle components. Moreover, the methodological principles of selecting the main components are presented. Among the many ways to select principle components, the easiest way is selecting the first k-number of components with the largest eigenvalues or to determine the percentage of the total variance explained by each component. Many statistical data packages often use the Kaiser method for this purpose. However, this method fails to take into account the fact that when dealing with random data (noise), it is possible to identify components with eigenvalues greater than one, or in other words, to select redundant components. We conclude that when selecting the main components, the classical mechanisms should be used with caution. The Parallel analysis method uses multiple data simulations to overcome the problem of random errors. This method assumes that the components of real data must have greater eigenvalues than the parallel components derived from simulated data which have the same sample size and design, variance and number of variables. A comparative analysis of the eigenvalues was performed by means of two methods: the Kaiser criterion and the parallel Horn analysis on the example of several data sets. The study shows that the method of parallel analysis produces more valid results with actual data sets. We believe that the main advantage of Parallel analysis is its ability to model the process of selecting the required number of main components by determining the point at which they cannot be distinguished from those generated by simulated noise.
- Źródło:
-
Statistics in Transition new series; 2023, 24, 1; 199-212
1234-7655 - Pojawia się w:
- Statistics in Transition new series
- Dostawca treści:
- Biblioteka Nauki