- Tytuł:
- A new method for identifying outlying subsets of data
- Autorzy:
-
Zalewska, M.
Grzanka, A.
Niemiro, W.
Samoliński, B. - Powiązania:
- https://bibliotekanauki.pl/articles/970610.pdf
- Data publikacji:
- 2008
- Wydawca:
- Polska Akademia Nauk. Instytut Badań Systemowych PAN
- Tematy:
-
misclassification error
discriminant analysis
multidimensional homogeneity test
medical data - Opis:
- In various branches of science, e.g. medicine, economics, sociology, it is necessary to identify or detect outlying subsets of data. Suppose that the set of data is partitioned into many relatively small subsets and we have some reason to suspect that one or several of these subsets may be atypical or aberrant. We propose applying a new measure of separability, based on the ideas borrowed from the discriminant analysis. In our paper we define two versions of this measure, both using a jacknife, leave-one-out, estimator of classification error. If a suspected subset is significantly well separated from the main bulk of data, then we regard it as outlying. The usefulness of our algorithm is illustrated on a set of medical data collected in a large survey "Epidemiology of Allergic Diseases in Poland" (ECAP). We also tested our method on artificial data sets and on the classical IRIS data set. For a comparison, we report the results of a homogeneity test of Bartoszyński, Pearl and Lawrence, applied to the same data sets.
- Źródło:
-
Control and Cybernetics; 2008, 37, 3; 693-709
0324-8569 - Pojawia się w:
- Control and Cybernetics
- Dostawca treści:
- Biblioteka Nauki