- Tytuł:
- Classifiers accuracy improvement based on missing data imputation
- Autorzy:
-
Jordanov, I.
Petrov, N.
Petrozziello, A. - Powiązania:
- https://bibliotekanauki.pl/articles/91626.pdf
- Data publikacji:
- 2018
- Wydawca:
- Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
- Tematy:
-
machine learning
missing data
model-based imputation
neural networks
random forests
support vector machine
radar signal classification
nauczanie maszynowe
brakujące dane
sieci neuronowe
maszyna wektorów nośnych
klasyfikacja sygnałów radarowych - Opis:
- In this paper we investigate further and extend our previous work on radar signal identification and classification based on a data set which comprises continuous, discrete and categorical data that represent radar pulse train characteristics such as signal frequencies, pulse repetition, type of modulation, intervals, scan period, scanning type, etc. As the most of the real world datasets, it also contains high percentage of missing values and to deal with this problem we investigate three imputation techniques: Multiple Imputation (MI); K-Nearest Neighbour Imputation (KNNI); and Bagged Tree Imputation (BTI). We apply these methods to data samples with up to 60% missingness, this way doubling the number of instances with complete values in the resulting dataset. The imputation models performance is assessed with Wilcoxon’s test for statistical significance and Cohen’s effect size metrics. To solve the classification task, we employ three intelligent approaches: Neural Networks (NN); Support Vector Machines (SVM); and Random Forests (RF). Subsequently, we critically analyse which imputation method influences most the classifiers’ performance, using a multiclass classification accuracy metric, based on the area under the ROC curves. We consider two superclasses (‘military’ and ‘civil’), each containing several ‘subclasses’, and introduce and propose two new metrics: inner class accuracy (IA); and outer class accuracy (OA), in addition to the overall classification accuracy (OCA) metric. We conclude that they can be used as complementary to the OCA when choosing the best classifier for the problem at hand.
- Źródło:
-
Journal of Artificial Intelligence and Soft Computing Research; 2018, 8, 1; 31-48
2083-2567
2449-6499 - Pojawia się w:
- Journal of Artificial Intelligence and Soft Computing Research
- Dostawca treści:
- Biblioteka Nauki