Temat: imbalanced data - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Analiza danych niezrównoważonych we wstępnej diagnostyce raka pęcherza moczowego
Analysis of imbalanced data using morphometric parameters in diagnosis of bladder cancer
Autorzy:: Piotrowska, E.
Stanisławski, W.
Powiązania:: https://bibliotekanauki.pl/articles/157289.pdf
Data publikacji:: 2012
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: dane niezrównoważone
uczenie nadzorowane
imbalanced data
supervised learning
Opis:: Artykuł przedstawia wyniki rozważań dotyczących klasyfikacji danych niezrównoważonych w obrazach mikroskopowych preparatów cytologicznych. Do klasyfikacji wykorzystano algorytmy uczenia nadzorowanego jak: naiwny klasyfikator Bayesa, analiza dyskryminacyjna, drzewa decyzyjne oraz zaproponowany przez autorów algorytm klasyfikacji będący połączeniem zbiorów przybliżonych i metody k-najbliższych sąsiadów. Do analizy wykorzystano opracowane przez autorów narzędzie Rough Sets Analysis Toolbox (RSA Toolbox) - przybornik dla środowiska MATLAB. Wykorzystane obrazy mikroskopowe uzyskano w procesie diagnostyki nowotworu pęcherza moczowego badając metodą FISH odpowiednio przygotowane preparaty moczu.
In the paper the results of imbalanced data classification based on microscope images are described. The images were acquired in the process of bladder cancer diagnosis using the FISH method. The conducted research were focused on the effectiveness of the initial cancer diagnosis using specimen radiation in a DAPI channel and supervised learning methods. The analyzed data set contains about 23,000 objects described by 212 morphometric features. Each object was classified to one of two classes: normal cells or cancers cells. Decisions about belonging objects to the corresponding classes were carried out by an expert. There were identified only 640 cancer cells in the analyzed data. Most of learning algorithms assume balance between classes. The class imbalance problem causes difficulties at a learning stage and reduces the predictive ability. Therefore, the classifier evaluation was performed using G-mean and F-value measures. The authors defined additional measure FMaxSen=sen2ospe which is the product of sensitivity and specificity coefficients. Use of the second power factor emphasizes the importance of sensitivity and allows searching the classifier with the maximum specificity at the maximum sensitivity. The analysis presented in the paper was performed with use of Rough Sets Analysis Toolbox (RSA Toolbox) for MATLAB implemented by the authors. The main part of the RSA Toolbox contains a module which supports the rough sets theory processing. Another part (RSAm module) is a wrapper for the proposed rough classification functions and others implemented in Matalab such as NaiveBayes, Discriminant Analysis, Decision Tree. The RSAm gives us possibility to use cross validation for measuring the classification accuracy. The RSAm also contains features reduction algorithms (correlation based feature selection, sequential feature selection, principal component analysis) as well as discretizations algorithms (EWD, CAIM, CACC). An important part of the RSAToolbox is implementation of distributed computations using Matlab Parallel Computing Toolbox and Distributed Computing Server.
Źródło:: Pomiary Automatyka Kontrola; 2012, R. 58, nr 8, 8; 737-740
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "imbalanced data" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język