Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Wyszukujesz frazę "Random Forest" wg kryterium: Temat


Tytuł:
Lasy losowe - ocena jakości prognostycznej cech
Random forests - evaluation of predictive accuracy
Autorzy:
Krętowska, M.
Powiązania:
https://bibliotekanauki.pl/articles/341027.pdf
Data publikacji:
2007
Wydawca:
Politechnika Białostocka. Oficyna Wydawnicza Politechniki Białostockiej
Tematy:
lasy losowe
analiza przeżywalności
bezwzględny błąd predykcji
random forest
survival analysis
predictive accuracy
explained variation
Opis:
W pracy bezwzględny błąd predykcji jest wykorzystywany do oceny jakości prognostycznej poszczególnych cech. Narzędzie prognostyczne - lasy losowe - jest konstruowane w celu uzyskania estymatora funkcji przeżycia. Jest on następnie porównywany z estymatorem funkcji przeżycia Kaplana-Meiera, utworzonym przy założeniu jednorodności populacji. Elementem składowym lasów są dipolowe drzewa przeżycia. Zastosowanie dipolowej funkcji kryterialnej pozwala wykorzystać niepełną informację o czasie zajścia porażki, pochodzącą z obserwacji obciętych.
In the paper, predictive accuracy measured as the absolute predictive error is used to evaluate the quality of covariates. The prognostic tool - random forests - is built to receive the aggregated survival function. The function is compared to Kaplan-Meier estimator of survival function with assumption that the population is homogenous. The induction of individual dipolar survival tree is based on minimization of a piece-wise linear function - dipolar criterion. The algorithm allows using the information from censored observations for which the exact survival time is unknown.
Źródło:
Zeszyty Naukowe Politechniki Białostockiej. Informatyka; 2007, 2; 67-77
1644-0331
Pojawia się w:
Zeszyty Naukowe Politechniki Białostockiej. Informatyka
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Imitation learning of car driving skills with decision trees and random forests
Autorzy:
Cichosz, P.
Pawełczak, Ł.
Powiązania:
https://bibliotekanauki.pl/articles/329901.pdf
Data publikacji:
2014
Wydawca:
Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:
imitation learning
behavioral cloning
model ensemble
random forest
control
autonomous driving
car racing
decision tree
drzewo decyzyjne
lasy losowe
sterowanie
wyścigi samochodowe
Opis:
Machine learning is an appealing and useful approach to creating vehicle control algorithms, both for simulated and real vehicles. One common learning scenario that is often possible to apply is learning by imitation, in which the behavior of an exemplary driver provides training instances for a supervised learning algorithm. This article follows this approach in the domain of simulated car racing, using the TORCS simulator. In contrast to most prior work on imitation learning, a symbolic decision tree knowledge representation is adopted, which combines potentially high accuracy with human readability, an advantage that can be important in many applications. Decision trees are demonstrated to be capable of representing high quality control models, reaching the performance level of sophisticated pre-designed algorithms. This is achieved by enhancing the basic imitation learning scenario to include active retraining, automatically triggered on control failures. It is also demonstrated how better stability and generalization can be achieved by sacrificing human-readability and using decision tree model ensembles. The methodology for learning control models contributed by this article can be hopefully applied to solve real-world control tasks, as well as to develop video game bots.
Źródło:
International Journal of Applied Mathematics and Computer Science; 2014, 24, 3; 579-597
1641-876X
2083-8492
Pojawia się w:
International Journal of Applied Mathematics and Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Maximising accuracy and efficiency of traffic accident prediction combining information mining with computational intelligence approaches and decision trees
Autorzy:
Tambouratzis, T>
Souliou, D.
Chalikias, M.
Gregoriades, A.
Powiązania:
https://bibliotekanauki.pl/articles/91652.pdf
Data publikacji:
2014
Wydawca:
Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:
traffic accident
location
prediction
probabilistic neural networks
random forest
accuracy
efficiency
decision tree
Opis:
The development of universal methodologies for the accurate, efficient, and timely prediction of traffic accident location and severity constitutes a crucial endeavour. In this piece of research, the best combinations of salient accident-related parameters and accurate accident severity prediction models are determined for the 2005 accident dataset brought together by the Republic of Cyprus Police. The optimal methodology involves: (a) information mining in the form of feature selection of the accident parameters that maximise prediction accuracy (implemented via scatter search), followed by feature extraction (implemented via principal component analysis) and selection of the minimal number of components that contain the salient information of the original parameters, which combined bring about an overall 74.42% reduction in the dataset dimensionality; (b) accident severity prediction via probabilistic neural networks and random forests, both of which independently accomplish over 96% correct prediction and a balanced proportion of under- and over-estimations of accident severity. An explanation of the superiority of the optimal combinations of parameters and models is given, as is a comparison with existing accident classification/prediction approaches.
Źródło:
Journal of Artificial Intelligence and Soft Computing Research; 2014, 4, 1; 31-42
2083-2567
2449-6499
Pojawia się w:
Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Dimensionality Reduction for Probabilistic Neural Network in Medical Data Classification Problems
Autorzy:
Kusy, M.
Powiązania:
https://bibliotekanauki.pl/articles/226697.pdf
Data publikacji:
2015
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
probabilistic neural network
dimensionality reduction
feature selection
feature extraction
single decision tree
random forest
principal component analysis
prediction ability
Opis:
This article presents the study regarding the problem of dimensionality reduction in training data sets used for classification tasks performed by the probabilistic neural network (PNN). Two methods for this purpose are proposed. The first solution is based on the feature selection approach where a single decision tree and a random forest algorithm are adopted to select data features. The second solution relies on applying the feature extraction procedure which utilizes the principal component analysis algorithm. Depending on the form of the smoothing parameter, different types of PNN models are explored. The prediction ability of PNNs trained on original and reduced data sets is determined with the use of a 10-fold cross validation procedure.
Źródło:
International Journal of Electronics and Telecommunications; 2015, 61, 3; 289-300
2300-1933
Pojawia się w:
International Journal of Electronics and Telecommunications
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
The use of data mining models in solving the problem of imbalanced classes based on the example of an online marketing campaign
Wykorzystanie modeli data mining w rozwiązywaniu problemu niezrównoważonych klas na przykładzie kampanii marketingowych w Internecie
Autorzy:
Łapczyński, Mariusz
Surma, Jerzy
Powiązania:
https://bibliotekanauki.pl/articles/424980.pdf
Data publikacji:
2015
Wydawca:
Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu
Tematy:
C&RT
Random Forest
imbalanced class problem
online social network
banner ad campaign
Opis:
While building predictive models in analytical CRM, researchers often encounter the problem of imbalanced classes (skewed distributions of dependent variables), which consists in the fact that the number of observations belonging to one category of the dependent variable is much lower than the number of observations belonging to the second category of that variable. This is related to such areas as churn analysis, customer acquisition models and cross and up-selling models. The purpose of the paper is to present a predictive model that was built to predict the response of Internet users to banner advertising. The dataset used in the study came from an online social network which offers advertisers banner campaigns targeting its users. The advertising campaign of a cosmetics company was carried out in the autumn of 2010 and was mainly targeted at young women. A user of this service was described by 115 independent variables – 3 out of which were demographic variables (sex, age, education), and the remaining 112 referred to the user’s online activity. While building the model there appeared the problem of imbalanced classes due to the low number of users who clicked on the banner ad. The number of cases amounted to 81,000, while the number of positive reactions to the banner was 207, which constitutes approximately 0.25% of the dependent variable. During the study, two popular data mining tools were utilized – the decision trees C&RT and Random Forest. The second goal of this paper is to compare the performance of the predictive models based on both these analytical tools.
Źródło:
Econometrics. Ekonometria. Advances in Applied Data Analytics; 2015, 3 (49); 9-19
1507-3866
Pojawia się w:
Econometrics. Ekonometria. Advances in Applied Data Analytics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Data mining methods for prediction of air pollution
Autorzy:
Siwek, K.
Osowski, S.
Powiązania:
https://bibliotekanauki.pl/articles/330775.pdf
Data publikacji:
2016
Wydawca:
Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:
computational intelligence
feature selection
neural network
random forest
air pollution forecasting
inteligencja obliczeniowa
selekcja cech
sieć neuronowa
lasy losowe
zanieczyszczenie powietrza
Opis:
The paper discusses methods of data mining for prediction of air pollution. Two tasks in such a problem are important: generation and selection of the prognostic features, and the final prognostic system of the pollution for the next day. An advanced set of features, created on the basis of the atmospheric parameters, is proposed. This set is subject to analysis and selection of the most important features from the prediction point of view. Two methods of feature selection are compared. One applies a genetic algorithm (a global approach), and the other—a linear method of stepwise fit (a locally optimized approach). On the basis of such analysis, two sets of the most predictive features are selected. These sets take part in prediction of the atmospheric pollutants PM10, SO2, NO2 and O3. Two approaches to prediction are compared. In the first one, the features selected are directly applied to the random forest (RF), which forms an ensemble of decision trees. In the second case, intermediate predictors built on the basis of neural networks (the multilayer perceptron, the radial basis function and the support vector machine) are used. They create an ensemble integrated into the final prognosis. The paper shows that preselection of the most important features, cooperating with an ensemble of predictors, allows increasing the forecasting accuracy of atmospheric pollution in a significant way.
Źródło:
International Journal of Applied Mathematics and Computer Science; 2016, 26, 2; 467-478
1641-876X
2083-8492
Pojawia się w:
International Journal of Applied Mathematics and Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
EQUITY ISSUANCE AND CORPORATE DIVIDEND POLICY IN EMERGING ECONOMY CONTEXT
Autorzy:
Rohov, Heorhiy
Solesvik, Marina Z.
Powiązania:
https://bibliotekanauki.pl/articles/453403.pdf
Data publikacji:
2016
Wydawca:
Szkoła Główna Gospodarstwa Wiejskiego w Warszawie. Katedra Ekonometrii i Statystyki
Tematy:
dividend policy
emission policy
random forest algorithm
Ukraine
Opis:
This article explores links between the size of a company, industrial sector in which a company operates, concentration of capital, size of business and emission and dividend policy in the Ukrainian corporate sector. Guided by insights from the bird-in-hand theory, clientele theory, signaling theory, and agency theory, we justify factors that determine the choice of shares’ placement by Ukrainian public joint stock companies and forming of their dividend policy related to the current operating conditions of the Ukrainian corporate sector. Using mathematical approach of tree classification construction in the form of random forest algorithm, we found out that maximization of the share capital value, that is involved in shares issuance of Ukrainian PJSCs, is not a priority for owners of corporate rights. 86.1 per cent of companies have selected private placements of shares. In the non-financial sector, 87.5 per cent of companies opted private placements. The study revealed also only a small share (3.5%) of Ukrainian joint stock companies paid dividends to shareholders. However, the dividend policy of Ukrainian joint stock companies changed when they listed their shares on foreign stock markets. In this case two thirds of explored firms paid dividends.
Źródło:
Metody Ilościowe w Badaniach Ekonomicznych; 2016, 17, 4; 114-137
2082-792X
Pojawia się w:
Metody Ilościowe w Badaniach Ekonomicznych
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Zastosowanie metod czarnej skrzynki do prognozowania wartości wybranych wskaźników jakości ścieków dopływających do oczyszczalni komunalnej
Black-box forecasting of selected indicator values for influent wastewater quality in municipal treatment plant
Autorzy:
Szeląg, B.
Bartkiewicz, L.
Studziński, J.
Powiązania:
https://bibliotekanauki.pl/articles/236740.pdf
Data publikacji:
2016
Wydawca:
Polskie Zrzeszenie Inżynierów i Techników Sanitarnych
Tematy:
ścieki komunalne
modelowanie
prognozowanie jakości ścieków metoda MARS
metoda lasów losowych (RF)
metoda samoorganizujących się sieci neuronowych (SOM)
metoda drzew wzmacnianych (BT) metoda analizy składowych
głównych (PCA)
sewage
modeling
sewage quality forecasting
MARS (multivariate adaptive regression spline)
random forest (RF)
self-organizing map (SOM)
boosted trees (BT)
principal component analysis (PCA)
Opis:
Prognozowanie ilości i jakości ścieków dopływających do oczyszczalni komunalnej z odpowiednim wyprzedzeniem czasowym daje możliwość optymalnego sterowania wieloma parametrami procesów oczyszczania ścieków. Dlatego prowadzi się badania mające na celu opracowanie modeli matematycznych (fizykalnych deterministycznych i operatorowych statystycznych), prognozujących zarówno ilość, jak i jakość ścieków dopływających do oczyszczalni. W artykule zbadano możliwość zastosowania prostszych modeli operatorowych do prognozowania wartości wybranych wskaźników jakości ścieków na dopływie do oczyszczalni (BZT5, zawiesiny ogólne, azot ogólny i amonowy, fosfor ogólny) jedynie na podstawie wyników pomiarów natężenia przepływu ścieków oraz – w celu porównania – na podstawie ich zmierzonych wartości. Do tego celu zastosowano metody czarnej skrzynki typu MARS oraz lasy losowe (RF). Dodatkowo przedstawiono możliwość połączenia metody lasów losowych z modelem klasyfikacyjnym (RF+SOM). Do identyfikacji danych określających zmienność wybranych wskaźników jakości ścieków zastosowano metody drzew wzmacnianych (BT) i analizy składowych głównych (PCA). Modele opracowano na podstawie wyników ciągłych pomiarów dobowych przeprowadzonych w latach 2013–2015 w oczyszczalni ścieków komunalnych w Rzeszowie.
Forecasting the amount and quality of wastewater flowing into a treatment plant sufficiently in advance, enables effective control of numerous treatment process parameters. Therefore, mathematical (physical deterministic and time series statistical) models forecasting both the amount and quality of wastewater inflow into a sewage treatment plant are under development. In this paper, a possibility of simpler time series models application to forecasting values of selected indicators (biochemical oxygen demand (BOD5), total suspended solids (TSS), total nitrogen (TN), total phosphorus (TP) and ammonium (NH4+)) of sewage quality in the inflow into a treatment plant was investigated. The research was based solely on sewage flow rate data and – for the purpose of comparison – the actual measured indicator values. For this purpose, MARS type black-box and random forest (RF) methods were used. Also, a possibility of combining the RF method with a classification model (RF+SOM) was investigated. Boosted trees (BT) and principal component analysis (PCA) methods were applied for identification of data that determine variability of the selected sewage quality indicators. The models were developed on the basis of continuous daily measurements performed in the period of 2013–2015 in the municipal sewage treatment plant in Rzeszow.
Źródło:
Ochrona Środowiska; 2016, 38, 4; 39-46
1230-6169
Pojawia się w:
Ochrona Środowiska
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Evaluation of the impact of explanatory variables on the accuracy of prediction of daily inflow to the sewage treatment plant by selected models nonlinear
Ocena wpływu zmiennych objaśniających na dokładność predykcji dobowego dopływu do oczyszczalni ścieków wybranymi modelami nieliniowymi
Autorzy:
Szeląg, B.
Bartkiewicz, L.
Studziński, J.
Barbusiński, K.
Powiązania:
https://bibliotekanauki.pl/articles/205349.pdf
Data publikacji:
2017
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
wastewater treatment plant
data mining
random forest
forecasting inflow
k-nearest neighbour
Kernel regression
oczyszczalnia ścieków
wydobywanie danych
las losowy
dopływ ścieków
modelowanie
k-najbliższy sąsiad
regresja Kernela
Opis:
The aim of the study was to evaluate the possibility of applying different methods of data mining to model the inflow of sewage into the municipal sewage treatment plant. Prediction models were elaborated using methods of support vector machines (SVM), random forests (RF), k-nearest neighbour (k-NN) and of Kernel regression (K). Data consisted of the time series of daily rainfalls, water level measurements in the clarified sewage recipient and the wastewater inflow into the Rzeszow city plant. Results indicate that the best models with one input delayed by 1 day were obtained using the k-NN method while the worst with the K method. For the models with two input variables and one explanatory one the smallest errors were obtained if model inputs were sewage inflow and rainfall data delayed by 1 day and the best fit is provided using RF method while the worst with the K method. In the case of models with three inputs and two explanatory variables, the best results were reported for the SVM and the worst for the K method. In the most of the modelling runs the smallest prediction errors are obtained using the SVM method and the biggest ones with the K method. In the case of the simplest model with one input delayed by 1 day the best results are provided using k-NN method and by the models with two inputs in two modelling runs the RF method appeared as the best.
Celem pracy jest ocena możliwości zastosowania różnych metod data mining do modelowania dopływu ścieków do komunalnej oczyszczalni ścieków. Do opracowania modeli statystycznych metodą wektorów nośnych, lasów losowych, k – najbliższego sąsiada i regresji Kernela wykorzystano szeregi pomiarowe dobowych wartości opadów deszczu, stanów wody w odbiorniku oraz dopływów do komunalnej oczyszczalni ścieków w Rzeszowie. Z obliczeń wykonanych metodami SVM, RF, k-NN i K wynika, że dla modeli z jedną zmienną objaśniającą opóźnioną o dobę w stosunku do wartości dopływu, najlepsze wyniki otrzymano modelem autoregresyjnym bazującym na metodzie k-NN a najgorsze regresją Kernela. W przypadku modeli z dwoma zmiennymi objaśniającymi najmniejsze wartości błędów uzyskano, dla modeli uwzględniających dopływ ścieków i całkowitą wysokość opadu deszczu z jednodobowym opóźnieniem; najlepsze wyniki uzyskano metodą RF a najgorsze regresji Kernela. Dla modeli z dwiema zmiennymi objaśniającymi, ale trzema sygnałami wejściowymi, najmniejsze błędy dopływu ścieków do OŚ uzyskano metodą SVM, a najgorsze regresji Kernela. Z wykonanych symulacji stwierdzono, że w większości przypadków najmniejsze wartości błędów dopływu ścieków do oczyszczalni otrzymano metodą SVM a największe metodą K. W przypadku najprostszego modelu z jednym sygnałem wejściowym opóźnionym o 1 dobę najlepsze wyniki obliczeń uzyskano metodą k-NN, a w dwóch przypadkach modeli, gdzie ujęto 2 sygnały wejściowe, najlepsza okazała się metoda RF.
Źródło:
Archives of Environmental Protection; 2017, 43, 3; 74-81
2083-4772
2083-4810
Pojawia się w:
Archives of Environmental Protection
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Mining Data of Noisy Signal Patterns in Recognition of Gasoline Bio-Based Additives using Electronic Nose
Autorzy:
Osowski, S.
Siwek, K.
Powiązania:
https://bibliotekanauki.pl/articles/220792.pdf
Data publikacji:
2017
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
data mining
electronic nose
gasoline blends
random forest
support vector machine
wavelet denoising
Opis:
The paper analyses the distorted data of an electronic nose in recognizing the gasoline bio-based additives. Different tools of data mining, such as the methods of data clustering, principal component analysis, wavelet transformation, support vector machine and random forest of decision trees are applied. A special stress is put on the robustness of signal processing systems to the noise distorting the registered sensor signals. A special denoising procedure based on application of discrete wavelet transformation has been proposed. This procedure enables to reduce the error rate of recognition in a significant way. The numerical results of experiments devoted to the recognition of different blends of gasoline have shown the superiority of support vector machine in a noisy environment of measurement.
Źródło:
Metrology and Measurement Systems; 2017, 24, 1; 27-44
0860-8229
Pojawia się w:
Metrology and Measurement Systems
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Predictive Business Process Monitoring with Tree-based Classification Algorithms
Autorzy:
Owczarek, Tomasz
Janke, Piotr
Powiązania:
https://bibliotekanauki.pl/articles/503954.pdf
Data publikacji:
2018
Wydawca:
Międzynarodowa Wyższa Szkoła Logistyki i Transportu
Tematy:
business process
prediction
classification
random forest
gradient boosting
Opis:
Predictive business process monitoring is a current research area which purpose is to predict the outcome of a whole process (or an element of a process i.e. a single event or task) based on available data. In the article we explore the possibility of use of the machine learning classification algorithms based on trees (CART, C5.0, random forest and extreme gradient boosting) in order to anticipate the result of a process. We test the application of these algorithms on real world event-log data and compare it with the known approaches. Our results show that.
Źródło:
Logistics and Transport; 2018, 40, 4; 73-82
1734-2015
Pojawia się w:
Logistics and Transport
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Using a GEOBIA framework for integrating different data sources and classification methods in context of land use/land cover mapping
Autorzy:
Osmólska, A.
Hawryło, P.
Powiązania:
https://bibliotekanauki.pl/articles/145304.pdf
Data publikacji:
2018
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
mapa użytkowanych gruntów
mapa pokrycia terenu
mapa leśna
data fusion
random forest
supervised classification
Sentinel-2
Opis:
Land use/land cover (LULC) maps are important datasets in various environmental projects. Our aim was to demonstrate how GEOBIA framework can be used for integrating different data sources and classification methods in context of LULC mapping.We presented multi-stage semi-automated GEOBIA classification workflow created for LULC mapping of Tuszyma Forestry Management area based on multi-source, multi-temporal and multi-resolution input data, such as 4 bands- aerial orthophoto, LiDAR-derived nDSM, Sentinel-2 multispectral satellite images and ancillary vector data. Various classification methods were applied, i.e. rule-based and Random Forest supervised classification. This approach allowed us to focus on classification of each class ‘individually’ by taking advantage from all useful information from various input data, expert knowledge, and advanced machine-learning tools. In the first step, twelve classes were assigned in two-steps rule-based classification approach either vector-based, ortho- and vector-based or orthoand Lidar-based. Then, supervised classification was performed with use of Random Forest algorithm. Three agriculture-related LULC classes with vegetation alternating conditions were assigned based on aerial orthophoto and Sentinel-2 information. For classification of 15 LULC classes we obtained 81.3% overall accuracy and kappa coefficient of 0.78. The visual evaluation and class coverage comparison showed that the generated LULC layer differs from the existing land cover maps especially in relative cover of agriculture-related classes. Generally, the created map can be considered as superior to the existing data in terms of the level of details and correspondence to actual environmental and vegetation conditions that can be observed in RS images.
Źródło:
Geodesy and Cartography; 2018, 67, 1; 99-116
2080-6736
2300-2581
Pojawia się w:
Geodesy and Cartography
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Vibroacoustic Real Time Fuel Classification in Diesel Engine
Autorzy:
Bąkowski, A.
Kekez, M.
Radziszewski, L.
Sapietova, A.
Powiązania:
https://bibliotekanauki.pl/articles/177686.pdf
Data publikacji:
2018
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
fuel recognition
classification trees
particle swarm optimization (PSO)
random forest
Opis:
Five models and methodology are discussed in this paper for constructing classifiers capable of recognizing in real time the type of fuel injected into a diesel engine cylinder to accuracy acceptable in practical technical applications. Experimental research was carried out on the dynamic engine test facility. The signal of in-cylinder and in-injection line pressure in an internal combustion engine powered by mineral fuel, biodiesel or blends of these two fuel types was evaluated using the vibro-acoustic method. Computational intelligence methods such as classification trees, particle swarm optimization and random forest were applied.
Źródło:
Archives of Acoustics; 2018, 43, 3; 385-395
0137-5075
Pojawia się w:
Archives of Acoustics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Attribute selection for stroke prediction
Autorzy:
Zdrodowska, Małgorzata
Powiązania:
https://bibliotekanauki.pl/articles/386466.pdf
Data publikacji:
2019
Wydawca:
Politechnika Białostocka. Oficyna Wydawnicza Politechniki Białostockiej
Tematy:
data mining
classifier
J48 (C4.5)
CART
PART
naive Bayes classifier
random forest
support vector machine
multilayer perceptron
haemorrhagic stroke
ischemic stroke
Opis:
Stroke is the third most common cause of death and the most common cause of long-term disability among adults around theworld. Therefore, stroke prediction and diagnosis is a very important issue. Data mining techniques come in handy to help determine the correlations between individual patient characterisation data, that is, extract from the medical information system the knowledge necessary to predict and treat various diseases. The study analysed the data of patients with stroke using eight known classification algorithms (J48 (C4.5), CART, PART, naive Bayes classifier, Random Forest, Supporting Vector Machine and neural networks Multilayer Perceptron), which allowed to build an exploration model given with an accuracy of over 88%. The potential features of patients, which may be factors that increase the risk of stroke, were also indicated.
Źródło:
Acta Mechanica et Automatica; 2019, 13, 3; 200-204
1898-4088
2300-5319
Pojawia się w:
Acta Mechanica et Automatica
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Ensemble-based Method of Fraud Detection at Self-checkouts in Retail
Autorzy:
Vitynskyi, P.
Tkachenko, R.
Izonin, I.
Powiązania:
https://bibliotekanauki.pl/articles/410756.pdf
Data publikacji:
2019
Wydawca:
Polska Akademia Nauk. Oddział w Lublinie PAN
Tematy:
classification
Ensemble-based method
Random Forest
fraud detection
retail
Ito decomposition
imbalanced dataset
Opis:
The authors consider the problem of fraud detection at self-checkouts in retail in condition of unbalanced data set. A new ensemble-based method is proposed for its effective solution. The developed method involves two main steps: application of the preprocessing procedures and the Random Forest algorithm. The step-by-step implementation of the preprocessing stage involves the sequential execution of such procedures over the input data: scaling by maximal element in a column with row-wise scaling by Euclidean norm, weighting by correlation and applying polynomial extension. For polynomial extension Ito decomposition of the second degree is used. The simulation of the method was carried out on real data. Evaluating performance was based on the use of cost matrix. The experimental comparison of the effectiveness of the developed ensemble-based method with a number of existing (simples and ensembles) demonstrates the best performance of the developed method. Experimental studies of changing the parameters of the Random Forest both for the basic algorithm and for the developed method demonstrate a significant improvement of the investigated efficiency measures of the latter. It is the result of all steps of the preprocessing stage of the developed method use.
Źródło:
ECONTECHMOD : An International Quarterly Journal on Economics of Technology and Modelling Processes; 2019, 8, 2; 3-8
2084-5715
Pojawia się w:
ECONTECHMOD : An International Quarterly Journal on Economics of Technology and Modelling Processes
Dostawca treści:
Biblioteka Nauki
Artykuł

Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies