Temat: data mining - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Contextual probability
Autorzy:: Wang, H.
Powiązania:: https://bibliotekanauki.pl/articles/307791.pdf
Data publikacji:: 2003
Wydawca:: Instytut Łączności - Państwowy Instytut Badawczy
Tematy:: mathematical foundations
knowledge representation
machine learning
uncertainty
data mining
Opis:: In this paper we present a new probability function G that generalizes the classical probability function. A mass function is an assignment of basic probability to some context (events, propositions). It represents the strength of support for some contexts in a domain. A context is a subset of the basic elements of interest in a domain - the frame of discernment. It is a medium to carry the "probabilistic" knowledge about a domain. The G function is defined in terms of a mass function under various contexts. G is shown to be a probability function satisfying the axioms of probability. Therefore G has all the properties attributed to a probability function. If the mass function is obtained from probability function by normalization, then G is shown to be a linear function of probability distribution and a linear function of probability. With this relationship we can estimate probability distribution from probabilistic knowledge carried in some contexts without any model assumption.
Źródło:: Journal of Telecommunications and Information Technology; 2003, 3; 92-97
1509-4553
1899-8852
Pojawia się w:: Journal of Telecommunications and Information Technology
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: A survey of big data classification strategies
Autorzy:: Banchhor, Chitrakant
Srinivasu, N.
Powiązania:: https://bibliotekanauki.pl/articles/2050171.pdf
Data publikacji:: 2020
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: big data
data mining
MapReduce
classification
machine learning
evolutionary intelligence
deep learning
Opis:: Big data plays nowadays a major role in finance, industry, medicine, and various other fields. In this survey, 50 research papers are reviewed regarding different big data classification techniques presented and/or used in the respective studies. The classification techniques are categorized into machine learning, evolutionary intelligence, fuzzy-based approaches, deep learning and so on. The research gaps and the challenges of the big data classification, faced by the existing techniques are also listed and described, which should help the researchers in enhancing the effectiveness of their future works. The research papers are analyzed for different techniques with respect to software tools, datasets used, publication year, classification techniques, and the performance metrics. It can be concluded from the here presented survey that the most frequently used big data classification methods are based on the machine learning techniques and the apparently most commonly used dataset for big data classification is the UCI repository dataset. The most frequently used performance metrics are accuracy and execution time.
Źródło:: Control and Cybernetics; 2020, 49, 4; 447-469
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: The use of machine learning technique for short-term forecasting of demand for electricity
Wykorzystanie technik uczenia maszynowego do krótkoterminowego prognozowania zapotrzebowania na energię elektryczną
Autorzy:: Nęcka, K.
Powiązania:: https://bibliotekanauki.pl/articles/337635.pdf
Data publikacji:: 2014
Wydawca:: Sieć Badawcza Łukasiewicz - Przemysłowy Instytut Maszyn Rolniczych
Tematy:: data mining
electricity
machine learning
short-term forecasts
energia elektryczna
prognoza krótkoterminowa
uczenie maszynowe
Opis:: The study verifies the usefulness of selected machine learning techniques for predicting hourly demand for electricity within a short time period. The results of the performed analyses show that the lowest values for both the MAPE forecast error for the test set at the level of 17% and the lowest share of the balancing energy in the total consumption at a level which does not exceed 15% were obtained for models for which the input data included the averaged electricity consumption profile for characteristic days of the week, the forecast number of pure production pieces and the encoded day of the week and time of the day. Among the tested models, forecasts prepared on the basis of artificial neural networks and standard CRT trees were characterised by the best quality of predictions.
W pracy sprawdzono przydatność wybranych technik uczenia maszynowego do predykcji godzinowego zapotrzebowania na energię elektryczną w krótkim horyzoncie czasu. Z wykonanych analiz wynika, że najniższe wartości zarówno błędu prognozy MAPE na poziomie 17% jak i najniższy udział energii bilansującej w całkowitym zużyciu na poziomie nie przekraczającym 15% uzyskano dla modeli, dla których zmiennymi wejściowymi były uśredniony profil zużycia energii elektrycznej dla charakterystycznych dni tygodnia, prognozowana liczba sztuk czystej produkcji oraz zakodowany dzień tygodnia i godzina doby. Spośród badanych modeli najlepszą jakością predykcji charakteryzowały się prognozy opracowywane w oparciu o sztuczne sieci neuronowe oraz standardowe drzewa CRT.
Źródło:: Journal of Research and Applications in Agricultural Engineering; 2014, 59, 2; 71-74
1642-686X
2719-423X
Pojawia się w:: Journal of Research and Applications in Agricultural Engineering
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: A survey on prediction of diabetes using classification algorithms
Autorzy:: Khanwalkar, A.
Soni, R.
Powiązania:: https://bibliotekanauki.pl/articles/1818807.pdf
Data publikacji:: 2021
Wydawca:: Stowarzyszenie Komputerowej Nauki o Materiałach i Inżynierii Powierzchni w Gliwicach
Tematy:: diabetes
diabetes prediction
algorithm
data mining
machine learning
cukrzyca
algorytm
eksploracja danych
uczenie maszynowe
Opis:: Purpose: Diabetes is a chronic disease that pays for a large proportion of the nation's healthcare expenses when people with diabetes want medical care continuously. Several complications will occur if the polymer disorder is not treated and unrecognizable. The prescribed condition leads to a diagnostic center and a doctor's intention. One of the real-world subjects essential is to find the first phase of the polytechnic. In this work, basically a survey that has been analyzed in several parameters within the poly-infected disorder diagnosis. It resembles the classification algorithms of data collection that plays an important role in the data collection method. Automation of polygenic disorder analysis, as well as another machine learning algorithm. Design/methodology/approach: This paper provides extensive surveys of different analogies which have been used for the analysis of medical data, For the purpose of early detection of polygenic disorder. This paper takes into consideration methods such as J48, CART, SVMs and KNN square, this paper also conducts a formal surveying of all the studies, and provides a conclusion at the end. Findings: This surveying has been analyzed on several parameters within the poly-infected disorder diagnosis. It resembles that the classification algorithms of data collection plays an important role in the data collection method in Automation of polygenic disorder analysis, as well as another machine learning algorithm. Practical implications: This paper will help future researchers in the field of Healthcare, specifically in the domain of diabetes, to understand differences between classification algorithms. Originality/value: This paper will help in comparing machine learning algorithms by going through results and selecting the appropriate approach based on requirements.
Źródło:: Journal of Achievements in Materials and Manufacturing Engineering; 2021, 104, 2; 77--84
1734-8412
Pojawia się w:: Journal of Achievements in Materials and Manufacturing Engineering
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Wybrane metody eksploracji danych i uczenia maszynowego w analizie stanu uszkodzeń oraz zużycia technicznego zabudowy terenów górniczych
Selected methods of data mining and machine learning in risk analysis for developments located in mining areas
Autorzy:: Firek, K.
Rusek, J.
Wodyński, A.
Powiązania:: https://bibliotekanauki.pl/articles/164216.pdf
Data publikacji:: 2016
Wydawca:: Stowarzyszenie Inżynierów i Techników Górnictwa
Tematy:: uczenie się maszynowe
wydobywanie danych
techniczne zużycie budynku
uszkodzenie budynku
wpływ eksploatacji
machine learning
data mining
technical wear of building
damage of building
mining effects
Opis:: W referacie przedstawiono metodykę oraz wyniki badań wpływu oddziaływań eksploatacji górniczej na zabudowę powierzchni, które zostały przeprowadzone w ostatnich latach w Katedrze Geodezji Inżynieryjnej i Budownictwa AGH. Obejmowały one modelowanie przebiegu zużycia technicznego budynków metodami uczenia maszynowego oraz analizę zakresu i intensywności ich uszkodzeń z zastosowaniem metod eksploracji danych. Uzyskane wyniki potwierdzają przydatność zastosowanych metod do rozwiązywania zagadnień związanych z budownictwem na terenach górniczych.
This paper presents the methodology and results of the studies on the influence of mining impacts on developments located in mining areas, which have been performed in recent years at the Department of Engineering Surveying and Civil Engineering of AGH University of Science and Technology. The studies included modeling the course of technical wear of buildings, by the methods of machine learning, as well as the analysis of the scope and intensity of their damage with the methods of data mining. The obtained results confirm the usefulness of the methods to solve the issues related to construction in mining areas.
Źródło:: Przegląd Górniczy; 2016, 72, 1; 50-55
0033-216X
Pojawia się w:: Przegląd Górniczy
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: A Deep-Learning-Based Bug Priority Prediction Using RNN-LSTM Neural Networks
Autorzy:: Bani-Salameh, Hani
Sallam, Mohammed
Al shboul, Bashar
Powiązania:: https://bibliotekanauki.pl/articles/1818480.pdf
Data publikacji:: 2021
Wydawca:: Politechnika Wrocławska. Oficyna Wydawnicza Politechniki Wrocławskiej
Tematy:: assigning
priority
bug tracking systems
bug priority
bug severity
closed-source
data mining
machine learning
ML
deep learning
RNN-LSTM
SVM
KNN
Opis:: Context: Predicting the priority of bug reports is an important activity in software maintenance. Bug priority refers to the order in which a bug or defect should be resolved. A huge number of bug reports are submitted every day. Manual filtering of bug reports and assigning priority to each report is a heavy process, which requires time, resources, and expertise. In many cases mistakes happen when priority is assigned manually, which prevents the developers from finishing their tasks, fixing bugs, and improve the quality. Objective: Bugs are widespread and there is a noticeable increase in the number of bug reports that are submitted by the users and teams’ members with the presence of limited resources, which raises the fact that there is a need for a model that focuses on detecting the priority of bug reports, and allows developers to find the highest priority bug reports. This paper presents a model that focuses on predicting and assigning a priority level (high or low) for each bug report. Method: This model considers a set of factors (indicators) such as component name, summary, assignee, and reporter that possibly affect the priority level of a bug report. The factors are extracted as features from a dataset built using bug reports that are taken from closed-source projects stored in the JIRA bug tracking system, which are used then to train and test the framework. Also, this work presents a tool that helps developers to assign a priority level for the bug report automatically and based on the LSTM’s model prediction. Results: Our experiments consisted of applying a 5-layer deep learning RNN-LSTM neural network and comparing the results with Support Vector Machine (SVM) and K-nearest neighbors (KNN) to predict the priority of bug reports. The performance of the proposed RNN-LSTM model has been analyzed over the JIRA dataset with more than 2000 bug reports. The proposed model has been found 90% accurate in comparison with KNN (74%) and SVM (87%). On average, RNN-LSTM improves the F-measure by 3% compared to SVM and 15.2% compared to KNN. Conclusion: It concluded that LSTM predicts and assigns the priority of the bug more accurately and effectively than the other ML algorithms (KNN and SVM). LSTM significantly improves the average F-measure in comparison to the other classifiers. The study showed that LSTM reported the best performance results based on all performance measures (Accuracy = 0.908, AUC = 0.95, F-measure = 0.892).
Źródło:: e-Informatica Software Engineering Journal; 2021, 15, 1; 29--45
1897-7979
Pojawia się w:: e-Informatica Software Engineering Journal
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Zastosowanie algorytmów przeszukiwania grafów do analizy obrazów medycznych
Analysis of medical images based on graph search algorithms
Autorzy:: Dimitrova-Grekow, T.
Dąbkowski, A.
Powiązania:: https://bibliotekanauki.pl/articles/156629.pdf
Data publikacji:: 2012
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: analiza obrazów medycznych
algorytmy przeszukiwania grafów
uczenie maszynowe
eksploracja danych
rozpoznawanie choroby
image analysis
graph search algorithm
machine learning
data mining
disease recognition
Opis:: W artykule przedstawiono wyniki testów niekonwencjonalnego zastosowania metod do przeszukiwania grafów w celu analizy obrazów powstałych z rezonansu magnetycznego głowy. Zaprezentowano GUI do automatycznej obróbki serii obrazów. Zbudowane klasyfikatory wykazały, że metoda BFS analizy plików DICOM, po odpowiednej selekcji cech, pozwala na 100% rozpoznawanie chorych na wodogłowie i ponad 90% zdrowych, co zachęca do dalszych badań i obserwacji, np. czy osoby sklasyfikowane błędnie jako chorzy, po czasie rzeczywiście nie rozwinęli tej choroby.
There are many methods for image segmentation [1, 2]: threshold, area, edge and hybrid methods. Area methods indicate groups of similar pixels form local regions [3, 4]. Edge methods detect boundaries between homogeneous segments [5, 6, 7]. In this paper we present the results of tests of unconventional implementation of graph search methods for the analysis of images generated from magnetic resonance imaging [8]. We explored the effectiveness of different approaches for dividing areas within a similar gray scale, using adapted graph search algorithms (DFS, BFS) after appropriate modification (Fig. 1). For this purpose, the Weka package (a tool for pre-processing, classification, regression, clustering and data visualization) was used [9]. A training set was generated after analyzing all the series of images from the database. First, we evaluated models created using certain algorithms and compared their efficacy (Tab. 1). This was followed by a selection of attributes (Tab. 2) and a re-evaluation of the models (Tab. 3). Comparison of the results of both evaluations showed that after selection of the relevant product attributes, you can achieve up to 100% detection of patients with hydrocephalus and over 90% proper recognition of healthy persons. This encourages further research and observation, such as whether persons wrongly classified as sick actually developed the disease in time. We designed a web application for the study, written in Windows Azure, as well as a GUI for automatic processing of a series of images (Fig. 2).
Źródło:: Pomiary Automatyka Kontrola; 2012, R. 58, nr 7, 7; 578-580
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: Wpływ pandemii COVID-19 na stan zdrowia psychicznego społeczeństwa
Influence of the COVID-19 pandemic on the mental health of society
Autorzy:: Ptak-Chmielewska, Aneta
Baszniak, Karolina
Kurpanik, Jarosław
Powiązania:: https://bibliotekanauki.pl/articles/2124987.pdf
Data publikacji:: 2022-09-30
Wydawca:: Główny Urząd Statystyczny
Tematy:: uczenie maszynowe
pandemia COVID-19
data mining
stan zdrowia psychicznego
gospodarstwo domowe
Stany Zjednoczone
machine learning
COVID-19 pandemic
mental health
household
USA
Opis:: Pandemia COVID-19 odmieniła życie ludzi na całym świecie, m.in. wpłynęła na kondycję psychiczną i funkcjonowanie wielu rodzin. Głównym celem badania omawianego w artykule jest ocena wpływu pandemii COVID-19 na stan zdrowia psychicznego członków gospodarstw domowych. W badaniu posłużono się zbiorem danych pochodzących z ankiety COVID Impact Survey, przeprowadzonej w 2020 r. (w trakcie pierwszej fali pandemii) w Stanach Zjednoczonych wśród osób dorosłych przez organizację Data Foundation. Analizie poddano 6768 obserwacji. Oszacowano model regresji logistycznej oraz modele oparte na metodach data mining, takich jak: drzewa decyzyjne, wzmacnianie gradientowe, metoda k-najbliższych sąsiadów, sztuczne sieci neuronowe i metoda wektorów wspierających. Analiza skupień pozwoliła podzielić respondentów na grupy uwidaczniające cechy charakterystyczne i problemy członków gospodarstw domowych, a w utworzonym modelu uwzględniono kwestie zdrowia i zaburzeń psychicznych oraz ich związek z sytuacją finansową gospodarstw. Wyniki badania wskazują na to, że izolacja, zdalny tryb nauczania i pracy oraz mniejsza aktywność fizyczna przyczyniają się do pogarszania się stanu zdrowia psychicznego.
The COVID-19 pandemic changed the lives of people all around the world, e.g. affected mental health and the functioning of several families. The main goal of the research presented in this paper is to assess the influence of the COVID-19 pandemic on the mental health of members of households. The research was performed on the basis of a data set from the COVID Impact Survey carried out by the Data Foundation think tank in 2020 (during the first wave of COVID-19 pandemic) in the USA among adult respondents. The survey used 6,768 observations. The authors estimated a model of logistic regression and models based on data mining methods, such as decision trees, XG Boost, k-nearest neighbours method, artificial neural networks and a support vector machine. Cluster analysis made it possible to divide respondents into groups showing their characteristic features and problems, and the constructed model took into account their mental issues and the relationship between those issues and the financial situation of households. The results demonstrate that isolation, remote education and work and limited physical activity contribute to the worsening of mental health of the population.
Źródło:: Wiadomości Statystyczne. The Polish Statistician; 2022, 67, 9; 24-52
0043-518X
Pojawia się w:: Wiadomości Statystyczne. The Polish Statistician
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: Machine learning-based business rule engine data transformation over high-speed networks
Autorzy:: Neelima, Kenpi
Vasundra, S.
Powiązania:: https://bibliotekanauki.pl/articles/38700094.pdf
Data publikacji:: 2023
Wydawca:: Instytut Podstawowych Problemów Techniki PAN
Tematy:: CRISP-DM
data mining algorithms
business rule
prediction
classification
machine learning
deep learning
AI design
algorytmy eksploracji danych
reguła biznesowa
prognoza
klasyfikacja
nauczanie maszynowe
uczenie głębokie
projekt Sztucznej Inteligencji
Opis:: Raw data processing is a key business operation. Business-specific rules determine howthe raw data should be transformed into business-required formats. When source datacontinuously changes its formats and has keying errors and invalid data, then the effectiveness of the data transformation is a big challenge. The conventional data extraction andtransformation technique produces a delay in handling such data because of continuousfluctuations in data formats and requires continuous development of a business rule engine.The best business rule engines require near real-time detection of business rule and datatransformation mechanisms utilizing machine learning classification models. Since data iscombined from numerous sources and older systems, it is challenging to categorize andcluster the data and apply suitable business rules to turn raw data into the business-required format. This paper proposes a methodology for designing ensemble machine learning techniques and approaches for classifying and segmenting registered numbersof registered title records to choose the most suitable business rule that can convert theregistered number into the format the business expects, allowing businesses to provide customers with the most recent data in less time. This study evaluates the suggested modelby gathering sample data and analyzing classification machine learning (ML) models todetermine the relevant business rule. Experimentation employed Python, R, SQL storedprocedures, Impala scripts, and Datameer tools.
Źródło:: Computer Assisted Methods in Engineering and Science; 2023, 30, 1; 55-71
2299-3649
Pojawia się w:: Computer Assisted Methods in Engineering and Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: Machine Learning Algorithms for Data Enrichment: A Promising Solution for Enhancing Accuracy in Predicting Blast-Induced Ground Vibration in Open-Pit Mines
Autorzy:: Nguyen, Hoang
Bui, Xuan-Nam
Drebenstedt, Carsten
Powiązania:: https://bibliotekanauki.pl/articles/25212182.pdf
Data publikacji:: 2023
Wydawca:: Polskie Towarzystwo Przeróbki Kopalin
Tematy:: blast-induced ground vibration
data enrichment
sustainable and responsible mining
machine learning
open-pit mining
performance improvement
górnictwo odkrywkowe
sztuczna inteligencja
maszyny
Opis:: The issue of blast-induced ground vibration poses a significant environmental challenge in open-pit mines, necessitating precise prediction and control measures. While artificial intelligence and machine learning models hold promise in addressing this concern, their accuracy remains a notable issue due to constrained input variables, dataset size, and potential environmental impact. To mitigate these challenges, data enrichment emerges as a potential solution to enhance the efficacy of machine learning models, not only in blast-induced ground vibration prediction but also across various domains within the mining industry. This study explores the viability of utilizing machine learning for data enrichment, with the objective of generating an augmented dataset that offers enhanced insights based on existing data points for the prediction of blast-induced ground vibration. Leveraging the support vector machine (SVM), we uncover intrinsic relationships among input variables and subsequently integrate them as supplementary inputs. The enriched dataset is then harnessed to construct multiple machine learning models, including k-nearest neighbors (KNN), classification and regression trees (CART), and random forest (RF), all designed to predict blast-induced ground vibration. Comparative analysis between the enriched models and their original counterparts, established on the initial dataset, provides a foundation for extracting insights into optimizing the performance of machine learning models not only in the context of predicting blast-induced ground vibration but also in addressing broader challenges within the mining industry.
Źródło:: Inżynieria Mineralna; 2023, 2; 79--88
1640-4920
Pojawia się w:: Inżynieria Mineralna
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "data mining" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język