Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Wyszukujesz frazę "concept drift" wg kryterium: Temat


Wyświetlanie 1-5 z 5
Tytuł:
Unsupervised labeling of data for supervised learning and its application to medical claims prediction
Autorzy:
Ngufor, C.
Wojtusiak, A.
Powiązania:
https://bibliotekanauki.pl/articles/305284.pdf
Data publikacji:
2013
Wydawca:
Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:
unsupervised learning
concept drift
medical claims
Opis:
The task identifying changes and irregularities in medical insurance claim payments is a difficult process of which the traditional practice involves querying historical claims databases and flagging potential claims as normal or abnormal. Because what is considered as normal payment is usually unknown and may change over time, abnormal payments often pass undetected; only to be discovered when the payment period has passed. This paper presents the problem of on-line unsupervised learning from data streams when the distribution that generates the data changes or drifts over time. Automated algorithms for detecting drifting concepts in a probability distribution of the data are presented. The idea behind the presented drift detection methods is to transform the distribution of the data within a sliding window into a more convenient distribution. Then, a test statistics p-value at a given significance level can be used to infer the drift rate, adjust the window size and decide on the status of the drift. The detected concepts drifts are used to label the data, for subsequent learning of classification models by a supervised learner. The algorithms were tested on several synthetic and real medical claims data sets.
Źródło:
Computer Science; 2013, 14 (2); 191-214
1508-2806
2300-7036
Pojawia się w:
Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Semi-supervised approach to handle sudden concept drift in Enron data
Autorzy:
Kmieciak, M. R.
Stefanowski, J.
Powiązania:
https://bibliotekanauki.pl/articles/206052.pdf
Data publikacji:
2011
Wydawca:
Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:
concept drift
incremental learning of classifiers
email foldering
Enron data
Opis:
Detection of concept changes in incremental learning from data streams and classifier adaptation is studied in this paper. It is often assumed that all processed learning examples are always labeled, i.e. the class label is available for each example. As it may be difficult to satisfy this assumption in practice, in particular in case of data streams, we introduce an approach that detects concept drift in unlabeled data and retrains the classifier using a limited number of additionally labeled examples. The usefulness of this partly supervised approach is evaluated in the experimental study with the Enron data. This real life data set concerns classification of user's emails to multiple folders. Firstly, we show that the Enron data are characterized by frequent sudden changes of concepts. We also demonstrate that our approach can precisely detect these changes. Results of the next comparative study demonstrate that our approach leads to the classification accuracy comparable to two fully supervised methods: the periodic retraining of the classifier based on windowing and the trigger approach with the DDM supervised drift detection. However, our approach reduces the number of examples to be labeled. Furthermore, it requires less updates of retraining classifiers than windowing.
Źródło:
Control and Cybernetics; 2011, 40, 3; 667-695
0324-8569
Pojawia się w:
Control and Cybernetics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Solving Support Vector Machine with Many Examples
Autorzy:
Białoń, P.
Powiązania:
https://bibliotekanauki.pl/articles/308497.pdf
Data publikacji:
2010
Wydawca:
Instytut Łączności - Państwowy Instytut Badawczy
Tematy:
concept drift
convex optimization
data mining
network failure detection
stream processing
support vector machines
Opis:
Various methods of dealing with linear support vector machine (SVM) problems with a large number of examples are presented and compared. The author believes that some interesting conclusions from this critical analysis applies to many new optimization problems and indicates in which direction the science of optimization will branch in the future. This direction is driven by the automatic collection of large data to be analyzed, and is most visible in telecommunications. A stream SVM approach is proposed, in which the data substantially exceeds the available fast random access memory (RAM) due to a large number of examples. Formally, the use of RAM is constant in the number of examples (though usually it depends on the dimensionality of the examples space). It builds an inexact polynomial model of the problem. Another author's approach is exact. It also uses a constant amount of RAM but also auxiliary disk files, that can be long but are smartly accessed. This approach bases on the cutting plane method, similarly as Joachims' method (which, however, relies on early finishing the optimization).
Źródło:
Journal of Telecommunications and Information Technology; 2010, 3; 65-70
1509-4553
1899-8852
Pojawia się w:
Journal of Telecommunications and Information Technology
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Regression function and noise variance tracking methods for data streams with concept drift
Autorzy:
Jaworski, M.
Powiązania:
https://bibliotekanauki.pl/articles/329716.pdf
Data publikacji:
2018
Wydawca:
Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:
data stream
concept drift
Parzen kernel
regression function
variance estimation
strumień danych
funkcja regresji
estymacja wariancji
Opis:
Two types of heuristic estimators based on Parzen kernels are presented. They are able to estimate the regression function in an incremental manner. The estimators apply two techniques commonly used in concept-drifting data streams, i.e., the forgetting factor and the sliding window. The methods are applicable for models in which both the function and the noise variance change over time. Although nonparametric methods based on Parzen kernels were previously successfully applied in the literature to online regression function estimation, the problem of estimating the variance of noise was generally neglected. It is sometimes of profound interest to know the variance of the signal considered, e.g., in economics, but it can also be used for determining confidence intervals in the estimation of the regression function, as well as while evaluating the goodness of fit and in controlling the amount of smoothing. The present paper addresses this issue. Specifically, variance estimators are proposed which are able to deal with concept drifting data by applying a sliding window and a forgetting factor, respectively. A number of conducted numerical experiments proved that the proposed methods perform satisfactorily well in estimating both the regression function and the variance of the noise.
Źródło:
International Journal of Applied Mathematics and Computer Science; 2018, 28, 3; 559-567
1641-876X
2083-8492
Pojawia się w:
International Journal of Applied Mathematics and Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Continuous update of business process trees using Continuous Inductive Miner
Autorzy:
Pawlak, Tomasz P.
Górka, Bartosz
Powiązania:
https://bibliotekanauki.pl/articles/2204514.pdf
Data publikacji:
2023
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
business process
process mining
business intelligence
event log
concept drift
proces biznesowy
górnictwo procesowe
wywiad gospodarczy
dziennik zdarzeń
dryf koncepcji
Opis:
Business processes are omnipresent in nowadays economy: companies operate repetitively to achieve their goals, e.g., deliver goods, complete orders. The business process model is the key to understanding, managing, controlling, and verifying the operations of a company. Modeling of business processes may be a legal requirement in some market segments, e.g., financial in the European Union, and a prerequisite for certification, e.g., of the ISO-9001 standard. However, business processes naturally evolve, and continuous model adaptation is essential for rapid spot and reaction to changes in the process. The main contribution of this work is the Continuous Inductive Miner (CIM) algorithm that discovers and continuously adapts the process tree, an established representation of the process model, using the batches of event logs of the business process. CIM joins the exclusive guarantees of its two batch predecessors, the Inductive Miner (IM) and the Inductive Miner – directlyfollows-based (IMd): perfectly fit and sound models, and single-pass event log processing, respectively. CIM offers much shorter computation times in the update scenario than IM and IMd. CIM employs statistical information to work around the need to remember event logs as IM does while ensuring the perfect fit, contrary to IMd.
Źródło:
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2023, 71, 1; art. no. e143551
0239-7528
Pojawia się w:
Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:
Biblioteka Nauki
Artykuł
    Wyświetlanie 1-5 z 5

    Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies