Temat: mining data - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Integration of candidate hash trees in concurrent processing of frequent itemset queries using Apriori
Autorzy:: Grudziński, P.
Wojciechowski, M.
Powiązania:: https://bibliotekanauki.pl/articles/970835.pdf
Data publikacji:: 2009
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: data mining
frequent itemset mining
data mining queries
Opis:: Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. In this paper we address the problem of processing batches of frequent itemset queries using the Apriori algorithm. The best solution of this problem proposed so far is Common Counting, which consists in concurrent execution of the queries using Apriori with the integration of scans of the parts of the database shared among the queries. In this paper we propose a new method - Common Candidate Tree, offering a more tight integration of the concurrently processed queries by sharing memory data structures, i.e., candidate hash trees. The experiments show that Common Candidate Tree outperforms Common Counting in terms of execution time. Moreover, thanks to smaller memory consumption, Common Candidate Tree can be applied to larger batches of queries.
Źródło:: Control and Cybernetics; 2009, 38, 1; 47-65
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Ant colony metaphor in a new clustering algorithm
Autorzy:: Boryczka, U.
Powiązania:: https://bibliotekanauki.pl/articles/969824.pdf
Data publikacji:: 2010
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: data mining
cluster analysis
ant clustering algorithm
Opis:: Among the many bio-inspired techniques, ant clustering algorithms have received special attention, especially because they still require much investigation to improve performance, stability and other key features that would make such algorithms mature tools for data mining. Clustering with swarm-based algorithms is emerging as an alternative to more conventional clustering methods, such as k-means algorithm. This proposed approach mimics the clustering behavior observed in real ant colonies. As a case study, this paper focuses on the behavior of clustering procedures in this new approach. The proposed algorithm is evaluated on a number of well-known benchmark data sets. Empirical results clearly show that the ant clustering algorithm (ACA) performs well when compared to other techniques.
Źródło:: Control and Cybernetics; 2010, 39, 2; 343-358
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Properties and pre-processing strategies to enhance the discovery of functional dependency with degree of satisfaction
Autorzy:: Wei, Q.
Chen, G.
Zhou, X.
Powiązania:: https://bibliotekanauki.pl/articles/970954.pdf
Data publikacji:: 2009
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: functional dependency
incomplete data
degree of distinctness
data mining
Opis:: Functional dependency with degree of satisfaction (FDd) is an extended notion in data modeling, and reflects a type of integrity constraints and business rules on attributes, mainly for massive databases, in which incomplete data such as noise, null and imprecision may exist. While existing approaches are considered effective in general, attempts for further improvement in efficiency are deemed meaningful and desirable as far as knowledge discovery is concerned. This paper focuses on discovering (FDd)s as a form of useful semantic knowledge, aiming at providing an enhancement to the FDd mining process in a more efficient manner. In doing so, properties of FDd are in-depth investigated along with a measure for degree of distinctness. Subsequently, a number of optimization strategies are developed for pre-processing, which are then incorporated into the mining process, giving rise to an enhanced approach for mining functional dependency with degree of satisfaction, namely e-MFDD. Finally, data experiments revealed that e-MFDD significantly outperformed the original approach without pre-processing.
Źródło:: Control and Cybernetics; 2009, 38, 2; 367-394
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Creating a knowledge database on system dependability and resilience
Autorzy:: Kubacki, M.
Sosnowski, J.
Powiązania:: https://bibliotekanauki.pl/articles/206779.pdf
Data publikacji:: 2013
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: dependability
data mining
event and performance logs
resilience
Opis:: The paper deals with the problem of creating a knowledge database on system dependability and resilience, created on the basis of available system and application logs. Special to ols to collect and analyse these data from many systems have been developed. Taking into account a wide spectrum of various logs we explore them locally and globally. This allowed for identification of characteristics of normal operation and anomalous behaviour. A lot of attention is paid to the problem of selecting measures to identify symptoms characterising system operation and their usefulness in dependability and resilience evaluation or prediction. The concepts presented are illustrated with experience gained during monitoring of real systems.
Źródło:: Control and Cybernetics; 2013, 42, 1; 287-307
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: A survey of big data classification strategies
Autorzy:: Banchhor, Chitrakant
Srinivasu, N.
Powiązania:: https://bibliotekanauki.pl/articles/2050171.pdf
Data publikacji:: 2020
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: big data
data mining
MapReduce
classification
machine learning
evolutionary intelligence
deep learning
Opis:: Big data plays nowadays a major role in finance, industry, medicine, and various other fields. In this survey, 50 research papers are reviewed regarding different big data classification techniques presented and/or used in the respective studies. The classification techniques are categorized into machine learning, evolutionary intelligence, fuzzy-based approaches, deep learning and so on. The research gaps and the challenges of the big data classification, faced by the existing techniques are also listed and described, which should help the researchers in enhancing the effectiveness of their future works. The research papers are analyzed for different techniques with respect to software tools, datasets used, publication year, classification techniques, and the performance metrics. It can be concluded from the here presented survey that the most frequently used big data classification methods are based on the machine learning techniques and the apparently most commonly used dataset for big data classification is the UCI repository dataset. The most frequently used performance metrics are accuracy and execution time.
Źródło:: Control and Cybernetics; 2020, 49, 4; 447-469
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Hybrid approach to supporting decision making processes in companies
Autorzy:: Pietruszkiewicz, W.
Twardochleb, M.
Roszkowski, M.
Powiązania:: https://bibliotekanauki.pl/articles/205657.pdf
Data publikacji:: 2011
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: data mining
cascade optimization hybrid
parallel classification hybrid
hybrid multicomponent attribute selection
Opis:: This article presents the advantages of hybrid approach to the support decision making by analyzing three areas of business decision problems, solved by combination of well-known algorithms into the new hybrid constructions: cascade optimization hybrid, parallel classification hybrid and hybrid multicomponent attribute selection. Each of them solved a different problem: the cascade optimization hybrid allowed for finding an extreme of a composite objective function, the parallel classification hybrid was used to choose a proper class through voting, the multicomponent attribute selection robustly chose significant decision variables. A hybrid approach to the problem of supporting the decision making processes is more effective than using each of the component methods alone, even for the sophisticated ones. A combination of several methods with different characteristics and performance makes it possible to take advantages of their strong sides and simultaneously eliminate the weak ones, resulting in a better computational support of decision making.
Źródło:: Control and Cybernetics; 2011, 40, 1; 125-143
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: An improved comparison of three rough set approaches to missing attribute values
Autorzy:: Grzymala-Busse, J. W.
Grzymala-Busse, W. J.
Hippe, Z. S.
Rząsa, W.
Powiązania:: https://bibliotekanauki.pl/articles/969797.pdf
Data publikacji:: 2010
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: incomplete data sets
missing attribute values
approximations for incomplete data
LERS data mining system
MLEM2 algorithm
Opis:: In a previous paper three types of missing attribute values: lost values, attribute-concept values and "do not care" conditions were compared using six data sets. Since previous experimental results were affected by large variances due to conducting experiments on different versions of a given data set, we conducted new experiments, using the same pattern of missing attribute values for all three types of missing attribute values and for both certain and possible rules. Additionally, in our new experiments, the process of incremental replacing specified values by missing attribute values was terminated when entire rows of the data sets were full of missing attribute values. Finally, we created new, incomplete data sets by replacing the specified values starting from 5% of all attribute values, instead of 10% as in the previous experiments, with an increment of 5% instead of the previous increment of 10%. As a result, it is becoming more clear that the best approach to missing attribute values is based on lost values, with small difference between certain and possible rules, and that the worst approach is based on "do not care" conditions, certain rules. With our improved experimental setup it is also more clear that for a given data set the type of the missing attribute values should be selected individually.
Źródło:: Control and Cybernetics; 2010, 39, 2; 469-486
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: Neural methods of knowledge extraction
Autorzy:: Duch, W.
Adamczak, R.
Grąbczewski, K.
Jankowski, N.
Powiązania:: https://bibliotekanauki.pl/articles/206250.pdf
Data publikacji:: 2000
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: diagnostyka medyczna
optymalizacja
reguła logiczna
reguła rozmyta
wspomaganie decyzji
data mining
decision support
fuzzy rules
logical rules
medical diagnosis
optimization
Opis:: Contrary to the common opinion, neural networks may be used for knowledge extraction. Recently, a new methodology of logical rule extraction, optimization and application of rule-based systems has been described. C-MLP2LN algorithm, based on constrained multilayer perceptron network, is described here in details and the dynamics of a transition from neural to logical system illustrated. The algorithm handles real-valued features, determining appropriate linguistic variables or membership functions as a part of the rule extraction process. Initial rules are optimized by exploring the accuracy/simplicity tradeoff at the rule extraction stage and the one between reliability of rules and rejection rate at the optimization stage. Gaussian uncertainties of measurements are assumed during application of crisp logical rules, leading to "soft trapezoidal" membership functions and allowing to optimize the linguistic variables using gradient procedures. Comments are made on application of neural networks to knowledge discovery in the benchmark and real life problems.
Źródło:: Control and Cybernetics; 2000, 29, 4; 997-1017
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "mining data" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język