Temat: mining data - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: An alternative extension of the k-means algorithm for clustering categorical data
Autorzy:: San, O. M.
Huynh, V. N.
Nakamori, Y.
Powiązania:: https://bibliotekanauki.pl/articles/907406.pdf
Data publikacji:: 2004
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: analiza skupień
dane kategoryczne
eksploracja danych
cluster analysis
categorical data
data mining
Opis:: Most of the earlier work on clustering has mainly been focused on numerical data whose inherent geometric properties can be exploited to naturally define distance functions between data points. Recently, the problem of clustering categorical data has started drawing interest. However, the computational cost makes most of the previous algorithms unacceptable for clustering very large databases. The k-means algorithm is well known for its efficiency in this respect. At the same time, working only on numerical data prohibits them from being used for clustering categorical data. The main contribution of this paper is to show how to apply the notion of "cluster centers'' on a dataset of categorical objects and how to use this notion for formulating the clustering problem of categorical objects as a partitioning problem. Finally, a k-means-like algorithm for clustering categorical data is introduced. The clustering performance of the algorithm is demonstrated with two well-known data sets, namely, soybean disease and nursery databases.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2004, 14, 2; 241-247
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Inferring graph grammars by detecting overlap in frequent subgraphs
Autorzy:: Kukluk, J. P.
Holder, L. B.
Cook, D. J.
Powiązania:: https://bibliotekanauki.pl/articles/907941.pdf
Data publikacji:: 2008
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: indukcja gramatyczna
gramatyka grafowa
pozyskiwanie danych
grammar induction
graph grammars
graph mining
multi-relational data mining
Opis:: In this paper we study the inference of node and edge replacement graph grammars. We search for frequent subgraphs and then check for an overlap among the instances of the subgraphs in the input graph. If the subgraphs overlap by one node, we propose a node replacement graph grammar production. If the subgraphs overlap by two nodes or two nodes and an edge, we propose an edge replacement graph grammar production. We can also infer a hierarchy of productions by compressing portions of a graph described by a production and then inferring new productions on the compressed graph. We validate the approach in experiments where we generate graphs from known grammars and measure how well the approach infers the original grammar from the generated graph. We show graph grammars found in biological molecules, biological networks, and analyze learning curves of the algorithm.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2008, 18, 2; 241-250
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Methods for mining co–location patterns with extended spatial objects
Autorzy:: Bembenik, R.
Jóźwicki, W.
Protaziuk, G.
Powiązania:: https://bibliotekanauki.pl/articles/330860.pdf
Data publikacji:: 2017
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: spatial data mining
colocation patterns
extended objects
dane przestrzenne
eksploracja danych
obiekt rozszerzony
Opis:: The paper discusses various approaches to mining co-location patterns with extended spatial objects. We focus on the properties of transaction-free approaches EXCOM and DEOSP, and discuss the differences between the method using a buffer and that employing clustering and triangulation. These theoretical differences between the two methods are verified experimentally. In the performed tests three different implementations of EXCOM are compared with DEOSP, highlighting the advantages and downsides of both approaches.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2017, 27, 4; 681-695
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Center-based l₁-clustering method
Autorzy:: Sabo, K.
Powiązania:: https://bibliotekanauki.pl/articles/330910.pdf
Data publikacji:: 2014
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: l1 clustering
data mining
optimization
weighted median problem
metoda grupowania
eksploracja danych
optymalizacja
Opis:: In this paper, we consider the l₁-clustering problem for a finite data-point set which should be partitioned into k disjoint nonempty subsets. In that case, the objective function does not have to be either convex or differentiable, and generally it may have many local or global minima. Therefore, it becomes a complex global optimization problem. A method of searching for a locally optimal solution is proposed in the paper, the convergence of the corresponding iterative process is proved and the corresponding algorithm is given. The method is illustrated by and compared with some other clustering methods, especially with the l₂-clustering method, which is also known in the literature as a smooth k-means method, on a few typical situations, such as the presence of outliers among the data and the clustering of incomplete data. Numerical experiments show in this case that the proposed l₁-clustering algorithm is faster and gives significantly better results than the l₂-clustering algorithm.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2014, 24, 1; 151-163
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Applications of rough sets in big data analysis: An overview
Autorzy:: Pięta, Piotr
Szmuc, Tomasz
Powiązania:: https://bibliotekanauki.pl/articles/2055175.pdf
Data publikacji:: 2021
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: rough sets theory
big data analysis
deep learning
data mining
teoria zbiorów przybliżonych
duży zbiór danych
uczenie głębokie
eksploracja danych
Opis:: Big data, artificial intelligence and the Internet of things (IoT) are still very popular areas in current research and industrial applications. Processing massive amounts of data generated by the IoT and stored in distributed space is not a straightforward task and may cause many problems. During the last few decades, scientists have proposed many interesting approaches to extract information and discover knowledge from data collected in database systems or other sources. We observe a permanent development of machine learning algorithms that support each phase of the data mining process, ensuring achievement of better results than before. Rough set theory (RST) delivers a formal insight into information, knowledge, data reduction, uncertainty, and missing values. This formalism, formulated in the 1980s and developed by several researches, can serve as a theoretical basis and practical background for dealing with ambiguities, data reduction, building ontologies, etc. Moreover, as a mature theory, it has evolved into numerous extensions and has been transformed through various incarnations, which have enriched expressiveness and applicability of the related tools. The main aim of this article is to present an overview of selected applications of RST in big data analysis and processing. Thousands of publications on rough sets have been contributed; therefore, we focus on papers published in the last few years. The applications of RST are considered from two main perspectives: direct use of the RST concepts and tools, and jointly with other approaches, i.e., fuzzy sets, probabilistic concepts, and deep learning. The latter hybrid idea seems to be very promising for developing new methods and related tools as well as extensions of the application area.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2021, 31, 4; 659--683
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Exploiting multi-core and many-core parallelism for subspace clustering
Autorzy:: Datta, Amitava
Kaur, Amardeep
Lauer, Tobias
Chabbouh, Sami
Powiązania:: https://bibliotekanauki.pl/articles/331126.pdf
Data publikacji:: 2019
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: data mining
subspace clustering
multicore processor
many core processor
GPU computing
eksploracja danych
procesor wielordzeniowy
obliczenia GPU
Opis:: Finding clusters in high dimensional data is a challenging research problem. Subspace clustering algorithms aim to find clusters in all possible subspaces of the dataset, where a subspace is a subset of dimensions of the data. But the exponential increase in the number of subspaces with the dimensionality of data renders most of the algorithms inefficient as well as ineffective. Moreover, these algorithms have ingrained data dependency in the clustering process, which means that parallelization becomes difficult and inefficient. SUBSCALE is a recent subspace clustering algorithm which is scalable with the dimensions and contains independent processing steps which can be exploited through parallelism. In this paper, we aim to leverage the computational power of widely available multi-core processors to improve the runtime performance of the SUBSCALE algorithm. The experimental evaluation shows linear speedup. Moreover, we develop an approach using graphics processing units (GPUs) for fine-grained data parallelism to accelerate the computation further. First tests of the GPU implementation show very promising results.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2019, 29, 1; 81-91
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: An algorithm for reducing the dimension and size of a sample for data exploration procedures
Autorzy:: Kulczycki, P.
Łukasik, S.
Powiązania:: https://bibliotekanauki.pl/articles/330110.pdf
Data publikacji:: 2014
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: dimension reduction
sample size reduction
linear transformation
simulated annealing
data mining
redukcja wymiaru
transformacja liniowa
wyżarzanie symulowane
eksploracja danych
Opis:: The paper deals with the issue of reducing the dimension and size of a data set (random sample) for exploratory data analysis procedures. The concept of the algorithm investigated here is based on linear transformation to a space of a smaller dimension, while retaining as much as possible the same distances between particular elements. Elements of the transformation matrix are computed using the metaheuristics of parallel fast simulated annealing. Moreover, elimination of or a decrease in importance is performed on those data set elements which have undergone a significant change in location in relation to the others. The presented method can have universal application in a wide range of data exploration problems, offering flexible customization, possibility of use in a dynamic data environment, and comparable or better performance with regards to the principal component analysis. Its positive features were verified in detail for the domain’s fundamental tasks of clustering, classification and detection of atypical elements (outliers).
Źródło:: International Journal of Applied Mathematics and Computer Science; 2014, 24, 1; 133-149
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: A Fuzzy Logic Based Approach to Linguistic Summaries of Databases
Autorzy:: Kacprzyk, J.
Yager, R. R.
Zadrożny, S.
Powiązania:: https://bibliotekanauki.pl/articles/911154.pdf
Data publikacji:: 2000
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: logika rozmyta
baza danych
podsumowanie lingwistyczne
zgłębianie danych
fuzzy logic
linguistic summary
computing with words
data mining
fuzzy querying
Opis:: In this paper, we present basic ideas and perspectives related to the use of fuzzy logic for the derivation of linguistic summaries of data (databases). We concentrate on the issue of how to measure the goodness of a linguistic summary, and on how to embed data summarization within the fuzzy querying environment, for an effective and efficient implementation. In particular, we propose how to efficiently implement Kacprzyk and Yager's (2000) new quality indicators of linguistic summaries to derive summaries via Kacprzyk and Zadrozny's (1994; 1995a; 1995b; 1996) fuzzy querying add-on. Finally, we present an implementation for deriving linguistic summaries of a sales database at a computer retailer, and show how the linguistic summaries obtained can be useful for supporting decisions of the business owner.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2000, 10, 4; 813-834
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: A complete gradient clustering algorithm formed with kernel estimators
Autorzy:: Kulczycki, P.
Charytanowicz, M.
Powiązania:: https://bibliotekanauki.pl/articles/907781.pdf
Data publikacji:: 2010
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: analiza danych
eksploracja danych
grupowanie
metoda statystyczna
estymacja jądrowa
obliczenia numeryczne
data analysis
data mining
clustering
gradient procedures
nonparametric statistical methods
kernel estimators
numerical calculations
Opis:: The aim of this paper is to provide a gradient clustering algorithm in its complete form, suitable for direct use without requiring a deeper statistical knowledge. The values of all parameters are effectively calculated using optimizing procedures. Moreover, an illustrative analysis of the meaning of particular parameters is shown, followed by the effects resulting from possible modifications with respect to their primarily assigned optimal values. The proposed algorithm does not demand strict assumptions regarding the desired number of clusters, which allows the obtained number to be better suited to a real data structure. Moreover, a feature specific to it is the possibility to influence the proportion between the number of clusters in areas where data elements are dense as opposed to their sparse regions. Finally, the algorithm-by the detection of one-element clusters-allows identifying atypical elements, which enables their elimination or possible designation to bigger clusters, thus increasing the homogeneity of the data set.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2010, 20, 1; 123-134
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: Graph-based generation of a meta-learning search space
Autorzy:: Jankowski, N.
Powiązania:: https://bibliotekanauki.pl/articles/330964.pdf
Data publikacji:: 2012
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: pozyskiwanie danych
maszyna ucząca się
inteligencja obliczeniowa
meta learning
data mining
learning machines
complexity of learning
complexity of learning machines
computational intelligence
Opis:: Meta-learning is becoming more and more important in current and future research concentrated around broadly defined data mining or computational intelligence. It can solve problems that cannot be solved by any single, specialized algorithm. The overall characteristic of each meta-learning algorithm mainly depends on two elements: the learning machine space and the supervisory procedure. The former restricts the space of all possible learning machines to a subspace to be browsed by a meta-learning algorithm. The latter determines the order of selected learning machines with a module responsible for machine complexity evaluation, organizes tests and performs analysis of results. In this article we present a framework for meta-learning search that can be seen as a method of sophisticated description and evaluation of functional search spaces of learning machine configurations used in meta-learning. Machine spaces will be defined by specially defined graphs where vertices are specialized machine configuration generators. By using such graphs the learning machine space may be modeled in a much more flexible way, depending on the characteristics of the problem considered and a priori knowledge. The presented method of search space description is used together with an advanced algorithm which orders test tasks according to their complexities.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2012, 22, 3; 647-667
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "mining data" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język