Autor: Stefanowski, J. - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: An experimental evaluation of two approaches to mining context based sequential patterns
Autorzy:: Stefanowski, J.
Ziembiński, R.
Powiązania:: https://bibliotekanauki.pl/articles/970837.pdf
Data publikacji:: 2009
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: knowledge discovery
sequential patterns mining
context patterns
similarity of patterns
Opis:: The paper discusses the results of experiments with a new context extension of a sequential pattern mining problem. In this extension, two kinds of context attributes are introduced for describing the source of a sequence and for each element inside this sequence. Such context based sequential patterns may be discovered by a new algorithm, called Context Mapping Improved, specific for handling attributes with similarity functions. For numerical attributes an alternative approach could include their pre-discretization, transforming discrete values into artificial items and, then, using an adaptation of an algorithm for mining sequential patterns from nominal items. The aim of this paper is to experimentally compare these two approaches to mine artificially generated sequence databases with numerical context attributes where several reference patterns are hidden. The results of experiments show that the Context Mapping Improved algorithm has led to better re-discovery of reference patterns. Moreover, a new measure for comparing two sets of context based patterns is introduced.
Źródło:: Control and Cybernetics; 2009, 38, 1; 27-45
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Extending k-means with the description comes first approach
Autorzy:: Stefanowski, J.
Weiss, D.
Powiązania:: https://bibliotekanauki.pl/articles/970926.pdf
Data publikacji:: 2007
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: document clustering
cluster labels
k-means algorithm
information retrieval
Opis:: This paper describes a technique for clustering large collections of short and medium length text documents such as press articles, news stories and the like. The technique called description comes first (DCF) consists of identification of related document clusters, selection of salient phrases relevant to these clusters and reallocation of documents matching the selected phrases to form final document groups. The advantages of this technique include more comprehensive cluster labels and clearer (more transparent) relationship between cluster labels and their content. We demonstrate the DCF by taking a standard k-means algorithm as a baseline and weaving DCF elements into it; the outcome is the descriptive k-means (DKM) algorithm. The paper goes through technical background explaining how to implement DKM efficiently and ends with the description of an experiment measuring clustering quality on a benchmark document collection 20-newsgroups. Short fragments of this paper appeared at the poster session of the RIAO 2007 conference, Pittsburgh, PA, USA (electronic proceedings only).
Źródło:: Control and Cybernetics; 2007, 36, 4; 1009-1035
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Semi-supervised approach to handle sudden concept drift in Enron data
Autorzy:: Kmieciak, M. R.
Stefanowski, J.
Powiązania:: https://bibliotekanauki.pl/articles/206052.pdf
Data publikacji:: 2011
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: concept drift
incremental learning of classifiers
email foldering
Enron data
Opis:: Detection of concept changes in incremental learning from data streams and classifier adaptation is studied in this paper. It is often assumed that all processed learning examples are always labeled, i.e. the class label is available for each example. As it may be difficult to satisfy this assumption in practice, in particular in case of data streams, we introduce an approach that detects concept drift in unlabeled data and retrains the classifier using a limited number of additionally labeled examples. The usefulness of this partly supervised approach is evaluated in the experimental study with the Enron data. This real life data set concerns classification of user's emails to multiple folders. Firstly, we show that the Enron data are characterized by frequent sudden changes of concepts. We also demonstrate that our approach can precisely detect these changes. Results of the next comparative study demonstrate that our approach leads to the classification accuracy comparable to two fully supervised methods: the periodic retraining of the classifier based on windowing and the trigger approach with the DDM supervised drift detection. However, our approach reduces the number of examples to be labeled. Furthermore, it requires less updates of retraining classifiers than windowing.
Źródło:: Control and Cybernetics; 2011, 40, 3; 667-695
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Exploring complex and big data
Autorzy:: Stefanowski, J.
Krawiec, K.
Wrembel, R.
Powiązania:: https://bibliotekanauki.pl/articles/330152.pdf
Data publikacji:: 2017
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: big data
complex data
data integration
data provenance
data streams
deep learning
dane złożone
integracja danych
pochodzenie danych
strumień danych
uczenie głębokie
Opis:: This paper shows how big data analysis opens a range of research and technological problems and calls for new approaches. We start with defining the essential properties of big data and discussing the main types of data involved. We then survey the dedicated solutions for storing and processing big data, including a data lake, virtual integration, and a polystore architecture. Difficulties in managing data quality and provenance are also highlighted. The characteristics of big data imply also specific requirements and challenges for data mining algorithms, which we address as well. The links with related areas, including data streams and deep learning, are discussed. The common theme that naturally emerges from this characterization is complexity. All in all, we consider it to be the truly defining feature of big data (posing particular research and technological challenges), which ultimately seems to be of greater importance than the sheer data volume.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2017, 27, 4; 669-679
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Rough set based processing of inconsistent information in decision analysis
Autorzy:: Słowiński, R.
Stefanowski, J.
Greco, S.
Matarazzo, B.
Powiązania:: https://bibliotekanauki.pl/articles/206765.pdf
Data publikacji:: 2000
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: klasyfikacja
kombinatoryka
teoria decyzji
teoria gier
classification
decision analysis
knowledge based systems
multi-criteria decision analysis
rough sets
rule induction
Opis:: Inconsistent information is one of main difficulties in the explanation and recommendation tasks of decision analysis. We distinguish two kinds of such information inconsistencies : the first is related to indiscernibility of objects described by attributes defined in nominal or ordinal scales, and the other follows from violation of the dominance principle among attributes defined on preference ordered ordinal or cardinal scales, i.e. among criteria. In this paper we discuss how these two kinds of inconsistencies are handled by a new approach based on the rough sets theory. Combination of this theory with inductive learning techniques leads to generation of decision rules from rough approximations of decision classes. Particular attention is paid to numerical attribute scales and preference-ordered scales of criteria, and their influence on the syntax of induced decision rules.
Źródło:: Control and Cybernetics; 2000, 29, 1; 379-404
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "Stefanowski, J." wg kryterium: Autor

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język