Temat: selekcja danych - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: On the predictive power of meta-features in OpenML
Autorzy:: Bilalli, B.
Abelló, A.
Aluja-Banet, T.
Powiązania:: https://bibliotekanauki.pl/articles/331086.pdf
Data publikacji:: 2017
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: feature extraction
feature selection
meta learning
ekstrakcja danych
selekcja danych
uczenie maszynowe
Opis:: The demand for performing data analysis is steadily rising. As a consequence, people of different profiles (i.e., nonexperienced users) have started to analyze their data. However, this is challenging for them. A key step that poses difficulties and determines the success of the analysis is data mining (model/algorithm selection problem). Meta-learning is a technique used for assisting non-expert users in this step. The effectiveness of meta-learning is, however, largely dependent on the description/characterization of datasets (i.e., meta-features used for meta-learning). There is a need for improving the effectiveness of meta-learning by identifying and designing more predictive meta-features. In this work, we use a method from exploratory factor analysis to study the predictive power of different meta-features collected in OpenML, which is a collaborative machine learning platform that is designed to store and organize meta-data about datasets, data mining algorithms, models and their evaluations. We first use the method to extract latent features, which are abstract concepts that group together meta-features with common characteristics. Then, we study and visualize the relationship of the latent features with three different performance measures of four classification algorithms on hundreds of datasets available in OpenML, and we select the latent features with the highest predictive power. Finally, we use the selected latent features to perform meta-learning and we show that our method improves the meta-learning process. Furthermore, we design an easy to use application for retrieving different meta-data from OpenML as the biggest source of data in this domain.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2017, 27, 4; 697-712
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Zastosowanie sztucznych sieci neuronowych do wyznaczania przepuszczalności skał na podstawie danych otworowych z rejonu Dzików–Wola Obszańska w północno-wschodniej części zapadliska przedkarpackiego
Artificial Neural Networks applying for determining the absolute rock permeability on the basis of data from boreholes situated on the Dzików–Wola Obszańska area (northeastern part of the Carpathian Foredeep Basin)
Autorzy:: Jarzyna, J.
Prętka, J.
Powiązania:: https://bibliotekanauki.pl/articles/2063104.pdf
Data publikacji:: 2010
Wydawca:: Państwowy Instytut Geologiczny – Państwowy Instytut Badawczy
Tematy:: przepuszczalność skał
sztuczne sieci neuronowe SSN
selekcja danych wejściowych
profilowania geofizyki otworowej
rock permeability
artificial neural networks
input data selection
well logging
Opis:: Zbadano zdolność sztucznych sieci neuronowych SNN do oceny przepuszczalności absolutnej skał. Do tego celu wykorzystano dane z pięciu otworów wiertniczych, zlokalizowanych w północno-wschodniej części zapadliska przedkarpackiego: Dzików 16, 17, 20 oraz Wola Obszańska 10 i 15. Modele neuronowe stworzono na podstawie wyników badań laboratoryjnych próbek skał pobranych w wymienionych otworach, profilowań geofizyki otworowej oraz wyników kompleksowej interpretacji tych profilowań. Otrzymano SSN, służącą do odtwarzania wartości przepuszczalności całkowitej, określonej w laboratorium. Następnie model neuronowy wdrożono do estymowania przepuszczalności w otworze wiertniczym DZ17, przenosząc tym samym rozpoznane wcześniej zależności na nowy zbiór danych. Sieci neuronowe mogą stanowić alternatywę dla klasycznych metod wyznaczania przepuszczalności skał.
The absolute rock permeability was determinated with the use of artificial neural networks (ANN). Authors checked up ANN ability to determine permeability on the data from five borehole locked in northeastern part of the Carpathian Foredeep: Dzików 16, 17, 20 and Wola Obszańska 10 and 15. Neural models were built on the basis of results from laboratory tests, well logs data and the results of the comprehensive interpretation. ANN provided good results in estimating laboratory permeability. The best neural network was applied on similar data set from DZ17 borehole to show that complicated links between input variable and absolute permeability can be used for prediction of permeability from another data. It is hard to find deft deterministic model for permeability estimation so neural model gained in training process is an alternative method.
Źródło:: Biuletyn Państwowego Instytutu Geologicznego; 2010, 439 (2); 399--402
0867-6143
Pojawia się w:: Biuletyn Państwowego Instytutu Geologicznego
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Intelligent Retail Forecasting System for New Clothing Products Considering Stock-out
Inteligentny system przewidywania sprzedaży detalicznej nowych produktów odzieżowych uwzględniający wyprzedaż
Autorzy:: Huang, H.
Liu, Q.
Powiązania:: https://bibliotekanauki.pl/articles/232823.pdf
Data publikacji:: 2017
Wydawca:: Sieć Badawcza Łukasiewicz - Instytut Biopolimerów i Włókien Chemicznych
Tematy:: intelligent forecasting system
demand estimation
stock out
adaptive neuro fuzzy inference system
new clothing product
inteligentny system prognozowania
prognozowanie popytu
system adaptacyjno neuronowy
dane rozproszone
selekcja danych
Opis:: Improving the accuracy of forecasting is crucial but complex in the clothing industry, especially for new products, with the lack of historical data and a wide range of factors affecting demand. Previous studies more concentrate on sales forecasting rather than demand forecasting, and the variables affecting demand remained to be optimized. In this study, a two-stage intelligent retail forecasting system is designed for new clothing products. In the first stage, demand is estimated with original sales data considering stock-out. The adaptive neuro fuzzy inference system (ANFIS) is introduced into the second stage to forecast demand. Meanwhile a data selection process is presented due to the limited data of new products. The empirical data are from a Canadian fast-fashion company. The results reveal the relationship between demand and sales, demonstrate the necessity of integrating the demand estimation process into a forecasting system, and show that the ANFIS-based forecasting system outperforms the traditional ANN technique.
Poprawa dokładności prognozowania jest bardzo istotna, ale skomplikowana w przypadku przemysłu odzieżowego, zwłaszcza dla nowych produktów oraz szerokiego zakresu czynników wpływających na popyt. Wcześniejsze badania bardziej koncentrowały się na prognozowaniu sprzedaży, niż prognozowaniu popytu. Zmienne wpływające na popyt powinny zostać zoptymalizowane. W tym badaniu opracowano dwustopniowy inteligentny system prognozowania sprzedaży detalicznej przeznaczony dla nowych produktów odzieżowych. W pierwszym etapie, popyt jest określony za pośrednictwem oryginalnych danych dotyczących sprzedaży. Adaptacyjny neuronowy system danych rozproszonych (ANFIS) jest wprowadzony w drugim etapie do prognozowania popytu. Jednocześnie prezentowany jest proces selekcji danych. Dane empiryczne pochodzą z kanadyjskiej firmy.
Źródło:: Fibres & Textiles in Eastern Europe; 2017, 1 (121); 10-16
1230-3666
2300-7354
Pojawia się w:: Fibres & Textiles in Eastern Europe
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Informacja w organizacji -- gromadzenie i przetwarzanie
Autorzy:: Sirko, Stanisław.
Powiązania:: Przegląd Informacyjno-Dokumentacyjny Centralny Ośrodek Naukowej Informacji Wojskowej, 2003, nr 3, s. 58-70
Data publikacji:: 2003
Tematy:: Katalogowanie rzeczowe
Informacja naukowa wojsko gromadzenie i selekcja
Przetwarzanie danych
Opis:: Rys.
Dostawca treści:: Bibliografia CBW

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Feature Selection for Prognostic Models by Linear Separation of Survival Genetic Data Sets
Selekcja cech na potrzeby modeli prognostycznych poprzez liniową separację zbiorów danych genetycznych dotyczących analizy przeżycia
Autorzy:: Bobrowski, L.
Łukaszuk, T.
Powiązania:: https://bibliotekanauki.pl/articles/88380.pdf
Data publikacji:: 2018
Wydawca:: Politechnika Białostocka. Oficyna Wydawnicza Politechniki Białostockiej
Tematy:: eksploracja danych
regresja interwałowa
selekcja modelu
relaksacja separacji liniowej
data mining
interval regression
model selection
relaxed linear separability
Opis:: Designing regression models based on high dimensional (e.g. genetic) data sets through exploring linear separability problem is considered in the paper. The linear regression model designing has been reformulated here as the linear separability problem. Exploring the linear separability problem has been based on minimization of the convex and piecewise linear (CPL) criterion functions. The minimization of the CPL criterion functions was used not only for estimating the prognostic model parameters, but also for most effective selecting feature subsets (model selection) in accordance with the relaxed linear separability (RLS) method. This approach to designing prognostic models has been used in experiments both with synthetic multivariate data, and with genetic data sets containing censored values of dependent variable. The quality of the prognostic models resulting from the linear separability postulate has been evaluated by using the measure of the model discrepancy and the estimated classification error rate. In order to reduce the bias of the evaluation, the value of the model discrepancy and the classification error have been computed in different feature subspaces, in accordance with the cross-validation procedure. A series of new experiments described in this paper shows that the designing of regression models can be based on the linear separability principle. More specifically, the high-dimensional genetic sets with censored dependent variable can be used in designing procedure. The proposed measure of prognostic model discrepancy can be effectively used in the search for the optimal feature subspace and for selecting the linear regression model.
W artykule rozważane jest projektowanie modeli regresji opartych na wysokowymiarowych (np. genetycznych) zbiorach danych poprzez badanie problemu separacji liniowej. Projektowanie modelu regresji liniowej zostało tu przeformułowane jako problem separacji liniowej. Eksploracja problemu separacji liniowej opiera się na minimalizacji wypukłej i odcinkowo-liniowej (CPL) funkcji kryterialnej. Minimalizacja funkcji kryterialnej typu CPL została wykorzystana nie tylko do oszacowania parametrów modelu prognostycznego, ale również do skutecznego wyboru podzbioru cech (selekcji modelu) zgodnie z metodą relaksacji separacji liniowej (RLS). Takie podejście do projektowania modeli prognostycznych zostało wykorzystane w eksperymentach zarówno z syntetycznymi danymi wielowymiarowymi, jak i do zbiorów danych genetycznych zawierających cenzurowane wartości zmiennej zależnej. Jakość modeli prognostycznych otrzymywanych w oparciu o postulat liniowej separacji została oceniona przy użyciu miary rozbieżności modelu i szacowanego wskaźnika błędu klasyfikacji. W celu zmniejszenia obciążenia oceny, obliczono wartości rozbieżności modelu i błędu klasyfikacji w różnych podprzestrzeniach cech, zgodnie z procedurą walidacji krzyżowej. Seria nowych eksperymentów opisanych w niniejszym opracowaniu pokazuje, ze projektowanie modeli regresji może być oparte na zasadzie separacji liniowej. W szczególności, w procedurze projektowania można użyć wysokowymiarowych zbiorów genetycznych o cenzurowanej zmiennej zależnej. Proponowana miara rozbieżności modelu prognostycznego może być skutecznie wykorzystana w poszukiwaniu optymalnej podprzestrzeni cech i selekcji modelu regresji liniowej.
Źródło:: Advances in Computer Science Research; 2018, 14; 31-54
2300-715X
Pojawia się w:: Advances in Computer Science Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Optimization on the complementation procedure towards efficient implementation of the index generation function
Autorzy:: Borowik, G.
Powiązania:: https://bibliotekanauki.pl/articles/330597.pdf
Data publikacji:: 2018
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: data reduction
feature selection
indiscernibility matrix
logic synthesis
index generation function
redukcja danych
selekcja cech
synteza logiczna
funkcja generowania indeksów
Opis:: In the era of big data, solutions are desired that would be capable of efficient data reduction. This paper presents a summary of research on an algorithm for complementation of a Boolean function which is fundamental for logic synthesis and data mining. Successively, the existing problems and their proposed solutions are examined, including the analysis of current implementations of the algorithm. Then, methods to speed up the computation process and efficient parallel implementation of the algorithm are shown; they include optimization of data representation, recursive decomposition, merging, and removal of redundant data. Besides the discussion of computational complexity, the paper compares the processing times of the proposed solution with those for the well-known analysis and data mining systems. Although the presented idea is focused on searching for all possible solutions, it can be restricted to finding just those of the smallest size. Both approaches are of great application potential, including proving mathematical theorems, logic synthesis, especially index generation functions, or data processing and mining such as feature selection, data discretization, rule generation, etc. The problem considered is NP-hard, and it is easy to point to examples that are not solvable within the expected amount of time. However, the solution allows the barrier of computations to be moved one step further. For example, the unique algorithm can calculate, as the only one at the moment, all minimal sets of features for few standard benchmarks. Unlike many existing methods, the algorithm additionally works with undetermined values. The result of this research is an easily extendable experimental software that is the fastest among the tested solutions and the data mining systems.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2018, 28, 4; 803-815
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Identification of submersible pump temperature changes model using KDD methods
Identyfikacja modelu zmian temperatury pompy głębinowej z zastosowaniem metod odkrywania wiedzy w bazach danych
Autorzy:: Wachla, D.
Powiązania:: https://bibliotekanauki.pl/articles/327824.pdf
Data publikacji:: 2006
Wydawca:: Polska Akademia Nauk. Polskie Towarzystwo Diagnostyki Technicznej PAN
Tematy:: baza danych
wiedza
identyfikacja systemów
algorytm genetyczny
metoda wektorów wspomagających
selekcja atrybutów
system SCADA
database
knowledge
system identification
genetic algorithm
support vector machines
attributes selection
SCADA systems
Opis:: This paper deals with the problem of the autoregressive model identification using KDD methods. In the considered problem, the autoregressive models are applied to describe dynamics processes of various technical systems. In particular, a method of functional dependencies discovering was presented. The method was designed for exploring data sets gathered by industrial SCADA systems. For the problem of the identification of pump temperature changes model, the method was verified. For this particular reason, a set of data was used which was gathered by submersible pumping station SCADA system. The assumptions, the exemplary results of the conducted research and conclusions were presented, as well.
W artykule poruszono problem identyfikacji modeli autoregresyjnych opisujących dynamikę obserwowanych procesów. W szczególności przedstawiono metodę odkrywania zależności funkcyjnych w zbiorach danych gromadzonych przez przemysłowe systemy SCADA. Opracowaną metodę zweryfikowano dla problemu identyfikacji modelu zmian temperatury pompy głębinowej. W tym celu zastosowano fragment danych zgromadzony przez system rejestracji danych współpracujący pompownią głębinową. Przedstawiono przyjęte założenia, fragmenty uzyskanych wyników oraz wnioski z przeprowadzonych badań.
Źródło:: Diagnostyka; 2006, 2(38); 41-44
1641-6414
2449-5220
Pojawia się w:: Diagnostyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "selekcja danych" wg kryterium: Temat

Źródło danych

Dostawca treści

Podbaza

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język