Temat: BIG - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: A hybrid scheduler for many task computing in big data systems
Autorzy:: Vasiliu, L.
Pop, F.
Negru, C.
Mocanu, M.
Cristea, V.
Kolodziej, J.
Powiązania:: https://bibliotekanauki.pl/articles/907647.pdf
Data publikacji:: 2017
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: many task computing
scheduling heuristics
QoS
big data system
simulation
obliczenia wielofunkcyjne
szeregowanie zadań
duży zbiór danych
Opis:: With the rapid evolution of the distributed computing world in the last few years, the amount of data created and processed has fast increased to petabytes or even exabytes scale. Such huge data sets need data-intensive computing applications and impose performance requirements to the infrastructures that support them, such as high scalability, storage, fault tolerance but also efficient scheduling algorithms. This paper focuses on providing a hybrid scheduling algorithm for many task computing that addresses big data environments with few penalties, taking into consideration the deadlines and satisfying a data dependent task model. The hybrid solution consists of several heuristics and algorithms (min-min, min-max and earliest deadline first) combined in order to provide a scheduling algorithm that matches our problem. The experimental results are conducted by simulation and prove that the proposed hybrid algorithm behaves very well in terms of meeting deadlines.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2017, 27, 2; 385-399
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Efficient storage, retrieval and analysis of poker hands: An adaptive data framework
Autorzy:: Gorawski, M.
Lorek, M.
Powiązania:: https://bibliotekanauki.pl/articles/330018.pdf
Data publikacji:: 2017
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: big data
storage model design
data architecture
data access
path optimization
zbiór danych
architektura danych
udostępnianie danych
optymalizacja obszaru
Opis:: In online gambling, poker hands are one of the most popular and fundamental units of the game state and can be considered objects comprising all the events that pertain to the single hand played. In a situation where tens of millions of poker hands are produced daily and need to be stored and analysed quickly, the use of relational databases no longer provides high scalability and performance stability. The purpose of this paper is to present an efficient way of storing and retrieving poker hands in a big data environment. We propose a new, read-optimised storage model that offers significant data access improvements over traditional database systems as well as the existing Hadoop file formats such as ORC, RCFile or SequenceFile. Through index-oriented partition elimination, our file format allows reducing the number of file splits that needs to be accessed, and improves query response time up to three orders of magnitude in comparison with other approaches. In addition, our file format supports a range of new indexing structures to facilitate fast row retrieval at a split level. Both index types operate independently of the Hive execution context and allow other big data computational frameworks such as MapReduce or Spark to benefit from the optimized data access path to the hand information. Moreover, we present a detailed analysis of our storage model and its supporting index structures, and how they are organised in the overall data framework. We also describe in detail how predicate based expression trees are used to build effective file-level execution plans. Our experimental tests conducted on a production cluster, holding nearly 40 billion hands which span over 4000 partitions, show that multi-way partition pruning outperforms other existing file formats, resulting in faster query execution times and better cluster utilisation.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2017, 27, 4; 713-726
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Interpretable decision-tree induction in a big data parallel framework
Autorzy:: Weinberg, A. I.
Last, M.
Powiązania:: https://bibliotekanauki.pl/articles/330635.pdf
Data publikacji:: 2017
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: big data
parallel computing
mapreduce
decision trees
editing distance
tree similarity
zbiór danych
obliczenia równoległe
drzewa decyzyjne
odległość edycji
Opis:: When running data-mining algorithms on big data platforms, a parallel, distributed framework, such as MAPREDUCE, may be used. However, in a parallel framework, each individual model fits the data allocated to its own computing node without necessarily fitting the entire dataset. In order to induce a single consistent model, ensemble algorithms such as majority voting, aggregate the local models, rather than analyzing the entire dataset directly. Our goal is to develop an efficient algorithm for choosing one representative model from multiple, locally induced decision-tree models. The proposed SySM (syntactic similarity method) algorithm computes the similarity between the models produced by parallel nodes and chooses the model which is most similar to others as the best representative of the entire dataset. In 18.75% of 48 experiments on four big datasets, SySM accuracy is significantly higher than that of the ensemble; in about 43.75% of the experiments, SySM accuracy is significantly lower; in one case, the results are identical; and in the remaining 35.41% of cases the difference is not statistically significant. Compared with ensemble methods, the representative tree models selected by the proposed methodology are more compact and interpretable, their induction consumes less memory, and, as confirmed by the empirical results, they allow faster classification of new records.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2017, 27, 4; 737-748
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Applications of rough sets in big data analysis: An overview
Autorzy:: Pięta, Piotr
Szmuc, Tomasz
Powiązania:: https://bibliotekanauki.pl/articles/2055175.pdf
Data publikacji:: 2021
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: rough sets theory
big data analysis
deep learning
data mining
teoria zbiorów przybliżonych
duży zbiór danych
uczenie głębokie
eksploracja danych
Opis:: Big data, artificial intelligence and the Internet of things (IoT) are still very popular areas in current research and industrial applications. Processing massive amounts of data generated by the IoT and stored in distributed space is not a straightforward task and may cause many problems. During the last few decades, scientists have proposed many interesting approaches to extract information and discover knowledge from data collected in database systems or other sources. We observe a permanent development of machine learning algorithms that support each phase of the data mining process, ensuring achievement of better results than before. Rough set theory (RST) delivers a formal insight into information, knowledge, data reduction, uncertainty, and missing values. This formalism, formulated in the 1980s and developed by several researches, can serve as a theoretical basis and practical background for dealing with ambiguities, data reduction, building ontologies, etc. Moreover, as a mature theory, it has evolved into numerous extensions and has been transformed through various incarnations, which have enriched expressiveness and applicability of the related tools. The main aim of this article is to present an overview of selected applications of RST in big data analysis and processing. Thousands of publications on rough sets have been contributed; therefore, we focus on papers published in the last few years. The applications of RST are considered from two main perspectives: direct use of the RST concepts and tools, and jointly with other approaches, i.e., fuzzy sets, probabilistic concepts, and deep learning. The latter hybrid idea seems to be very promising for developing new methods and related tools as well as extensions of the application area.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2021, 31, 4; 659--683
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Recommendation systems with the quantum k-NN and Grover algorithms for data processing
Autorzy:: Sawerwain, Marek
Wróblewski, Marek
Powiązania:: https://bibliotekanauki.pl/articles/330538.pdf
Data publikacji:: 2019
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: quantum k-NN algorithm
recommendation system
Grover algorithm
big data
kwantowy algorytm k-NN
system rekomendujący
algorytm Grovera
duży zbiór danych
Opis:: In this article, we discuss the implementation of a quantum recommendation system that uses a quantum variant of the k-nearest neighbours algorithm and the Grover algorithm to search for a specific element in an unstructured database. In addition to the presentation of the recommendation system as an algorithm, the article also shows the main steps in construction of a suitable quantum circuit for realisation of a given recommendation system. The computational complexity of individual calculation steps in the recommendation system is also indicated. The verification of the correctness of the proposed system is analysed as well, indicating an algebraic equation describing the probability of success of the recommendation. The article also shows numerical examples presenting the behaviour of the recommendation system for two selected cases.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2019, 29, 1; 139-150
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Exploring complex and big data
Autorzy:: Stefanowski, J.
Krawiec, K.
Wrembel, R.
Powiązania:: https://bibliotekanauki.pl/articles/330152.pdf
Data publikacji:: 2017
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: big data
complex data
data integration
data provenance
data streams
deep learning
dane złożone
integracja danych
pochodzenie danych
strumień danych
uczenie głębokie
Opis:: This paper shows how big data analysis opens a range of research and technological problems and calls for new approaches. We start with defining the essential properties of big data and discussing the main types of data involved. We then survey the dedicated solutions for storing and processing big data, including a data lake, virtual integration, and a polystore architecture. Difficulties in managing data quality and provenance are also highlighted. The characteristics of big data imply also specific requirements and challenges for data mining algorithms, which we address as well. The links with related areas, including data streams and deep learning, are discussed. The common theme that naturally emerges from this characterization is complexity. All in all, we consider it to be the truly defining feature of big data (posing particular research and technological challenges), which ultimately seems to be of greater importance than the sheer data volume.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2017, 27, 4; 669-679
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Efficient astronomical data condensation using approximate nearest neighbors
Autorzy:: Łukasik, Szymon
Lalik, Konrad
Sarna, Piotr
Kowalski, Piotr A.
Charytanowicz, Małgorzata
Kulczycki, Piotr
Powiązania:: https://bibliotekanauki.pl/articles/907932.pdf
Data publikacji:: 2019
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: big data
astronomical observation
data reduction
nearest neighbor search
kd-trees
duży zbiór danych
obserwacja astronomiczna
redukcja danych
wyszukiwanie najbliższego sąsiada
drzewo kd
Opis:: Extracting useful information from astronomical observations represents one of the most challenging tasks of data exploration. This is largely due to the volume of the data acquired using advanced observational tools. While other challenges typical for the class of big data problems (like data variety) are also present, the size of datasets represents the most significant obstacle in visualization and subsequent analysis. This paper studies an efficient data condensation algorithm aimed at providing its compact representation. It is based on fast nearest neighbor calculation using tree structures and parallel processing. In addition to that, the possibility of using approximate identification of neighbors, to even further improve the algorithm time performance, is also evaluated. The properties of the proposed approach, both in terms of performance and condensation quality, are experimentally assessed on astronomical datasets related to the GAIA mission. It is concluded that the introduced technique might serve as a scalable method of alleviating the problem of the dataset size.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2019, 29, 3; 467-476
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: An effective data reduction model for machine emergency state detection from big data tree topology structures
Autorzy:: Iaremko, Iaroslav
Senkerik, Roman
Jasek, Roman
Lukastik, Petr
Powiązania:: https://bibliotekanauki.pl/articles/2055178.pdf
Data publikacji:: 2021
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: OPC UA
OPC tree
principal component analysis
PCA
big data analysis
data reduction
machine tool
anomaly detection
emergency states
analiza głównych składowych
duży zbiór danych
redukcja danych
wykrywanie anomalii
stan nadzwyczajny
Opis:: This work presents an original model for detecting machine tool anomalies and emergency states through operation data processing. The paper is focused on an elastic hierarchical system for effective data reduction and classification, which encompasses several modules. Firstly, principal component analysis (PCA) is used to perform data reduction of many input signals from big data tree topology structures into two signals representing all of them. Then the technique for segmentation of operating machine data based on dynamic time distortion and hierarchical clustering is used to calculate signal accident characteristics using classifiers such as the maximum level change, a signal trend, the variance of residuals, and others. Data segmentation and analysis techniques enable effective and robust detection of operating machine tool anomalies and emergency states due to almost real-time data collection from strategically placed sensors and results collected from previous production cycles. The emergency state detection model described in this paper could be beneficial for improving the production process, increasing production efficiency by detecting and minimizing machine tool error conditions, as well as improving product quality and overall equipment productivity. The proposed model was tested on H-630 and H-50 machine tools in a real production environment of the Tajmac-ZPS company.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2021, 31, 4; 601--611
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "BIG" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język