Autor: Pietroń, M. - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Elementary functions in HLL on example of CORDIC algorithm implemented in Mitrion-C language
Implementacja funkcji elementarnych w FPGA na przykładzie algorytmu CORDIC w języku wysokiego poziomu Mitrion-C
Autorzy:: Pietroń, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/156559.pdf
Data publikacji:: 2012
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: HLL
FPGA
funkcje elementarne
HPRC
elementary functions
Opis:: The elementary functions are very often used in scientific computations. The quantum chemistry, physics, financial computing are only examples were elementary functions like exponent, logarithm are intensively computed. This paper presents implementation of an exp(x) core in a CORDIC-algorithm written in Mitrion-C lanuage. The Mitrion-C language is a new high level language. It enables implementing pipelined and wide paralleled algorithms on FPGA platforms. It makes process of algorithms implementation on FPGA faster. From gravitational forces to quantum chemistry or financial mathematics, computational scientists very often use exp(x) in computer simulations. The implemented core generates IEEE 754 standard single precision exponential values. The CORDIC algorithm can be used to compute wide spectrum of different elementary functions like sine, cosine, tangent. In our solution values of the exponent for integer part of the input argument are stored in a table. The table is allocated in an internal memory. The fractional part is computed by the CORDIC algorithm. The final result is achieved by multiplying the values of the fractional and integer part. Our implementation is made on SGI Altix 4700 hardware platform. It is SGI multiprocessor distributed shared memory computer system with Virtex-4 LX 200 FPGAs.
Funkcje elementarne są bardzo często wykorzystywane w obliczeniach naukowych. Chemia kwantowa, matematyka finansowa, fizyka jedne z wielu dziedzin gdzie funkcje takie jak eksponenta, logarytm są intensywnie wykonywane. Praca ta przedstawia implementację funkcji eksponenty za pomocą algorytmu CORDIC w języku Mitrion-C. Mitrion-C jest nowym językiem wysokiego poziomu programowania układów FPGA. Język ten posiada odpowiednie instrukcje oraz wbudowane typy danych, które pozwalają na programowanie algorytmów potokowo jak i całkowicie równolegle. W naszym rozwiązaniu argument wejściowy jest rozdzielony na część całkowitą i część ułamkową. Wartości eksponenty dla części całkowitej przechowywane są w tablicy w pamięci wewnętrznej natomiast część wartość dla części ułamkowej obliczana jest algorytmem CORDIC. Wynik końcowy obliczany jest za pomocą mnożenia części ułamkowej i całkowitej. Implementacja wykonana jest na platformie sprzętowej SGI ALTIX 4700. Jest to platforma wieloprocesorowa ze współdzieloną pamięcią oraz układami FPGA typu Virtex-4 LX 200.
Źródło:: Pomiary Automatyka Kontrola; 2012, R. 58, nr 7, 7; 671-673
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Implementacja oraz porównanie algorytmów tekstowych w środowiskach przetwarzania równoległego na przykładzie procesorów wielordzeniowych i kart graficznych
Multicore and GPGPU implementation of chosen text algorithms
Autorzy:: Pietroń, M.
Wielgosz, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/155953.pdf
Data publikacji:: 2014
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: algorytmy tekstowe
GPGPU
obliczenia równoległe
text mining
text algorithms
parallel computing
Opis:: Artykuł przedstawia implementację algorytmów tekstowych w wybranych platformach przetwarzania równoległego. Dostępność procesorów wielordzeniowych oraz kart graficznych ogólnego przeznaczenia sprawia, iż badania nad równoległą implementacją algorytmów w celu ich akceleracji nabierają coraz większego znaczenia. Algorytmy tekstowe są niezwykle istotnym i często niezbędnym elementem zaawansowanych algorytmów analizy tekstu oraz są także składowymi funkcji wyszukiwania wzorców w tekście wielu języków programowania. W pracy dokonano analizy najpopularniejszych algorytmów tekstowych oraz dokonano ich analizy pod kątem ich zrównoleglenia w celu ich implementacji w procesorze wielordzeniowym oraz karcie graficznej ogólnego przeznaczenia. Analizowanymi algorytmami są: boyer-moore, algorytm naiwny oraz algorytm knuth-morris-pratt. Następnie dokonano porównania efektywności ich realizacji na wymienionych platformach sprzętowych.
This paper presents implementation of text algorithms in multicore CPU and GPGPU. The text algorithms are very common algorithms used in text analysis process and they are a part of functions used for text patterns recognition. The library functions for text searching implemented in many languages very often use most popular text-algorithms. The paper describes the analysis of these algorithms for parallel implementations in multicore processors and general purpose graphic cards. The research work presented in this paper shows that text algorithms can be partially parallelized. The process of acceleration can be done by appropriate dividing the input text between parallel threads (data parallelism). The comparative studies were performed for the following algorithms: boyer-moore (horspool) , naive and knuth-morris-pratt algorithm. The presented results show the efficiency of these algorithms in the case of different type and size of patterns. In the case of GPU the implementation was made in the CUDA framework. The OpenMP library was used for a multicore version.
Źródło:: Pomiary Automatyka Kontrola; 2014, R. 60, nr 5, 5; 301-304
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Loop profiling tool for HPC code inspection as an efficient method of FPGA based acceleration
Autorzy:: Pietroń, M.
Russek, P.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/929582.pdf
Data publikacji:: 2010
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: HPC
HPRC
obliczanie wysokowartościowe
obliczanie rekonfigurowalne
przetwarzanie danych
loop profiling
Mitrion-C
DFG (data flow graph)
Opis:: This paper presents research on FPGA based acceleration of HPC applications. The most important goal is to extract a code that can be sped up. A major drawback is the lack of a tool which could do it. HPC applications usually consist of a huge amount of a complex source code. This is one of the reasons why the process of acceleration should be as automated as possible. Another reason is to make use of HLLs (High Level Languages) such as Mitrion-C (Mohl, 2006). HLLs were invented to make the development of HPRC applications faster. Loop profiling is one of the steps to check if the insertion of an HLL to an existing HPC source code is possible to gain acceleration of these applications. Hence the most important step to achieve acceleration is to extract the most time consuming code and data dependency, which makes the code easier to be pipelined and parallelized. Data dependency also gives information on how to implement algorithms in an FPGA circuit with minimal initialization of it during the execution of algorithms.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2010, 20, 3; 581-589
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: The Java profiler based on byte code analysis and instrumentation for many-core hardware accelerators
Autorzy:: Pietroń, M.
Karwatowski, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/114614.pdf
Data publikacji:: 2015
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: virtual machine
CUDA
GPU
profiling
parallel computing
Opis:: One of the most challenging issues in the case of many and multi-core architectures is how to exploit their potential computing power in legacy systems without a deep knowledge of their architecture. The analysis of static dependence and dynamic data dependences of a program run, can help to identify independent paths that could have been computed by individual parallel threads. The statistics of reusing the data and its size is also crucial in adapting the application in GPU many-core hardware architecture because of specific memory hierarchies. The proposed profiling system accomplishes static data analysis and computes dynamic dependencies for Java programs as well as recommends parts of source code with the highest potential for parallelization in GPU. Such an analysis can also provide starting point for automatic parallelization.
Źródło:: Measurement Automation Monitoring; 2015, 61, 7; 385-387
2450-2855
Pojawia się w:: Measurement Automation Monitoring
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Accelerating SELECT WHERE and SELECT JOIN queries on a GPU
Autorzy:: Pietroń, M.
Russek, P.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/305797.pdf
Data publikacji:: 2013
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: SQL
CUDA
relational databases
GPU
Opis:: This paper presents implementations of a few selected SQL operations using the CUDA programming framework on the GPU platform. Nowadays, the GPU’s parallel architectures give a high speed-up on certain problems. Therefore, the number of non-graphical problems that can be run and sped-up on the GPU still increases. Especially, there has been a lot of research in data mining on GPUs. In many cases it proves the advantage of offloading processing from the CPU to the GPU. At the beginning of our project we chose the set of SELECT WHERE and SELECT JOIN instructions as the most common operations used in databases. We parallelized these SQL operations using three main mechanisms in CUDA: thread group hierarchy, shared memories, and barrier synchronization. Our results show that the implemented highly parallel SELECT WHERE and SELECT JOIN operations on the GPU platform can be significantly faster than the sequential one in a database system run on the CPU.
Źródło:: Computer Science; 2013, 14 (2); 243-252
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: A study of parallel techniques for dimensionality reduction and its impact on the quality of text processing algorithms
Autorzy:: Pietroń, M.
Wielgosz, M.
Karwatowski, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/114190.pdf
Data publikacji:: 2015
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: singular value decomposition
vector space model
TFIDF
Opis:: The presented algorithms employ the Vector Space Model (VSM) and its enhancements such as TFIDF (Term Frequency Inverse Document Frequency) with Singular Value Decomposition (SVD). TFIDF were applied to emphasize the important features of documents and SVD was used to reduce the analysis space. Consequently, a series of experiments were conducted. They revealed important properties of the algorithms and their accuracy. The accuracy of the algorithms was estimated in terms of their ability to match the human classification of the subject. For unsupervised algorithms the entropy was used as a quality evaluation measure. The combination of VSM, TFIDF, and SVD came out to be the best performing unsupervised algorithm with entropy of 0.16.
Źródło:: Measurement Automation Monitoring; 2015, 61, 7; 352-353
2450-2855
Pojawia się w:: Measurement Automation Monitoring
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: FPGA implementation of procedures for video quality assessment
Autorzy:: Wielgosz, M.
Karwatowski, M.
Pietron, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/305403.pdf
Data publikacji:: 2018
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: video quality
video metrics
image processing
FPGA
Impulse C
Opis:: The video resolutions used in a variety of media are constantly rising. While manufacturers struggle to perfect their screens, it is also important to ensure the high quality of the displayed image. Overall quality can be measured using a Mean Opinion Score (MOS). Video quality can be affected by miscellaneous artifacts appearing at every stage of video creation and transmission. In this paper, we present a solution to calculate four distinct video quality metrics that can be applied to a real-time video quality assessment system. Our assessment module is capable of processing 8K resolution in real time set at a level of 30 frames per second. The throughput of 2.19 GB/s surpasses the performance of pure software solutions. The module was created using a high-level language to concentrate on architectural optimization.
Źródło:: Computer Science; 2018, 19 (3); 279-305
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: Real time 8K video quality assessment using FPGA
Autorzy:: Wielgosz, M.
Pietroń, M.
Karwatowski, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/114403.pdf
Data publikacji:: 2016
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: video quality
video metrics
image processing
FPGA
Opis:: This paper presents a hardware architecture of the video quality assessment module. Two different metrics were implemented on FPGA using modern High Level Language for digital system design – Impulse C. FPGA resources consumption of the presented module is low, which enables module-level parallelization. Tests conducted for four modules working concurrently show that 1.96 GB/s throughput can be achieved. The module is capable of processing 8K video stream in a real-time manner i.e. 30 frames/second. Such high performance of the presented solution was achieved due to the series of architectural optimization introduced to the module, such as reduction of data precision and reuse of various module components.
Źródło:: Measurement Automation Monitoring; 2016, 62, 6; 187-189
2450-2855
Pojawia się w:: Measurement Automation Monitoring
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: The comparison of parallel sorting algorithms implemented on different hardware platforms
Autorzy:: Żurek, D.
Pietroń, M.
Wielgosz, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/305317.pdf
Data publikacji:: 2013
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: parallel algorithms
GPU
OpenMP
CUDA
sorting networks
merge-sort
Opis:: Sorting is a common problem in computer science. There are a lot of well-known sorting algorithms created for sequential execution on a single processor. Recently, many-core and multi-core platforms have enabled the creation of wide parallel algorithms. We have standard processors that consist of multiple cores and hardware accelerators, like the GPU. Graphic cards, with their parallel architecture, provide new opportunities to speed up many algorithms. In this paper, we describe the results from the implementation of a few different parallel sorting algorithms on GPU cards and multi-core processors. Then, a hybrid algorithm will be presented, consisting of parts executed on both platforms (a standard CPU and GPU). In recent literature about the implementation of sorting algorithms in the GPU, a fair comparison between many core and multi-core platforms is lacking. In most cases, these describe the resulting time of sorting algorithm executions on the GPU platform and a single CPU core.
Źródło:: Computer Science; 2013, 14 (4); 679-691
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: Równoległa implementacja algorytmu winnowing dla operacji strumieniowej analizy tekstu
Parallel Winnowing Implementation for text stream analysis
Autorzy:: Wielgosz, M.
Żurek, D.
Pietroń, M.
Dąbrowska-Boruch, A.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/154404.pdf
Data publikacji:: 2014
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: n-gramowy model
eksploracja danych
przetwarzanie strumieniowe
GPGPU
n-gram-based model
document comparison
GPU
information retrieval
Opis:: W ramach praca przeprowadzona została analiza możliwości wykorzystania algorytmu winnowing do strumieniowego przetwarzania informacji tekstowej. W szczególności nacisk został położony na operacje generacji odcisku jako jej zredukowanej reprezentacji wiadomości tekstowej. Autorzy przeprowadzili szereg eksperymentów, w celu określenia efektywności działania algorytmu oraz możliwego do uzyskania przyspieszenia obliczeń, z wykorzy-staniem węzła procesorów Intel Xeon E5645 2.40GHz oraz karty GPU Nvidia Tesla m2090.
There are several models available for information retrieval and text analysis but the two are considered to be the dominant ones, namely Boolean and the vector space model (VSM). A model maps the existing words or text into a new representation space. This paper presents a boolean n-gram-based algorithm - winnowing for fast text search and comparison of documents with main focus on its implementation and performance analysis. The algorithm is used to generate fingerprints (i.e. a set of hashes) of the analyzed documents. A dedicated test framework was designed and implemented to handle the task of the algorithm evaluation which utilizes PAN test corpus and programming environment. Several tests were conducted in order to determine the comparison quality of the obfuscated and not obfuscated text for the winnowing algorithm and different window and n-gram size. The tests revealed interesting properties of the algorithms with respect to comparison of documents as well as defied the limits of their applicability. The n-gram-based algorithms due to their simplicity are well suited for hardware implementation. Thus, the authors implemented compu-tationally demanding part of both fingerprint generation both on CPU and GPU. Performance measurements for Intel Xeon E5645, 2.40GHz and Nvidia Tesla m2090 implementation of Ngram-based algorithm show approximately 14x computational speedup.
Źródło:: Pomiary Automatyka Kontrola; 2014, R. 60, nr 5, 5; 309-312
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 11.

Tytuł:: Assessment of various GPU acceleration strategies in text categorization processing flow
Autorzy:: Korduła, Ł.
Wielgosz, M.
Karwatowski, M.
Pietroń, M.
Żurek, D.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/114132.pdf
Data publikacji:: 2017
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: GPU
NLP
text categorization
OpenCL
Opis:: Automatic text categorization presents many difficulties. Modern algorithms are getting better in extracting meaningful information from human language. However, they often significantly increase complexity of computations. This increased demand for computational capabilities can be facilitated by the usage of hardware accelerators like general purpose graphic cards. In this paper we present a full processing flow for document categorization system. Gram-Schmidt process signatures calculation up to 12 fold decrease in computing time of system components.
Źródło:: Measurement Automation Monitoring; 2017, 63, 6; 203-205
2450-2855
Pojawia się w:: Measurement Automation Monitoring
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "Pietroń, M." wg kryterium: Autor

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język