Autor: Russek, K. - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Loop profiling tool for HPC code inspection as an efficient method of FPGA based acceleration
Autorzy:: Pietroń, M.
Russek, P.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/929582.pdf
Data publikacji:: 2010
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: HPC
HPRC
obliczanie wysokowartościowe
obliczanie rekonfigurowalne
przetwarzanie danych
loop profiling
Mitrion-C
DFG (data flow graph)
Opis:: This paper presents research on FPGA based acceleration of HPC applications. The most important goal is to extract a code that can be sped up. A major drawback is the lack of a tool which could do it. HPC applications usually consist of a huge amount of a complex source code. This is one of the reasons why the process of acceleration should be as automated as possible. Another reason is to make use of HLLs (High Level Languages) such as Mitrion-C (Mohl, 2006). HLLs were invented to make the development of HPRC applications faster. Loop profiling is one of the steps to check if the insertion of an HLL to an existing HPC source code is possible to gain acceleration of these applications. Hence the most important step to achieve acceleration is to extract the most time consuming code and data dependency, which makes the code easier to be pipelined and parallelized. Data dependency also gives information on how to implement algorithms in an FPGA circuit with minimal initialization of it during the execution of algorithms.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2010, 20, 3; 581-589
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Accelerating SELECT WHERE and SELECT JOIN queries on a GPU
Autorzy:: Pietroń, M.
Russek, P.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/305797.pdf
Data publikacji:: 2013
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: SQL
CUDA
relational databases
GPU
Opis:: This paper presents implementations of a few selected SQL operations using the CUDA programming framework on the GPU platform. Nowadays, the GPU’s parallel architectures give a high speed-up on certain problems. Therefore, the number of non-graphical problems that can be run and sped-up on the GPU still increases. Especially, there has been a lot of research in data mining on GPUs. In many cases it proves the advantage of offloading processing from the CPU to the GPU. At the beginning of our project we chose the set of SELECT WHERE and SELECT JOIN instructions as the most common operations used in databases. We parallelized these SQL operations using three main mechanisms in CUDA: thread group hierarchy, shared memories, and barrier synchronization. Our results show that the implemented highly parallel SELECT WHERE and SELECT JOIN operations on the GPU platform can be significantly faster than the sequential one in a database system run on the CPU.
Źródło:: Computer Science; 2013, 14 (2); 243-252
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Using standard hardware accelerators to decrease computation times in scientific applications
Użycie standardowych akceleratorów sprzętowych do skrócenia czasu obliczeń naukowych
Autorzy:: Kuna, D.
Jamro, E.
Russek, P.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/305599.pdf
Data publikacji:: 2009
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: procesory ogólnego przeznaczenia
standardowe akceleratory
akceleratory obliczeń
architektury dedykowane
GPGPU
Cell
ClearSpeed
general-purpose processors
standard accelerators
computation accelerators
dedicated architectures
custom computing
CPGPU
Opis:: Nowadays, general-purpose processors are being used in scientific computing. However, when high computational throughput is needed, it’s worth to think it over if dedicated hardware solutions would be more efficient, either in terms of performance (or performance to price ratio), or in terms of power efficiency, or both. This paper describes them briefly and compares to contemporary general-purpose processors’ architecture.
Współczesnie w obliczeniach naukowych stosuje sie procesory ogólnego przeznaczenia. Gdy potrzebna jest duża przepustowość obliczeniowa, warto zastanowić się, czy dedykowane rozwiązania sprzętowe nie okazałyby się efektywniejsze pod względem wydajności (lub stosunku wydajności do ceny), zużycia energii bądź obu czynników jednocześnie. Artykuł opisuje pobieżnie dedykowane rozwiązania sprzętowe i porównuje ze współczesnymi architekturami procesorów ogólnego przeznaczenia.
Źródło:: Computer Science; 2009, 10; 65-74
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: A custom co-processor for the discovery of low autocorrelation binary sequences
Autorzy:: Russek, P.
Karwatowski, M.
Jamro, E.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/114571.pdf
Data publikacji:: 2016
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: LABS
SDLS algorithm
custom processors
HLS
FPGA
Opis:: We present a custom processor that was designed to enhance algorithms of finding Low Autocorrelation Binary Sequences (LABS). Finding LABS is very computationally exhaustive, but no custom computing solutions have been reported in the literature so far. A computational kernel which allowed creating an effective single-purpose processor was determined and an appropriate architecture was proposed. The selected elements of the architecture were coded in High-Level Synthesis (HLS) language to speed up the design process. Afterwards, the processor was verified and tested in Xilinx’s Virtex7 FPGA. At the beginning of the paper, we briefly present the finding LABS problem and its importance. Later, we deliver the algorithm, its custom processor structure, and implementation results in terms of the processor performance, size and power.
Źródło:: Measurement Automation Monitoring; 2016, 62, 5; 154-156
2450-2855
Pojawia się w:: Measurement Automation Monitoring
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Computation acceleration on SGI RASC: FPGA based reconfigurable computing hardware
Akceleracja obliczeń na platformie SGI RASC: module obliczeń za pomocą logiki rekonfigurowalnej
Autorzy:: Jamro, E.
Janiszewski, M.
Machaczek, K.
Russek, P.
Wiatr, K.
Wielgosz, M.
Powiązania:: https://bibliotekanauki.pl/articles/305339.pdf
Data publikacji:: 2008
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: sprzętowa akceleracja obliczeń
procesory dedykowane
FPGA
obliczenia wielkiej skali
SGI RASC
custom computing
single-purpose processors
high performance computing
Opis:: In this paper a novel method of computation using FPGA technology is presented. In several cases this method provides a calculations speedup with respcct to the General Purpose Processors (GPP). The main concept of this approach is based on such a design of computing hardware architecture to fit algorithm dataflow and best utilize well known computing techniques as pipelining and parallelism. Configurable hardware is used as a implementation platform for custom designed hardware. Paper will present implementation results of algorithms those are used in such areas as cryptography, data analysis and scientific computation. The other promising areas of new technology utilization will also be mentioned, bioinformatics for instance. Mentioned algorithms were designed, tested and implemented on SGI RASC platform. RASC module is a part of Cyfronet's SGI Altix 4700 SMP system. We will also present RASC modern architecture. In principle it consists of FPGA chips and very fast, 128-bit wide local memory. Design tools avaliable for designers will also be presented.
Autorzy prezentują nową metodę prowadzenia obliczeń wielkiej skali, opartą na układach FPGA. W szczególnych przypadkach jej zastosowanie prowadzi do skrócenia czasu obliczeń. Podstawą metody jest prowadzenie obliczeń za pomocą architektur obliczeniowych projektowanych dla danego algorytmu. Ponieważ architektura stworzona została specjalnie dla zadanego algorytmu, lepiej wykorzystuje możliwości równoległej i potokowej realizacji obliczeń. Jako platformę realizacji architektur dedykowanych zastosowano układy rekonfigurowalne. Artykuł prezentuje także wyniki zastosowania wspomnianej techniki w takich obszarach, jak kryptografia, analiza danych i obliczenia naukowe podwójnej precyzji. Wskazano również na inne dziedziny nauki, gdzie opisywana technika jest z powodzeniem stosowana (np.: bioinformatyka). Zrealizowane algorytmy były uruchomione i przetestowane na zainstalowanym w ACK Cyfronet AGH module SGI RASC, będącym częścią systemu SMP Al-tix 4700. Przedstawiono architekturę zastosowanego modułu RASC oraz narzędzia i metody projektowania dostępne dla programistów.
Źródło:: Computer Science; 2008, 9; 21-34
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Novel architecture for floating point accumulator with cancelation error detection
Autorzy:: Jamro, E.
Dąbrowska-Boruch, A.
Russek, P.
Wielgosz, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/201228.pdf
Data publikacji:: 2018
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: floating point arithmetic
computing error
approximate computing
arytmetyka zmiennoprzecinkowa
błąd obliczeniowy
obliczenia przybliżone
Opis:: A floating point accumulator cannot be obtained straightforwardly due to its pipeline architecture and feedback loop. Therefore, an essential part of the proposed floating point accumulator is a critical accumulation loop which is limited to an integer adder and 16-bit shifter only. The proposed accumulator detects a catastrophic cancellation which occurs e.g. when two similar numbers are subtracted. Additionally, modules with reduced hardware resources for rough error evaluation are proposed. The proposed architecture does not comply with the IEEE-754 floating point standard but it guarantees that a correct result, with an arbitrarily defined number of significant bits, is obtained. The proposed calculation philosophy focuses on the desired result error rather than on calculation precision as such.
Źródło:: Bulletin of the Polish Academy of Sciences. Technical Sciences; 2018, 66, 5; 579-587
0239-7528
Pojawia się w:: Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "Russek, K." wg kryterium: Autor

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język