Autor: Wiatr, K. - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Elementary functions in HLL on example of CORDIC algorithm implemented in Mitrion-C language
Implementacja funkcji elementarnych w FPGA na przykładzie algorytmu CORDIC w języku wysokiego poziomu Mitrion-C
Autorzy:: Pietroń, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/156559.pdf
Data publikacji:: 2012
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: HLL
FPGA
funkcje elementarne
HPRC
elementary functions
Opis:: The elementary functions are very often used in scientific computations. The quantum chemistry, physics, financial computing are only examples were elementary functions like exponent, logarithm are intensively computed. This paper presents implementation of an exp(x) core in a CORDIC-algorithm written in Mitrion-C lanuage. The Mitrion-C language is a new high level language. It enables implementing pipelined and wide paralleled algorithms on FPGA platforms. It makes process of algorithms implementation on FPGA faster. From gravitational forces to quantum chemistry or financial mathematics, computational scientists very often use exp(x) in computer simulations. The implemented core generates IEEE 754 standard single precision exponential values. The CORDIC algorithm can be used to compute wide spectrum of different elementary functions like sine, cosine, tangent. In our solution values of the exponent for integer part of the input argument are stored in a table. The table is allocated in an internal memory. The fractional part is computed by the CORDIC algorithm. The final result is achieved by multiplying the values of the fractional and integer part. Our implementation is made on SGI Altix 4700 hardware platform. It is SGI multiprocessor distributed shared memory computer system with Virtex-4 LX 200 FPGAs.
Funkcje elementarne są bardzo często wykorzystywane w obliczeniach naukowych. Chemia kwantowa, matematyka finansowa, fizyka jedne z wielu dziedzin gdzie funkcje takie jak eksponenta, logarytm są intensywnie wykonywane. Praca ta przedstawia implementację funkcji eksponenty za pomocą algorytmu CORDIC w języku Mitrion-C. Mitrion-C jest nowym językiem wysokiego poziomu programowania układów FPGA. Język ten posiada odpowiednie instrukcje oraz wbudowane typy danych, które pozwalają na programowanie algorytmów potokowo jak i całkowicie równolegle. W naszym rozwiązaniu argument wejściowy jest rozdzielony na część całkowitą i część ułamkową. Wartości eksponenty dla części całkowitej przechowywane są w tablicy w pamięci wewnętrznej natomiast część wartość dla części ułamkowej obliczana jest algorytmem CORDIC. Wynik końcowy obliczany jest za pomocą mnożenia części ułamkowej i całkowitej. Implementacja wykonana jest na platformie sprzętowej SGI ALTIX 4700. Jest to platforma wieloprocesorowa ze współdzieloną pamięcią oraz układami FPGA typu Virtex-4 LX 200.
Źródło:: Pomiary Automatyka Kontrola; 2012, R. 58, nr 7, 7; 671-673
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: The versatile hardware accelerator framework for sparse vector calculations
Autorzy:: Karwatowski, R.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/114705.pdf
Data publikacji:: 2015
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: FPGA
sparse vectors
cosine similarity
Zynq
hardware accelerator
Opis:: In this paper, we present the advantage of the ability of FPGAs to perform various computationally complex calculations using deep pipelining and parallelism. We propose an architecture that consists of many small stream processing blocks. The designed framework maintains proper data movement and synchronization. The architecture can be easily adapted to be implemented in FPGA devices of a various size and cost - from small SoC devices to high-end PCIe accelerator cards. It is capable to perform a selected operation on a sparse data that are loaded as the stream of vectors. As an example application, we have implemented the cosine similarity measure for the text similarity calculations that uses the TF-IDF weighting scheme. The presented example application calculates the similarity of texts from the set of input documents to documents from the large database. The scheme is used to find the most similar documents. The proposed design can decrease the service time of search queries in computer centers while reducing power consumption.
Źródło:: Measurement Automation Monitoring; 2015, 61, 7; 327-329
2450-2855
Pojawia się w:: Measurement Automation Monitoring
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Implementacja elementów bezstratnej kompresji standardu MPEG-2 w układach FPGA
Implementation of lossless compression elements of MPEG-2 standard in FPGA
Autorzy:: Dąbrowska, A.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/152616.pdf
Data publikacji:: 2007
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: Zig-Zag
RLC
VLC
FPGA
Opis:: Jednym z głównych elementów kompresji obrazów ruchomych oraz nieruchomych jest kodowanie entropijne. Typowy koder składa się z bloku uporządkowania pikseli, bloku kodowania ciągów oraz bloku kodera ze zmienną długością słowa W artykule zostały przedstawione parametry zaimplementowanych metod kompresji bezstratnej: kodera/dekodera entropijnego oraz części składowych.
An entropy coding is one of basic elements of video and still image compression. Typical entropy coder consists of pixels ordering block, run length coding block and Variable Length Coding block. Paper presents parameters lossless compression methods implementation: encoder/decoder and components part.
Źródło:: Pomiary Automatyka Kontrola; 2007, R. 53, nr 5, 5; 36-38
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Implementacja kodeka MPEG-2 w układach FPGA
Implementation of MPEG-2 codec in FPGA chips
Autorzy:: Dąbrowska, A.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/155721.pdf
Data publikacji:: 2007
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: MPEG-2
kodek
FPGA
codec
Opis:: Metoda kompresji zastosowana w standardzie MPEG-2 jest kombinacją innych standardów a mianowicie: JPEG oraz H.261. Ponieważ sygnał wizyjny jest w tym przypadku sekwencją nieruchomych obrazów, możliwe jest zastosowanie technik kompresji podobnych jak w przypadku standardu JPEG. W artykule zostały przedstawione wyniki implementacji toru przetwarzania sygnału wizyjnego zgodnego ze specyfikacją standardu ISO/IEC 13818 w układzie XC2VP100(-6)FF1704 firmy Xilinx.
The compression method applied in MPEG-2 standard is a combination of different standards, namely JPEG and H.261. There is possible to use similar compression techniques how in case of the JPEG standard, because the video signal is a sequence of still pictures. Paper presents implementation results of video signal processing path compatible with ISO/IEC 13818 standard specification in XC2VP100(-6)FF1704 Xilinx chip.
Źródło:: Pomiary Automatyka Kontrola; 2007, R. 53, nr 7, 7; 42-44
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Potokowa realizacja operacji pomnóż i dodaj dla argumentów zmiennoprzecinkowych podwójnej precyzji
Pipeline implementation of multiply and accumulate double precision floating point operation
Autorzy:: Russek, P.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/155725.pdf
Data publikacji:: 2007
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: układy FPGA
obliczenia dużej złożoności
architektury dedykowane
FPGA
supercomputing
custom computing machines
Opis:: Operacja pomnóż i dodaj to fundament realizacji obliczeń numerycznych we współczesnej nauce i technice. Możliwość szybkiej realizacji tej opera-cji ma zasadnicze znaczenie dla efektywności systemu obliczeniowego. Obok techniki przyśpieszania obliczeń polegającej na równoległej ich realizacji duże znaczenie i zastosowanie ma również technika przetwarzania potokowego. Zwiększa ona przepustowość modułów obliczeniowych wydłużając opóźnienie. W przypadku operatora pomnóż i dodaj zastosowanie techniki potokowej ze względu na pętle sprzężenia zwrotnego w ścieżce danych napotyka pewne problemy. W pracy zaprezentowano sposób potokowej realizacji operacji pomnóż i dodaj oraz wyniki jej implementacji w FPGA dla argumentów zmiennoprzecinkowych podwójnej precyzji.
Multiply and accumulate operation is a foundation of contemporary numerical computation in science and technology. Ability for its fast execution is crucial for performance of computing system. In computing acceleration beside parallel processing technique also pipelining has an important role as a way to increase system throughput. In a case of multiply-and-accumulate (MAC) operation there is a problematic issue that comes from the feedback loop necessary in MAC architecture. In this paper double precision MAC pipeline architecture is proposed and FPGA implementation results presented.
Źródło:: Pomiary Automatyka Kontrola; 2007, R. 53, nr 7, 7; 36-38
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Implementacja w układach FPGA dekompresji danych zgodnie ze standardem Deflate
FPGA Implementation of Deflate standard data decompression
Autorzy:: Jamro, E.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/156208.pdf
Data publikacji:: 2013
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: kompresja danych
FPGA
kodowanie Huffmana
data compression
Deflate
Huffman
Opis:: Otwarty standard kompresji danych, Deflate, jest szeroko stosowanym standardem w plikach .gz / .zip i stanowi kombinację kompresji metodą LZ77 / LZSS oraz kodowania Huffmana. Niniejszy artykuł opisuje implementację w układach FPGA dekompresji danych według tego standardu. Niniejszy moduł jest w stanie dokonać dekompresji co najmniej 1B na takt zegara, co przy zegarze 100MHz daje 100MB/s. Aby zwiększyć szybkość, możliwa jest praca wielu równoległych modułów dla różnych strumieni danych wejściowych.
This paper describes FPGA implementation of the Deflate standard decoder. Deflate [1] is a commonly used compression standard employed e.g. in zip and gz files. It is based on dictionary compression (LZ77 / LZSS) [4] and Huffman coding [5]. The proposed Huffman decoded is similar to [9], nevertheless several improvements are proposed. Instead of employing barrel shifter a different translation function is proposed (see Tab. 1). This is a very important modification as the barrel shifter is a part of the time-critical feedback loop (see Fig. 1). Besides, the Deflate standard specifies extra bits, which causes that a single input word might be up to 15+13=28 bits wide, but this width is very rare. Consequently, as the input buffer might not feed the decoder width such wide input date, a conditional decoding is proposed, for which the validity of the input data is checked after decoding the input symbol, thus when the actual input symbol bit widths is known. The implementation results (Tab. 2) show that the occupied hardware resources are mostly defined by the number of BRAM modules, which are mostly required by the 32kB dictionary memory. For example, comparable logic (LUT / FF) resources to the Deflate standard decoder are required by the AXI DMA module which transfers data to / from the decoder.
Źródło:: Pomiary Automatyka Kontrola; 2013, R. 59, nr 8, 8; 739-741
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Realizacja kompresji danych metodą Huffmana z ograniczeniem długości słów kodowych
Implementation of Huffman compression with limited codeword length
Autorzy:: Rybak, K.
Jamro, E.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/156575.pdf
Data publikacji:: 2012
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: kompresja danych
FPGA
kodowanie Huffmana
data compression
Huffman coding
Opis:: Praca opisuje zmodyfikowany sposób budowania książki kodowej kodu Huffmana. Książka kodowa została zoptymalizowana pod kątem implementacji sprzętowej kodera i dekodera Huffmana w układach programowalnych FPGA. Opisano dynamiczną metodę kodowania - książka kodowa może się zmieniać w zależności od zmiennego formatu kompresowanych danych, ponadto musi być przesłana z kodera do dekodera. Sprzętowa implementacja kodeka Huffmana wymusza ograniczenie maksymalnej długości słowa, w przyjętym założeniu do 12 bitów, co pociąga za sobą konieczność modyfikacji algorytmu budowy drzewa Huffmana.
This paper presents a modified algorithm for constructing Huffman codeword book. Huffman coder, decoder and histogram calculations are implemented in FPGA similarly like in [2, 3]. In order to reduce the hardware resources the maximum codeword is limited to 12 bit. It reduces insignificantly the compression ratio [2, 3]. The key problem solved in this paper is how to reduce the maximum codeword length while constructing the Huffman tree [1]. A standard solution is to use a prefix coding, like in the JPEG standard. In this paper alternative solutions are presented: modification of the histogram or modification of the Huffman tree. Modification of the histogram is based on incrementing (disrupting) the histogram values for an input codeword for which the codeword length is greater than 12 bit and then constructing the Huffman tree from the very beginning. Unfortunately, this algorithm is not deterministic, i.e. it is not known how much the histogram should be disrupted in order to obtain the maximum codeword length limited by 12 bit. Therefore several iterations might be required. Another solution is to modify the Huffman tree (see Fig. 2). This algorithm is more complicated (when designing), but its execution time is more deterministic. Implementation results (see Tab. 1) show that modifi-cation of the Huffman tree results in a slightly better compression ratio.
Źródło:: Pomiary Automatyka Kontrola; 2012, R. 58, nr 7, 7; 662-664
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: Realizacja operacji mnożenia o skróconej szerokości w układach FPGA
FPGA implementation of reduce-width multiplier
Autorzy:: Jamro, E.
Wielgosz, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/154019.pdf
Data publikacji:: 2009
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: układ mnożący
układy FPGA
FPGA
fixed-width multiplier
Opis:: Pełne mnożenie dwóch argumentów n-bitowych daje rezultat o szerokości 2xn-bitów. W większości przypadków stosuje się mnożenie o skróconej szerokości gdzie np. dodatkowe n najmłodszych bitów wyniku jest odrzucane. Niniejszy artykuł prezentuje nową metodę kompensacji błędu obliczeń dla mnożenia o skróconej szerokości szczególnie wydajną w przypadku użycia układów FPGA. Podstawą proponowanej architektury jest podawanie na niewykorzystywane do tej pory wejście przeniesienia wybranych bitów argumentów wejściowych układu mnożącego.
The paper presents a novel metod of the error compensation for a reduce-width multiplier implemented in FPGAs. For a standard multiplier and the bit-width equal to n for both inputs, the output width is equal to 2?n. In order to obtain a fixed-width multiplier, the n-LSBs of the output should be truncated. Lan-Da Van et. al. [1, 2] presented the error compensation method appropriate for ASIC, however, this method cannot be directly employed in FPGAs due to relatively high hardware resources and a different multiplier structure (compare Fig. 1 and Fig. 2). The main idea of the proposed error compensation method is to feed carry input directly with the selected bits of the multiplier input (see Fig. 4). The implementation results shown in Fig. 5 confirm the significant reduction of the truncation error, especially for the mean error which is close to zero. It should be noted that the error compensation circuit employs the normally unused carry-in input, therefore no additional FPGA resources are required by the proposed method.
Źródło:: Pomiary Automatyka Kontrola; 2009, R. 55, nr 8, 8; 669-671
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: Akceleracja obliczeń zmiennoprzecinkowych na platformie RASC
Accelerating calculations on the RASC platform
Autorzy:: Wielgosz, M.
Jamro, E.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/154331.pdf
Data publikacji:: 2009
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: akceleracja sprzętowa
komputery dużej mocy (HPC)
FPGA
obliczenia zmiennoprzecinkowe
funkcja exp()
HPRC (High Performance Reconfigurable Computing)
elementary functions
exponential function
Opis:: W artykule zostały zaprezentowane wyniki testów przeprowadzonych w celu określenia maksymalnej szybkości wykonywania operacji zmiennoprzecinkowych na platformie rekonfigurowanej RASC. Zaimplementowano różne dostępne tryby konfiguracji jednostki Host oraz RASC w celu wyłonienia najbardziej efektywnego pod względem wydajności trybu pracy jednostki obliczeniowej. Uzyskane wyniki pomiarów ujawniały, że kombinacja Direct I/O oraz DMA zapewnia najwyższą przepustowość pomiędzy węzłami Host i RASC. Niemniej jednak dla niektórych aplikacji tryb multi-buffering może okazać się bardziej odpowiedni, ze względu na możliwość jednoczesnego przesyłania danych i wykonywania operacji. Funkcja exp() w standardzie zmiennoprzecinkowym o podwójnej precyzji została wykorzystana jako przykładowa aplikacja, która pozwoliła oszacowanie możliwej do uzyskania akceleracji obliczeń na platformie RASC.
This paper presents results of the tests performed to determine high speed calculations capabilities of the SGI RASC platform. Different data transfer modes and memory management approaches were examined to choose the most effective combination of the Host and RASC memory adjustments. That work may be regarded as a case study of the contemporary FPGA -based accelerator which, however, can characterize the whole branch of the devices. The paper is strongly focused on the floating point calculations potential of the FPGA accelerator. The RASC algorithm execution procedure, from the processor perspective, is composed of several functions which reserve resources, queue commands and perform other preparation steps. It is noteworthy (Fig. 3) that the time consumed by the functions remains roughly the same, independent of the algorithm being executed. The resource reservation procedure, once conducted, allows many executions of the algorithm -that amounts to huge time savings, since the procedure takes approximately 7.5 ms, which is roughly 99 % of the overall execution time of the algorithm. Rasclib algorithm commit and rasclib algorithm wait calls are considered to be the key (Fig. 3) part of the RASC software execution routine. The first one activates the FPGA between these two commands is the transfer and algorithm execution time. All curves (Fig. 4) reflect overall processing time of the same amount of data, but differ in size of the single data chunk which varies from 1024x64 bit = 8 kB to 1048576x64 bit = 8 MB. It has been observed that for the bigger chunk much better results are achieved in terms of the effective execution time. However, above 1 MB a decrease of the effective execution time seems to indicate saturation, therefore sending data in bigger portions may not improve the performance of the system so much. The most effective execution time of single exp() function for SRAM buffering mode is 12 ns, so 9,5 ns is transport overhead due to bus delays. The theoretical calculation time of single exp() function (data transfer is not taken into account) is 2,5 ns because two exp() are implemented on the RASC and clocked at 200 Mhz. The obtained measurement results show that Direct I/O mode together with DMA transfer provides the highest data throughput between the Host and RASC slice. Nevertheless, for some application multi-buffering can appear to be more suitable in terms of concurrent data transfer capabilities and FPGA algorithm execution. As a hardware acceleration example, there is considered an exponential function which allows estimating maximum achievable data processing speed.
Źródło:: Pomiary Automatyka Kontrola; 2009, R. 55, nr 7, 7; 485-487
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: Moduł obliczający funkcję eksponenty implementowanej w układach FPGA
FPGA Implementation of Exponent Function
Autorzy:: Wielgosz, M.
Jamro, E.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/155683.pdf
Data publikacji:: 2007
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: HPC
exp()
FPGA
Opis:: Niniejszy artykuł prezentuje implementację operacji obliczania eksponenty o podwójnej precyzji obliczeń w układach FPGA. Zaproponowano metodę tablicowo - aproksymacyjną, dla której wykorzystano 3 niezależne tablice 512x64-bity do obliczenia 27 najstarszych bitów mantysy oraz aproksy-macje wielomianową ex"1+x dla pozostałych bitów mantysy. Wyniki implementacji pokazują że proponowany moduł zajmuje około 7.5% układu Virtex-4 LX200.
This paper presents FPGA implementation of exponent operation in double precision format. A mixture of Look-Up Table (LUT) and approximation methods was employed. Twenty seven most significant bits of input mantissa are calculated employing 3 independent LUTs, the rest input bits are calculated by approximation: ex"1+x. Implementation results in roughly 7.5% occupation of Virtex-4 LX-200.
Źródło:: Pomiary Automatyka Kontrola; 2007, R. 53, nr 7, 7; 27-29
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 11.

Tytuł:: Implementacja oraz porównanie algorytmów tekstowych w środowiskach przetwarzania równoległego na przykładzie procesorów wielordzeniowych i kart graficznych
Multicore and GPGPU implementation of chosen text algorithms
Autorzy:: Pietroń, M.
Wielgosz, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/155953.pdf
Data publikacji:: 2014
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: algorytmy tekstowe
GPGPU
obliczenia równoległe
text mining
text algorithms
parallel computing
Opis:: Artykuł przedstawia implementację algorytmów tekstowych w wybranych platformach przetwarzania równoległego. Dostępność procesorów wielordzeniowych oraz kart graficznych ogólnego przeznaczenia sprawia, iż badania nad równoległą implementacją algorytmów w celu ich akceleracji nabierają coraz większego znaczenia. Algorytmy tekstowe są niezwykle istotnym i często niezbędnym elementem zaawansowanych algorytmów analizy tekstu oraz są także składowymi funkcji wyszukiwania wzorców w tekście wielu języków programowania. W pracy dokonano analizy najpopularniejszych algorytmów tekstowych oraz dokonano ich analizy pod kątem ich zrównoleglenia w celu ich implementacji w procesorze wielordzeniowym oraz karcie graficznej ogólnego przeznaczenia. Analizowanymi algorytmami są: boyer-moore, algorytm naiwny oraz algorytm knuth-morris-pratt. Następnie dokonano porównania efektywności ich realizacji na wymienionych platformach sprzętowych.
This paper presents implementation of text algorithms in multicore CPU and GPGPU. The text algorithms are very common algorithms used in text analysis process and they are a part of functions used for text patterns recognition. The library functions for text searching implemented in many languages very often use most popular text-algorithms. The paper describes the analysis of these algorithms for parallel implementations in multicore processors and general purpose graphic cards. The research work presented in this paper shows that text algorithms can be partially parallelized. The process of acceleration can be done by appropriate dividing the input text between parallel threads (data parallelism). The comparative studies were performed for the following algorithms: boyer-moore (horspool) , naive and knuth-morris-pratt algorithm. The presented results show the efficiency of these algorithms in the case of different type and size of patterns. In the case of GPU the implementation was made in the CUDA framework. The OpenMP library was used for a multicore version.
Źródło:: Pomiary Automatyka Kontrola; 2014, R. 60, nr 5, 5; 301-304
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 12.

Tytuł:: Realizacja w układach FPGA mnożenia Montgomery dla akceleracji operacji kryptograficznych
Implementation of Montgomery multiplication for cryptographic algorithm acceleration in FPGA
Autorzy:: Janiszewski, M.
Russek, P.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/156268.pdf
Data publikacji:: 2008
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: mnożenie Montgomery
mnożenie modulo
FPGA
RSA
Montgomery multiplication
modular multiplication
Opis:: W niniejszej pracy podjęto temat realizacji modułu sprzętowego, mogącego skutecznie przyspieszyć programowe realizacje operacji kryptograficznych. Rozpatrywanym algorytmem jest szyfrowanie asymetryczne RSA. Moduł został zaimplementowany w układzie firmy Xilinx - Virtex 4 LX200. Prędkość działania modułu została porównana z najpopularniejszymi rozwiązaniami programowymi. Rezultaty pokazują, że rozwiązania bazujące na układach rekonfigurowanych mogą konkurować z implementacjami opartymi na procesorach ogólnego przeznaczenia (GPP).
Modular exponentiation is a key operation for RSA cryptographic algorithm. There are many algorithms for computing modular exponentiation - equation 1. The most basic are right to left and left to right binary algorithms. For key length k=1024 bits, 1024 modular squarings and 512 modular multiplications on average must be performed. There are many optimization which allows to minimize the number of multiplications, however they are more suited for software implementations. Therefore key factor for faster modular exponentiation is fast multiplier module. This work presents example implementation of modulo multiplier using Montgomery multiplication algorithm [1]. Montgomery multiplication is the most efficient algorithm when large number of multiplications must be performed with respect to the same modulus n. Our results show that timings comparable with modern processors can be achieved - table 2. This works also presents optimizations of proposed module, which allow greater speedup and application of FPGA bas
Źródło:: Pomiary Automatyka Kontrola; 2008, R. 54, nr 8, 8; 550-552
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 13.

Tytuł:: Oscilloscope based on small-size FPGA with VGA display
Autorzy:: Rzeszut, P.
Jamro, E.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/114072.pdf
Data publikacji:: 2018
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: FPGA
oscilloscope
Opis:: A simple single-chip, FPGA based oscilloscope was designed both to supply a user with a low-budget oscilloscope and to teach operation of such devices. The device implements basic functions of real oscilloscope, providing clear insight in processes of signal acquisition (employing FPGA built-in analog-to-digital converter with aggregated sampling rate equal 1MS/s), processing and displaying acquired signals. Also some effort was made in order to fit the design in limited resources of the selected FPGA device. The project is open source [1].
Źródło:: Measurement Automation Monitoring; 2018, 64, 1; 2-4
2450-2855
Pojawia się w:: Measurement Automation Monitoring
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 14.

Tytuł:: Implementacja szybkiej transformacji Fouriera o parametryzowanym rozmiarze w układach FPGA
Implementation of fast Fourier transform of configurable size in FPGA circuits
Autorzy:: Rzepka, D.
Jamro, E.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/151898.pdf
Data publikacji:: 2009
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: FFT
FPGA
CORDIC
Opis:: W artykule przedstawiono przykład implementacji szybkiej transformacji Fouriera w układach FPGA. Operacja obrotu liczby zespolonej o dany kąt wykonywana podczas obliczeń FFT jest realizowana za pomocą modułu CORDIC. Dokonano analizy błędów zaokrągleń dla algorytmu CORDIC i mnożenia zespolonego, wykorzystywanych przy rotacji wektorów zespolonych. Główną motywacją niniejszej implementacji było współdzielenie zasobów pamięci BRAM pomiędzy różne zadania (nie tylko FFT) w ramach całego systemu zbudowanego w pakiecie EDK firmy Xilinx.
The paper presents hardware implementation of the Fast Fourier Transform (FFT) implemented in FPGAs. The FFT module is based on the CORDIC [4], therefore there is no need to store sin(?) coefficients. The main idea besides designing this FFT module was to share FPGA internal memory resources between different modules, e.g. FFT, Procedure of Linear Decimation [8]. This is a very important issue as FFT operation is one of many computation tasks performed by the embedded system [8], and internal memory resources are critical. Apart from it, for large FFT size (216), the external memory must be used. Therefore a special control and address counters were designed in order to allow internal and external memory transfers. The proposed FFT module calculates one butterfly operation per clock cycle (assuming internal memory transfers), therefore it is not speed optimized, nevertheless it is still much quicker than only MicroBlaze based implementation and it satisfies the system requirements. This paper presents also the computation error analysis.
Źródło:: Pomiary Automatyka Kontrola; 2009, R. 55, nr 8, 8; 600-602
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 15.

Tytuł:: Implementacja standardu szyfrowania AES w układzie FPGA dla potrzeb sprzętowej akceleracji obliczeń
The AES ciper standard implementation on FPGA for hardware accelerated computing
Autorzy:: Gielata, A.
Russek, P.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/152602.pdf
Data publikacji:: 2007
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: Rijndael
AES
implementacja sprzętowa
FPGA
hardware implementation
Opis:: Tematem artykułu jest implementacja standardu szyfrowania danych AES-128 w układach reprogramowalnych FPGA. W systemach, gdzie wymagana jest duża szybkość szyfrowania informacji implementacje programowe okazują się zbyt wolne. W związku z tym zachodzi konieczność sprzętowej akceleracji obliczeń, a idealnym rozwiązaniem jest wykorzystanie do tego celu możliwości, jakie dają układy reprogramowalne FPGA. Do implementacji w języku VHDL wybrana została podstawowa wersja algorytmu określonego w standardzie AES. W celu uzyskania maksymalnej szybkości szyfrowania zastosowana została architektura potokowa modułu.
In this paper we investigate hardware implementation of AES-128 cipher standard on FPGA technology. In many network applications software implementations of cryptographic algorithms are slow and inefficient. To solve the problems custom architecture in reconfigurable hardware was used to speed up the performance and flexibility of Rijndael algorithm implementation. We aimed at achieving the maximum speed and efficiency of cipher process, therefore pipeline architecture of AES module was proposed. The investigations involved simulations and synthesis of VHDL code utilizing Virtex4 series of Xilinx.
Źródło:: Pomiary Automatyka Kontrola; 2007, R. 53, nr 5, 5; 48-50
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "Wiatr, K." wg kryterium: Autor

Źródło danych

Dostawca treści

Podbaza

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język