Temat: danych - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Optymalizacja kompresji Huffmana pod kątem podziału na bloki
Optimization of Huffman compression employing different block sizes
Autorzy:: Rybak, K.
Jamro, E.
Wielgosz, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/154957.pdf
Data publikacji:: 2014
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: kompresja danych
kodowanie Huffmana
deflate
data compression
Huffman coding
Opis:: Prezentowane w pracy badania dotyczą bezstratnej kompresji danych opartej o metodę Huffmana i zgodnej ze standardem deflate stosowanym w plikach .zip / .gz. Zaproponowana jest optymalizacja kodera Huffmana polegająca na podziale na bloki, w których stosuje się różne książki kodowe. Wprowadzenie dodatkowego bloku z reguły poprawia stopień kompresji kosztem narzutu spowodowanego koniecznością przesłania dodatkowej książki kodowej. Dlatego w artykule zaproponowano nowy algorytm podziału na bloki.
According to deflate [2] standard (used e.g. in .zip / .gz files), an input file can be divided into different blocks, which are compressed employing different Huffman [1] codewords. Usually the smaller the block size, the better the compression ratio. Nevertheless each block requires additional header (codewords) overhead. Consequently, introduction of a new block is a compromise between pure data compression ratio and headers size. This paper introduces a novel algorithm for block Huffman compression, which compares sub-block data statistics (histograms) based on current sub-block entropy E(x) (1) and entropy-based estimated average word bitlength Emod(x) for which codewords are obtained for the previous sub-block (2). When Emod(x) - E(x) > T (T - a threshold), then a new block is inserted. Otherwise, the current sub-block is merged into the previous block. The typical header size is 50 B, therefore theoretical threshold T for different sub-block sizes S is as in (3) and is given in Tab. 2. Nevertheless, the results presented in Tab. 1 indicate that optimal T should be slightly different - smaller for small sub-block size S and larger for big S. The deflate standard was selected due to its optimal compression size to compression speed ratio [3]. This standard was selected for hardware implementation in FPGA [4, 5, 6, 7].
Źródło:: Pomiary Automatyka Kontrola; 2014, R. 60, nr 7, 7; 519-521
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Implementacja w układach FPGA dekompresji danych zgodnie ze standardem Deflate
FPGA Implementation of Deflate standard data decompression
Autorzy:: Jamro, E.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/156208.pdf
Data publikacji:: 2013
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: kompresja danych
FPGA
kodowanie Huffmana
data compression
Deflate
Huffman
Opis:: Otwarty standard kompresji danych, Deflate, jest szeroko stosowanym standardem w plikach .gz / .zip i stanowi kombinację kompresji metodą LZ77 / LZSS oraz kodowania Huffmana. Niniejszy artykuł opisuje implementację w układach FPGA dekompresji danych według tego standardu. Niniejszy moduł jest w stanie dokonać dekompresji co najmniej 1B na takt zegara, co przy zegarze 100MHz daje 100MB/s. Aby zwiększyć szybkość, możliwa jest praca wielu równoległych modułów dla różnych strumieni danych wejściowych.
This paper describes FPGA implementation of the Deflate standard decoder. Deflate [1] is a commonly used compression standard employed e.g. in zip and gz files. It is based on dictionary compression (LZ77 / LZSS) [4] and Huffman coding [5]. The proposed Huffman decoded is similar to [9], nevertheless several improvements are proposed. Instead of employing barrel shifter a different translation function is proposed (see Tab. 1). This is a very important modification as the barrel shifter is a part of the time-critical feedback loop (see Fig. 1). Besides, the Deflate standard specifies extra bits, which causes that a single input word might be up to 15+13=28 bits wide, but this width is very rare. Consequently, as the input buffer might not feed the decoder width such wide input date, a conditional decoding is proposed, for which the validity of the input data is checked after decoding the input symbol, thus when the actual input symbol bit widths is known. The implementation results (Tab. 2) show that the occupied hardware resources are mostly defined by the number of BRAM modules, which are mostly required by the 32kB dictionary memory. For example, comparable logic (LUT / FF) resources to the Deflate standard decoder are required by the AXI DMA module which transfers data to / from the decoder.
Źródło:: Pomiary Automatyka Kontrola; 2013, R. 59, nr 8, 8; 739-741
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Realizacja kompresji danych metodą Huffmana z ograniczeniem długości słów kodowych
Implementation of Huffman compression with limited codeword length
Autorzy:: Rybak, K.
Jamro, E.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/156575.pdf
Data publikacji:: 2012
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: kompresja danych
FPGA
kodowanie Huffmana
data compression
Huffman coding
Opis:: Praca opisuje zmodyfikowany sposób budowania książki kodowej kodu Huffmana. Książka kodowa została zoptymalizowana pod kątem implementacji sprzętowej kodera i dekodera Huffmana w układach programowalnych FPGA. Opisano dynamiczną metodę kodowania - książka kodowa może się zmieniać w zależności od zmiennego formatu kompresowanych danych, ponadto musi być przesłana z kodera do dekodera. Sprzętowa implementacja kodeka Huffmana wymusza ograniczenie maksymalnej długości słowa, w przyjętym założeniu do 12 bitów, co pociąga za sobą konieczność modyfikacji algorytmu budowy drzewa Huffmana.
This paper presents a modified algorithm for constructing Huffman codeword book. Huffman coder, decoder and histogram calculations are implemented in FPGA similarly like in [2, 3]. In order to reduce the hardware resources the maximum codeword is limited to 12 bit. It reduces insignificantly the compression ratio [2, 3]. The key problem solved in this paper is how to reduce the maximum codeword length while constructing the Huffman tree [1]. A standard solution is to use a prefix coding, like in the JPEG standard. In this paper alternative solutions are presented: modification of the histogram or modification of the Huffman tree. Modification of the histogram is based on incrementing (disrupting) the histogram values for an input codeword for which the codeword length is greater than 12 bit and then constructing the Huffman tree from the very beginning. Unfortunately, this algorithm is not deterministic, i.e. it is not known how much the histogram should be disrupted in order to obtain the maximum codeword length limited by 12 bit. Therefore several iterations might be required. Another solution is to modify the Huffman tree (see Fig. 2). This algorithm is more complicated (when designing), but its execution time is more deterministic. Implementation results (see Tab. 1) show that modifi-cation of the Huffman tree results in a slightly better compression ratio.
Źródło:: Pomiary Automatyka Kontrola; 2012, R. 58, nr 7, 7; 662-664
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Optymalizacja sprzętowej architektury kompresji danych metodą słownikową
FPGA implementation of Deflate standard data decompression
Autorzy:: Gwiazdoń, M.
Jamro, E.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/152652.pdf
Data publikacji:: 2013
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: kompresja danych
FPGA
kodowanie Huffmana
data compression
Deflate
LZ77
Opis:: Niniejszy artykuł opisuje nową architekturę sprzętową kompresji słownikowej, np. LZ77, LZSS czy też Deflate. Zaproponowana architektura oparta jest na funkcji haszującej. Poprzednie publikacje były oparte na sekwencyjnym odczycie adresu wskazywanego przez pamięć hasz, niniejszy artykuł opisuje układ, w którym możliwe jest równoległe odczytywanie tego adresu z wielu pamięci hasz, w konsekwencji możliwa jest kompresja słownikowa z szybkością na poziomie 1B ciągu wejściowego na takt zegara. Duża szybkość kompresji jest okupiona nieznacznym spadkiem stopnia kompresji.
This paper describes a novel parallel architecture for hardware (ASIC or FPGA) implementation of dictionary compressor, e.g. LZ77 [1], LZSS [2] or Deflate [4]. The proposed architecture allows for very fast compression – 1B of input data per clock cycle. A standard compression architecture [8, 9] is based on sequential hash address reading (see Fig. 2) and requires M clock cycles per 1B of input data, where M is the number of candidates for string matching, i.e. hashes look ups (M varies for different input data). In this paper every hash address is looked up in parallel (see Fig. 3). The drawback of the presented method is that the number of M is defined (limited), therefore the compression ratio is slightly degraded (see Fig. 4). To improve compression ratio, a different sting length may be searched independently, i.e. not only 3B, but also 4B, … N B hashes (see results in Fig. 5, 6). Every hash memory (M(N-2)) usually requires a direct look-up in the dictionary to eliminate hash false positive cases or to check whether a larger length sting was found. In order to reduce the number of dictionary reads, an additional pre-elimination algorithm is proposed, thus the number of dictionary reads does not increase rapidly with growing N (see Fig. 7).
Źródło:: Pomiary Automatyka Kontrola; 2013, R. 59, nr 8, 8; 827-829
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Statistics in cyphertext detection
Statystyka w wykrywaniu informacji szyfrowanej
Autorzy:: Gancarczyk, G.
Dąbrowska-Boruch, A.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/159309.pdf
Data publikacji:: 2011
Wydawca:: Sieć Badawcza Łukasiewicz - Instytut Elektrotechniki
Tematy:: szyfr
kryptografia
kryptoanaliza
szyfrogram
szyfrowanie
analizator danych
dystrybucja danych
szum biały
statystyka
cipher
cryptography
cryptanalysis
ciphertext
encryption
data analyzer
data distribution
white noise
statistics
Opis:: Mostly when word encrypted occurs in an article text, another word decryption comes along. However not always knowledge about the plaintext is the most significant one. An example could be a network data analysis where only information, that cipher data were sent from one user to another or what was the amount of all cipher data in the observed path, is needed. Also before data may be even tried being decrypted, they must be somehow distinguished from non-encrypted messages. In this paper it will be shown, that using only simple Digital Data Processing, encrypted information can be detected with high probability. That knowledge can be very helpful in preventing cyberattacks, ensuring safety and detecting security breaches in local networks, or even fighting against software piracy in the Internet. Similar solutions are successfully used in steganalysis and network anomaly detections.
Nowoczesna kryptografia wykorzystuje wyszukane i skomplikowane obliczeniowo przekształcenia matematyczno-logiczne w celu ukrycia ważnej informacji jawnej przez osobami niepowołanymi. Przeważająca większość z nich nadal odwołuje się do postawionego w roku 1949 przez Claude'a E. Shannona postulatu, że idealnie utajniona informacja charakteryzuje się tym, że żaden z pojawiających się w niej symboli nie jest bardziej prawdopodobny niż inne spośród używanego alfabetu znaków. Zgodnie z tą definicją dane idealnie zaszyfrowane w swej naturze przypominają dane losowe o rozkładzie równomiernym, czyli przypomina swoim rozkładem szum biały. Koncepcja detektora opiera się o algorytm analizujący podawane na wejściu dane pod względem ich podobieństwa do szumu białego. Wielkości odniesienia są bardzo dobrze znane, a ich ewentualne wyprowadzenie nie przysparza żadnych trudności. Wyznaczając w sposób doświadczalny granice tolerancji dla każdego z parametrów uzyskuje się w pełni działający algorytm, dokonujący w sposób zero-jedynkowy klasyfikacji na jawny/tajny. W grupie przedstawionych 14 Parametrów Statystycznych pojawiają się takie jak: energia, wartość średnia czy też momenty centralne. Na ich podstawie można stworzyć klasyfikator pierwszego poziomu. Efektywność poprawnego rozróżnienia danych przez klasyfikator pierwszego rzędu waha się w granicach od 80% do 90% (w zależności od użytej w algorytmie wielkości). W celu zwiększenia wykrywalności danych proponuje się, a następnie przedstawia, klasyfikator drugiego rzędu, bazujący na dwóch lub więcej, wzajemnie nieskorelowanych Parametrach Statystycznych. Rozwiązanie takie powoduje wzrost sprawności do około 95%. Zaproponowany w artykule algorytm może być wykorzystany na potrzeby kryptoanalizy, statystycznej analizy danych, analizy danych sieciowych. W artykule przedstawiona jest także koncepcja klasyfikatora trzeciego rzędu, wykorzystującego dodatkowo informacje o charakterze innym niż statystyczny, na potrzeby prawidłowej detekcji danych zaszyfrowanych.
Źródło:: Prace Instytutu Elektrotechniki; 2011, 251; 67-85
0032-6216
Pojawia się w:: Prace Instytutu Elektrotechniki
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Loop profiling tool for HPC code inspection as an efficient method of FPGA based acceleration
Autorzy:: Pietroń, M.
Russek, P.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/929582.pdf
Data publikacji:: 2010
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: HPC
HPRC
obliczanie wysokowartościowe
obliczanie rekonfigurowalne
przetwarzanie danych
loop profiling
Mitrion-C
DFG (data flow graph)
Opis:: This paper presents research on FPGA based acceleration of HPC applications. The most important goal is to extract a code that can be sped up. A major drawback is the lack of a tool which could do it. HPC applications usually consist of a huge amount of a complex source code. This is one of the reasons why the process of acceleration should be as automated as possible. Another reason is to make use of HLLs (High Level Languages) such as Mitrion-C (Mohl, 2006). HLLs were invented to make the development of HPRC applications faster. Loop profiling is one of the steps to check if the insertion of an HLL to an existing HPC source code is possible to gain acceleration of these applications. Hence the most important step to achieve acceleration is to extract the most time consuming code and data dependency, which makes the code easier to be pipelined and parallelized. Data dependency also gives information on how to implement algorithms in an FPGA circuit with minimal initialization of it during the execution of algorithms.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2010, 20, 3; 581-589
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Efektywność parametrów statystycznych w detekcji informacji szyfrowanej
Effectiveness of statistic parameters in cipher data detection
Autorzy:: Gancarczyk, G.
Dąbrowska-Boruch, A.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/158109.pdf
Data publikacji:: 2010
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: szyfrowanie
parametry statystyczne
analiza danych
rozkład statystyczny
cipher
cryptography
cryptanalysis
statistic parameters
data analysis
probability distribution
noise
Opis:: Informacja szyfrowana, podobnie jak wszystkie inne typy danych, może zostać poddana analizie statystycznej. Wyznaczenie dla niej parametrów takich jak wartość średnia, wariancja czy też entropia nie nastręcza większych trudności. Wykorzystać do tego można nowoczesne narzędzia numeryczne jak np. MATLAB, Mathcad czy też Microsoft Exel. Pytanie, na które ma dać odpowiedź niniejsze opracowanie brzmi - "czy parametry te niosą ze sobą wiedzę, którą można wykorzystać w użyteczny sposób?" Przykładowym zastosowaniem może być np. określenie czy informacja jest zaszyfrowana (ang. cipher text), czy też jest ona jawna (ang. plain text).
A cipher text, like any other data, can be analysed with use of parameters typical for statistics. Values such as the mean value, variance or entropy are easy to be calculated, especially if one can use numerical tools like e.g. MATLAB, Mathcad or simply Microsoft Exel. The question, to which this paper should give an answer is - "do those parameters provide any information that could be used in any useful way?" For example, the information, whether the analysed data is a cipher or plain text. The available publications about distinguishing the cipher from plain text use only methods typical for testing the randomness of cipher text and random number generator or immunity for cipher breaking. They are presented in the paper by the National Institute of Standards and Technology [1]. The other common method, used for distinguishing the data, is the analysis based on entropy [2]. Lack of published results about the efficiency of methods based on e.g. entropy, is additional motivation for this paper. (see Paragraph 1.) The proposed algorithms use parameters and transformations typical for Statistic and Signal Processing to classify the analysed data as cipher/plain. The authors assume that cipher data are very similar to random numbers due to Shannon's Perfect Secrecy theorem [3]. Six types of plain and cipher data (text, music, image, video, archives and others), seven types of cipher cores (3DES, AES, Blowfish, CAST - 128, RC4, Serpent, Twofish) and various length (1 B to 2323 B) data were examined and group of the so called Statistic Parameters was formed (see Table 1). Definitions of all of them (and a few more) are given by equations (1) to (12). The efficiency of Statistic Parameters after 1417 test samples is shown in Table 2. The most interesting results are also shown in Figs. 1 to 9. (see Paragraphs 2 - 4.) The results show that using simple values like e.g. energy one can built a data distinguisher of the efficiency equal to 90% and low numerical complexity. The lower bound for usability of this method was found to be 200 B. The upper bound was not found. The presented algorithm can be used for creating a network data analyser or cipher text detector. (see Paragraph 5.)
Źródło:: Pomiary Automatyka Kontrola; 2010, R. 56, nr 10, 10; 1137-1143
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: Równoległa implementacja algorytmu winnowing dla operacji strumieniowej analizy tekstu
Parallel Winnowing Implementation for text stream analysis
Autorzy:: Wielgosz, M.
Żurek, D.
Pietroń, M.
Dąbrowska-Boruch, A.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/154404.pdf
Data publikacji:: 2014
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: n-gramowy model
eksploracja danych
przetwarzanie strumieniowe
GPGPU
n-gram-based model
document comparison
GPU
information retrieval
Opis:: W ramach praca przeprowadzona została analiza możliwości wykorzystania algorytmu winnowing do strumieniowego przetwarzania informacji tekstowej. W szczególności nacisk został położony na operacje generacji odcisku jako jej zredukowanej reprezentacji wiadomości tekstowej. Autorzy przeprowadzili szereg eksperymentów, w celu określenia efektywności działania algorytmu oraz możliwego do uzyskania przyspieszenia obliczeń, z wykorzy-staniem węzła procesorów Intel Xeon E5645 2.40GHz oraz karty GPU Nvidia Tesla m2090.
There are several models available for information retrieval and text analysis but the two are considered to be the dominant ones, namely Boolean and the vector space model (VSM). A model maps the existing words or text into a new representation space. This paper presents a boolean n-gram-based algorithm - winnowing for fast text search and comparison of documents with main focus on its implementation and performance analysis. The algorithm is used to generate fingerprints (i.e. a set of hashes) of the analyzed documents. A dedicated test framework was designed and implemented to handle the task of the algorithm evaluation which utilizes PAN test corpus and programming environment. Several tests were conducted in order to determine the comparison quality of the obfuscated and not obfuscated text for the winnowing algorithm and different window and n-gram size. The tests revealed interesting properties of the algorithms with respect to comparison of documents as well as defied the limits of their applicability. The n-gram-based algorithms due to their simplicity are well suited for hardware implementation. Thus, the authors implemented compu-tationally demanding part of both fingerprint generation both on CPU and GPU. Performance measurements for Intel Xeon E5645, 2.40GHz and Nvidia Tesla m2090 implementation of Ngram-based algorithm show approximately 14x computational speedup.
Źródło:: Pomiary Automatyka Kontrola; 2014, R. 60, nr 5, 5; 309-312
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "danych" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język