Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Wyszukujesz frazę "procesor" wg kryterium: Temat


Wyświetlanie 1-14 z 14
Tytuł:
Porting of finite element integration algorithm to Xeon Phi coprocessor-based HPC architectures
Autorzy:
Krużel, Filip
Banaś, Krzysztof
Iacomo, Mauro
Powiązania:
https://bibliotekanauki.pl/articles/38704636.pdf
Data publikacji:
2023
Wydawca:
Instytut Podstawowych Problemów Techniki PAN
Tematy:
CPU
optimization
parallelization
vectorization
Intel Xeon Phi
procesor
optymalizacja
równoległość
wektoryzacja
Opis:
In the present article, we describe the implementation of the finite element numerical integration algorithm for the Xeon Phi coprocessor. The coprocessor was an extension of the many-core specialized unit for calculations, and its performance was comparable with the corresponding GPUs. Its main advantages were the built-in 512-bit vector registers and the ease of transferring existing codes from traditional x86 architectures. In the article, we move the code developed for a standard CPU to the coprocessor. We compareits performance with our OpenCL implementation of the numerical integration algorithm, previously developed for GPUs. The GPU code is tuned to fit into a coprocessor by ourauto-tuning mechanism. Tests included two types of tasks to solve, using two types of approximation and two types of elements. The obtained timing results allow comparing the performance of highly optimized CPU and GPU codes with a Xeon Phi coprocessor performance. This article answers whether such massively parallel architectures perform better using the CPU or GPU programming method. Furthermore, we have compared the Xeon Phi architecture and the latest available Intel’s i9 13900K CPU when writing this article. This comparison determines if the old Xeon Phi architecture remains competitive in today’s computing landscape. Our findings provide valuable insights for selectingthe most suitable hardware for numerical computations and the appropriate algorithmic design.
Źródło:
Computer Assisted Methods in Engineering and Science; 2023, 30, 4; 427-459
2299-3649
Pojawia się w:
Computer Assisted Methods in Engineering and Science
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
The use of computer modeling in teaching the economic and mathematical disciplines to future economists
Zastosowanie modelowania komputerowego w nauczaniu przedmiotów ekonomicznych i matematycznych na kierunkach ekonomicznych
Autorzy:
RUMYANTSEVA, Kateryna
POGRISCHUK, Borys
LYSYUK, Olena
Powiązania:
https://bibliotekanauki.pl/articles/456462.pdf
Data publikacji:
2012
Wydawca:
Uniwersytet Rzeszowski
Tematy:
computer modeling
tabular processor MS Excel
future economists
modelowanie komputerowe
tabelaryczny procesor MS Excel
ekonomiści
Opis:
The article surveys the use of informational technologies, partially of a table composing processor MS Excel, in the process of studying economic and mathematical subjects at faculties of economics. The advantages of the programmer in question over other similar software are analyzed.
W artykule opisano wyniki rozważań na temat wykorzystania technologii informacyjnych oraz wykorzystanie programu MS Excel w nauczaniu przedmiotów ekonomicznych i matematycznych na ekonomicznych kierunkach studiów. Zalety i wady innych programów tego rodzaju są przedmiotem dalszych analiz.
Źródło:
Edukacja-Technika-Informatyka; 2012, 3, 2; 286-290
2080-9069
Pojawia się w:
Edukacja-Technika-Informatyka
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Cooling of a processor with the use of a heat pump
Chłodzenie procesora za pomocą pompy ciepła
Autorzy:
Lipnicki, Z.
Lechów, H.
Pantoł, K.
Powiązania:
https://bibliotekanauki.pl/articles/396528.pdf
Data publikacji:
2018
Wydawca:
Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:
heat pump
cooling
processor
vaprocompressor cycle
evaporation
pompa ciepła
chłodzenie
procesor
cykl sprężania pary
odparowywanie
Opis:
In this paper the problem of cooling a component, in the interior of which heat is generated due to its work, was solved analytically. the problem of cooling of a processor with the use of a heat pump was solved based on a earlier theoretical analysis of authors of external surface cooling of the cooled component by using the phenomenon of liquid evaporation. Cases of stationary and non-stationary cooling were solved as well. The authors of the work created a simplified non-stationary analytical model describing the phenomenon, thanks to which heat distribution within the component, contact temperature between the component and liquid layer, and the evaporating substance layer thickness in relation to time, were determined. Numerical calculations were performed and appropriate charts were drawn. The resulting earlier analytical solutions allowed conclusions to be drawn, which might be of help to electronics engineers when designing similar cooling systems. Model calculations for a cooling system using a compressor heat pump as an effective method of cooling were performed.
Przedstawiono analityczne rozwiązanie równania chłodzenia jednostki, w której wytwarzane jest ciepło. Z tego powodu opracowano uproszczony, niestacjonarny model określania rozkładu temperatury w jednostce, temperatury kontaktu między jednostką a warstwą cieczy oraz grubości warstwy parowania w funkcji czasu. Podano teoretyczną analizę zewnętrznego chłodzenia jednostki poprzez uwzględnienie zjawiska parowania cieczy za pomocą równań Fouriera i Poissona. Pokazano zarówno stacjonarny, jak i niestacjonarny opis chłodzenia. Uzyskane wyniki symulacji wydają się przydatne przy projektowaniu podobnych układów chłodzenia. Wykonywany jest również tryb obliczeniowy dla układów chłodzenia wyposażonych w pompę ciepła sprężarki, jako efektywnej metody chłodzenia.
Źródło:
Civil and Environmental Engineering Reports; 2018, No. 28(1); 16-25
2080-5187
2450-8594
Pojawia się w:
Civil and Environmental Engineering Reports
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Reconfigurable General-purpose Processor Idea Overview
Autorzy:
Zarzycki, I
Powiązania:
https://bibliotekanauki.pl/articles/397875.pdf
Data publikacji:
2013
Wydawca:
Politechnika Łódzka. Wydział Mikroelektroniki i Informatyki
Tematy:
procesor rekonfigurowalny
FPGA
bezpośrednio programowalna macierz bramek
rekonfiguracja dynamiczna
processor
dynamic reconfiguration
reconfigurable computing paradigm
Opis:
This paper presents the idea of the reconfigurable general-purpose processor implemented as dynamically reconfigurable FPGA (called “reconfigurable processor” in the rest of this document). Proposed solution is compared with currently available general-purpose processors performing instructions sequentially (called “sequential processors” in the rest of this paper). This document presents the idea of such reconfigurable processor and its operation without going into implementation details and technological limitations. The main novelty of reconfigurable processor lays in lack of typical for other processors sequential execution of instructions. All operations (if only possible) are executed in parallel, in hardware also at subistruction level. Solution proposed in this paper should give speed up and lower power consumption in comparison with other processors currently available. Additionally proposed architecture does not requires neither any modifications in source codes of already existing, portable programs nor any changes in development process. All of the changes can be performed by compiler at the stage of compilation.
Źródło:
International Journal of Microelectronics and Computer Science; 2013, 4, 1; 37-42
2080-8755
2353-9607
Pojawia się w:
International Journal of Microelectronics and Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Exploiting multi-core and many-core parallelism for subspace clustering
Autorzy:
Datta, Amitava
Kaur, Amardeep
Lauer, Tobias
Chabbouh, Sami
Powiązania:
https://bibliotekanauki.pl/articles/331126.pdf
Data publikacji:
2019
Wydawca:
Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:
data mining
subspace clustering
multicore processor
many core processor
GPU computing
eksploracja danych
procesor wielordzeniowy
obliczenia GPU
Opis:
Finding clusters in high dimensional data is a challenging research problem. Subspace clustering algorithms aim to find clusters in all possible subspaces of the dataset, where a subspace is a subset of dimensions of the data. But the exponential increase in the number of subspaces with the dimensionality of data renders most of the algorithms inefficient as well as ineffective. Moreover, these algorithms have ingrained data dependency in the clustering process, which means that parallelization becomes difficult and inefficient. SUBSCALE is a recent subspace clustering algorithm which is scalable with the dimensions and contains independent processing steps which can be exploited through parallelism. In this paper, we aim to leverage the computational power of widely available multi-core processors to improve the runtime performance of the SUBSCALE algorithm. The experimental evaluation shows linear speedup. Moreover, we develop an approach using graphics processing units (GPUs) for fine-grained data parallelism to accelerate the computation further. First tests of the GPU implementation show very promising results.
Źródło:
International Journal of Applied Mathematics and Computer Science; 2019, 29, 1; 81-91
1641-876X
2083-8492
Pojawia się w:
International Journal of Applied Mathematics and Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Analysis of the Use of Undervolting to Reduce Electricity Consumption and Environmental Impact of Computers
Analiza wykorzystania undervoltingu do redukcji zużycia energii elektrycznej w urządzeniach komputerowych i oddziaływania na środowisko
Autorzy:
Muc, Adam
Muchowski, Tomasz
Kluczyk, Marcin
Szeleziński, Adam
Powiązania:
https://bibliotekanauki.pl/articles/1811735.pdf
Data publikacji:
2020
Wydawca:
Politechnika Koszalińska. Wydawnictwo Uczelniane
Tematy:
undervolting
energy saving
electric energy
power consumption
generated heat
processor
oszczędzanie energii
energia elektryczna
pobór prądu
generowane ciepło
procesor
Opis:
This paper presents a method of lowering the processor’s voltage and temperature in which the computer operates by performing an operation called undervolting. By using undervolting it is possible to reduce electricity consumption and the amount of heat generated by computer workstations by up to 30%. This problem is particularly relevant for institutions that use a large number of computers. The more the computers are subjected to the higher computational load, the more effective the mechanism of undervolting is. Undervolting the processor does not reduce its performance, but lowers its operating temperature, has a positive impact on its life span and power consumption. Maintaining a low temperature of operation for computer hardware is essential to reduce operating and repair costs. The paper also presents the results of environmental research aimed at assessing the validity and effectiveness of undervolting.
W pracy przedstawiono metodę obniżania napięcia procesora i temperatury pracy komputera poprzez wykonanie operacji zwanej undervoltingiem. Przez zastosowanie undervoltingu można obniżyć nawet o 30% zużycie energii elektrycznej i ilość wydzielanego ciepła przez stanowiska komputerowe. Problem ten jest szczególnie istotny w przypadku instytucji, które korzystają z dużej liczby komputerów. Skuteczność mechanizmu jest tym większa im komputery poddane undervoltingowi są bardziej obciążone obliczeniowo. Wykorzystywanie undervoltingu w konfiguracji procesora nie zmniejsza jego wydajności, a obniża jego temperaturę pracy, wpływa pozytywnie na jego żywotność i zużycie energii elektrycznej. Utrzymanie dobrej kultury pracy sprzętu komputerowego jest kluczowe, by obniżyć koszty eksploatacji oraz napraw. W pracy przedstawiono również wyniki badań środowiskowych, których celem była ocena zasadność i efektywności stosowania undervoltingu.
Źródło:
Rocznik Ochrona Środowiska; 2020, Tom 22, cz. 2; 791-808
1506-218X
Pojawia się w:
Rocznik Ochrona Środowiska
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Research and development of an electric traction drive based on a switched reluctance motor
Autorzy:
Buriakovsky, S.
Maslii, A.
Pasko, O.
Denys, I.
Powiązania:
https://bibliotekanauki.pl/articles/375320.pdf
Data publikacji:
2018
Wydawca:
Politechnika Śląska. Wydawnictwo Politechniki Śląskiej
Tematy:
switched reluctance motor
torque
microcontroller
quadrature encoder
electronic switches
silnik reluktancyjny przełączalny
moment obrotowy
mikrokontroler
procesor sygnałowy
przełączniki elektroniczne
Opis:
In the most developed countries, intensive studies are being carried out to utilize various types of electric machines such as synchronous motors with permanent magnets and traction motors with non-traditional magnetic systems on traction electric drives. Switched Reluctance Motors (SRM) are one of the most simple, reliable, and cost-efficient technology used in manufacture and operation. Its convenient traction performance, combined with the high overload capacity, makes its use promising for both freight and passenger rolling stock. Our research is directed to develop a control system for a four-phase SRM. The procedure of fuzzy-regulator synthesis is presented. A physical model of a switched reluctance drive is created, namely, it is a system of a wheel set and a motor. The efficiency of the control system with different types of speed regulators was checked and their main quality indicators were determined. According to the results of the analysis, it was found that the fuzzy regulator more precisely controls the regulated value.
Źródło:
Transport Problems; 2018, 13, 2; 69-79
1896-0596
2300-861X
Pojawia się w:
Transport Problems
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Graphics processing units in acceleration of bandwidth selection for kernel density estimation
Autorzy:
Andrzejewski, W.
Gramacki, A.
Gramacki, J.
Powiązania:
https://bibliotekanauki.pl/articles/330819.pdf
Data publikacji:
2013
Wydawca:
Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:
bandwidth selection
graphics processing unit
probability density function
nonparametric estimation
kernel estimation
szerokość pasmowa
programowalny procesor graficzny
funkcja gęstości prawdopodobieństwa
estymacja nieparametryczna
estymacja jądrowa
Opis:
The Probability Density Function (PDF) is a key concept in statistics. Constructing the most adequate PDF from the observed data is still an important and interesting scientific problem, especially for large datasets. PDFs are often estimated using nonparametric data-driven methods. One of the most popular nonparametric method is the Kernel Density Estimator (KDE). However, a very serious drawback of using KDEs is the large number of calculations required to compute them, especially to find the optimal bandwidth parameter. In this paper we investigate the possibility of utilizing Graphics Processing Units (GPUs) to accelerate the finding of the bandwidth. The contribution of this paper is threefold: (a) we propose algorithmic optimization to one of bandwidth finding algorithms, (b) we propose efficient GPU versions of three bandwidth finding algorithms and (c) we experimentally compare three of our GPU implementations with the ones which utilize only CPUs. Our experiments show orders of magnitude improvements over CPU implementations of classical algorithms.
Źródło:
International Journal of Applied Mathematics and Computer Science; 2013, 23, 4; 869-885
1641-876X
2083-8492
Pojawia się w:
International Journal of Applied Mathematics and Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Implementation of a Web-based remote control system for qZS DAB application using low-cost ARM platform
Autorzy:
Korzeniewski, M.
Kulikowski, K.
Zakis, J.
Jasiński, M.
Malinowski, A.
Powiązania:
https://bibliotekanauki.pl/articles/201276.pdf
Data publikacji:
2016
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
ARM processor
DAB converter controller
network controller
web-based user interface
energy control
procesor ARM
kontroler
konwerter
kontroler sieciowy
interfejs użytkownika oparty na sieci Web
kontrola energii
DAB
Opis:
Continuous development of intelligent network applications drives the demand for deployment-ready hardware and software solutions. Such solutions are highly valued not only by distributed producers of energy but by energy consumers as well. The use of intelligent network applications enables the development and improvement of the quality of services. It also increases self-sufficiency and efficiency. This paper describes an example of such device that allows for the control of a dual active bridge (DAB) converter and enables its remote control in real time over an IP-based network. The details of both hardware and software components of proposed implementation are provided. The DAB converter gives a possibility to control and manage the energy between two DC power systems with very different voltage levels. Not only information, but also the quality of energy, the direction of power flow, and energy storage systems can be easily controlled through an IP-based network and power electronics converters. Information technology, together with intelligent control of power electronics technology, provides a flexible solution, especially for sustainable smart grids.
Źródło:
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2016, 64, 4; 887-896
0239-7528
Pojawia się w:
Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Effective real-time computer graphics processing based on depth-of-field effect
Wspomaganie sprzętowe do efektywnego przetwarzania grafiki w czasie rzeczywistym na przykładzie efektu głębi ostrości
Autorzy:
Tomaszewska, A.
Bazyluk, B.
Powiązania:
https://bibliotekanauki.pl/articles/154559.pdf
Data publikacji:
2010
Wydawca:
Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:
efekt głębi widzenia
zakres widzenia
programowalny procesor graficzny
przetwarzanie grafiki 3D
depth of field effect
circle of confusion
graphics processing unit
3D computer graphics processing
Opis:
In this paper we present a GPU-based effect of an artificial depth of field, which varies with the distance from camera of the point that the user is looking at. Depth of field greatly enhances the scene's realism. The goal of our technique is the 3D approach with the user interaction that relies on the simulation of gaze point. Most of the computations are efficiently performed on the GPU with the use of vertex and pixel shaders.
W artykule zaprezentowano sprzętową implementację efektu głębi widzenia. Opracowane podejście zoptymalizowano wykorzystując informację o punkcie skupienia wzroku użytkownika. W artykule główną uwagę skoncentrowano na efekcie zachowania ostrości wyłącznie dla tego fragmentu sceny, na którym w danym momencie skupiony jest wzrok obserwatora. Działanie algorytmu zaprojektowano pod kątem implementacji sprzętowej z wykorzystaniem programowalnych jednostek cieniowania wierzchołków oraz fragmentów. Do synchronizacji shaderów (programów uruchamanych na karcie) oraz transferu danych pomiędzy pamięcią główną a GPU wykorzystano procesor CPU, a wszystkie dane przechowano w postaci 32-bitowych tekstur. W implementacji, moduły algorytmu wykonujące operacje macierzowe korzystały z obiektu bufora ramki umożliwiającego generowanie wyniku do tekstury zamiast do standardowego bufora okna. W celu prezentacji efektu głębi widzenia stworzono aplikację umożliwiającą przetestowanie wydajności algorytmu wykorzystującego informację o punkcie skupienia wzroku uzyskując wzrost wydajności nawet do 40% w porównaniu z podejściem bez optymalizacji [2]. W rozdziale 2 artykułu zaprezentowano przegląd istniejących algorytmów symulujących efekt głębi widzenia. Prezentowane podejście oraz jego implementację sprzętową przedstawiono w rozdziale 3. Rezultaty działania metody zaprezentowano w rozdziale 4 a podsumowanie w rozdziale 5.
Źródło:
Pomiary Automatyka Kontrola; 2010, R. 56, nr 7, 7; 675-677
0032-4140
Pojawia się w:
Pomiary Automatyka Kontrola
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Execution time prediction model for parallel GPU realizations of discrete transforms computation algorithms
Autorzy:
Puchala, Dariusz
Stokfiszewski, Kamil
Wieloch, Kamil
Powiązania:
https://bibliotekanauki.pl/articles/2173537.pdf
Data publikacji:
2022
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
graphics processing unit
GPU
execution time prediction model
discrete wavelet transform
DWT
lattice structure
convolution-based approach
orthogonal transform
orthogonal filter banks
time effectiveness
prediction accuracy
procesor graficzny
model przewidywania czasu wykonania
dyskretna transformata falkowa
struktura sieciowa
podejście oparte na splotach
przekształcenia ortogonalne
ortogonalne banki filtrów
efektywność czasowa
dokładność przewidywania
Opis:
Parallel realizations of discrete transforms (DTs) computation algorithms (DTCAs) performed on graphics processing units (GPUs) play a significant role in many modern data processing methods utilized in numerous areas of human activity. In this paper the authors propose a novel execution time prediction model, which allows for accurate and rapid estimation of execution times of various kinds of structurally different DTCAs performed on GPUs of distinct architectures, without the necessity of conducting the actual experiments on physical hardware. The model can serve as a guide for the system analyst in making the optimal choice of the GPU hardware solution for a given computational task involving particular DT calculation, or can help in choosing the best appropriate parallel implementation of the selected DT, given the limitations imposed by available hardware. Restricting the model to exhaustively adhere only to the key common features of DTCAs enables the authors to significantly simplify its structure, leading consequently to its design as a hybrid, analytically–simulational method, exploiting jointly the main advantages of both of the mentioned techniques, namely: time-effectiveness and high prediction accuracy, while, at the same time, causing mutual elimination of the major weaknesses of both of the specified approaches within the proposed solution. The model is validated experimentally on two structurally different parallel methods of discrete wavelet transform (DWT) computation, i.e. the direct convolutionbased and lattice structure-based schemes, by comparing its prediction results with the actual measurements taken for 6 different graphics cards, representing a fairly broad spectrum of GPUs compute architectures. Experimental results reveal the overall average execution time and prediction accuracy of the model to be at a level of 97.2%, with global maximum prediction error of 14.5%, recorded throughout all the conducted experiments, maintaining at the same time high average evaluation speed of 3.5 ms for single simulation duration. The results facilitate inferring the model generality and possibility of extrapolation to other DTCAs and different GPU architectures, which along with the proposed model straightforwardness, time-effectiveness and ease of practical application, makes it, in the authors’ opinion, a very interesting alternative to the related existing solutions.
Źródło:
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2022, 70, 1; e139393, 1--30
0239-7528
Pojawia się w:
Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Execution time prediction model for parallel GPU realizations of discrete transforms computation algorithms
Autorzy:
Puchala, Dariusz
Stokfiszewski, Kamil
Wieloch, Kamil
Powiązania:
https://bibliotekanauki.pl/articles/2173635.pdf
Data publikacji:
2022
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
graphics processing unit
GPU
execution time prediction model
discrete wavelet transform
DWT
lattice structure
convolution-based approach
orthogonal transform
orthogonal filter banks
time effectiveness
prediction accuracy
procesor graficzny
model przewidywania czasu wykonania
dyskretna transformata falkowa
struktura sieciowa
podejście oparte na splotach
przekształcenia ortogonalne
ortogonalne banki filtrów
efektywność czasowa
dokładność przewidywania
Opis:
Parallel realizations of discrete transforms (DTs) computation algorithms (DTCAs) performed on graphics processing units (GPUs) play a significant role in many modern data processing methods utilized in numerous areas of human activity. In this paper the authors propose a novel execution time prediction model, which allows for accurate and rapid estimation of execution times of various kinds of structurally different DTCAs performed on GPUs of distinct architectures, without the necessity of conducting the actual experiments on physical hardware. The model can serve as a guide for the system analyst in making the optimal choice of the GPU hardware solution for a given computational task involving particular DT calculation, or can help in choosing the best appropriate parallel implementation of the selected DT, given the limitations imposed by available hardware. Restricting the model to exhaustively adhere only to the key common features of DTCAs enables the authors to significantly simplify its structure, leading consequently to its design as a hybrid, analytically–simulational method, exploiting jointly the main advantages of both of the mentioned techniques, namely: time-effectiveness and high prediction accuracy, while, at the same time, causing mutual elimination of the major weaknesses of both of the specified approaches within the proposed solution. The model is validated experimentally on two structurally different parallel methods of discrete wavelet transform (DWT) computation, i.e. the direct convolutionbased and lattice structure-based schemes, by comparing its prediction results with the actual measurements taken for 6 different graphics cards, representing a fairly broad spectrum of GPUs compute architectures. Experimental results reveal the overall average execution time and prediction accuracy of the model to be at a level of 97.2%, with global maximum prediction error of 14.5%, recorded throughout all the conducted experiments, maintaining at the same time high average evaluation speed of 3.5 ms for single simulation duration. The results facilitate inferring the model generality and possibility of extrapolation to other DTCAs and different GPU architectures, which along with the proposed model straightforwardness, time-effectiveness and ease of practical application, makes it, in the authors’ opinion, a very interesting alternative to the related existing solutions.
Źródło:
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2022, 70, 1; art. no. e139393
0239-7528
Pojawia się w:
Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Execution time prediction model for parallel GPU realizations of discrete transforms computation algorithms
Autorzy:
Puchala, Dariusz
Stokfiszewski, Kamil
Wieloch, Kamil
Powiązania:
https://bibliotekanauki.pl/articles/2173636.pdf
Data publikacji:
2022
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
graphics processing unit
GPU
execution time prediction model
discrete wavelet transform
DWT
lattice structure
convolution-based approach
orthogonal transform
orthogonal filter banks
time effectiveness
prediction accuracy
procesor graficzny
model przewidywania czasu wykonania
dyskretna transformata falkowa
struktura sieciowa
podejście oparte na splotach
przekształcenia ortogonalne
ortogonalne banki filtrów
efektywność czasowa
dokładność przewidywania
Opis:
Parallel realizations of discrete transforms (DTs) computation algorithms (DTCAs) performed on graphics processing units (GPUs) play a significant role in many modern data processing methods utilized in numerous areas of human activity. In this paper the authors propose a novel execution time prediction model, which allows for accurate and rapid estimation of execution times of various kinds of structurally different DTCAs performed on GPUs of distinct architectures, without the necessity of conducting the actual experiments on physical hardware. The model can serve as a guide for the system analyst in making the optimal choice of the GPU hardware solution for a given computational task involving particular DT calculation, or can help in choosing the best appropriate parallel implementation of the selected DT, given the limitations imposed by available hardware. Restricting the model to exhaustively adhere only to the key common features of DTCAs enables the authors to significantly simplify its structure, leading consequently to its design as a hybrid, analytically–simulational method, exploiting jointly the main advantages of both of the mentioned techniques, namely: time-effectiveness and high prediction accuracy, while, at the same time, causing mutual elimination of the major weaknesses of both of the specified approaches within the proposed solution. The model is validated experimentally on two structurally different parallel methods of discrete wavelet transform (DWT) computation, i.e. the direct convolutionbased and lattice structure-based schemes, by comparing its prediction results with the actual measurements taken for 6 different graphics cards, representing a fairly broad spectrum of GPUs compute architectures. Experimental results reveal the overall average execution time and prediction accuracy of the model to be at a level of 97.2%, with global maximum prediction error of 14.5%, recorded throughout all the conducted experiments, maintaining at the same time high average evaluation speed of 3.5 ms for single simulation duration. The results facilitate inferring the model generality and possibility of extrapolation to other DTCAs and different GPU architectures, which along with the proposed model straightforwardness, time-effectiveness and ease of practical application, makes it, in the authors’ opinion, a very interesting alternative to the related existing solutions.
Źródło:
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2022, 70, 1; art. no. e139393
0239-7528
Pojawia się w:
Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:
Biblioteka Nauki
Artykuł
    Wyświetlanie 1-14 z 14

    Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies