Temat: parallelization - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: MIDACO parallelization scalability on 200 minlp benchmarks
Autorzy:: Schlueter, M.
Munetomo, M.
Powiązania:: https://bibliotekanauki.pl/articles/91640.pdf
Data publikacji:: 2017
Wydawca:: Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:: MINLP
optimization
MIDACO
parallelization
Opis:: This contribution presents a numerical evaluation of the impact of parallelization on the performance of an evolutionary algorithm for mixed-integer nonlinear programming (MINLP). On a set of 200 MINLP benchmarks the performance of the MIDACO solver is assessed with gradually increasing parallelization factor from one to three hundred. The results demonstrate that the efficiency of the algorithm can be significantly improved by parallelized function evaluation. Furthermore, the results indicate that the scale-up behaviour on the efficiency resembles a linear nature, which implies that this approach will even be promising for very large parallelization factors. The presented research is especially relevant to CPU-time consuming real-world applications, where only a low number of serial processed function evaluation can be calculated in reasonable time.
Źródło:: Journal of Artificial Intelligence and Soft Computing Research; 2017, 7, 3; 171-181
2083-2567
2449-6499
Pojawia się w:: Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Hybrid-parallel formulation of fundamental quantum-chemical algorithms
Hybrydowe zrównoleglenie podstawowych algorytmów kwantowo-chemicznych
Autorzy:: Mazur, G.
Makowski, M.
Kuna, D.
Powiązania:: https://bibliotekanauki.pl/articles/305411.pdf
Data publikacji:: 2011
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: chemia obliczeniowa
zrównoleglenie
computational chemistry
parallelization
Opis:: Hybrid-parallel variants of Hartree-Fock, Kohn-Sham and Moller-Plesset second-level perturbation theory are described. Their efficiency with respect to the serial and MPI-based parallel implementations are measured and briefly analyzed. It is shown that while hybrid parallelization provide increased efficiency in all cases, the magnitude of the effect strongly depends on the features of the particular algorithm.
Przedstawiono hybrydowo zrównoleglone warianty metod Hartreego-Focka, Kohna-Shama i rachunku zaburzeń Mollera-Plesseta drugiego rzędu. Porównano ich wydajność względem implementacji szeregowej i implementacji zrównoleglonej za pomocą mechanizmu przekazywania komunikatów (MPI). Pokazano, że hybrydowe zrównoleglenie zapewnia zwiększoną wydajność we wszystkich analizowanych przypadkach, przy czym wielkość uzyskanego przyspieszenia silnie zależy od cech danego algorytmu.
Źródło:: Computer Science; 2011, 12; 163-168
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Parallelized algorithms for finding similar images and object recognition
Autorzy:: Frączek, R.
Cyganek, B.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/305784.pdf
Data publikacji:: 2013
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: color descriptors
code optimization
parallelization
OpenMP
Opis:: The paper addresses the issue of searching for similar images and objects in arepository of information. The contained images are annotated with the help of the sparse descriptors. In the presented research, different color and edge histogram descriptors were used. To measure similarities among images,various color descriptors are compared. For this purpose different distance measures were employed. In order to decrease execution time, several code optimization and parallelization methods are proposed. Results of these experiments, as well as discussion of the advantages and limitations of different combinations of metods are presented.
Źródło:: Computer Science; 2013, 14 (1); 113-127
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Concerning decomposition of a system of linear algebraic equations
Autorzy:: Moszyński, Krzysztof
Powiązania:: https://bibliotekanauki.pl/articles/1340315.pdf
Data publikacji:: 1995
Wydawca:: Polska Akademia Nauk. Instytut Matematyczny PAN
Tematy:: decomposition of a matrix
nearly commuting matrices
parallelization
Źródło:: Applicationes Mathematicae; 1995-1996, 23, 2; 191-198
1233-7234
Pojawia się w:: Applicationes Mathematicae
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: The efficiency of parallel OpenMP loop code produced by the hyperplane method
Efektywność kodu pętli automatycznie zrównoleglonego metodą hiperpłaszczyzn
Autorzy:: Poliwoda, M.
Powiązania:: https://bibliotekanauki.pl/articles/156060.pdf
Data publikacji:: 2009
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: zrównoleglenie pętli
OpenMP
metoda hiperpłaszczyzn
loop parallelization
hyperplane method
Opis:: The efficiency of loops parallelized by the hyperplane method is considered in the paper. The improvement of the parallel loop code efficiency was explored across improvement the locality of calculations. The main goal of presented research is disclosing whether is it possibly and what is the area of the hyperplane method parallelize loops, and how the improvement of data locality influences the improvement of the parallel loop code efficiency.
W artykule przedstawiono wyniki badań efektywności kodu pętli zrównoleglonego metodą hiperpłaszczyzn w odniesieniu do kodu pętli zrównoleglonego innymi metodami, z uwzględnieniem efektywności wynikającej z użycia różnych kompilatorów. Dodatkowo przeprowadzono badania poprawy efektywności zrównoleglenia poprzez zwiększenie lokalności obliczeń. Celem przeprowadzonych badań było określenie czy i w jakim obszarze kod zrównoleglony metodą hiperpłaszczyzn może być efektywny i w jakim stopniu zwiększenie lokalności obliczeń wpływa na poprawę efektywności kodu.
Źródło:: Pomiary Automatyka Kontrola; 2009, R. 55, nr 10, 10; 811-814
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Automatyczne zrównoleglenie pętli, efektywność zrównoleglonego kodu
Automatically loops parallelized, efficiency of parallelized code
Autorzy:: Paliwoda, M.
Powiązania:: https://bibliotekanauki.pl/articles/156253.pdf
Data publikacji:: 2008
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: zrównoleglenie pętli
OpenMP
metoda hiperpłaszczyzn
loops parallelization
hyperplane method
Opis:: Artykuł przedstawia wyniki badań efektywności kodu pętli zrównoleglonego metodą hiperpłaszczyzn w odniesieniu do kodu pętli zrównoleglonego innymi metodami. Celem przeprowadzonych badań było określenie efektywności kodu zrównoleglonego różnymi metodami oraz obszaru, w jakim zrównoleglony kod efektywnie wykorzystuje zasoby systemu wieloprocesorowego z uwzględnieniem procesorów wielordzeniowych.
The results of loops code efficiency parallelized by hyperplanes method compared with loops code efficiency parallelized with other methods are presented in this paper. The main goal is to determinate when the loops parallelized by different methods are efficient and the multiprocessor system or multi core processors are utilized effectively.
Źródło:: Pomiary Automatyka Kontrola; 2008, R. 54, nr 8, 8; 575-578
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Dependency between Tiles’ Sizes and Program Execution Time
Autorzy:: Sushko, S.
Chemerys, O.
Powiązania:: https://bibliotekanauki.pl/articles/114282.pdf
Data publikacji:: 2018
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: software optimization
tiling
parallelization
tile size
code transformation
fast computation
Opis:: The paper is dedicated to the aspects of software optimization. Optimization problem is described. Tiling and parallelization methods were applied on the test applications. Several tests were performed to estimate influence of the tiles' sizes on the computational time. The obtained results show complicated dependency between tiles' sizes and processing time. Numerical characteristics of the obtained results and the corresponding pictures are presented.
Źródło:: Measurement Automation Monitoring; 2018, 64, 2; 28-30
2450-2855
Pojawia się w:: Measurement Automation Monitoring
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: The potential for real-time testing of high-frequency trading strategies through a developed tool during volatile market conditions
Autorzy:: Vaitonis, Mantas
Korovkinas, Konstantinas
Powiązania:: https://bibliotekanauki.pl/articles/30148245.pdf
Data publikacji:: 2023
Wydawca:: Polskie Towarzystwo Promocji Wiedzy
Tematy:: high frequency trading
cryptocurrencies
algorithmic trading
multidimensional matrice
parallelization
simulation
Opis:: This study presents a method for testing high-frequency trading (HFT) for algorithms on GPUs using kernel parallelization, code vectorization, and multidimensional matrices. The research evaluates HFT strategies within algorithmic cryptocurrency trading in volatile market conditions, particularly during the COVID-19 pandemic. The study's objective is to provide an efficient and comprehensive approach to assessing the efficiency and profitability of HFT strategies. The results show that the method effectively evaluates the efficiency and profitability of HFT strategies, as demonstrated by the Sharp ratio of 2.29 and the Sortino ratio of 2.88. The authors suggest that further study on HFT testing methods could be conducted using a tool that directly connects to electronic marketplaces, enabling real-time receipt of high-frequency trading data and simulation of trade decisions. Finally, the study introduces a novel method for testing HFT algorithms on GPUs, offering promising results in assessing the efficiency and profitability of HFT strategies during volatile market conditions.
Źródło:: Applied Computer Science; 2023, 19, 2; 63-81
1895-3735
2353-6977
Pojawia się w:: Applied Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: Parallel Algorithms for Forward and Back Substitution in Linear Algebraic Equations of Finite Element Method
Autorzy:: Fialko, Sergiy
Powiązania:: https://bibliotekanauki.pl/articles/308616.pdf
Data publikacji:: 2019
Wydawca:: Instytut Łączności - Państwowy Instytut Badawczy
Tematy:: finite element method
multithreaded parallelization
sparse symmetric matrices
triangular solution
Opis:: This paper considers several algorithms for parallelizing the procedure of forward and back substitution for high-order symmetric sparse matrices on multi-core computers with shared memory. It compares the proposed approaches for various ﬁnite-element problems of structural mechanics which generate sparse matrices of diﬀerent structures.
Źródło:: Journal of Telecommunications and Information Technology; 2019, 4; 20-29
1509-4553
1899-8852
Pojawia się w:: Journal of Telecommunications and Information Technology
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: Basic Aspects of Designing a High-performance Processor Structure for Calculating a "true" Discrete Fractional Fourier Transform
Autorzy:: Cariow, A.
Majorkowska-Mech, D.
Powiązania:: https://bibliotekanauki.pl/articles/114579.pdf
Data publikacji:: 2018
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: discrete fractional Fourier transform
parallelization of computations
hardware implementation
complexity reduction
Opis:: This paper presents a basic aspects of structural design of the highperformance processor for implementation of discrete fractional Fourier transform (DFrFT). The general idea of the possibility of parallelizing the calculation of the so-called “true” discrete Fourier transform on the basis of our previously developed algorithmic approach is presented. We specifically focused only on the general aspects of the organization of the structure of such a processor, since the details of a particular implementation always depend on the implementation platform used, while the general idea of constructing the structure of the processor remains unchanged.
Źródło:: Measurement Automation Monitoring; 2018, 64, 2; 43-45
2450-2855
Pojawia się w:: Measurement Automation Monitoring
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 11.

Tytuł:: Porting of finite element integration algorithm to Xeon Phi coprocessor-based HPC architectures
Autorzy:: Krużel, Filip
Banaś, Krzysztof
Iacomo, Mauro
Powiązania:: https://bibliotekanauki.pl/articles/38704636.pdf
Data publikacji:: 2023
Wydawca:: Instytut Podstawowych Problemów Techniki PAN
Tematy:: CPU
optimization
parallelization
vectorization
Intel Xeon Phi
procesor
optymalizacja
równoległość
wektoryzacja
Opis:: In the present article, we describe the implementation of the finite element numerical integration algorithm for the Xeon Phi coprocessor. The coprocessor was an extension of the many-core specialized unit for calculations, and its performance was comparable with the corresponding GPUs. Its main advantages were the built-in 512-bit vector registers and the ease of transferring existing codes from traditional x86 architectures. In the article, we move the code developed for a standard CPU to the coprocessor. We compareits performance with our OpenCL implementation of the numerical integration algorithm, previously developed for GPUs. The GPU code is tuned to fit into a coprocessor by ourauto-tuning mechanism. Tests included two types of tasks to solve, using two types of approximation and two types of elements. The obtained timing results allow comparing the performance of highly optimized CPU and GPU codes with a Xeon Phi coprocessor performance. This article answers whether such massively parallel architectures perform better using the CPU or GPU programming method. Furthermore, we have compared the Xeon Phi architecture and the latest available Intel’s i9 13900K CPU when writing this article. This comparison determines if the old Xeon Phi architecture remains competitive in today’s computing landscape. Our findings provide valuable insights for selectingthe most suitable hardware for numerical computations and the appropriate algorithmic design.
Źródło:: Computer Assisted Methods in Engineering and Science; 2023, 30, 4; 427-459
2299-3649
Pojawia się w:: Computer Assisted Methods in Engineering and Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 12.

Tytuł:: Optimization of the Multi-Threaded Interval Algorithm for the Pareto-Set Computation
Autorzy:: Kubica, B. J.
Wodniak, A.
Powiązania:: https://bibliotekanauki.pl/articles/308050.pdf
Data publikacji:: 2010
Wydawca:: Instytut Łączności - Państwowy Instytut Badawczy
Tematy:: interval computations
multicriterial analysis
multithreaded programming
Pareto set
POSIX threads
shared-memory parallelization
Opis:: Previous investigations of the authors surveyed the possibility of applying interval methods to seek the Paretofront of a multicriterial nonlinear problem. An efficient algorithm has been proposed and its implementation in a multicore environment has been done and tested. This paper has two goals. First one is to tune the developed algorithm to increase the speedup of the multi-threaded variant. The second one is to extend the algorithm to compute not only the Paretofront (in the criteria space), but also the Pareto-set (in the decision space). Numerical results for suitable test problems are presented.
Źródło:: Journal of Telecommunications and Information Technology; 2010, 1; 70-75
1509-4553
1899-8852
Pojawia się w:: Journal of Telecommunications and Information Technology
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 13.

Tytuł:: Wyznaczanie równoległości pętli programowych w aplikacjach dedykowanych dla procesorów graficznych
Parallelizing program loops for graphics processing in general purpose computing
Autorzy:: Bielecki, W.
Pałkowski, M.
Powiązania:: https://bibliotekanauki.pl/articles/155271.pdf
Data publikacji:: 2011
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: automatyczne zrównoleglanie pętli
fragmenty kodu
GPU
CUDA
OpenCL
obliczenia wysokiej wydajności
loop parallelization
slices
Opis:: Ekstrakcja równoległości w postaci niezależnych fragmentów kodu pozwala wygenerować równoległe pętle programowe w sposób automatyczny. Kod taki umożliwia wykorzystanie mocy obliczeniowej maszyn równoległych, w tym wieloprocesorowych kart graficznych. W niniejszym artykule poddano analizie zastosowanie algorytmów wyznaczania fragmentów kodu dla aplikacji dedykowanych dla procesorów graficznych. Zbadano przyspieszenie i efektywność obliczeń oraz skalowalność wygenerowanego kodu równoległego.
Extracting synchronization-free slices allows automatically generating parallel loops. The code can be executed on multi-processors machines in a reduced period of time. Slicing techniques enable also generating parallel code for graphics processing in general purpose computing. Nowadays, graphic cards support executing multi-threaded applications. GPU systems consist of tens or hundreds of processors. CUDA (an acronym for Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA. Graphics processing units (GPUs) are accessible to software developers through variants of industry standard programming languages. Using CUDA, the latest NVIDIA GPUs become accessible for computation like CPUs. The model for GPU computing is to use a CPU and GPU together in a heterogeneous co-processing computing model. The sequential part of the application runs on the CPU and the computationally-intensive part is accelerated by the GPU. From the user's perspective, the application just runs faster because it uses the high-performance of the GPU to boost performance. In this paper slicing algorithms are examined for generating a parallel code for graphic cards are examined. A short example of the code is presented. CUDA statements and technique are explained. Memory cost and transfer data is considered. Speed-up, efficiency and scalability of the code are analyzed.
Źródło:: Pomiary Automatyka Kontrola; 2011, R. 57, nr 8, 8; 963-965
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 14.

Tytuł:: An approach to form affine time partitioning for statement instances of arbitrarily nested loops
Wyznaczenie harmonogramu instancji instrukcji dla pętli dowolnie zagnieżdżonych
Autorzy:: Bielecki, W.
Siedlecki, K.
Wernikowski, S.
Powiązania:: https://bibliotekanauki.pl/articles/158284.pdf
Data publikacji:: 2010
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: zrównoleglenie pętli
harmonogram swobodny
afiniczne odwzorowanie czasowe
loop parallelization
free schedule
affine time partitioning
Opis:: A novel approach to form affine time partitioning for statement instances of arbitrary nested loops is presented. It is based on extracting free-scheduling which next is used to form a system of equations to produce legal time partitioning. The approach requires an exact dependence analysis. To carry out experiments, the dependence analysis by Pugh and Wonnacott was chosen. Examples illustrating the approach and the results of experiments are presented.
Przedstawiona została nowa metoda do tworzenia afinicznych odwzorowań czasowych instancji instrukcji dla pętli dowolnie zagnieżdżonych. Metoda bazuje na ekstrakcji harmonogramu swobodnego, wykorzystywanego do tworzenia legalnego odwzorowania czasowego. Metoda wymaga dokładnej analizy zależności. Do przeprowadzenia eksperymentów, wybrana została analiza zależności zaproponowana przez Pugh'a and Wonnacott'a. W analizie tej zależności reprezentowane są przez relacje zależności, natomiast przestrzeń iteracji przez zbiory. Do tworzenie zbiorów i relacji zależności wykorzystywana jest arytmetyka Presburgera. Zostały przedstawione przykłady ilustrujące działanie metody dla pętli idealnie zagnieżdżonej, jak i dla pętli nieidealnie zagnieżdżonej. Eksperymenty przeprowadzone zostały na procesorach graficznych firmy nVidia z wykorzystaniem technologii CUDA w trybie zgodności z wersją 1.1. Wyniki zostały przedstawione w formie tabelarycznej. Zostały przedstawione prace pokrewne oraz kierunek dalszych badań.
Źródło:: Pomiary Automatyka Kontrola; 2010, R. 56, nr 10, 10; 1186-1189
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 15.

Tytuł:: Improving the TSAB algorithm through parallel computing
Autorzy:: Rudy, Jarosław
Pempera, Jaroslaw
Smutnicki, Czesław
Powiązania:: https://bibliotekanauki.pl/articles/229535.pdf
Data publikacji:: 2020
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: job shop scheduling
parallel computing
operations research
taboo search
TSAB algorithm
coarse-grained parallelization
Opis:: In this paper, a parallel multi-path variant of the well-known TSAB algorithm for the job shop scheduling problem is proposed. Coarse-grained parallelization method is employed, which allows for great scalability of the algorithm with accordance to Gustafon’s law. The resulting P-TSAB algorithm is tested using 162 well-known literature benchmarks. Results indicate that P-TSAB algorithm with a running time of one minute on a modern PC provides solutions comparable to the ones provided by the newest literature approaches to the job shop scheduling problem. Moreover, on average P-TSAB achieves two times smaller percentage relative deviation from the best known solutions than the standard variant of TSAB. The use of parallelization also relieves the user from having to fine-tune the algorithm. The P-TSAB algorithm can thus beused as module in real-life production planning systems or as a local search procedure in other algorithms. It can also provide the upper bound of minimal cycle time for certain problems of cyclic scheduling.
Źródło:: Archives of Control Sciences; 2020, 30, 3; 411-435
1230-2384
Pojawia się w:: Archives of Control Sciences
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "parallelization" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język