Performance enhancement of CUDA applications by overlapping data transfer and Kernel execution

Szczegóły
Opis

Tytuł:: Performance enhancement of CUDA applications by overlapping data transfer and Kernel execution
Autorzy:: Raju, K.
Chiplunkar, Niranjan N
Powiązania:: https://bibliotekanauki.pl/articles/1956064.pdf
Data publikacji:: 2021
Wydawca:: Polskie Towarzystwo Promocji Wiedzy
Tematy:: CPU-GPU
high-performance computing
kernel
data transfer
CUDA streams
obliczenia wysokiej wydajności
jądro
transfer danych
strumienie CUDA
Źródło:: Applied Computer Science; 2021, 17, 3; 5-18
1895-3735
Język:: angielski
Prawa:: CC BY: Creative Commons Uznanie autorstwa 4.0
Dostawca treści:: Biblioteka Nauki
: Artykuł

Przejdź do źródła

The CPU-GPU combination is a widely used heterogeneous computing system in which the CPU and GPU have different address spaces. Since the GPU cannot directly access the CPU memory, prior to invoking the GPU function the input data must be available on the GPU memory. On completion of GPU function, the results of computation are transferred to CPU memory. The CPU-GPU data transfer happens through PCIExpress bus. The PCI-E bandwidth is much lesser than that of GPU memory. The speed at which the data is transferred is limited by the PCI-E bandwidth. Hence, the PCI-E acts as a performance bottleneck. In this paper two approaches are discussed to minimize the overhead of data transfer, namely, performing the data transfer while the GPU function is being executed and reducing the amount of data to be transferred to GPU. The effectiveness of these approaches on the execution time of a set of CUDA applications is realized using CUDA streams. The results of our experiments show that the execution time of applications can be minimized with the proposed approaches.

Informacja

Powiązane pozycje