Temat: CUDA - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Wykorzystanie procesorów graficznych do szybkiego renderingu krajobrazu sferycznego
Efficient GPU-based approach to a spherical terrain rendering
Autorzy:: Tomaszewska, A.
Osobniak, O.
Powiązania:: https://bibliotekanauki.pl/articles/154799.pdf
Data publikacji:: 2010
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: generowanie terenu
programowalny procesor graficzny
technologia CUDA
geometry clipmaps
terrain generation
graphics processing unit
CUDA technology
Opis:: W artykule zaprezentowano sposób generowania w czasie rzeczywistym planety o dużej powierzchni oraz wysokim poziomie szczegółowości. Algorytm opracowano na podstawie techniki wykorzystującej mapy obcięcia geometrii, umożliwiając generowanie na bieżąco dowolnego wycinka terenu na podstawie parametrów ustawienia kamery. Algorytm zaprojektowano pod kątem implementacji sprzętowej z wykorzystaniem programowalnego procesora graficznego oraz technologii CUDA.
In the paper there is presented a fast method for large and detailed spherical terrain rendering. Rendering terrain with a high degree of realism is an ongoing need in real-time computer graphics applications. To render scenes of increased sizes and complexity, several terrain rendering algorithms have been proposed in the literature. One of the recent techniques called geometry clipmaps relies on the position of the viewpoint to create multi-resolution representation of the terrain, using nested meshes. In [1] there is proposed very efficient, GPU based approach of this technique for large terrain models. In the paper there are presented techniques which combine procedural approach and geometry clipmaps together. It enables rendering an arbitrary piece of terrain on fly based on the camera parameters. To improve the algorithm efficience most computations were performed on GPU with use of vertex and pixel shaders and CUDA technology. The paper is organized as follows: Section 2 discusses the previous works, Section 3 presents the application of procedural terrein generetion based on the clipmaps and its hardware implementation, whereas the results obtained are given in Section 4. Thge conclusions are presented at the end of the paper.
Źródło:: Pomiary Automatyka Kontrola; 2010, R. 56, nr 7, 7; 790-792
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: The Java profiler based on byte code analysis and instrumentation for many-core hardware accelerators
Autorzy:: Pietroń, M.
Karwatowski, M.
Wiatr, K.
Powiązania:: https://bibliotekanauki.pl/articles/114614.pdf
Data publikacji:: 2015
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: virtual machine
CUDA
GPU
profiling
parallel computing
Opis:: One of the most challenging issues in the case of many and multi-core architectures is how to exploit their potential computing power in legacy systems without a deep knowledge of their architecture. The analysis of static dependence and dynamic data dependences of a program run, can help to identify independent paths that could have been computed by individual parallel threads. The statistics of reusing the data and its size is also crucial in adapting the application in GPU many-core hardware architecture because of specific memory hierarchies. The proposed profiling system accomplishes static data analysis and computes dynamic dependencies for Java programs as well as recommends parts of source code with the highest potential for parallelization in GPU. Such an analysis can also provide starting point for automatic parallelization.
Źródło:: Measurement Automation Monitoring; 2015, 61, 7; 385-387
2450-2855
Pojawia się w:: Measurement Automation Monitoring
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Wyznaczanie równoległości pętli programowych w aplikacjach dedykowanych dla procesorów graficznych
Parallelizing program loops for graphics processing in general purpose computing
Autorzy:: Bielecki, W.
Pałkowski, M.
Powiązania:: https://bibliotekanauki.pl/articles/155271.pdf
Data publikacji:: 2011
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: automatyczne zrównoleglanie pętli
fragmenty kodu
GPU
CUDA
OpenCL
obliczenia wysokiej wydajności
loop parallelization
slices
Opis:: Ekstrakcja równoległości w postaci niezależnych fragmentów kodu pozwala wygenerować równoległe pętle programowe w sposób automatyczny. Kod taki umożliwia wykorzystanie mocy obliczeniowej maszyn równoległych, w tym wieloprocesorowych kart graficznych. W niniejszym artykule poddano analizie zastosowanie algorytmów wyznaczania fragmentów kodu dla aplikacji dedykowanych dla procesorów graficznych. Zbadano przyspieszenie i efektywność obliczeń oraz skalowalność wygenerowanego kodu równoległego.
Extracting synchronization-free slices allows automatically generating parallel loops. The code can be executed on multi-processors machines in a reduced period of time. Slicing techniques enable also generating parallel code for graphics processing in general purpose computing. Nowadays, graphic cards support executing multi-threaded applications. GPU systems consist of tens or hundreds of processors. CUDA (an acronym for Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA. Graphics processing units (GPUs) are accessible to software developers through variants of industry standard programming languages. Using CUDA, the latest NVIDIA GPUs become accessible for computation like CPUs. The model for GPU computing is to use a CPU and GPU together in a heterogeneous co-processing computing model. The sequential part of the application runs on the CPU and the computationally-intensive part is accelerated by the GPU. From the user's perspective, the application just runs faster because it uses the high-performance of the GPU to boost performance. In this paper slicing algorithms are examined for generating a parallel code for graphic cards are examined. A short example of the code is presented. CUDA statements and technique are explained. Memory cost and transfer data is considered. Speed-up, efficiency and scalability of the code are analyzed.
Źródło:: Pomiary Automatyka Kontrola; 2011, R. 57, nr 8, 8; 963-965
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Akceleracja obliczeń komputerowych za pomocą układów graficznych z wykorzystaniem technologii CUDA
Computing acceleration based on application of the CUDA technology
Autorzy:: Stefanowicz, Ł.
Wiśniewski, R.
Wiśniewska, M.
Powiązania:: https://bibliotekanauki.pl/articles/155246.pdf
Data publikacji:: 2011
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: procesor
obliczenia
równoległość
CPU
GPU
CUDA
multimedia
iteracja
wielowątkowość
processor
computing acceleration
parallelism
iteration
multithreading
Opis:: W artykule zaprezentowano możliwość zastosowania układów graficznych celem przyspieszenia obliczeń komputerowych. Przedstawiono technologię oraz architekturę CUDA firmy nVidia, a także podstawowe rozszerzenia względem standardów języka C. W referacie omówiono autorskie algorytmy testowe oraz metodykę badań, które przeprowadzono w celu określenia skuteczności akceleracji obliczeń komputerowych z wykorzystaniem procesorów graficznych GPU w porównaniu do rozwiązań tradycyjnych, opartych o CPU.
The paper deals with application of the graphic processor units (GPUs) to acceleration of computer operations and computations. The traditional computation methods are based on the Central Processor Unit (CPU), which ought to handle all computer operations and tasks. Such a solution is especially not effective in case of distributed systems where some sub-tasks can be performed in parallel. Many parallel threads can accelerate computing, which results in a shorter execution time. In the paper a new CUDA technology and architecture is shown. The presented idea of CUDA technology bases on application of the GPU processors to compu-tation to achieve better performance in comparison with the traditional methods, where CPUs are used. The GPU processors may perform multi-thread calculation. Therefore, especially in case of tasks where concurrency can be applied, CUDA may highly speed-up the computation process. The effectiveness of CUDA technology was verified experimentally. To perform investigations and experiments, the own test modules were used. The library of benchmarks consists of various algorithms, from simple iteration scripts to video processing methods. The results obtained from calculations performed via CPU and via GPU are compared and discussed.
Źródło:: Pomiary Automatyka Kontrola; 2011, R. 57, nr 8, 8; 954-956
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Robust and efficient finite-difference-time-domain modelling of the propagation of nonlinear elastic waves
Niezawodne i wydajne modelowanie propagacji nieliniowych fal sprężystych metodą różnic skończonych w dziedzinie czasu
Autorzy:: Pandala, A.
Shivaprasad, S.
Krishnamurthy, C. V.
Balasubramaniam, K.
Powiązania:: https://bibliotekanauki.pl/articles/107732.pdf
Data publikacji:: 2018
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: Finite Difference Time Domain
Rotated Staggered Grid
Parsimonious Scheme
Nonlinear elastic waves
CUDA
GPU
metoda różnic skończonych w dziedzinie czasu
rotowane siatki przestawne
schemat redukcji oszczędnej
nieliniowe fale sprężyste
Opis:: A robust finite-difference-time-domain (FDTD ) scheme to model the non-linear elastic wave propagation in a homogeneous isotropic material is presented. A formulation based on rotated staggered grid scheme in a displacement-velocity-stress configuration incorporating both geometric and material nonlinearities is proposed. By adopting a Parsimonious algorithm, the computational memory requirement is reduced by 50%. Simulations are accelerated by exploiting massive data parallelism innate to the FDTD approach using parallel computation on Graphical Processing Units with NVIDIA CUDA ’s API. For the proposed numerical scheme, the grid convergence criterion and accuracy over propagating distances are investigated. The study is also extended to determine the contribution from geometric and material models at various input amplitude levels. The time and frequency domain signals obtained from the proposed scheme are verified with a commercial finite element solver. The simulation runtimes for an Aluminium sample of dimensions 20 mm x 10 mm using a 5 MHz pulse is of the order of one minute, which makes the proposed numerical scheme attractive to model nonlinear elastic waves in large domains.
W artykule przedstawiono odporny schemat metody różnic skończonych w dziedzinie czasu (FDTD ) do modelowania propagacji nieliniowych fal sprężystych w jednorodnym materiale izotropowym. Zaproponowano podejście oparte na rotowanych siatkach przestawnych w układzie przemieszczenie- prędkość-naprężenie obejmującym zarówno nieliniowość geometryczną, jak i materiałową. Zastosowanie algorytmu redukcji oszczędnej, zmniejszyło zapotrzebowanie na pamięć obliczeniową o 50%. Symulacje są przyspieszane przez wykorzystanie olbrzymiego paralelizmu danych wbudowanego w podejście FDTD z wykorzystaniem obliczeń równoległych na jednostkach przetwarzania graficznego (GPU) wyposażonych w interfejs API NVIDIA CUDA . Dla proponowanego schematu numerycznego badane jest kryterium zbieżności siatki i dokładność w funkcji odległości propagacji. Badanie rozszerzono również w celu określenia wkładu modeli geometrycznych i materiałowych na różnych poziomach amplitudy wejściowej. Sygnały w dziedzinie czasu i częstotliwości uzyskane z proponowanego schematu są weryfikowane za pomocą komercyjnego oprogramowania wykorzystującego metodę elementów skończonych. Czasy pracy dla symulacji propagacji impulsu o częstotliwości 5 MHz w próbce aluminium o wymiarach 20 mm x 10 mm są rzędu jednej minuty, co sprawia, że proponowany schemat liczbowy jest atrakcyjny dla modelowania nieliniowych fal sprężystych w dużych domenach.
Źródło:: Badania Nieniszczące i Diagnostyka; 2018, 2; 11-21
2451-4462
2543-7755
Pojawia się w:: Badania Nieniszczące i Diagnostyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "CUDA" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język