460-4118/02 – Parallel Algorithms II (PA II)
Gurantor department | Department of Computer Science | Credits | 4 |
Subject guarantor | doc. Ing. Petr Gajdoš, Ph.D. | Subject version guarantor | doc. Ing. Petr Gajdoš, Ph.D. |
Study level | undergraduate or graduate | Requirement | Optional |
Year | 1 | Semester | summer |
| | Study language | English |
Year of introduction | 2015/2016 | Year of cancellation | |
Intended for the faculties | FEI | Intended for study types | Follow-up Master |
Subject aims expressed by acquired skills and competences
The main goal consists in the knowledge extension in the area of programming of parallel applications. The lessons extend an existing subject (Parallel Algorithms I). All topics will be focused on usage of graphic processor units (GPU). Students will be familiar with existing architectures of GPUs and frameworks for parallel programming. The CUDA architecture will be explained in more detail with the respect to the fact, that nVidia Research Center has arisen on VŠB-TU Ostrava. Students get necessary knowledge to be able to solve practical tasks with the usage of GPU. They can use it in their diploma work or in several grant projects running on VŠB-TU Ostrava.
Knowledge and skills:
- orientation in the basic concept of architecture of graphic processors
- knowledge in software architecture of parallel program, problem decomposition into grids, blocks and threads
- knowledge in selected framework for parallel programming on GPU
- understanding of algorithm conversion from serial to parallel form
- task distribution over several GPUs, clusters
- students should be able to solve practical tasks in the area of data processing
Teaching methods
Lectures
Individual consultations
Tutorials
Summary
The subject follows an existing one called Parallel Algorithms I. Acquired knowledge makes a presumption for understanding of new topics. Selected lecture notes give a ground for practical exercises. nVidia CUDA architecture will be presented in more detail will related tools for parallel programming on GPU. Assumption of parallel programming technics in combination with solving of practical tasks makes the most important premises to pass the final exam.
Compulsory literature:
[1] Bjarne Stroustrup. The C++ Programming Language, 4th Edition. Addison-Wesley Professional, 4th edition, 5 2013.
[2] Graham Sellers, Richard S. Wright, and Nicholas Haemel. OpenGL SuperBible: Comprehensive Tutorial and Reference (6th Edition). Addison-Wesley Professional, 6th edition, 7 2013.
[3] John Cheng, Max Grossman, and Ty McKercher. Professional CUDA C Programming. Wrox, 1st edition, 9 2014.
[4] Soyata, Tolga. GPU parallel program development using CUDA. CRC Press, 2018.
Recommended literature:
[1] Bjarne Stroustrup. The C++ Programming Language, 4th Edition. Addison-Wesley Professional, 4th edition, 5 2013.
[2] John Cheng, Max Grossman, and Ty McKercher. Professional CUDA C Programming. Wrox, 1st edition, 9 2014.
[3] Tuomanen, Brian. Hands-On GPU Programming with Python and CUDA: Explore high-performance parallel computing with CUDA. Packt Publishing Ltd, 2018.
[4] Volodymyr Kindratenko, editor. Numerical Computations with GPUs. Springer, 2014 edition, 7 2014.
[5] Vaidya, Bhaumik. Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA: Effective techniques for processing complex image data in real time using GPUs. Packt Publishing Ltd, 2018.
[6] Jung W. Suh and Youngmin Kim. Accelerating MATLAB with GPU Computing: A Primer with Examples. Morgan Kaufmann, 1st edition, 12 2013.
Additional study materials
Way of continuous check of knowledge in the course of semester
The student will work independently on an associated task/project. The solution of the project will be checked during semester. The final evaluation of all achieved results will be done at the end of semester.
E-learning
Other requirements
It is assumed that the student has a good knowledge of programming in C/C++.
Prerequisities
Subject has no prerequisities.
Co-requisities
Subject has no co-requisities.
Subject syllabus:
The lecture notes are designed such that they can make the basis for practical exercising on computer labs.
The outline of lessons:
1. Introduction to parallel programming on GPU, a brief history, CUDA
2. CUDA architecture and its integration within standard C++ project
3. Threads and kernel functions
4. CUDA memories, patterns and usage
5. Memory bank conflicts
6. Program execution control, distribution of an algorithm
7. Algorithm performance with respect to its parallelization on GPU
9. Optimization on the data level, effective data structures.
10. Optimization of programs with respect to the maximum GPU performance
11. Support library CUBLAS
12. The Case study
The outline of exercises (exercises are on computer labs):
1. The first application in CUDA
2. Data transfers to/from GPU
3. Threads hierarchy, basic thread life cycle, limits, calling of kernel functions, parameters and restrictions
4. CUDA memories, patterns and usage
5. Memory bank conflicts, access optimization, suitable data structures
6. Streams, parallel calling of kernel functions, synchronization on several levels
7. The case study, experiment with more variants of the same program
8. Vectors and matrices, the case study, large data processing, parallel reduction
9. Introduction to several support libraries for linear algebra
10. The case study, image manipulation, double buffering, optimization at the level of blocks, registers, etc.
11. The case study, Interesting research topics, outline of possible Solutions, experiments
12. Program tuning, debugging with nVidia nSight
Conditions for subject completion
Occurrence in study plans
Occurrence in special blocks
Assessment of instruction