460-4056/01 – Programming of Parallel Applications II (PPA II)

Gurantor department	Department of Computer Science	Credits	5
Subject guarantor	doc. Ing. Petr Gajdoš, Ph.D.	Subject version guarantor	doc. Ing. Petr Gajdoš, Ph.D.
Study level	undergraduate or graduate	Requirement	Optional
Year	1	Semester	summer
		Study language	Czech
Year of introduction	2012/2013	Year of cancellation	2019/2020
Intended for the faculties	FEI	Intended for study types	Follow-up Master

Instruction secured by
Login	Name	Tuitor	Teacher giving lectures
GAJ03	doc. Ing. Petr Gajdoš, Ph.D.

Extent of instruction for forms of study
Form of study	Way of compl.	Extent
Full-time	Credit and Examination	2+2
Part-time	Credit and Examination	10+2

Subject aims expressed by acquired skills and competences

The main goal consists in the knowledge extension in the area of programming of parallel applications. The lessons extend an existing subject (Programming of Parallel Applications). All topics will be focused on usage of graphic processor units (GPU). Students will be familiar with existing architectures of GPUs and frameworks for parallel programming. The CUDA architecture will be explained in more detail with the respect to the fact, that nVidia Research Center has arisen on VŠB-TU Ostrava. Students get necessary knowledge to be able to solve practical tasks with the usage of GPU. They can use it in their diploma work or in several grant projects running on VŠB-TU Ostrava. Knowledge and skills: - orientation in the basic concept of architecture of graphic processors - knowledge of software architecture of parallel program, problem decomposition into grids, blocks and threads - knowledge of selected framework for parallel programming on GPU - understanding of algorithm conversion from serial to parallel form - task distribution over several GPUs, clusters - students should be able to solve practical tasks in the area of data processing

Teaching methods

Lectures
Individual consultations
Tutorials
Project work

Summary

The subject follows an existing one called Programming of Parallel Applications I. Acquired knowledge makes a presumption for understanding of new topics. Selected lecture notes give a ground for practical exercises. nVidia CUDA architecture will be presented in more detail will related tools for parallel programming on GPU. Assumption of parallel programming technics in combination with solving of practical tasks makes the most important premises to pass the final exam.

Compulsory literature:

1. Nir Shavit, Maurice Herlihy: The Art of Multiprocessor Programming, Morgan Kaufmann (March 14, 2008) , ISBN-13: 978-0123705914 2. Edward Kandrot, Jason Sanders: CUDA by Example: An Introduction to General-Purpose GPU Programming, Addison-Wesley Professional; 1 edition (July 29, 2010), ISBN-13: 978-0131387683 3. David Kirk: Programming Massively Parallel Processors: A Hands-on Approach (Applications of GPU Computing Series), Morgan Kaufmann; 1 edition (February 5, 2010), 978-0123814722

Recommended literature:

1. Timothy G. Mattson: Patterns for Parallel Programming, Addison-Wesley Professional; 1 edition (September 25, 2004), ISBN-13: 978-0321228116

Additional study materials

Study supports in the EduDocs system

Way of continuous check of knowledge in the course of semester

A Student will work alone or within a group of max. two people on associated project. The solution of the project will be checked during semester. The final evaluation of achieved results will run at the end of semester.

E-learning

Other requirements

Additional requirements are placed on the student.

Prerequisities

Subject code	Abbreviation	Title	Requirement
460-4040	PPA I	Parallel Programming I	Recommended

Co-requisities

Subject has no co-requisities.

Subject syllabus:

The lecture notes are designed such that they can make the basis for practical exercising on computer labs. 1. Introduction to parallel programming on GPU, a brief history, CUDA - evolution of graphic processors - the beginning of programming on GPU, GPGPU - programmable pipeline - current architectures 2. CUDA architecture and its integration within standard C++ project - key features of selected architecture - decomposition of an algorithm - hardware vs. software decomposition 3. Threads and kernel functions - thread and its meaning in GPU - threads hierarchy, basic thread life cycle, limits - calling of kernel functions, parameters and restrictions 4. CUDA memories, patterns and usage - global, shared, and constant memory, registers and texture memory - allocation and deallocation of memory - memory alignment - copying data from RAM to VRAM and wise versa 5. Memory bank conflicts - access optimization - suitable data structures 6. Program execution control, distribution af an algorithm - streams, parallel calling of kernel functions - synchronization on several levels – threads, blocks, GPU vs. CPU - program distribution on more GPUs 7. Algorithm performance with respect to its parallelization on GPU - case study, experiment with more variants of the same program 8. Vectors and matrices - case study, large data processing - parallel reduction 9. Support library CUBLAS - introduction to several support libraries for linear algebra 10. Performance optimization - case study, image manipulation - double buffering - optimization at the level of blocks, registers, etc. 11. Case study - interesting research topics - outline of possible solutions - experiments - program tuning, debugging with nVidia nSight

Conditions for subject completion

Full-time form (validity from: 2012/2013 Summer semester, validity until: 2012/2013 Summer semester)

Task name	Type of task	Max. number of points (act. for subtasks)	Min. number of points	Max. počet pokusů
Exercises evaluation and Examination	Credit and Examination	100 (100)	51
Exercises evaluation	Credit	40 (40)	21
Project	Project	40	21
Examination	Examination	60 (60)	11	3
Knowledge examination	Oral examination	60	30

Mandatory attendence participation:

Show history

Conditions for subject completion and attendance at the exercises within ISP:

Show history

Occurrence in study plans

Academic year	Programme	Branch/spec.	Form	Study language	Tut. centre	Year	Type of duty
2014/2015	(N2647) Information and Communication Technology	(2612T025) Computer Science and Technology	P	Czech	Ostrava	1	Optional	study plan
2014/2015	(N2647) Information and Communication Technology	(2612T025) Computer Science and Technology	K	Czech	Ostrava	1	Optional	study plan
2013/2014	(N2647) Information and Communication Technology	(2612T025) Computer Science and Technology	P	Czech	Ostrava	1	Optional	study plan
2013/2014	(N2647) Information and Communication Technology	(2612T025) Computer Science and Technology	K	Czech	Ostrava	1	Optional	study plan
2012/2013	(N2647) Information and Communication Technology	(2612T025) Computer Science and Technology	P	Czech	Ostrava	1	Optional	study plan
2012/2013	(N2647) Information and Communication Technology	(2612T025) Computer Science and Technology	K	Czech	Ostrava	1	Optional	study plan

Occurrence in special blocks

Block name	Academic year	Form of study	Study language	Year	W	S	Type of block	Block owner

Assessment of instruction

2014/2015 Summer

2013/2014 Summer

2012/2013 Summer