9600-1020/02 – Libraries for parallel data processing (KPZD)

Gurantor department	IT4Innovations	Credits	4
Subject guarantor	Ing. Jan Martinovič, Ph.D.	Subject version guarantor	Ing. Jan Martinovič, Ph.D.
Study level	undergraduate or graduate	Requirement	Compulsory
Year	1	Semester	winter
		Study language	English
Year of introduction	2019/2020	Year of cancellation
Intended for the faculties	FEI, FMT	Intended for study types	Follow-up Master

Instruction secured by
Login	Name	Tuitor	Teacher giving lectures
MAR23	Ing. Jan Martinovič, Ph.D.
MAR0486	Ing. Tomáš Martinovič, Ph.D.

Extent of instruction for forms of study
Form of study	Way of compl.	Extent
Full-time	Graded credit	2+2

Subject aims expressed by acquired skills and competences

Students get an overview of libraries and frameworks for parallel processing of large data and gain a basic experience with usage of most famous libraries. The course shows basic concepts and manipulations with big data and basic paradigms and programming models for their processing. Exercises will use Python, a programming language where all well-known frameworks can be used.

Teaching methods

Lectures
Tutorials
Project work

Summary

Compulsory literature:

• Pandas documentation: http://pandas.pydata.org/ • Spark documentation: https://spark.apache.org/docs/latest/ • Tensorflow documentation: https://www.tensorflow.org/ • Keras documentation: https://keras.io/ HENDL, J., Big data - Věda o datech, základy a aplikace, Cosmopolis, 2021.

Recommended literature:

• Nathan Marz and James Warren: Big Data - Principles and best practices of scalable realtime data systems, Manning, April 2015 ISBN 9781617290343.

Additional study materials

Study supports in the EduDocs system

Way of continuous check of knowledge in the course of semester

project development

E-learning

Other requirements

No other requirements.

Prerequisities

Subject has no prerequisities.

Co-requisities

Subject has no co-requisities.

Subject syllabus:

Student po absolvování předmětu získá přehled o knihovnách pro paralelní zpracování velkých dat a získá základní zkušenost s použitím nejznámějších knihoven. Budou představeny základní koncepty jak s velkými daty minipulovat a základní paradigmata a programové modely pro jejich zpracování. Cvičení budou probíhat v jazyce Python, ve kterém existují knhovny pro všechný známé frameworky. Osnova předmětu: 1. Úvod do zpracování velkých dat 2. Základní manipulace s daty (Pandas, Numpy) 3. Map & Reduce model (Hadoop, Spark, Flink) 4. Paralelní zpracovaní numerických dat v Pythonu (Dask) 5. Knihovny pro neuronové sítě I (Tensorflow, Theano) 6. Knihovny pro neuronové sítě II (Keras) 7. Paralelizace obecných úloh (HyperLoom) 8. Workflow systémy (Luigi, Airflow)

Conditions for subject completion

Full-time form (validity from: 2019/2020 Winter semester)

Task name	Type of task	Max. number of points (act. for subtasks)	Min. number of points	Max. počet pokusů
Graded credit	Graded credit	100	51	3

Valid from	Valid until	Mandatory attendence participation
May 7, 2019 9:38:05 AM	May 7, 2019 9:38:24 AM	Attendance on exercises.

Mandatory attendence participation: Attendance on exercises.

Show history

Conditions for subject completion and attendance at the exercises within ISP: Completion of all mandatory tasks within individually agreed deadlines.

Show history

Occurrence in study plans

Academic year	Programme	Branch/spec.	Form	Study language	Tut. centre	Year	Type of duty
2025/2026	(N0688A270002) Information Technology in Material Science		P	English	Ostrava	1	Compulsory	study plan
2024/2025	(N0541A170008) Computational and Applied Mathematics	(S02) Computational Methods and HPC	P	English	Ostrava	1	Compulsory	study plan
2023/2024	(N0541A170008) Computational and Applied Mathematics	(S02) Computational Methods and HPC	P	English	Ostrava	1	Compulsory	study plan
2022/2023	(N0541A170008) Computational and Applied Mathematics	(S02) Computational Methods and HPC	P	English	Ostrava	1	Compulsory	study plan
2021/2022	(N0541A170008) Computational and Applied Mathematics	(S02) Computational Methods and HPC	P	English	Ostrava	1	Compulsory	study plan
2020/2021	(N0541A170008) Computational and Applied Mathematics	(S02) Computational Methods and HPC	P	English	Ostrava	1	Compulsory	study plan
2019/2020	(N0541A170008) Computational and Applied Mathematics	(S02) Computational Methods and HPC	P	English	Ostrava	1	Compulsory	study plan

Occurrence in special blocks

Block name	Academic year	Form of study	Study language	Year	W	S	Type of block	Block owner

Assessment of instruction

Předmět neobsahuje žádné hodnocení.