9600-1020/01 – Libraries for parallel data processing (KPZD)

Gurantor departmentIT4InnovationsCredits4
Subject guarantorIng. Jan Martinovič, Ph.D.Subject version guarantorIng. Jan Martinovič, Ph.D.
Study levelundergraduate or graduateRequirementCompulsory
Year1Semesterwinter
Study languageCzech
Year of introduction2019/2020Year of cancellation
Intended for the facultiesFEIIntended for study typesFollow-up Master
Instruction secured by
LoginNameTuitorTeacher giving lectures
BER0134 Ing. Jakub Beránek
BOH126 Ing. Ada Böhm, Ph.D.
Extent of instruction for forms of study
Form of studyWay of compl.Extent
Full-time Graded credit 2+2
Part-time Graded credit 9+9

Subject aims expressed by acquired skills and competences

Students get an overview of libraries and frameworks for parallel processing of large data and gain a basic experience with usage of most famous libraries. The course shows basic concepts and manipulations with big data and basic paradigms and programming models for their processing. Exercises will use Python, a programming language where all well-known frameworks can be used.

Teaching methods

Lectures
Tutorials
Project work

Summary

Compulsory literature:

• Pandas documentation: http://pandas.pydata.org/ • Spark documentation: https://spark.apache.org/docs/latest/ • Tensorflow documentation: https://www.tensorflow.org/ • Keras documentation: https://keras.io/

Recommended literature:

• Nathan Marz and James Warren: Big Data - Principles and best practices of scalable realtime data systems, Manning, April 2015 ISBN 9781617290343.

Way of continuous check of knowledge in the course of semester

student project

E-learning

Other requirements

No other requirements.

Prerequisities

Subject has no prerequisities.

Co-requisities

Subject has no co-requisities.

Subject syllabus:

Student po absolvování předmětu získá přehled o knihovnách pro paralelní zpracování velkých dat a získá základní zkušenost s použitím nejznámějších knihoven. Budou představeny základní koncepty jak s velkými daty minipulovat a základní paradigmata a programové modely pro jejich zpracování. Cvičení budou probíhat v jazyce Python, ve kterém existují knhovny pro všechný známé frameworky. Osnova předmětu: 1. Úvod do zpracování velkých dat 2. Základní manipulace s daty (Pandas, Numpy) 3. Map & Reduce model (Hadoop, Spark, Flink) 4. Paralelní zpracovaní numerických dat v Pythonu (Dask) 5. Knihovny pro neuronové sítě I (Tensorflow, Theano) 6. Knihovny pro neuronové sítě II (Keras) 7. Paralelizace obecných úloh (HyperLoom) 8. Workflow systémy (Luigi, Airflow)

Conditions for subject completion

Full-time form (validity from: 2019/2020 Winter semester)
Task nameType of taskMax. number of points
(act. for subtasks)
Min. number of pointsMax. počet pokusů
Graded credit Graded credit 100  51 3
Mandatory attendence participation: Attendance on exercises.

Show history

Conditions for subject completion and attendance at the exercises within ISP: Completion of all mandatory tasks within individually agreed deadlines.

Show history

Occurrence in study plans

Academic yearProgrammeBranch/spec.Spec.ZaměřeníFormStudy language Tut. centreYearWSType of duty
2024/2025 (N0541A170007) Computational and Applied Mathematics (S02) Computational Methods and HPC P Czech Ostrava 1 Compulsory study plan
2024/2025 (N0541A170007) Computational and Applied Mathematics (S02) Computational Methods and HPC K Czech Ostrava 1 Compulsory study plan
2023/2024 (N0541A170007) Computational and Applied Mathematics (S02) Computational Methods and HPC P Czech Ostrava 1 Compulsory study plan
2023/2024 (N0541A170007) Computational and Applied Mathematics (S02) Computational Methods and HPC K Czech Ostrava 1 Compulsory study plan
2022/2023 (N0541A170007) Computational and Applied Mathematics (S02) Computational Methods and HPC K Czech Ostrava 1 Compulsory study plan
2022/2023 (N0541A170007) Computational and Applied Mathematics (S02) Computational Methods and HPC P Czech Ostrava 1 Compulsory study plan
2021/2022 (N0541A170007) Computational and Applied Mathematics (S02) Computational Methods and HPC K Czech Ostrava 1 Compulsory study plan
2021/2022 (N0541A170007) Computational and Applied Mathematics (S02) Computational Methods and HPC P Czech Ostrava 1 Compulsory study plan
2020/2021 (N0541A170007) Computational and Applied Mathematics (S02) Computational Methods and HPC P Czech Ostrava 1 Compulsory study plan
2020/2021 (N0541A170007) Computational and Applied Mathematics (S02) Computational Methods and HPC K Czech Ostrava 1 Compulsory study plan
2019/2020 (N0541A170007) Computational and Applied Mathematics (S02) Computational Methods and HPC P Czech Ostrava 1 Compulsory study plan
2019/2020 (N0541A170007) Computational and Applied Mathematics (S02) Computational Methods and HPC K Czech Ostrava 1 Compulsory study plan

Occurrence in special blocks

Block nameAcademic yearForm of studyStudy language YearWSType of blockBlock owner

Assessment of instruction



2023/2024 Winter
2022/2023 Winter
2020/2021 Winter