9360-0193/02 – Advanced methods for data manipulation (PMZD)

Gurantor departmentCNT - Nanotechnology CentreCredits3
Subject guarantorIng. Dominik Legut, Ph.D.Subject version guarantorIng. Dominik Legut, Ph.D.
Study levelundergraduate or graduate
Study languageEnglish
Year of introduction2019/2020Year of cancellation
Intended for the facultiesFMTIntended for study typesFollow-up Master, Bachelor
Instruction secured by
LoginNameTuitorTeacher giving lectures
LEG0015 Ing. Dominik Legut, Ph.D.
Extent of instruction for forms of study
Form of studyWay of compl.Extent
Full-time Credit and Examination 1+2

Subject aims expressed by acquired skills and competences

It allows students to process large data of GBi-TBi dimensions, its manipulation and analysis.

Teaching methods

Lectures
Tutorials
Project work

Summary

This subject prepares participant for the processing and manipulation large data files. This concerns not only the work with HPC supercomputers, but also to manipulate date of daily life. Participant will learn how to work with files of million lines or million columns or files as large as several GBi.

Compulsory literature:

http://becksteinlab.physics.asu.edu/pages/unix/IntroUnix/vim_basics.html for unix and vi, sed etc. http://cs.lmu.edu/~ray/notes/bash/ for bash https://www.tutorialspoint.com/awk/index.htm for awk

Recommended literature:

http://www.well.ox.ac.uk/~johnb/comp/perl/intro.html

Way of continuous check of knowledge in the course of semester

Acquired knowledges are subject to project finalization and oral and written final test.

E-learning

Other requirements

Successful solution of the data processing project as well as the exam.

Prerequisities

Subject has no prerequisities.

Co-requisities

Subject has no co-requisities.

Subject syllabus:

This subject prepares participant for the processing and manipulation large data files and prepares to work with HPC supercomputers. Participatn will learn to work with files of million lines or columns or files as large as several GBi. 1. Unix(linux) commands for data manipulation in command line prompt 2. Handling text data and editing in unix, Vi-editor, Nano, midnight commander etc. 3. Introduction to scripting in Bash, for and while loops, etc. 4. Introduction to Awk, manipulation of data 5. How to exploit simple mathemtics in command line 6. Awk, formats of data I/O (formated input and output) 7. Basics of Ed and Sed, replacing strings, more complex constructions 8. Advance methods - Introduction to Perl 9. Perl II 10. Regular syntax I 11. Regular synax II 12. Data manipulation to and from HPC systems, dispaly forwarding, usage of scheduler and batch jobs 13. - 14. Practical sessions

Conditions for subject completion

Full-time form (validity from: 2019/2020 Winter semester)
Task nameType of taskMax. number of points
(act. for subtasks)
Min. number of points
Credit and Examination Credit and Examination 100 (100) 51
        Credit Credit 40  20
        Examination Examination 60  35
Mandatory attendence parzicipation: Successful solution of the data processing project as well as the exam.

Show history

Occurrence in study plans

Academic yearProgrammeField of studySpec.ZaměřeníFormStudy language Tut. centreYearWSType of duty
2020/2021 (N0719A270003) Nanotechnology P English Ostrava 1 Compulsory study plan
2019/2020 (N0719A270003) Nanotechnology P English Ostrava 1 Compulsory study plan

Occurrence in special blocks

Block nameAcademic yearForm of studyStudy language YearWSType of blockBlock owner
FMT+9360 2020/2021 Full-time English Optional 600 - Faculty of Materials Science and Technology - Dean's Office stu. block
FMT-new subjects 2019/2020 Full-time English Optional 600 - Faculty of Materials Science and Technology - Dean's Office stu. block