9360-0193/02 – Advanced methods for data manipulation (PMZD)
Gurantor department | CNT - Nanotechnology Centre | Credits | 3 |
Subject guarantor | Ing. Dominik Legut, Ph.D. | Subject version guarantor | Ing. Dominik Legut, Ph.D. |
Study level | undergraduate or graduate | Requirement | Compulsory |
Year | 1 | Semester | summer |
| | Study language | English |
Year of introduction | 2019/2020 | Year of cancellation | |
Intended for the faculties | FMT | Intended for study types | Follow-up Master, Bachelor |
Subject aims expressed by acquired skills and competences
It allows students to process large data of GBi-TBi dimensions, its manipulation and analysis.
Teaching methods
Lectures
Tutorials
Project work
Summary
This subject prepares participant for the processing and manipulation large data files. This concerns not only the work with HPC supercomputers, but also to manipulate date of daily life. Participant will learn how to work with files of million lines or million columns or files as large as several GBi.
Compulsory literature:
http://becksteinlab.physics.asu.edu/pages/unix/IntroUnix/vim_basics.html for unix and vi, sed etc.
http://cs.lmu.edu/~ray/notes/bash/ for bash
https://www.tutorialspoint.com/awk/index.htm for awk
Recommended literature:
http://www.well.ox.ac.uk/~johnb/comp/perl/intro.html
Additional study materials
Way of continuous check of knowledge in the course of semester
Acquired knowledges are subject to project finalization and oral and written final test.
E-learning
Other requirements
Successful solution of the data processing project as well as the exam.
Prerequisities
Subject has no prerequisities.
Co-requisities
Subject has no co-requisities.
Subject syllabus:
This subject prepares participant for the processing and manipulation large data files and prepares to work with HPC supercomputers. Participatn will learn to work with files of million lines or columns or files as large as several GBi.
1. Unix(linux) commands for data manipulation in command line prompt
2. Handling text data and editing in unix, Vi-editor, Nano, midnight commander etc.
3. Introduction to scripting in Bash, for and while loops, etc.
4. Introduction to Awk, manipulation of data
5. How to exploit simple mathemtics in command line
6. Awk, formats of data I/O (formated input and output)
7. Basics of Ed and Sed, replacing strings, more complex constructions
8. Advance methods - Introduction to Perl
9. Perl II
10. Regular syntax I
11. Regular synax II
12. Data manipulation to and from HPC systems, dispaly forwarding, usage of scheduler and batch jobs
13. - 14. Practical sessions
Conditions for subject completion
Occurrence in study plans
Occurrence in special blocks
Assessment of instruction
Předmět neobsahuje žádné hodnocení.