440-4224/01 – Speech Processing (ZŘS)

Gurantor departmentDepartment of TelecommunicationsCredits4
Subject guarantorIng. Jan Skapa, Ph.D.Subject version guarantorIng. Jan Skapa, Ph.D.
Study levelundergraduate or graduateRequirementOptional
Year2Semesterwinter
Study languageCzech
Year of introduction2016/2017Year of cancellation
Intended for the facultiesFEIIntended for study typesFollow-up Master
Instruction secured by
LoginNameTuitorTeacher giving lectures
PAR0038 Ing. Pavol Partila, Ph.D.
SKA109 Ing. Jan Skapa, Ph.D.
TOV020 Ing. Jaromír Továrek, Ph.D.
Extent of instruction for forms of study
Form of studyWay of compl.Extent
Full-time Credit and Examination 2+2
Combined Credit and Examination 2+8

Subject aims expressed by acquired skills and competences

After completing the course, students will be able to solve problems in the field of speech processing. They will learn the basic approaches and methods of speech signal processing, such as feature extraction and processing by neural networks or hidden Markov models. They master to implement a simple system to identify the speaker or the recognition of emotion from speech signal.

Teaching methods

Lectures
Tutorials
Experimental work in labs

Summary

Area of speech processing is one of the important part of information and communication technology. The goal of the course is to understand of basic tasks of speech processing which are SI (Speaker Identification), ASR (Automatic Speech Recognition), TTS (Text to Speech) and SER (Speech Emotion Recognition). Acquired skills can be used for design complex systems where the speech processing is used.

Compulsory literature:

MCLOUGHLIN, Ian. Speech and audio processing: a Matlab-based approach. Cambridge: Cambridge University Press, 2016. ISBN 978-1-107-08546-6.

Recommended literature:

BAILLY, Gérard, Pascal PERRIER a Eric VATIKIOTIS-BATESON, ed. Audiovisual speech processing. Cambridge: Cambridge University Press, 2012. ISBN 978-1-107-00682-9. OGUNFUNMI, Tokunbo, Roberto TOGNERI a Madihally NARASIMHA, ed. Speech and audio processing for coding, enhancement and recognition. New York: Springer, 2015. ISBN 978-1-4939-1455-5.

Way of continuous check of knowledge in the course of semester

Test (0-15) points Project (0-25) points

E-learning

http://lms.vsb.cz/

Další požadavky na studenta

No additional requirements are placed on the student.

Prerequisities

Subject has no prerequisities.

Co-requisities

Subject has no co-requisities.

Subject syllabus:

Subject syllabus 1. Introduction to subject and speech processing, practical applications and its using. 2. Speech production, basic concepts, speech preprocessing (DC Offset, preemphases, segmentation, windowing). 3. Basic features - energy, zero cross ratio (ZCR), Jitter, Shimmer, autocorrelation. 4. Speech signal analysis - extraction of fundamental frequency F0 and its using, recognition of voiced and unvoiced consonants. 5. Spectrum, spectrogram, spectral analysis of vowels and consonants. 6. Cepstrum, cepstral analysis, Mel frequency cepstral coefficients and other speech parameters. 7. Introduction to classification, SOM, k-NN, GMM, ANN and classifier fusion. 8. Speaker identification (SI) and possible approaches. 9. Speech emotion recognition (SER), stress recognition. 10. Automatic speech recognition (ASR) and possible approaches. 11. Hidden Markov Model (HMM), structure, training and using for speech recognition (Viterbi algorithm and token-passing). 12. Speech synthesis and vocoder. 13. Text to speech (TTS), speech corpora and open-source projects. 14. Actual trends in speech processing.. Excercise syllabus 1. Introduction, Safety, Conditions for subject completion 2. Practical exercises – speech preprocessing – DC offset, preemphases, segmentation, windowing 3. Practical exercises – Feautures extraction – energy, zero cross ratio, fundamental frequency 4. Practical exercises – Spectral analysis of speech signal 5. Practical exercises – Features extraction – MFCC, LPC 6. Test and assigment of project 7. Design of speaker identification system - GMM, ANN 8. Example of project proposal 9. Design of speech emotion recognition system. - GMM, ANN 10. Design of automatic speech recognition system - DTW, HMM 11. Speech synthesis 12. Classifier fusion 13. Presentation of projects

Conditions for subject completion

Full-time form (validity from: 2017/2018 Winter semester)
Task nameType of taskMax. number of points
(act. for subtasks)
Min. number of points
Credit and Examination Credit and Examination 100 (100) 51
        Credit Credit 36  20
        Examination Examination 64  15
Mandatory attendence parzicipation:

Show history

Occurrence in study plans

Academic yearProgrammeField of studySpec.FormStudy language Tut. centreYearWSType of duty
2019/2020 (N2647) Information and Communication Technology (2601T013) Telecommunication Technology P Czech Ostrava 2 Optional study plan
2019/2020 (N2647) Information and Communication Technology (2612T059) Mobile Technology P Czech Ostrava 2 Optional study plan
2019/2020 (N2647) Information and Communication Technology (2601T013) Telecommunication Technology K Czech Ostrava 2 Optional study plan
2019/2020 (N2647) Information and Communication Technology (2612T059) Mobile Technology K Czech Ostrava 2 Optional study plan
2018/2019 (N2647) Information and Communication Technology (2601T013) Telecommunication Technology P Czech Ostrava 2 Optional study plan
2018/2019 (N2647) Information and Communication Technology (2612T059) Mobile Technology P Czech Ostrava 2 Optional study plan
2018/2019 (N2647) Information and Communication Technology (2601T013) Telecommunication Technology K Czech Ostrava 2 Optional study plan
2018/2019 (N2647) Information and Communication Technology (2612T059) Mobile Technology K Czech Ostrava 2 Optional study plan
2017/2018 (N2647) Information and Communication Technology (2601T013) Telecommunication Technology P Czech Ostrava 2 Optional study plan
2017/2018 (N2647) Information and Communication Technology (2601T013) Telecommunication Technology K Czech Ostrava 2 Optional study plan
2017/2018 (N2647) Information and Communication Technology (2612T059) Mobile Technology P Czech Ostrava 2 Optional study plan
2017/2018 (N2647) Information and Communication Technology (2612T059) Mobile Technology K Czech Ostrava 2 Optional study plan

Occurrence in special blocks

Block nameAcademic yearForm of studyStudy language YearWSType of blockBlock owner