460-4070/02 – Business Intelligence and Data Warehouses I (BI I)
Gurantor department | Department of Computer Science | Credits | 4 |
Subject guarantor | prof. Ing. Michal Krátký, Ph.D. | Subject version guarantor | prof. Ing. Michal Krátký, Ph.D. |
Study level | undergraduate or graduate | Requirement | Optional |
Year | 1 | Semester | summer |
| | Study language | English |
Year of introduction | 2015/2016 | Year of cancellation | 2022/2023 |
Intended for the faculties | FEI | Intended for study types | Follow-up Master |
Subject aims expressed by acquired skills and competences
A student is able to orient in the domain of Business Intelligence and Data Warehousing (DWH), in particular practical knowledge of DWH data modeling methodology, ETL processes and data integration to data warehouses. Moreover, the student knows the methodology and he/she is able to create a reporting layer - data marts for analytics and reporting over data. The student is capable to describe the core basis of necessary data processing and operations with data in DWH database.
Teaching methods
Lectures
Tutorials
Summary
The course is a follow-up to the Database and Information Systems 2 course with focus on applying the knowledge to the domain of Business Intelligence and Data Warehouse. The content of lectures is based on getting familiar with principles of Data Warehousing, data modelling specifics, design of respective layers of a data warehouse, data integration using SQL scripts and ETL tools, data transformations within respective layers of a data warehouse including final aggregations of data in order to be presented as business information in a graphical form and layout, or in a form of data extracts for further processing, where the second part of the course carries on in the next term. Another part of the course is the methodology for the solution design of a DWH and data integration projects. During practical sessions students will make use of the methodology in a practical example of data warehouse design and development in an SQL database environment within scope of their final work.
Compulsory literature:
1. L. T. Moss, Shaku Atre: Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications. 576p, Addison-Wesley Professional, 2003.
Recommended literature:
1. R. Kimball, M. Ross: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling. 600p, Wiley, 2013.
2. R. Kimball, J. Caserta: The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. 528p, Wiley, 2004.
3. C. Batini, M. Scannapieco: Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications). Springer, 2010.
Way of continuous check of knowledge in the course of semester
Partial tasks in topics of lectures are checked on practices.
E-learning
Other requirements
Knowledge of the following topics: SQL, procedural extensions of SQL, physical database design, distributed and NoSQL DBMS.
Prerequisities
Subject has no prerequisities.
Co-requisities
Subject has no co-requisities.
Subject syllabus:
1. Introduction to BI and Data Warehousing
2. Data Warehouses (DWH) - Data modelling principles
3. Data Warehouses - multi-dimensional modelling, data layers - staging, main model, presentation layer (views, indexes).
4. Data Warehouses (DWH) - historization and SCD, using surrogate keys
5. Data marts
6. Operational Data Stores
7. Data Integration - ETL.
8. ETL Framework
9. Data Integration - ETL, transformation definitions - ETL development
10. ETL and Data Integration tools
11. Business Dictionaries and transformation rules
12. Data Quality
13. Analytics over DWH
Computer practices:
1. Practicals plan, platform introduction, sample databases.
2. Data modeling of tables and views in DWH.
3. Creation of DWH layers
4. Data historization, data indexing, surrogate keys creation
5. Fine tuning - querying (query definition for aggregations), analytical functions
6. Simple data pumps
7. Definition of ETL framework
8. Mappings definition, data workflow definition
9. ETL mappings - coding
10. ETL tool - Informatica
11. ETL tool - IBM DataStage/MS SSIS
12. Data profiling, Data cleansing - business rules definition
13. Test
Conditions for subject completion
Occurrence in study plans
Occurrence in special blocks
Assessment of instruction
Předmět neobsahuje žádné hodnocení.