• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Бакалаврская программа «Прикладной анализ данных»

Data Engeneering

2022/2023
Учебный год
ENG
Обучение ведется на английском языке
5
Кредиты

Преподаватели


Шумский Леонид Дмитриевич

Course Syllabus

Abstract

The course "Data Engineering" is dedicated to the work of data engineers who provide the foundation of the analytical process - the delivery of data to the analyst's desk. In order for this to happen, someone must ensure that they are searched, downloaded according to the required schedule or request, checked, converted into a usable form and protected taking into account user roles. In the classroom, we will talk about the basics of the Data Engineer profession: how and where data is stored, what to do if the data is not suitable for use and how to reduce the cost of performing analytics. Students will have a lot of practice, where they will solve engineering problems from the life of IT departments of Russian business with their own hands.
Learning Objectives

Learning Objectives

  • Get an idea about the features of data management tasks, their application in business and practical skills in working with data engineer tools.
Expected Learning Outcomes

Expected Learning Outcomes

  • Get the basics of data management as a discipline: data structures and sources, data manipulation methods
  • Be able to work with data engineer tools and modern databases such as PostreSQL or Clickhouse.
  • Be aware of why data quality is important and how to keep it high.
  • See what data marts are and how to create them.
  • Identify situations where data should be handled with care (sensitive business and personal data) and what protection mechanisms are in place.
  • Learn about the modern data processing stack and how it will change in the future.
Course Contents

Course Contents

  • Data structures.
  • Data manipulation
  • Cleaning and validating data
  • Industrial practice
  • Data marts
  • Protecting and masking data
Assessment Elements

Assessment Elements

  • non-blocking Homeworks
  • non-blocking Exam
    Written remote exam (online). Held in the end of the course.
Interim Assessment

Interim Assessment

  • 2022/2023 3rd module
    0.5 * Exam + 0.5 * Homeworks
Bibliography

Bibliography

Recommended Core Bibliography

  • Basic concepts in data structures, Klein, S. T., 2016
  • Large scale and big data : processing and management, , 2016

Recommended Additional Bibliography

  • Artificial intelligence and big data for financial risk management : intelligent applications, , 2023
  • Integrating deep learning algorithms to overcome challenges in big data analytics, , 2022