• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта
Магистратура 2022/2023

Машинное обучение I

Когда читается: 1-й курс, 3, 4 модуль
Формат изучения: с онлайн-курсом
Онлайн-часы: 72
Охват аудитории: для своего кампуса
Язык: английский
Кредиты: 6
Контактные часы: 32

Course Syllabus

Abstract

Using data to make predictions, test hypotheses and estimate models is an important skill in the current job market. Many companies collect a lot of data and their decisions data-driven. Machine learning disrupts many fields and promises to achieve superhuman performance in the coming decades. Statistical analysis allows to test hypotheses and verify which of the models fits the data best. In this course we will cover different methods for supervised and unsupervised learning to develop a necessary toolkit for successful data scientists. For some of the methods we will go into details to learn why and how they work. Also we will touch on ethical implications of data science in the age of big data and apply learned methods to real business data sets.
Learning Objectives

Learning Objectives

  • Students will free comfortable orienting among different methods of machine learning and develop understanding of why these methods work and to extend them
Expected Learning Outcomes

Expected Learning Outcomes

  • Understand the concept of data generating process and how it is different to the concept of model
  • Learn more details on hypothesis testing
  • Understand different methods for supervised learning such as regressions, random forest, gradient boosting etc.
  • Able to design and implement efficient feature engineering processes, including working with gaps, categorical variables, time series, and text
  • Can create reproducible scripts and pipelines for data processing and model training in Python
  • Know how to prepare data for training models, including scaling, normalization, and sampling
Course Contents

Course Contents

  • Python for data science
  • Supervised machine learning
  • Unsupervised machine learning
  • Machine learning principles: cross-validation, feature selection, metrics
  • Tools for data science
Assessment Elements

Assessment Elements

  • non-blocking Kaggle competition
  • non-blocking Exam
  • non-blocking Assignment
Interim Assessment

Interim Assessment

  • 2022/2023 4th module
    0.2 * Kaggle competition + 0.5 * Exam + 0.3 * Assignment
Bibliography

Bibliography

Recommended Core Bibliography

  • Silver, N. (2012). The Signal and the Noise : Why So Many Predictions Fail-but Some Don’t. New York: Penguin Books. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1122593

Recommended Additional Bibliography

  • Bruce E. Hansen. (2013). Econometrics. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.C0DB9E1E