• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Бакалаврская программа «Прикладной анализ данных»

08
Февраль

Applied Statistics for Machine Learning

2025/2026
Учебный год
ENG
Обучение ведется на английском языке
4
Кредиты
Статус:
Курс обязательный
Когда читается:
3-й курс, 3, 4 модуль

Преподаватель

Course Syllabus

Abstract

The course "Applied Statistics in Machine Learning" is designed for students enrolled in the Data Analysis and Business Analytics program who wish to gain a thorough understanding of certain aspects of applied statistics necessary for working with machine learning algorithms. The course covers both fundamental statistical analysis topics, such as constructing multivariate confidence intervals, hypothesis testing under nuisance parameters, and regression and variance analysis on nonstandard data, as well as modern methods that significantly improve data analysis (such as Markov chain sampling or variational inference methods). The course also covers several modern regression analysis methods, such as Gaussian process regression, Bayesian regressions, and generalized linear models.
Learning Objectives

Learning Objectives

  • Development skills for inference in multivariate data
  • Understanding of basic approaches to interpreting probability
  • Acquiring skills in basic multivariate data analysis
  • Understanding basic concepts of working with probability density distributions
  • Learning advanced algorithms for sampling multivariate data
Expected Learning Outcomes

Expected Learning Outcomes

  • Provides rationale for the choice of estimating method for nuisance parameters.
  • Applies Bayesian inference to construct confidence intervals.
  • Applies frequentist inference to construct confidence intervals.
  • Analyzes experimental results.
  • Evaluates the robustness of solutions.
  • Understands multiple testing issues.
  • Applies post hoc correction to results.
  • Calculates correction for multiple testing.
  • Analyzes the appropriateness of cross-validation.
  • Provides rationale for the choice of metric for probability distributions.
  • Proves inequalities between metrics.
  • Proposes corrections for estimating values ​​using cross-validation.
  • Identifies the optimal sampling method.
  • Applies Markov chains for multivariate Bayesian inference.
  • Uses Gaussian processes to construct surrogate models.
Course Contents

Course Contents

  • Bayesian and Frequentists Inference
  • Analysis of variance and regression
  • Machine learning statistics
  • Sampling
  • Gaussian processes
Assessment Elements

Assessment Elements

  • non-blocking Homework 1
  • non-blocking Homework 2
  • non-blocking Homework 3
  • non-blocking Homework 4
  • non-blocking Colloquium
  • non-blocking Written Exam
Interim Assessment

Interim Assessment

  • 2025/2026 4th module
    0.1 * Colloquium + 0.2 * Homework 1 + 0.2 * Homework 2 + 0.2 * Homework 3 + 0.2 * Homework 4 + 0.1 * Written Exam
Bibliography

Bibliography

Recommended Core Bibliography

  • A first course in Bayesian statistical methods, Hoff, P. D., 2009
  • Categorical data analysis, Agresti, A., 2002
  • Computer age statistical inference : algorithms, evidence, and data science, Efron, B., 2017
  • Data analysis using regression and multilevel/hierarchical models, Gelman, A., 2009
  • The elements of statistical learning : data mining, inference, and prediction, Hastie, T., 2017
  • Глубокое обучение, Гудфеллоу, Я., 2017
  • Наглядная математическая статистика : учеб. пособие для вузов, Лагутин, М. Б., 2019

Recommended Additional Bibliography

  • All of statistics : a concise course in statistical inference, Wasserman, L., 2004
  • The Bayesian way : introductory statistics for economists and engineers, Nyberg, S. O., 2019
  • Гауссовские случайные процессы, Ибрагимов, И. А., 1970
  • Двадцать лекций о гауссовских процессах, Питербарг, В. И., 2015

Authors

  • DERKACH DENIS ALEKSANDROVICH