• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Master 2021/2022

Statistical Learning Theory

Type: Elective course (Data Science)
Area of studies: Applied Mathematics and Informatics
When: 2 year, 1, 2 module
Mode of studies: offline
Open to: students of all HSE University campuses
Master’s programme: Data Science
Language: English
ECTS credits: 8
Contact hours: 54

Course Syllabus

Abstract

We study a theory that inspired the development of two important families of machine learning algorithms: support vector machines and boosting. More generally, in a typical classification task we are given (1) a dataset for training and (2) a familyof classifiers, indexed by some parameters that need to be tuned. A learning algorithm uses the training set to select one of the classifiers, in other words, tune the parameters. For example, given a neural network, every choice of the weights specifies a classifier and the learning algorithm assigns values to the weights. On one side, we want to have a large set of classifiers to model all structure in the training data. This might lead to a large number of parameters to train. On the other hand a too large set of classifiers might lead to a decrease of accuracy on unseen examples. This is called overfitting. We study a theory that quantifies overfitting theoretically. Moreover, support vector machines and boosting algorithms can be seen as algorithms that optimize the trade-off discussed above under some additional assumptions. Moreover, the theory can determine good values for meta-parameters in machine learning algorithms that otherwise need to be tuned using cross-validation. We also study some recent deep boosting algorithms that were developed using the theory. These algorithms are currently among the best for classification tasks when the number of classes is high. Finally, we study the online mistake model. This model is more general but its mathematical analysis has many similarities with the theories above.
Learning Objectives

Learning Objectives

  • Understanding basic concepts from statistical learning theory.
  • Being able to understand the connection between these models and many machine learning algorithms.
  • Training of mathematical skills such as abstract thinking, formal thinking and problem solving; with emphasis on statistics.
Expected Learning Outcomes

Expected Learning Outcomes

  • Calculate sizes of trainingsets for several machinelearning tasks in the context of PAC-learning (and hence calculate VC-dimensions)
  • Deeper understanding of boosting algorithms and support vector machines.
  • Knowledge of several paradigms in statistical learning theory to select models (Structural risk minimization, Maximal likelihood, Minimal Description Length, etc.)
  • Theoretical understanding of several online learning algorithms and learning with expert advice.
  • Understand the link between cryptography and computational limitations of statistical learning.
Course Contents

Course Contents

  • Probably approximately correct learning
  • VC-dimensions
  • Structural risk minimization and variants
  • The time complexity of learning and cryptography
  • Boosting
  • Rademacher complexities
  • Support vector machines and margin theory
  • Mutliclass classification and DeepBoost
  • Online learning
  • Reinforcement learning
Assessment Elements

Assessment Elements

  • non-blocking Homework
  • non-blocking Colloqium
  • non-blocking Exam
Interim Assessment

Interim Assessment

  • 2021/2022 2nd module
    0.35 * Homework + 0.3 * Colloqium + 0.35 * Exam
Bibliography

Bibliography

Recommended Core Bibliography

  • Mohri, M., Talwalkar, A., & Rostamizadeh, A. (2012). Foundations of Machine Learning. Cambridge, MA: The MIT Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=478737

Recommended Additional Bibliography

  • Vitaly Kuznetsov, Mehryar Mohri, & Umar Syed. (n.d.). Multi-Class Deep Boosting.