Delivered at:: International Laboratory for Applied Network Research

Course type:: Elective course

When:: 1 year, 3 module

Instructor

Vashchenko, Vasilisa

Full Syllabus

Abstract

Machine learning is implemented within the field of statistical learning theory and is basically drawn from statistics and functional analysis. The goal of the course is to study, in a statistical framework, the properties of learning algorithms. This study serves a two-fold purpose. On one hand it provides strong guarantees for existing algorithms, and on the other hand suggests new algorithmic approaches that are potentially more powerful. In this course we will go in detail into the theory and methods of statistical learning, and in particular complexity regularization (i.e., how do you choose the complexity of your model when you have to learn it from data). This issue is at the heart of the most successful and popular machine learning algorithms today, and it is critical for their success. This course is an elective course and is implemented both with R and Python.

Learning Objectives

The course gives students an important foundation to develop and conduct their own research as well as to evaluate research of others.

Expected Learning Outcomes

Be able to develop and/or foster critical reviewing skills of published empirical research using applied statistical methods.
Be able to to criticize constructively and determine existing issues with applied linear models in published work .
Be able to calculate sizes of training sets for several machine learning tasks in the context of PAC-learning (and hence calculate VC-dimensions).
Have a training of mathematical skills such as abstract thinking, formal thinking and problem solving;
Have in-depth understanding of boosting algorithms and a few other algorithms for machine learning.
Have theoretical understanding of several online learning algorithms and learning with expert advice.
Know several paradigms in statistical learning theory to select models (Structural risk minimiza-tion, Maximal likelihood, Minimal Description Length, etc.).
Know the basic concepts from statistical learning theory.
Know the link between cryptography and computational limitations of statistical learning.
Know theoretical foundation of why some machine learning algorithms are successful in a large range of applications, with special emphasis on statistics.
Be able to apply the basic concepts from machine learning theory
Be able to identify appropriately the type of a machine learning problem at hand, e.g. classification, regression, clustering
Be able to differentiate between supervised and unsupervised learning methods, understand their benefits and limitations
Be able to master theoretical understanding of key methods for supervised learning to apply decision trees, linear regression, logistic regression, quantile regression, variations of regression for non-Gaussian distributions of the target variable
Be able to differentiate and correctly apply most common approaches to ensemble learning (random forests, gradient boosting, stacking, blending, etc.) as well as to explain their benefits and limitations
Be able to identify and tackle issues related to overfitting and model instability
Be able to apply basic tools and approaches to automated text processing as well as to incorporate text data into machine learning solutions
Be able to systematize and prioritize best practices in experiment tracking and sustainable ML development

Assessment Elements

Homework Assignments
Quizzes
Final In-Class or Take-home exam
In-Class Labs

Interim Assessment

2023/2024 3rd module
0.5 * Final In-Class or Take-home exam + 0.2 * Homework Assignments + 0.2 * In-Class Labs + 0.1 * Quizzes

Bibliography

Recommended Core Bibliography

Alpaydin, E. (2014). Introduction to Machine Learning (Vol. Third edition). Cambridge, MA: The MIT Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=836612
Harman, G., & Kulkarni, S. (2007). Reliable Reasoning : Induction and Statistical Learning Theory. Cambridge, Mass: A Bradford Book. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=189264
Haroon, D. (2017). Python Machine Learning Case Studies : Five Case Studies for the Data Scientist. [Berkeley, CA]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1623520
Kulkarni, S., Harman, G., & Wiley InterScience (Online service). (2011). An Elementary Introduction to Statistical Learning Theory. Hoboken, N.J.: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=391376
Mohri, M., Talwalkar, A., & Rostamizadeh, A. (2012). Foundations of Machine Learning. Cambridge, MA: The MIT Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=478737

Recommended Additional Bibliography

Lantz, B. (2019). Machine Learning with R : Expert Techniques for Predictive Modeling, 3rd Edition (Vol. Third edition). Birmingham, UK: Packt Publishing. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=2106304
Murphy, K. P. (2012). Machine Learning : A Probabilistic Perspective. Cambridge, Mass: The MIT Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=480968
Ramasubramanian, K., & Singh, A. (2017). Machine Learning Using R. [Place of publication not identified]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1402990
Sarkar, D., Bali, R., & Sharma, T. (2018). Practical Machine Learning with Python : A Problem-Solver’s Guide to Building Real-World Intelligent Systems. [United States]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1667293

Authors

BOLDYREVA LYUBOV VLADIMIROVNA
KHVATSKIY GRIGORIY SERGEEVICH

Master’s Programme 'Data Analytics and Social Statistics'

Contacts

Machine Learning