Machine Learning and Data Mining

Master 2019/2020

Type: Elective course (Data Science)

Area of studies: Applied Mathematics and Informatics

Delivered by: School of Data Analysis and Artificial Intelligence

Where: Faculty of Computer Science

When: 2 year, 1, 2 module

Mode of studies: offline

Instructors: Artem Maevskiy

Master’s programme: Data Science

Language: English

ECTS credits: 8

Contact hours: 56

Full Syllabus

Abstract

The course "Machine Learning and Data Mining"; introduces students to new and actively evolving interdisciplinary field of modern data analysis. Started as a branch of Artificial Intelligence, it attracted attention of physicists, computer scientists, economists, computational biologists, linguists and others and become a truly interdisciplinary field of study. In spite of the variety of data sources that could be analyzed, objects and attributes that from a particular dataset poses common statistical and structural properties. The interplay between known data and unknown ones give rise to complex pattern structures and machine learning methods that are the focus of the study. In the course we will consider methods of Machine Learning and Data Mining. Special attention will be given to the hands-on practical analysis of the real world datasets using available software tools and modern programming languages and libraries.

Learning Objectives

To familiarize students with a new rapidly evolving filed of machine learning and mining, and provide practical knowledge experience in analysis of real world data.

Expected Learning Outcomes

Students know the statement of No-Free-Lunch theorems and explain the role of prior knowledge for solving machine learning problems.
Students derive the bias-variance decomposition for MSE and “0-1” losses, and show how regularization affects the tradeoff.
Students explain the concepts of bootstrapping, bagging and boosting, and justify the choice of a particular weak learner for a given aggregating algorithm.
Students explain the relation between linear models and deep neural networks, describe how neural networks are trained, and understand what the role of data scientist is in designing a deep learning solution to a machine learning problem.
Students understand the principles of Generative Adversarial Networks, know which metrics they can optimize and how to regularize them.
Students explain and utilize the black-box optimization techniques.
Students use the techniques for working with imbalanced datasets.
Students explain the main approaches to graphical probabilistic models and training of them.
Students understand the principles behind Variational AutoEncoders and implement them.
Students know meta-learning approaches.

Course Contents

Introduction to Machine Learning and Data Mining, No-Free-Lunch theorems
Introduction to No-Free-Lunch theorems, discussion about the role of prior knowledge in Machine Learning. Discussion of the general Machine Learning workflow. Assumptions behind the most popular Machine Learning methods.
Bias-variance decomposition, regularization techniques
Model complexity through bias-variance decomposition, methods to control complexity of the models, the most common regularization techniques.
Introduction to meta-algorithms, bootstrap, boosting
Meta-algorithms as a tool for regulating bias/variance of the model. Introduction to bootstrap: Random Forest. Stacking. Introduction to boosting: AdaBoost, Gradient Boosting Machine, XGBoost.
Introduction and overview of deep learning methods
Introduction to Deep Learning through No-Free-Lunch theorem lens. Popularity of Deep Learning methods. Correspondence between the most common Deep Learning methods and prior assumptions.
Deep generative models: Generative Adversarial Networks (GANs)
Jensen-Shannon divergence and Wasserstein distance as minimization problems. Adversarial Neural Networks: classical GAN, WGAN, energy-based GAN. Difficulties in adversarial training, gradient penalty for WGAN. Practical applications beyond generative problems. Adversarial AutoEncoder, BiGAN, CycleGAN, Adversarial Variational Bayes.
Optimization techniques: black-box methods, first order methods
Brief overview of first order optimization methods: stochastic gradient descent, momentum, adam/adamax. Detailed discussion of black-box optimization methods: Bayesian optimization, Variational Optimization. Examples of black-box optimization: hyper-parameter tuning.
Miscellaneous topics: imbalanced datasets, importance sampling, one-class classification methods
Discussion of the problems caused by imbalanced datasets, particularly, for gradients based methods: change of priors, importance sampling. One-class classification: one-class SVM, density based methods, popular heuristics: dimensionality reduction (e.g. through AutoEncoders), Radial Basis Networks.
Deep generative models: energy-based models, Boltzmann machines and deep belief networks
Definition of a generative problem, types of generative problems. Energy-based models and contrastive divergence: Boltzmann machines, Deep Belief Networks, Restricted Boltzmann Machines and their connection to the AutoEncoders.
Deep generative models: Variational AutoEncoders
Variational bounds on likelihood, Variational AutoEncoder, Conditional Variational AutoEncoder.
Meta-learning: concept learning, learning how to learn
Concept learning: Neural Statistician, Generative Matching Networks. Learning how to learn: optimization procedure as a learning problem, gradient-based optimization algorithms.

Assessment Elements

Homeworks
Exam

Interim Assessment

Interim assessment (2 module)
Final score for the homework: homework score = min [1, ∑ixi] - penalty, where xi is a score for each homework. (Final grade) = 50% × (homework score) + 50% × (exam score).<ul><li>since each homework has a max score of 1 and there are 3 assignments, it will be scaled by 5/3 in this formula;</li><li>max exam score is 10, so it will be scaled by 1/2.</li></ul> Final grade = [5/3 ⋅ homework score + 1/2 ⋅ exam score]

Bibliography

Recommended Core Bibliography

Hall, M., Witten, Ian H., Frank, E. Data Mining: practical machine learning tools and techniques. – 2011. – 664 pp.
Han, J., Kamber, M., Pei, J. Data Mining: Concepts and Techniques, Third Edition. – Morgan Kaufmann Publishers, 2011. – 740 pp.
Hastie, T., Tibshirani, R., Friedman, J. The elements of statistical learning: Data Mining, Inference, and Prediction. – Springer, 2009. – 745 pp.

Recommended Additional Bibliography

Mirkin, B. Core concepts in data analysis: summarization, correlation and visualization. – Springer Science & Business Media, 2011. – 388 pp.

Course Syllabus