Master
2019/2020

## Machine Learning and Data Mining

Type:
Elective course (Data Science)

Area of studies:
Applied Mathematics and Informatics

Delivered by:
School of Data Analysis and Artificial Intelligence

When:
1 year, 1, 2 module

Mode of studies:
Full time

Instructors:
Artem Maevskiy

Master’s programme:
Data Science

Language:
English

ECTS credits:
4

### Course Syllabus

#### Abstract

The course "Machine Learning and Data Mining"; introduces students to new and actively evolving interdisciplinary field of modern data analysis. Started as a branch of Artificial Intelligence, it attracted attention of physicists, computer scientists, economists, computational biologists, linguists and others and become a truly interdisciplinary field of study. In spite of the variety of data sources that could be analyzed, objects and attributes that from a particular dataset poses common statistical and structural properties. The interplay between known data and unknown ones give rise to complex pattern structures and machine learning methods that are the focus of the study. In the course we will consider methods of Machine Learning and Data Mining. Special attention will be given to the hands-on practical analysis of the real world datasets using available software tools and modern programming languages and libraries.

#### Learning Objectives

- To familiarize students with a new rapidly evolving filed of machine learning and mining, and provide practical knowledge experience in analysis of real world data.

#### Expected Learning Outcomes

- Students know the statement of No-Free-Lunch theorems and explain the role of prior knowledge for solving machine learning problems.
- Students derive the bias-variance decomposition for MSE and “0-1” losses, and show how regularization affects the tradeoff.
- Students explain the concepts of bootstrapping, bagging and boosting, and justify the choice of a particular weak learner for a given aggregating algorithm.
- Students explain the relation between linear models and deep neural networks, describe how neural networks are trained, and understand what the role of data scientist is in designing a deep learning solution to a machine learning problem.
- Students understand the principles of Generative Adversarial Networks, know which metrics they can optimize and how to regularize them.
- Students explain and utilize the black-box optimization techniques.
- Students use the techniques for working with imbalanced datasets.
- Students explain the main approaches to graphical probabilistic models and training of them.
- Students understand the principles behind Variational AutoEncoders and implement them.
- Students know meta-learning approaches.

#### Course Contents

- Introduction to Machine Learning and Data Mining, No-Free-Lunch theoremsIntroduction to No-Free-Lunch theorems, discussion about the role of prior knowledge in Machine Learning. Discussion of the general Machine Learning workflow. Assumptions behind the most popular Machine Learning methods.
- Bias-variance decomposition, regularization techniquesModel complexity through bias-variance decomposition, methods to control complexity of the models, the most common regularization techniques.
- Introduction to meta-algorithms, bootstrap, boostingMeta-algorithms as a tool for regulating bias/variance of the model. Introduction to bootstrap: Random Forest. Stacking. Introduction to boosting: AdaBoost, Gradient Boosting Machine, XGBoost.
- Introduction and overview of deep learning methodsIntroduction to Deep Learning through No-Free-Lunch theorem lens. Popularity of Deep Learning methods. Correspondence between the most common Deep Learning methods and prior assumptions.
- Deep generative models: Generative Adversarial Networks (GANs)Jensen-Shannon divergence and Wasserstein distance as minimization problems. Adversarial Neural Networks: classical GAN, WGAN, energy-based GAN. Difficulties in adversarial training, gradient penalty for WGAN. Practical applications beyond generative problems. Adversarial AutoEncoder, BiGAN, CycleGAN, Adversarial Variational Bayes.
- Optimization techniques: black-box methods, first order methodsBrief overview of first order optimization methods: stochastic gradient descent, momentum, adam/adamax. Detailed discussion of black-box optimization methods: Bayesian optimization, Variational Optimization. Examples of black-box optimization: hyper-parameter tuning.
- Miscellaneous topics: imbalanced datasets, importance sampling, one-class classification methodsDiscussion of the problems caused by imbalanced datasets, particularly, for gradients based methods: change of priors, importance sampling. One-class classification: one-class SVM, density based methods, popular heuristics: dimensionality reduction (e.g. through AutoEncoders), Radial Basis Networks.
- Deep generative models: energy-based models, Boltzmann machines and deep belief networksDefinition of a generative problem, types of generative problems. Energy-based models and contrastive divergence: Boltzmann machines, Deep Belief Networks, Restricted Boltzmann Machines and their connection to the AutoEncoders.
- Deep generative models: Variational AutoEncodersVariational bounds on likelihood, Variational AutoEncoder, Conditional Variational AutoEncoder.
- Meta-learning: concept learning, learning how to learnConcept learning: Neural Statistician, Generative Matching Networks. Learning how to learn: optimization procedure as a learning problem, gradient-based optimization algorithms.

#### Interim Assessment

- Interim assessment (2 module)Final score for the homework: <br /><i>homework score</i> = min [1, ∑<sub>i</sub>x<sub>i</sub>] - penalty, where x<sub>i</sub> is a score for each homework. <br /><br />(Final grade) = 50% × (<i>homework score</i>) + 50% × (<i>exam score</i>).<ul><li>since each homework has a max score of 1 and there are 3 assignments, it will be scaled by 5/3 in this formula;</li><li>max exam score is 10, so it will be scaled by 1/2.</li></ul><br /><i>Final grade</i> = [5/3 ⋅ <i>homework score</i> + 1/2 ⋅ <i>exam score</i>]

#### Bibliography

#### Recommended Core Bibliography

- Hall, M., Witten, Ian H., Frank, E. Data Mining: practical machine learning tools and techniques. – 2011. – 664 pp.
- Han, J., Kamber, M., Pei, J. Data Mining: Concepts and Techniques, Third Edition. – Morgan Kaufmann Publishers, 2011. – 740 pp.
- Hastie, T., Tibshirani, R., Friedman, J. The elements of statistical learning: Data Mining, Inference, and Prediction. – Springer, 2009. – 745 pp.

#### Recommended Additional Bibliography

- Mirkin, B. Core concepts in data analysis: summarization, correlation and visualization. – Springer Science & Business Media, 2011. – 388 pp.