Методы машинного обучения и майнинга данных
- To familiarize students with a new rapidly evolving filed of machine learning and mining, and provide practical knowledge experience in analysis of real world data.
- Students derive the bias-variance decomposition for MSE and “0-1” losses, and show how regularization affects the tradeoff.
- Students explain and utilize the black-box optimization techniques.
- Students explain the concepts of bootstrapping, bagging and boosting, and justify the choice of a particular weak learner for a given aggregating algorithm.
- Students explain the main approaches to graphical probabilistic models and training of them.
- Students explain the relation between linear models and deep neural networks, describe how neural networks are trained, and understand what the role of data scientist is in designing a deep learning solution to a machine learning problem.
- Students know meta-learning approaches.
- Students know the statement of No-Free-Lunch theorems and explain the role of prior knowledge for solving machine learning problems.
- Students understand the principles behind Variational AutoEncoders and implement them.
- Students understand the principles of Generative Adversarial Networks, know which metrics they can optimize and how to regularize them.
- Students use the techniques for working with imbalanced datasets.
- Introduction to Machine Learning and Data Mining, No-Free-Lunch theorems
- Bias-variance decomposition, regularization techniques
- Introduction to meta-algorithms, bootstrap, boosting
- Introduction and overview of deep learning methods
- Deep generative models: Generative Adversarial Networks (GANs)
- Optimization techniques: black-box methods, first order methods
- Miscellaneous topics: imbalanced datasets, importance sampling, one-class classification methods
- Deep generative models: energy-based models, Boltzmann machines and deep belief networks
- Deep generative models: Variational AutoEncoders
- Meta-learning: concept learning, learning how to learn
- 2021/2022 2nd moduleFinal score for the homework: <br /><i>homework score</i> = min [1, ∑<sub>i</sub>x<sub>i</sub>] - penalty, where x<sub>i</sub> is a score for each homework. <br /><br />(Final grade) = 50% × (<i>homework score</i>) + 50% × (<i>exam score</i>).<ul><li>since each homework has a max score of 1 and there are 3 assignments, it will be scaled by 5/3 in this formula;</li><li>max exam score is 10, so it will be scaled by 1/2.</li></ul><br /><i>Final grade</i> = [5/3 ⋅ <i>homework score</i> + 1/2 ⋅ <i>exam score</i>]
- Hall, M., Witten, Ian H., Frank, E. Data Mining: practical machine learning tools and techniques. – 2011. – 664 pp.
- Han, J., Kamber, M., Pei, J. Data Mining: Concepts and Techniques, Third Edition. – Morgan Kaufmann Publishers, 2011. – 740 pp.
- Hastie, T., Tibshirani, R., Friedman, J. The elements of statistical learning: Data Mining, Inference, and Prediction. – Springer, 2009. – 745 pp.
- Mirkin, B. Core concepts in data analysis: summarization, correlation and visualization. – Springer Science & Business Media, 2011. – 388 pp.