Магистратура
2024/2025




Методы машинного обучения в биоинформатике
Лучший по критерию «Полезность курса для Вашей будущей карьеры»
Лучший по критерию «Полезность курса для расширения кругозора и разностороннего развития»
Статус:
Курс по выбору (Анализ данных в биологии и медицине)
Направление:
01.04.02. Прикладная математика и информатика
Где читается:
Факультет компьютерных наук
Когда читается:
1-й курс, 1, 2 модуль
Формат изучения:
с онлайн-курсом
Онлайн-часы:
14
Охват аудитории:
для своего кампуса
Прогр. обучения:
Анализ данных в биологии и медицине
Язык:
английский
Кредиты:
6
Course Syllabus
Abstract
The course introduces students to the theory and practice of applying machine learning algorithms to solve problems in the field of bioinformatics. The main goal is to provide students with a comprehensive understanding of modern methods of data analysis and the construction of predictive models. During the course, students will learn the key stages of working with data: from preprocessing and dimensionality reduction methods to techniques for building, optimizing, and validating models. The course program covers a wide range of algorithms, including linear regression with regularization (ridge regression, lasso, elastic network), support vector machine (SVM), neural networks, k-nearest neighbor (k-NN) method, classification and regression trees, as well as ensemble methods such as random forest and gradient boosting. Special attention is paid to practical work: seminars are aimed at developing skills in working with specialized software tools and libraries for predictive modeling. The classes will cover a variety of real-world cases and applied problems based on datasets from the field of bioinformatics.
Learning Objectives
- Master the theory, process, and components of machine learning implementation
- Learn to distinguish between types of predictive models and to know the key stages of their creation, such as data preprocessing, model construction and performance evaluation
- Test various practical applications of predictive modeling using machine learning algorithms for databases in the field of molecular biology
- Learn how to use functions from various Python libraries to apply different types of models: linear and nonlinear regression and classification models, decision trees, and rule-based models
- Perform input data preprocessing using Python: calculate statistics, evaluate skewness, apply appropriate transformations, perform principal component analysis (PCA), find correlations between predictors, and create dummy variables
- Apply Python functions to measure the importance of predictors and model performance, use feature filtering methods, and estimate prediction error
- Comprehensively apply the acquired knowledge and predictive analytics tools to solve applied problems in the field of bioinformatics
Expected Learning Outcomes
- apply the knowledge and tools of predictive analytics to real-life applications
- acquire the skills to implement machine-learning algorithms in python
- know the theory of machine-learning algorithms
Course Contents
- ML paradigm thinking and project anatomy
- Data Preprocessing.
- Linear regression models.
- Multivariate adaptive regression splines.
- Neural networks.
- Support vector machines. K-nearest neighbors.
- Measuring performance in classification models.
- Linear classification models
- Nonlinear classification models
- Decision Trees
- Machine-learning in bioinformatics
Interim Assessment
- 2024/2025 2nd module0.4 * Exam + 0.15 * Home assignment 1 + 0.15 * Home assignment 2 + 0.15 * Home assignment 3 + 0.15 * Home assignment 4
Bibliography
Recommended Core Bibliography
- Machine learning : a probabilistic perspective, Murphy, K. P., 2012
Recommended Additional Bibliography
- Data mining : practical machine learning tools and techniques, Witten, I. H., 2011
- Machine learning : the art and science of algorithms that make sense of data, Flach, P., 2014
- Witten, I. H. et al. Data Mining: Practical machine learning tools and techniques. – Morgan Kaufmann, 2017. – 654 pp.
- Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2017). Data Mining : Practical Machine Learning Tools and Techniques (Vol. Fourth edition). Cambridge, MA: Morgan Kaufmann. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1214611