• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Bachelor 2020/2021

Applied Machine Learning in Python

Type: Elective course (Marketing and Market Analytics)
Area of studies: Management
When: 3 year, 4 module
Mode of studies: distance learning
Instructors: Gleb Karpushkin
Language: English
ECTS credits: 3
Contact hours: 2

Course Syllabus

Abstract

This course will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods. The course will start with a discussion of how machine learning is different than descriptive statistics, and introduce the scikit learn toolkit through a tutorial. The issue of dimensionality of data will be discussed, and the task of clustering data, as well as evaluating those clusters, will be tackled. Supervised approaches for creating predictive models will be described, and learners will be able to apply the scikit learn predictive modelling methods while understanding process issues related to data generalizability (e.g. cross validation, overfitting). The course will end with a look at more advanced techniques, such as building ensembles, and practical limitations of predictive models. By the end of this course, students will be able to identify the difference between a supervised (classification) and unsupervised (clustering) technique, identify which technique they need to apply for a particular dataset and need, engineer features to meet that need, and write python code to carry out an analysis
Learning Objectives

Learning Objectives

  • The purpose of this course is to give students a solid introduction to modern applied Machine Learning (ML) methods and pipelines that are available for practitioners in the field of machine learning and statistical learning. The course develops skills of clustering data and creating predictive models. At the end of the course, students should be able to write short scripts to import, prepare and analyze data
Expected Learning Outcomes

Expected Learning Outcomes

  • Describe how machine learning is different than descriptive statistics
  • Explain different approaches for creating predictive models
  • Create and evaluate data clusters
  • Build features that meet analysis needs
Course Contents

Course Contents

  • Fundamentals of Machine Learning - Intro to SciKit Learn
    This module introduces basic machine learning concepts, tasks, and workflow using an example classification problem based on the K-nearest neighbors method, and implemented using the scikit-learn library.
  • Supervised Machine Learning - Part 1
    This module delves into a wider variety of supervised learning methods for both classification and regression, learning about the connection between model complexity and generalization performance, the importance of proper feature scaling, and how to control model complexity by applying techniques like regularization to avoid overfitting. In addition to k-nearest neighbors, this week covers linear regression (least-squares, ridge, lasso, and polynomial regression), logistic regression, support vector machines, the use of cross-validation for model evaluation, and decision trees.
  • Evaluation
    This module covers evaluation and model selection methods that you can use to help understand and optimize the performance of your machine learning models.
  • Supervised Machine Learning - Part 2
    This module covers more advanced supervised learning methods that include ensembles of trees (random forests, gradient boosted trees), and neural networks (with an optional summary on deep learning). You will also learn about the critical problem of data leakage in machine learning and how to detect and avoid it.
Assessment Elements

Assessment Elements

  • non-blocking Group project
    Individual or in a group of 2 students max Project
  • non-blocking Course assessments
    Provided by Coursera course, embedded surveys, tests, quizes
  • non-blocking Exam
Interim Assessment

Interim Assessment

  • Interim assessment (4 module)
    0.3 * Course assessments + 0.4 * Exam + 0.3 * Group project
Bibliography

Bibliography

Recommended Core Bibliography

  • Derivatives analytics with Python : data analysis, models, simulation, calibration and hedging, Hilpisch, Y. J., 2015
  • Python 3, Прохоренок, Н. А., 2016
  • Python for data analysis : data wrangling with pandas, numPy, and IPhython, Mckinney, W., 2017
  • Python для сложных задач : наука о данных и машинное обучение, Плас, Дж. В., 2018
  • Введение в машинное обучение с помощью Python : руководство для специалистов по работе с данными, Мюллер, А., 2018
  • Глубокое обучение на Python, Шолле, Ф., 2019
  • Изучаем программирование на Python, Бэрри, П., 2017
  • Основы Data Science и Big data : Python и наука о данных, Силен, Д., 2017

Recommended Additional Bibliography

  • Алгоритмы. Справочник : с примерами на C, C++, Java и Python, Хайнеман, Дж., 2017
  • Элегантный SciPy : искусство научного программирования на Python, Нуньес-Иглесиас, Х., 2018