• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Master 2022/2023

Large Scale Machine Learning 2

Type: Elective course
Area of studies: Applied Mathematics and Informatics
When: 2 year, 2 module
Mode of studies: distance learning
Online hours: 82
Open to: students of one campus
Instructors: Anatoly Bardukov
Master’s programme: Master of Data Science
Language: English
ECTS credits: 4
Contact hours: 8

Course Syllabus

Abstract

This course focuses on future of ML Engineering. It starts from big data problems and classic models appliance, introduces approaches for text (NLP) and other data types (images, etc), and in the end presents the field of ML operations. Final project requires you to show the full cycle of ML workflow including data collection, training and deployment. To complete the course, students are supposed to have skills in classical algorithms and data structures, main concepts of machine learning, and Python programming.
Learning Objectives

Learning Objectives

  • After taking this course, students should be able to: ● work with large and high-dimensional datasets, ● work withtext data, ● use strategies for paralleling neural network learning, ● use different approaches for model optimization, ● plan the model deployment using different scenarios.
Expected Learning Outcomes

Expected Learning Outcomes

  • Understand how to work with big data preparation for classic models’ training
  • Big text data preparation, understand word to vec models
  • Distributed training of neural networks, transfer learning
  • Understand knowledge distillation; neural network prunning, quantization
  • Dockerization of models
  • Get familiar with MLflow
Course Contents

Course Contents

  • Big Data Problems and Classic Models
  • Text Models for Big Data
  • Neural Network Models
  • Model Optimization
  • Machine Learning Models Deployment
  • End-to-end Production pipeline for Machine Learning Models
Assessment Elements

Assessment Elements

  • non-blocking Programming Assignments
    Weekly programming assignments.
  • non-blocking Final Project
Interim Assessment

Interim Assessment

  • 2022/2023 2nd module
    0.4 * Final Project + 0.6 * Programming Assignments
Bibliography

Bibliography

Recommended Core Bibliography

  • Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The Elements of Statistical Learning : Data Mining, Inference, and Prediction (Vol. Second edition, corrected 7th printing). New York: Springer. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=277008

Recommended Additional Bibliography

  • Murphy, K. P. (2012). Machine Learning : A Probabilistic Perspective. Cambridge, Mass: The MIT Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=480968