• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта
2021/2022

Машинное обучение в Питоне

Лучший по критерию «Полезность курса для Вашей будущей карьеры»
Лучший по критерию «Полезность курса для расширения кругозора и разностороннего развития»
Лучший по критерию «Новизна полученных знаний»
Статус: Общеуниверситетский факультатив
Когда читается: 1, 2 модуль
Охват аудитории: для всех
Преподаватели: Макаров Михаил Сергеевич, Мельников Олег, Тихонова Мария Ивановна
Язык: английский
Кредиты: 4
Контактные часы: 56

Course Syllabus

Abstract

This course introduces the students to the elements of machine learning and deep learning, including supervised and unsupervised methods such as linear and logistic regressions, splines, decision trees, support vector machines, bootstrapping, random forests, boosting, regularized methods, and topics in neural networks. Students apply Python programming language and popular packages, such as pandas, scikit-learn and TensorFlow, to investigate/visualize datasets and develop machine learning models to solve data-driven regression, classification and unsupervised problems. ❕PREREQUISITE❕ include prior coding experience in a higher level programming language (preferably, Python), prior coursework in statistics, linear algebra, calculus, writing/reading fluency in English language, and basic familiarity with machine learning. ❗STUDY LOAD❗ is 10 hours per week for well prepared students, but could be more for students lacking prerequisites. ⚠️IMPORTANT⚠️: this is an active and heavy hands-on course. We will have weekly machine learning assignments, including in-class Kaggle competitions (as group activities), quizzes on material from course textbook, and occasional DataCamp courses covering the necessary prerequisite concepts. We will also have weekly seminars and lectures, 80 minutes each.
Learning Objectives

Learning Objectives

  • The course aims to help students develop an understanding of the process to learn from data, familiarize them with a wide variety of algorithmic and model based methods to extract information from data, teach to apply and evaluate suitable methods to various datasets by model selection and predictive performance evaluation.
Course Contents

Course Contents

  • Academic Integrity, Honor, Ethics
  • Review of Calculus, Linear Algebra, Probability, Stats, Python, Colab
  • Introduction to Statistical Learning
  • Linear Regression and K Nearest Neighbor (KNN)
  • Classification: Logistic Regression, Linear Discriminant Analysis, Quadratic Discriminant Analysis, KNN
  • Resampling Methods. Cross Validation (CV), Bootstrap
  • Linear Model Selection and Regularization
  • Non-linear Regression
  • Decision Trees, Bagging, Random Forest, Boosting
  • Support Vector Machines (SVM)
  • Clustering and Dimension Reduction: k-Means, Hierarchical Clustering (HC), DBSCAN, PCA
  • Artificial Neural Networks (ANN) and Introduction to Deep Learning
  • Recurrent Neural Networks (RNN), Long-short Term Memory (LSTM)
  • Convolutional Neural Networks (CNN)
  • Deep Generative Models and Autoencoders
Assessment Elements

Assessment Elements

  • non-blocking Quizzes
    All questions and answers are in English. These closely follow the textbook, lectures, seminars and material posted in LMS, including questions about Syllabus and ethics/integrity/honor code.
  • non-blocking homework assignments
    Students will likely be formed in groups of about 2 students. Collaborations outside of their group will only be allowed at a high level. See grading rubric and syllabus for further instructions.
  • non-blocking participation
    See syllabus for more info.
Interim Assessment

Interim Assessment

  • 2021/2022 2nd module
    0.4 * homework assignments + 0.2 * participation + 0.4 * Quizzes
Bibliography

Bibliography

Recommended Core Bibliography

  • Gareth James, Daniela Witten, Trevor Hastie, & Robert Tibshirani. (2013). An Introduction to Statistical Learning : With Applications in R. Springer.

Recommended Additional Bibliography

  • Trevor Hastie, Robert Tibshirani , et al., The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edition, 2017. Free from the publisher: https://web.stanford.edu/~hastie/ElemStatLearn/printings/ESLII_print12.pdf