• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Бакалаврская программа «Прикладной анализ данных»

Computer Vision

2025/2026
Учебный год
ENG
Обучение ведется на английском языке
4
Кредиты
Статус:
Курс по выбору
Когда читается:
4-й курс, 3 модуль

Преподаватель

Course Syllabus

Abstract

This course provides a comprehensive, engineering-oriented introduction to modern computer vision. Students will learn the complete pipeline from classical image processing to state-of-the-art transformer-based architectures. The course covers CNNs, Vision Transformers, object detection (YOLO, DETR), segmentation (U-Net, Mask R-CNN, SAM), multimodal models (CLIP), self-supervised learning, generative models (VAE, GAN), and video understanding. Practical sessions include hands-on implementation, training, and deployment of CV models.
Learning Objectives

Learning Objectives

  • The main purpose is to provide students with both theoretical understanding and practical skills in modern computer vision. Students will master the evolution from classical filters to deep learning architectures, understand the mathematical foundations of CNNs and Transformers, and gain engineering competence in deploying CV systems.
Expected Learning Outcomes

Expected Learning Outcomes

  • Implement and train CNN and Vision Transformer architectures
  • Build object detection and segmentation pipelines using modern frameworks
  • Apply multimodal models (CLIP) for zero-shot tasks
  • Understand and implement self-supervised learning methods
  • Design and train generative models (VAE, GAN)
  • Process video data with spatio-temporal models
  • Deploy CV models with ONNX/TensorRT for production
  • Evaluate and debug CV systems systematically
Course Contents

Course Contents

  • Image as a Signal and Spatial Structure
  • Classification: CNNs and Vision Transformers
  • Object Detection: Anchor-Based vs Transformer-Based
  • Segmentation: U-Net, Mask R-CNN, Mask2Former, SAM
  • Multimodal Vision and Open-Vocabulary Models
  • Representation Learning and Self-Supervision
  • Generative Models I: Autoencoders, VAE, VQ-VAE, VQ-GAN
  • Generative Models II: GAN Evolution
  • Overview of Diffusion Models + Video Representations
  • Production Engineering and Final Integration
Assessment Elements

Assessment Elements

  • non-blocking HW_1
  • non-blocking HW_2
  • non-blocking Project
  • non-blocking Exam
Interim Assessment

Interim Assessment

  • 2025/2026 3rd module
    0.4 * Exam + 0.1 * HW_1 + 0.1 * HW_2 + 0.4 * Project
Bibliography

Bibliography

Recommended Core Bibliography

  • Computer vision : models, learning, and inference, Prince, S. J. D., 2012
  • Huang, K., Hussain, A., Wang, Q.-F., & Zhang, R. (2019). Deep Learning: Fundamentals, Theory and Applications. Cham, Switzerland: Springer. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=2029631
  • Richard Szeliski. (2010). Computer Vision: Algorithms and Applications. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.E8FCD1BD

Recommended Additional Bibliography

  • Deep learning, Goodfellow, I., 2016

Authors

  • Kopylov Ivan Stanislavovich
  • Кононова Елизавета Дмитриевна