Бакалавриат
2025/2026



Компьютерное зрение
Статус:
Курс по выбору (Прикладной анализ данных)
Где читается:
Факультет компьютерных наук
Когда читается:
4-й курс, 3 модуль
Охват аудитории:
для своего кампуса
Преподаватели:
Копылов Иван Станиславович
Язык:
английский
Кредиты:
4
Контактные часы:
40
Course Syllabus
Abstract
This course provides a comprehensive, engineering-oriented introduction to modern computer vision. Students will learn the complete pipeline from classical image processing to state-of-the-art transformer-based architectures. The course covers CNNs, Vision Transformers, object detection (YOLO, DETR), segmentation (U-Net, Mask R-CNN, SAM), multimodal models (CLIP), self-supervised learning, generative models (VAE, GAN), and video understanding. Practical sessions include hands-on implementation, training, and deployment of CV models.
Learning Objectives
- The main purpose is to provide students with both theoretical understanding and practical skills in modern computer vision. Students will master the evolution from classical filters to deep learning architectures, understand the mathematical foundations of CNNs and Transformers, and gain engineering competence in deploying CV systems.
Expected Learning Outcomes
- Implement and train CNN and Vision Transformer architectures
- Build object detection and segmentation pipelines using modern frameworks
- Apply multimodal models (CLIP) for zero-shot tasks
- Understand and implement self-supervised learning methods
- Design and train generative models (VAE, GAN)
- Process video data with spatio-temporal models
- Deploy CV models with ONNX/TensorRT for production
- Evaluate and debug CV systems systematically
Course Contents
- Image as a Signal and Spatial Structure
- Classification: CNNs and Vision Transformers
- Object Detection: Anchor-Based vs Transformer-Based
- Segmentation: U-Net, Mask R-CNN, Mask2Former, SAM
- Multimodal Vision and Open-Vocabulary Models
- Representation Learning and Self-Supervision
- Generative Models I: Autoencoders, VAE, VQ-VAE, VQ-GAN
- Generative Models II: GAN Evolution
- Overview of Diffusion Models + Video Representations
- Production Engineering and Final Integration
Bibliography
Recommended Core Bibliography
- Computer vision : models, learning, and inference, Prince, S. J. D., 2012
- Huang, K., Hussain, A., Wang, Q.-F., & Zhang, R. (2019). Deep Learning: Fundamentals, Theory and Applications. Cham, Switzerland: Springer. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=2029631
- Richard Szeliski. (2010). Computer Vision: Algorithms and Applications. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.E8FCD1BD
Recommended Additional Bibliography
- Deep learning, Goodfellow, I., 2016