• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Master 2019/2020

Neurobayesian Models

Category 'Best Course for Career Development'
Category 'Best Course for Broadening Horizons and Diversity of Knowledge and Skills'
Category 'Best Course for New Knowledge and Skills'
Type: Elective course (Statistical Learning Theory)
Area of studies: Applied Mathematics and Informatics
When: 2 year, 3 module
Mode of studies: offline
Master’s programme: Statistical Learning Theory
Language: English
ECTS credits: 6

Course Syllabus


This course is devoted to Bayesian reasoning in application to deep learning models. Attendees would learn how to use probabilistic modeling to construct neural generative and discriminative models, how to use the paradigm of generative adversarial networks to perform approximate Bayesian inference and how to model the uncertainty about the weights of neural networks. Selected open problems in the field of deep learning would also be discussed. The practical assignments will cover implementation of several modern Bayesian deep learning models.
Learning Objectives

Learning Objectives

  • The learning objective of the course is to give students basic and advanced tools for inference and learning in complex probabilistic models involving deep neural networks, such as probabilistic deep generative models and Bayesian neural networks.
Expected Learning Outcomes

Expected Learning Outcomes

  • Knowledge about different approximate inference and learning techniques for probabilistic models
  • Hands-on experience with modern probabilistic modifications of deep learning models
  • Knowledge about the necessary building blocks that allow to construct new probabilistic models, suitable for the desired problems
Course Contents

Course Contents

  • Stochastic Variational Inference (SVI) and Doubly SVI (DSVI)
    SVI as a scalable alternative to the variational inference for tasks with large data. Application of SVI to latent Dirichlet allocation model.
  • Bayesian neural networks and bayesian compression of neural networks
    Variational inference of the posterior distribution over the weights of discriminative neural networks. Local reparameterization trick for gradient variance reduction. Variational Dropout sparsifies deep neural networks: different parametrization yields totally different model. Soft Weight Sharing: how to save memory, using weights quantization of neural network
  • Variational autoencoders (VAE) and normalizing flows (NF)
    Probabilistic PCA, VAE as a non-linear generalization of probabilistic PCA. Reparametrization trick for doubly-stochastic variational inference. Extending variational approximations with normalizing flows. Examples of normalizing flows
  • Discrete Latent Variables and Variance Reduction
    The idea of Stochastic Computation Graphs, discrete and continuous stochastic nodes, and gradient estimation: Gumbel-Softmax and REINFORCE with control variates.
  • Implicit Variational Inference using Adversarial Training
    Adversarial Variational Bayes for training VAE with implicit inference distribution. f-GANs as a generalization of vanilla GANs for optimizing arbitrary f-divergence.
  • Inference in implicit probabilistic models
    Implicit and semi-implicit distributions are flexible parametric families that can be constructed with neural networks in a general way. Such distributions can be used as building blocks for probabilistic models. How to construct such distributions and how to perform inference with such models.
  • Deep MCMC
    How neural networks help MCMC methods to sample from analytical distribution, and how MCMC methods help neural networks to sample from empirical distribution.
Assessment Elements

Assessment Elements

  • Partially blocks (final) grade/grade calculation Practical assignments
    Practical assignments consist of programming some models/methods from the course in Python and analysing their behavior: Sparse Variational Dropout (SVDO), NF, VAE, Discrete Latent Variables (DLV).
  • blocking Exam
    2-ой курс. Экзамен состоялся в 3-ем модуле
Interim Assessment

Interim Assessment

  • Interim assessment (3 module)
    0.3 * Exam + 0.7 * Practical assignments


Recommended Core Bibliography

  • Christopher M. Bishop. (n.d.). Australian National University Pattern Recognition and Machine Learning. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.EBA0C705
  • Murphy, K. P. (2012). Machine Learning : A Probabilistic Perspective. Cambridge, Mass: The MIT Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=480968
  • Гудфеллоу Я., Бенджио И., Курвилль А. - Глубокое обучение - Издательство "ДМК Пресс" - 2018 - 652с. - ISBN: 978-5-97060-618-6 - Текст электронный // ЭБС ЛАНЬ - URL: https://e.lanbook.com/book/107901

Recommended Additional Bibliography

  • Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D. (2015). Weight Uncertainty in Neural Networks. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1505.05424
  • Grathwohl, W., Choi, D., Wu, Y., Roeder, G., & Duvenaud, D. (2017). Backpropagation through the Void: Optimizing control variates for black-box gradient estimation. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1711.00123
  • Jang, E., Gu, S., & Poole, B. (2016). Categorical Reparameterization with Gumbel-Softmax. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1611.01144
  • Kingma, D. P., & Welling, M. (2013). Auto-Encoding Variational Bayes. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1312.6114
  • Kingma, D. P., Salimans, T., & Welling, M. (2015). Variational Dropout and the Local Reparameterization Trick. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1506.02557
  • Levy, D., Hoffman, M. D., & Sohl-Dickstein, J. (2017). Generalizing Hamiltonian Monte Carlo with Neural Networks. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1711.09268
  • Louizos, C., & Welling, M. (2017). Multiplicative Normalizing Flows for Variational Bayesian Neural Networks. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1703.01961
  • Maddison, C. J., Mnih, A., & Teh, Y. W. (2016). The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1611.00712
  • Matt Hoffman, David M. Blei, Chong Wang, & John Paisley. (2013). Stochastic Variational Inference. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.C4CCD6D4
  • Mescheder, L., Nowozin, S., & Geiger, A. (2017). Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1701.04722
  • Molchanov, D., Ashukha, A., & Vetrov, D. (2017). Variational Dropout Sparsifies Deep Neural Networks. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1701.05369
  • Nowozin, S., Cseke, B., & Tomioka, R. (2016). f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1606.00709
  • Rezende, D. J., & Mohamed, S. (2015). Variational Inference with Normalizing Flows. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1505.05770
  • Sida I. Wang, & Christopher D. Manning. (2013). Fast dropout training. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.C2036E9B
  • Song, J., Zhao, S., & Ermon, S. (2017). A-NICE-MC: Adversarial Training for MCMC. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1706.07561
  • Tucker, G., Mnih, A., Maddison, C. J., Lawson, D., & Sohl-Dickstein, J. (2017). REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1703.07370
  • Ullrich, K., Meeds, E., & Welling, M. (2017). Soft Weight-Sharing for Neural Network Compression. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1702.04008