• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Магистерская программа «Науки о данных (Data Science)»

Stochastic Modelling

2016/2017
Учебный год
ENG
Обучение ведется на английском языке
5
Кредиты
Статус:
Курс по выбору
Когда читается:
2-й курс, 1, 2 модуль

Course Syllabus

Abstract

The first part of the course deals with unsupervised learning techniques for independent observations (including clustering and probabilistic principal component analysis), before moving on to models for sequential data (including the Poisson process and Kalman filtering). Finally, we introduce basic sampling techniques. The second part of the course is devoted to the study of stochastic modeling and simulation of random processes. It covers Markov chains with discrete and continuous time, methods of finding stationary states and processes described by Markov chains. The course proceeds with practical algorithms based on Markov chains: the hidden Markov model (HMM), Markov random field (MRF), Monte Carlo Methods in Markov chains (MCMC).
Learning Objectives

Learning Objectives

  • The main objective of the course is to learn the fundamental principles used in random systems modeling, as well as advanced algorithms based on them. These algorithms are widely used in modern technologies of information retrieval, processing and recognition of speech and language, bioinformatics, and many others. Understanding of the basic algorithms required for the training of professionals in the field of mathematical modeling.
  • The learning objective of the course «Stochastic Modeling» is to provide students with essential tools including:<ul> <li>Clustering (K-means and Gaussian Mixture Models);</li><li>EM algorithm;</li> <li>Principal Component Analysis (PCA), Probabilistic PCA, Kernel PCA;</li> <li>Kalman filtering;</li> <li>Basing sampling methods;</li> <li>Random processes and Markov chains;</li> <li>Hidden Markov chains;</li> <li>Monte Carlo methods.</li></ul>
Expected Learning Outcomes

Expected Learning Outcomes

  • Student knows essential techniques for clustering.
  • Student knows principal component analysis, and its generalisations.
  • Student forecasts time series using Kalman filtering.
  • Student knows basic sampling methods.
  • Student knows basic notions in a theory of random processes and Markov chains.
  • Student applies Markov chains to simulate various random processes in real-world problems (speech recognition, a text author problem, etc.).
  • Student understands the capabilities and limitations of existing algorithms.
Course Contents

Course Contents

  • Clustering
    K-means algorithm, initialization of the K-means algorithm using k++ procedure, Gaussian mixtures model, general form of the EM algorithm, EM akgorithm applied to gaussian mixtures, link between K-means and the Gaussian mixture model.
  • PCA
    Review of Principal Component Analysis, probabilistic PCA (pPCA) and factor analysis, learning pPCA using the EM algorithm and direct maximization of the likelihood, kernel PCA.
  • Kalman filtering
    Derivation of Kalman filtering equations, definition of Kalman gain, learning the model using the EM algorithm.
  • Sampling methods
    Rejection sampling, important sampling.
  • Introduction to Markov process
    Stochastic process. Markov process. Discrete and continues time processes. Probabilities.
  • Discrete time Markov Chains
    Discrete time Markov chains (DTMC). Applications of DTMC. Stationary DTMC. Transition matrix. N-step transition matrix. The Chapman-Kolmogorov equations. Classification of states. Reachable and communicating states. Equivalence classes. Communicating classes. Irreducibility. Periodicity. Recurrence and transience. First passage time. Positive recurrence. Recurrence time. Ergodic states. Fundamental theorem of Markov chains. Steady state. Steady states equations. Absorbing states. Finite absorbing chains. Transition matrix structure. Absorption probabilities. Limiting distributions.
  • Continuous time Markov chains
    Continuous time stochastic process (CTMC) . Examples of CTMC. Transition probability functions. Holding time and exponential distribution. Embedded DTMC. Transition rates. Absorbing states. Transition probability diagram. Classification of states. Long run behavior of CTMC. Positive recurrence. Fundamental theorem for CTMC. Steady state equations. Example of birth and death processes.
  • Hidden Markov models (HMM)
    Introduction to Hidden Markov Model. Prediction of states. Evaluation of model parameters. Transition and signal matrices. Three main problems of HMM. Evaluation of signal sequence. Forward-backward procedure. Finding the most probable sequence. Decoding. Viterbi Algorithm. Fining model parameters- training HMMs. Baum-Welch Algorithm. HMM applications in speech recognition.
  • Markov chain Monte Carlo (MCMC)
    Random walks. Sampling from a distribution. Monte Carlo integrations. Construction of Markov chains. Metropolis-Hastings algorithm.
Assessment Elements

Assessment Elements

  • non-blocking Homework 1
    One written homework.
  • non-blocking Homework 2
    Cumulative of written and coding works during the 2nd module.
  • non-blocking MidTerm Exam
  • non-blocking Exam
Interim Assessment

Interim Assessment

  • Interim assessment (2 module)
    0.3 * Exam + 0.15 * Homework 1 + 0.3 * Homework 2 + 0.25 * MidTerm Exam
Bibliography

Bibliography

Recommended Core Bibliography

  • Christopher M. Bishop. (n.d.). Australian National University Pattern Recognition and Machine Learning. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.EBA0C705

Recommended Additional Bibliography

  • Ibe, O. C. (2009). Markov Processes for Stochastic Modeling. Amsterdam: Elsevier Ltd. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=319668