• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
For visually-impairedUser profile (HSE staff only)Search
Master 2020/2021

Contemporary Data Analysis: Methodology and Methods of Interdisciplinary Research

Area of studies: Applied Mathematics and Informatics
When: 1 year, 1, 2 module
Mode of studies: offline
Open to: students of all HSE University campuses
Instructors: Valentina Kuskova
Master’s programme: Applied Statistics with Network Analysis
Language: English
ECTS credits: 4

Course Syllabus


This course is a required foundational course for masters’ students in “Applied Statistics with Network Analysis” program, designed to familiarize them with the most recent developments in interdisciplinary statistical methods. The students will get an overview of data and approaches to analyzing them (remember, “data” is always plural!), including complex models. The course will also emphasize problem formulation at the intersection of mathematics and social sciences, integrate the most important concepts from probability theory, and overall, is designed as a "gateway" to graduate work in statistics, where the mathematical concepts are bridged with applied concepts and research design, depending on the discipline.
Learning Objectives

Learning Objectives

  • The course gives students an important foundation to develop and conduct their own research as well as to evaluate research of others.
Expected Learning Outcomes

Expected Learning Outcomes

  • Know the four major areas that contemporary field of statistics is based on: data management, statistical inference, statistical prediction, and statistical reporting.
  • Know the most recent advances in network science and applied statistical methods, complex statistical modeling, analysis, and forecasting.
  • Be able to build and estimate formalized mathematical models, describing real-life situations.
  • Have a working knowledge of mathematics of data analysis.
  • Be able to criticize constructively and determine existing issues with the use of statistical methods in published work .
  • Be able to apply data analysis tools to real-life problems.
  • Be able to estimate the data, find appropriate functions describing the data, visualize data.
  • Know contemporary software programs used to analyze data.
  • Have a working knowledge of different ways of using software programs for data analysis.
Course Contents

Course Contents

  • Introduction
    The first session will introduce the main concepts of contemporary data analysis, with an overview of the field and everything that needs to be taken into account when working with data.
  • Social data
    The session sets up the framework for collecting social data, meaning projecting real life into numbers. We will also talk about scaling procedures, related issues and applications.
  • Summated rating scale overview
    The session provides the theoretical basis and the general approach to constructing semantic-differential scales. This topic will also provide an overview of covariance structure analysis nec-essary for CFA and software used to analyze summated scales.
  • Scaling procedures: issues and applications
    This sessions continues with summated rating scales, different areas of their application, issues with validity and reliability, and additional software analysis.
  • Missing data and other data issues
    This session covers the basics of fake data, missing data, bootstrapping techniques, and other re-lated questions that could arise when working with real-life data.
  • Introduction to causal inference
    This session will talk about the various approaches to establishing causality, starting with experimental design and ending with adjustments to modeling when setting up experiments is not possible. It will also discuss multiple instruments used to establish causal inference.
  • Causal inference – special topics I (RDD)
    This session will focus on regression discontinuity design, a non-parametric method that is widely used in social sciences for establishing causality.
  • Causal inference – special topics II (IV)
    This session will continue with special tools in causality domain, this time looking at the instru-mental variables as a way to correct for some of the issues that preclude establishing causality. We will also look at other ways to use IVs.
  • Spacial data analysis
    This session will look into incorporating special data into analysis, including geographic mapping and other related methods.
  • Prediction
    This session will look into the basics of predictive modeling, starting with the basic foundation of building prediction with dynamic modeling.
  • Predictive modeling
    This session will look on advance techniques of predictive modeling, including neural networks and other advanced methods.
  • Conclusion: overview of the field
    This session is designed to give the final look at the vast field of data analysis, with most up-to-date methods reviewed and put together into a one coherent whole.
Assessment Elements

Assessment Elements

  • non-blocking Final In-Class or Take-home exam (at the discretion of the instructor)
  • non-blocking Homework Assignments (5 x Varied points)
  • non-blocking In-Class Labs (9-10 x Varied points)
  • non-blocking Quizzes (Best 9 of 10, Varied points)
Interim Assessment

Interim Assessment

  • Interim assessment (2 module)
    0.5 * Final In-Class or Take-home exam (at the discretion of the instructor) + 0.2 * Homework Assignments (5 x Varied points) + 0.2 * In-Class Labs (9-10 x Varied points) + 0.1 * Quizzes (Best 9 of 10, Varied points)


Recommended Core Bibliography

  • Brown, T. A. (2015). Confirmatory Factor Analysis for Applied Research, Second Edition (Vol. Second edition). New York: The Guilford Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=831411
  • Denis, D. J. (2016). Applied Univariate, Bivariate, and Multivariate Statistics. Hoboken, New Jersey: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1091881
  • Raykov, T., & Marcoulides, G. A. (2006). A First Course in Structural Equation Modeling (Vol. 2nd ed). Mahwah, NJ: Routledge. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=188193
  • Statistics and Causality : Methods for Applied Empirical Research, edited by Wolfgang Wiedermann, and Eye, Alexander von, John Wiley & Sons, Incorporated, 2016. ProQuest Ebook Central, https://ebookcentral.proquest.com/lib/hselibrary-ebooks/detail.action?docID=4530803.

Recommended Additional Bibliography

  • Khandker, S. R., Koolwal, G. B., & Samad, H. A. (2010). Handbook on Impact Evaluation : Quantitative Methods and Practices. Washington, D.C.: World Bank Publications. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=305052
  • Little, R. J. A., & Rubin, D. B. (2002). Statistical Analysis with Missing Data (Vol. Second edition). Hoboken: Wiley-Interscience. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=838162
  • Pearl, J., Glymour, M., & Jewell, N. P. (2016). Causal Inference in Statistics : A Primer. Chichester, West Sussex, UK: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1161971