The course provides students with a basic knowledge of statistics and data analysis techniques. The course consists of three parts. In the first part we will talk about general ideas of statistics and data analysis, mainly discussing descriptive statistics and basic data manipulations. In the second part of the course we will move towards inferential statistics and hypothesis testing. In the third part, we will apply machine learning techniques for data analysis. All the course practice will be conducted in Python. There are 3 credits for this course.
Learning Objectives
Via this course, students will acquire a solid basis in data manipulation and visualization.
Expected Learning Outcomes
After this session, students should be able to: - Apply numerical techniques for describing and summarizing data - Identify, compute, and interpret descriptive statistical summary measures - Differentiate between the measures of central tendency, dispersion, and relative standing
Course Contents
Introduction
Data Basics
Graphical Descriptive Techniques
Numerical Descriptive Techniques
Data Collection and Sampling Theory
Probability
Discrete Probability Distributions
Continuous Probability Distributions
Sampling Distributions
Descriptive statistics: System of variables
Descriptive statistics: Qualitative and Quantitative Data.
Measures of Central Location
Estimation
Hypothesis Testing Framework
Inference for Numerical Data
Analysis of Variance
Regression Analysis
Assessment Elements
In-class Assignments
Final Test
The test will be conducted at the Smart LMS course page with Safe Exam Browser. The mock test will be provided a few weeks in advance. The test will consist of a quiz and problems. In the problems section the student should calculate statistics and fill in a short answer, plot the graph and find the most appropriate interpretation in an answers list, and do other calculations analyzing a dataset.
Home Assignments
In order to get full marks for the home assignments students need to solve practical tasks in Smart LMS with auto-checking. Home Assignments will be distributed each 1-2 weeks with the deadline before the next seminar.
Attendance
Attendance is not graded. However, uncertified absence can lead to deduction of the grade or even disqualification. Two absences of lectures or seminars separately are excused per semester. In case of the student’s absence for a valid reason, the student must provide a valid Certificate of Illness/Medical Note to the Students’ Office in the span of 1 (one) working day since the end of their sick leave, else their absence will be graded as 0 (zero). Each additional absence beyond the allowed number will lower the final grade for the course by 0.3 points for each absence without compromise (e.g. by the end of the course student collected 7.5 points but missed three lectures and two seminars without a valid reason, then 0.3 points are deducted from the final grade: 7.5 – 0.3 = 7.2).
Group Project (Exam)
The group project is performed in teams of 2-3 students (you can't do it one at a time, you can't do it with four people either). The project consists of three parts: data reconciliation, an intermediate stage, and a final stage. Penalties are provided for skipping intermediate deadlines. The works submitted after the deadline are not checked. The project defence is mandatory for the evaluation of the project and conducted during exam session.
Seminar Participation
In order to get full marks for the participation, students need to actively participate in the class discussions, to demonstrate familiarity with assigned readings and lecture material, including being prepared to answer the questions that the class teacher may pose dedicated to the home assignment.
Before the class in advance students will get recommendations for better preparation for in-class activities, discussing theoretical moments from the lectures, practicing in Python frameworks for data analysis.
Midterm Test
The test will be conducted at the Smart LMS course page with Safe Exam Browser. The mock test will be provided a few weeks in advance. The test will consist of a quiz and problems. In the problems section the student should calculate statistics and fill in a short answer, plot the graph and find the most appropriate interpretation in an answers list, and do other calculations analyzing a dataset.
Interim Assessment
2025/2026 2nd module
min(0.1 * Home Assignments + 0.2 * In-class Assignments + 0.1 * Seminar Participation + 0.15 * Midterm Test + 0.15 * Final Test + 0.3 * Group Project (Exam) + 0 * Attendance, 8). Remark: In accordance with the Regulations for Interim and Ongoing Assessments of Students at National Research University Higher School of Economics, grades awarded based on interim assessment outcomes of the discipline-prerequisites for the independent exam on digital competency may not exceed 8 points.
Bibliography
Recommended Core Bibliography
Elementary statistics : a step by step approach, Bluman, A. G., 2018
Frederick J Gravetter, Larry B. Wallnau, Lori-Ann B. Forzano, & James E. Witnauer. (2020). Essentials of Statistics for the Behavioral Sciences, Edition 10. Cengage Learning.
James, G. et al. An introduction to statistical learning. – Springer, 2013. – 426 pp.
Recommended Additional Bibliography
Boris Mirkin. (2011). Core Concepts in Data Analysis: Summarization, Correlation and Visualization (Vol. 2011). Springer.
Döbler, M., & Grössmann, T. (2019). Data Visualization with Python : Create an Impact with Meaningful Data Insights Using Interactive and Engaging Visuals. Packt Publishing.
Frederick J Gravetter, Lori-Ann B. Forzano, & Tim Rakow. (2021). Research Methods For The Behavioural Sciences, Edition 1. Cengage Learning.
Преподаватели
Карпов Максим Евгеньевич
Сохраби Маджид
Course Syllabus
Abstract
Learning Objectives
Expected Learning Outcomes
Course Contents
Assessment Elements
Interim Assessment
Bibliography
Recommended Core Bibliography
Recommended Additional Bibliography
Authors