• A
• A
• A
• АБB
• АБB
• АБB
• А
• А
• А
• А
• А
Обычная версия сайта
Бакалавриат 2020/2021

## Введение в анализ данных

Статус:
Направление: 41.03.06. Публичная политика и социальные науки
Когда читается: 1-й курс, 2 модуль
Формат изучения: с онлайн-курсом
Охват аудитории: для своего кампуса
Преподаватели: Бизяев Антон Игоревич, Жуков Павел Владимирович, Шилова Надежда Викторовна
Язык: английский
Кредиты: 3

### Course Syllabus

#### Abstract

This course offers an introduction to the modern data science methods that are useful for both research and industrial careers. The main focus of the course is to teach students to find data on the Internet, to process it and to perform a simple data analysis. Students are trained to develop critical thinking and to apply the scientific approach to problem solving. The course starts from the basics of working with data. Students will be taught to perform a basic data analysis in MS Excel. In the first part of the course, students will learn how to sort and filter data, to calculate various distribution characteristics and to create graphs and charts in accordance with the standards of their design. A part of the course also concerns the main methods of data storage and its usage. The second part concerns the main methods that lead to scientific results of the analyses in humanities starting from time series and linear regression analyses to the simplest predictive modelling. Students will learn to apply all these techniques in MS Excel.

#### Learning Objectives

• To provide an introduction to modern data science techniques
• To introduce the main concepts of scientific data analysis
• To show the best practices of working with data
• To train basic skills in MS Excel

#### Expected Learning Outcomes

• Demonstrate knowledge of basic concepts of data science
• Perform exploratory data analysis in MS Excel 2016
• Formulate and solve simple scientific problems
• To understand the notions of continuous random variable and of probability distribution. Know how to apply the central limit theorem
• Know the difference between sample and population analysis. Know different sampling methods.
• Know the methods of interval estimation and T-statistics. Be able to work with different kinds of data
• Understand the notion of null hypothesis. Be able to test hypotheses with real-world applications.

#### Course Contents

• Introduction to Data Analysis
Applied data science in the international relations. The examples of applications, the examples of application misuse and mistakes.
• The basic types of probability distributions
Main characteristics of the distributions: mean, mode, median. Correlations and causality. Calculation of maximum and minimum. Sorting and filtering data. Construction of tables, charts and graphs.
• Data mining, storage and processing
Main methods of data mining, storage and processing. Types of variables. Prospects for the development and application of data analysis. Processing polls, ratings.
• The simplest text analysis
The simplest text analysis: concatenation, string length search, formatting. Conditional operators in Excel: IF, COUNTIF, SUMIF. Pivot tables.
• Basic Data Analysis. Time series
Basic methods for forecasting time series. Calculation of quality metrics of predictive models.
• Trend and seasonality
Trend and seasonality. Analysis of time series in Excel. Determination coefficient (R2).
• Linear regression analysis
Linear regression analysis. Construction and visualization of regression lines. Obtaining predictions using a linear regression model. The concept of splitting data into train and test. Model quality evaluation.
• Continuous random variables and probability distributions. The normal distribution. The Central limit theorem
Notion of continuous random variable. Notion of probability distribution. Central limit theorem
• Sampling and sampling distributions
Sample. Sampling methods. Difference between sample and population analysis
• Interval estimation
Methods of interval estimation. T-statistics. Work with different kinds of data
• Hypothesis testing
Null hypothesis. Hypothesis testing. Real World Example of Hypothesis Testing

#### Assessment Elements

• Homework 1 (Data Culture)
• Homework 2 (Data Culture)
• Exam (Data analysis and Data Culture)
• Homework 3 (Data Culture)

#### Interim Assessment

• Interim assessment (2 module)
0.3 * Exam (Data analysis and Data Culture) + 0.2 * Homework 1 (Data Culture) + 0.2 * Homework 2 (Data Culture) + 0.2 * Homework 3 (Data Culture) + 0.1 * Midterm (Data Culture)

#### Recommended Core Bibliography

• Introductory statistics for business and economics, Wonnacott T. H., Wonnacott R. J., 1990
• Statistics for business and economics, Newbold P., Carlson W. L., 2013