• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта
Бакалавриат 2020/2021

Введение в анализ данных

Направление: 41.03.06. Публичная политика и социальные науки
Когда читается: 1-й курс, 2 модуль
Формат изучения: с онлайн-курсом
Охват аудитории: для своего кампуса
Преподаватели: Бизяев Антон Игоревич, Жуков Павел Владимирович, Шилова Надежда Викторовна
Язык: английский
Кредиты: 3

Course Syllabus

Abstract

This course offers an introduction to the modern data science methods that are useful for both research and industrial careers. The main focus of the course is to teach students to find data on the Internet, to process it and to perform a simple data analysis. Students are trained to develop critical thinking and to apply the scientific approach to problem solving. The course starts from the basics of working with data. Students will be taught to perform a basic data analysis in MS Excel. In the first part of the course, students will learn how to sort and filter data, to calculate various distribution characteristics and to create graphs and charts in accordance with the standards of their design. A part of the course also concerns the main methods of data storage and its usage. The second part concerns the main methods that lead to scientific results of the analyses in humanities starting from time series and linear regression analyses to the simplest predictive modelling. Students will learn to apply all these techniques in MS Excel.
Learning Objectives

Learning Objectives

  • To provide an introduction to modern data science techniques
  • To introduce the main concepts of scientific data analysis
  • To show the best practices of working with data
  • To train basic skills in MS Excel
Expected Learning Outcomes

Expected Learning Outcomes

  • Demonstrate knowledge of basic concepts of data science
  • Perform exploratory data analysis in MS Excel 2016
  • Formulate and solve simple scientific problems
  • To understand the notions of continuous random variable and of probability distribution. Know how to apply the central limit theorem
  • Know the difference between sample and population analysis. Know different sampling methods.
  • Know the methods of interval estimation and T-statistics. Be able to work with different kinds of data
  • Understand the notion of null hypothesis. Be able to test hypotheses with real-world applications.
Course Contents

Course Contents

  • Introduction to Data Analysis
    Applied data science in the international relations. The examples of applications, the examples of application misuse and mistakes.
  • The basic types of probability distributions
    Main characteristics of the distributions: mean, mode, median. Correlations and causality. Calculation of maximum and minimum. Sorting and filtering data. Construction of tables, charts and graphs.
  • Data mining, storage and processing
    Main methods of data mining, storage and processing. Types of variables. Prospects for the development and application of data analysis. Processing polls, ratings.
  • The simplest text analysis
    The simplest text analysis: concatenation, string length search, formatting. Conditional operators in Excel: IF, COUNTIF, SUMIF. Pivot tables.
  • Basic Data Analysis. Time series
    Basic methods for forecasting time series. Calculation of quality metrics of predictive models.
  • Trend and seasonality
    Trend and seasonality. Analysis of time series in Excel. Determination coefficient (R2).
  • Linear regression analysis
    Linear regression analysis. Construction and visualization of regression lines. Obtaining predictions using a linear regression model. The concept of splitting data into train and test. Model quality evaluation.
  • Continuous random variables and probability distributions. The normal distribution. The Central limit theorem
    Notion of continuous random variable. Notion of probability distribution. Central limit theorem
  • Sampling and sampling distributions
    Sample. Sampling methods. Difference between sample and population analysis
  • Interval estimation
    Methods of interval estimation. T-statistics. Work with different kinds of data
  • Hypothesis testing
    Null hypothesis. Hypothesis testing. Real World Example of Hypothesis Testing
Assessment Elements

Assessment Elements

  • non-blocking Homework 1 (Data Culture)
  • non-blocking Homework 2 (Data Culture)
  • non-blocking Exam (Data analysis and Data Culture)
  • non-blocking Homework 3 (Data Culture)
Interim Assessment

Interim Assessment

  • Interim assessment (2 module)
    0.3 * Exam (Data analysis and Data Culture) + 0.2 * Homework 1 (Data Culture) + 0.2 * Homework 2 (Data Culture) + 0.2 * Homework 3 (Data Culture) + 0.1 * Midterm (Data Culture)
Bibliography

Bibliography

Recommended Core Bibliography

  • Introductory statistics for business and economics, Wonnacott T. H., Wonnacott R. J., 1990
  • Statistics for business and economics, Newbold P., Carlson W. L., 2013

Recommended Additional Bibliography

  • Bąska, M., Pondel, M., & Dudycz, H. (2019). Identification of advanced data analysis in marketing: A systematic literature review. Journal of Economics & Management, 35(1), 18–39. https://doi.org/10.22367/jem.2019.35.02
  • Springston, M., Ernst, J. V., Clark, A. C., Kelly, D. P., & DeLuca, V. W. (2019). data analysis. Technology & Engineering Teacher, 79(4), 26–29. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=asn&AN=139712968