Master
2020/2021

## Statistical Analysis and Statistical Packages

Type:
Compulsory course (Population and Development)

Area of studies:
Public Administration

Delivered by:
Department of Applied Economics

When:
1 year, 1-4 module

Mode of studies:
offline

Open to:
students of one campus

Instructors:
Dmitry Malakhov

Master’s programme:
Population and Development

Language:
English

ECTS credits:
6

### Course Syllabus

#### Abstract

This course is a gentle introduction to modern applied statistics and econometrics. The course is based on the following principle: first, idea and formal description of mathematical concepts are given, second, these concepts are applied to real-world problems. The course has three main chapters: probability theory, statistics, and econometrics. Programming in R will be a red thread through all topics. Usage of R helps to apply statistical techniques to real data. The probability theory’s part is devoted to the most fundamental aspects of statistical analysis. Moreover, during this part we will also cover R programming, therefore, the first part of the course will form foundations for further topics. The statistics’ part explains principles of the basic applied statistical analysis and serves as a bridge between probability theory and the most applied part of the course, econometrics. Econometrics is a collection of mathematical tools which helps to forecast variables, find new dependences and test theories.

#### Learning Objectives

- The goal of this course is to refresh, broaden, and systematize students’ knowledge of statistics and econometrics and to practice its application. . The course is elementary and presents concepts and techniques in way that benefits students of all mathematical backgrounds. Fundamental concepts and methods of statistics and econometrics are introduced with emphasis on interpretation of arguments and application to real-world problems. Every topic will be backed up with an applied exercise.

#### Expected Learning Outcomes

- 1. be able to present the basic concepts and methods of statistical reasoning and data analysis in the context of decision-making;
- 2. develop computational skills in fundamental statistical analysis;
- 3. acquire a basic/working knowledge of data analysis using R package (STATA is optional);
- 4. demonstrate the appropriate level of competence regarding the fundamentals of statistics and econometrics;
- 5. demonstrate the appropriate level of competence in written expression.

#### Course Contents

- A.1 Recapitulation: Data description and numerical measures. know the difference between a population and a sample understand how to categorize data, construct frequency distributions and a histogram construct and interpret various types of charts and diagrams create a line chart and interpret the trend in the data distinguish between descriptive statistics and inferential statistics, between a population and a sample, and among the types of measurement scales describe the properties of a data set presented as a histogram or a frequency polygon calculate and interpret relative frequencies and cumulative relative frequencies, given a frequency distribution calculate and interpret measures of central tendency, including the population mean, sample mean, arithmetic mean, weighted average or mean (including a portfolio return viewed as a weighted mean), geometric mean, harmonic mean, median, and mode, construct and interpret a box and whisker graph compute and interpret the range, interquartile range, variance, and standard deviation and know what these values mean, coefficient of variation, explain measures of sample skewness and kurtosis compute a z-score and the coefficient of variation and understand how they are applied in decision-making situations calculate and interpret quartiles, quintiles, deciles, and percentiles calculate and interpret the proportion of observations falling within a specified number of standard deviations of the mean using Chebyshev’s inequality
- A.2 Recapitulation: Probabilities, probability distributions, sampling and estimation, and hypothesis testing. understand the three approaches to assessing probabilities be able to apply the Addition Rule and the Multiplication Rule know how to use Bayes' Theorem for applications involving conditional probabilities define an event, mutually exclusive events, and exhaustive events . distinguish between unconditional and conditional probabilities random variable, the expected value of a discrete random variable Bernoulli random variable binomial, Poisson and hypergeometric distributions and their application to decision-making situations interpret a cumulative distribution function convert a normal distribution to a standard normal distribution determine the probability that a normally distributed random variable lies inside a given interval explain the key properties of the normal distribution distinguish between a univariate and a multivariate distribution, and explain the role of correlation in the multivariate normal distribution calculate and interpret the expected value, variance, standard deviation of a random variable understand the concept of sampling error determine the mean and standard deviation for the sampling distribution of the sample mean x understand the importance of the Central Limit Theorem determine the mean and standard deviation for the sampling distribution of the sample proportion, p distinguish between simple random and stratified random sampling distinguish between a point estimate and a confidence interval estimate construct and interpret a confidence interval estimate for a single population mean using both the standard normal and t distributions determine the required sample size for estimating a single population mean formulate null and alternative hypotheses for applications involving a single population mean or proportion explain a test statistic, Type I and Type II errors correctly formulate a decision rule for testing a hypothesis know how to use the test statistic, critical value, and p-value approaches to test a hypothesis compute the probability of a Type II error discuss the logic behind, and demonstrate the techniques for, using independent samples to test hypotheses and develop interval estimates for the difference between two population means develop confidence interval estimates and conduct hypothesis tests for the difference between two population means for paired samples carry out hypothesis tests and establish interval estimates, using sample data, for the difference between two population proportions identify the appropriate test statistic and interpret the results for a hypothesis test concerning the mean difference of two normally distributed populations understand the basic logic of analysis of variance perform a hypothesis test for a single-factor design using analysis of variance odd and risk ratios
- B.1 OLS, the assumptions and the properties of OLS estimators. Derivation and interpretation of Ordinary Least Squares Assumptions in OLS regression models Properties of OLS estimators Marginal effect and its interpretation
- B.2 Hypothesis Testing after OLS estimation. Single population parameter Linear combination of parameters
- B.3 Multiple linear restrictions, R-squared. Multiple linear restrictions Goodness of fit (R-squared) Marginal effects and their interpretation
- C.1 Nonlinear and discrete independent variables. Nonlinear specifications Dummy variables
- C.2 Departures from OLS assumptions. Heteroskedasticity (consequences and tests) Autocorrelation (consequences and tests) Generalized Least Squares
- C.3 Misspecifications. Omitted variable bias Irrelevant variables Testing for endogeneity Instrumental variables 2-stage least squares
- C.4 Introduction to qualitative dependent variables I. Logit model and practical examples Marginal effects after logit models and their interpretation
- C.5 Introduction to qualitative. Probit model and practical examples Marginal effects after probit models and their interpretation
- C.6 Introduction to Time Series. Trend, seasonality Autocorrelation Durbin Watson test
- C.7 Endogeneity. Typical cases – omitted variable bias, selection bias, simultaneity, measurement error Instrumental Variables 2-stage least squares

#### Assessment Elements

- Homework AssignmentsLate homework submission WILL NOT BE ACCEPTED. Only students are allowed to work on homework assignments individually.
- Project- Students should collect datasets by themselves, set a research question and analyse the data.
- Calculus Exam- Retake of this is allowed if the student want try to increase the grade. - ONLY BASIC CALCULATORS will be allowed in the exams. Closed-book form.
- Winter midterm exam- No retake is allowed. - ONLY BASIC CALCULATORS will be allowed in the exams. - Winter midterm is a closed-book exam. The exam will be held in a distant format
- Spring midterm- No retake is allowed. - ONLY BASIC CALCULATORS will be allowed in the exams. - Winter midterm is a closed-book exam.
- Final exam- No retake is allowed. - ONLY BASIC CALCULATORS will be allowed in the exams. - Winter midterm is a closed-book exam.

#### Interim Assessment

- Interim assessment (4 module)Final exam = 0.1*Calculus exam + 0.15*Home assignments + 0.15*Winter midterm + 0.1*Spring midterm +0.1*Project + 0.3*Final exam

#### Bibliography

#### Recommended Core Bibliography

- Introductory econometrics : a modern approach, Wooldridge, J. M., 2009

#### Recommended Additional Bibliography

- David Card. (n.d.). Wed Jun 27 22:35:26 2007THE IMPACT OF THE MARIEL BOATLIFT ON THE MIAMI LABOR MARKET. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.ED3A80FA
- Steven Berry, James Levinsohn, & Ariel Pakes. (1995). Automobile prices in market equilibrium. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.6F437C07