• A
• A
• A
• ABC
• ABC
• ABC
• А
• А
• А
• А
• А
Regular version of the site
Master 2019/2020

## Statistical Analysis and Statistical Packages

Category 'Best Course for Career Development'
Category 'Best Course for Broadening Horizons and Diversity of Knowledge and Skills'
Type: Compulsory course (Population and Development)
When: 1 year, 1-3 module
Mode of studies: offline
Instructors: Pyotr Evdokimov
Master’s programme: Population and Development
Language: English
ECTS credits: 6

### Course Syllabus

#### Abstract

The course will provide underlying principles of mathematical statistics for statistical inference in the field of economics and sociology. This course will be focused on the methods of statistical inference including parameter estimation and hypothesis testing. Further results of the course can be used for the specialization courses in statistics and econometrics. #### Learning Objectives

• The goal of this course is to broaden and systematize students’ knowledge of econometrics and to practice its application. During the course we will go through the essentials of econometrics: from the statistical background through the theory and intuition behind regression analysis to practical applications. The course is elementary and presents concepts and techniques in way that benefits students of all mathematical backgrounds. Fundamental concepts and methods of statistics and econometrics are introduced with emphasis on interpretation of arguments and application to real-world problems. Every topic will be backed up with an applied exercise. #### Expected Learning Outcomes

• 1. be able to present the basic concepts and methods of statistical reasoning and data analysis in the context of decision-making;
• 2. develop computational skills in fundamental statistical analysis;
• 3. acquire a basic/working knowledge of data analysis using R package (STATA is optional);
• 4. demonstrate the appropriate level of competence regarding the fundamentals of statistics and econometrics;
• 5. demonstrate the appropriate level of competence in written expression. #### Course Contents

• A.1 Recapitulation: Data description and numerical measures.
 know the difference between a population and a sample  understand how to categorize data, construct frequency distributions and a histogram  construct and interpret various types of charts and diagrams  create a line chart and interpret the trend in the data  distinguish between descriptive statistics and inferential statistics, between a population and a sample, and among the types of measurement scales  describe the properties of a data set presented as a histogram or a frequency polygon  calculate and interpret relative frequencies and cumulative relative frequencies, given a frequency distribution  calculate and interpret measures of central tendency, including the population mean, sample mean, arithmetic mean, weighted average or mean (including a portfolio return viewed as a weighted mean), geometric mean, harmonic mean, median, and mode, construct and interpret a box and whisker graph  compute and interpret the range, interquartile range, variance, and standard deviation and know what these values mean, coefficient of variation, explain measures of sample skewness and kurtosis  compute a z-score and the coefficient of variation and understand how they are applied in decision-making situations  calculate and interpret quartiles, quintiles, deciles, and percentiles  calculate and interpret the proportion of observations falling within a specified number of standard deviations of the mean using Chebyshev’s inequality
• A.2 Recapitulation: Probabilities, probability distributions, sampling and estimation, and hypothesis testing.
 understand the three approaches to assessing probabilities  be able to apply the Addition Rule and the Multiplication Rule  know how to use Bayes' Theorem for applications involving conditional probabilities  define an event, mutually exclusive events, and exhaustive events .  distinguish between unconditional and conditional probabilities  random variable, the expected value of a discrete random variable  Bernoulli random variable  binomial, Poisson and hypergeometric distributions and their application to decision-making situations  interpret a cumulative distribution function  convert a normal distribution to a standard normal distribution  determine the probability that a normally distributed random variable lies inside a given interval  explain the key properties of the normal distribution  distinguish between a univariate and a multivariate distribution, and explain the role of correlation in the multivariate normal distribution  calculate and interpret the expected value, variance, standard deviation of a random variable  understand the concept of sampling error  determine the mean and standard deviation for the sampling distribution of the sample mean x  understand the importance of the Central Limit Theorem  determine the mean and standard deviation for the sampling distribution of the sample proportion, p  distinguish between simple random and stratified random sampling  distinguish between a point estimate and a confidence interval estimate  construct and interpret a confidence interval estimate for a single population mean using both the standard normal and t distributions  determine the required sample size for estimating a single population mean  formulate null and alternative hypotheses for applications involving a single population mean or proportion  explain a test statistic, Type I and Type II errors  correctly formulate a decision rule for testing a hypothesis  know how to use the test statistic, critical value, and p-value approaches to test a hypothesis  compute the probability of a Type II error  discuss the logic behind, and demonstrate the techniques for, using independent samples to test hypotheses and develop interval estimates for the difference between two population means  develop confidence interval estimates and conduct hypothesis tests for the difference between two population means for paired samples  carry out hypothesis tests and establish interval estimates, using sample data, for the difference between two population proportions  identify the appropriate test statistic and interpret the results for a hypothesis test concerning the mean difference of two normally distributed populations  understand the basic logic of analysis of variance  perform a hypothesis test for a single-factor design using analysis of variance  odd and risk ratios
• B.1 OLS, the assumptions and the properties of OLS estimators.
 Derivation and interpretation of Ordinary Least Squares  Assumptions in OLS regression models  Properties of OLS estimators  Marginal effect and its interpretation
• B.2 Hypothesis Testing after OLS estimation.
 Single population parameter  Linear combination of parameters
• B.3 Multiple linear restrictions, R-squared.
 Multiple linear restrictions  Goodness of fit (R-squared)  Marginal effects and their interpretation
• B.4 Recapitulation, practical training, GRETL and STATA training.
 Solving practical examples  GRETL and STATA training: working with project dataset and getting used to commands
• C.1 Nonlinear and discrete independent variables.
 Nonlinear specifications  Dummy variables
• C.2 Departures from OLS assumptions.
 Heteroskedasticity (consequences and tests)  Autocorrelation (consequences and tests)  Generalized Least Squares
• C.3 Misspecifications.
 Omitted variable bias  Irrelevant variables  Testing for endogeneity  Instrumental variables  2-stage least squares
• C.4 Introduction to qualitative dependent variables I.
 Logit model and practical examples  Marginal effects after logit models and their interpretation
• C.5 Introduction to qualitative.
 Probit model and practical examples  Marginal effects after probit models and their interpretation
• C.6 Introduction to Time Series.
 Trend, seasonality  Autocorrelation  Durbin Watson test
• C.7 Endogeneity.
 Typical cases – omitted variable bias, selection bias, simultaneity, measurement error  Instrumental Variables  2-stage least squares #### Assessment Elements

• Homework Assignments
Late homework submission WILL NOT BE ACCEPTED. - Students are allowed to work on homework assignments individually or in groups of two. - Homework outputs will be handed in a printed version at the beginning of a predefined class (unless agreed differently). If the students work in pairs, the role of each student needs to be described and both students need to sign it (there will be 50% point deduction it is missing).
• Project
- At the beginning of the course each student will be given a dataset. Student will use the dataset throughout the course. The dataset will be also used in the final project, whose main goal is to provide students with practical exercise of what will be taught during the course and to apply knowledge using statistical softwares (MS Office, and GRETL/SPSS/STATA). - Student will work on the project individually. - Project accounts for 15 per cent of the final grade and need to be handed at the end of the semester (the date will be agreed during the class). - Projects will be presented during the last lecture.
• Quizzes
- No retakes are allowed for quizzes. - Quizzes will NOT be announced in advance.
• Exam 1
- Retake of exam 1 is allowed only if the student was sick and has a confirmation from a doctor. - ONLY BASIC CALCULATORS will be allowed in the exams. - EXAM 1 is an open-book exam, i.e., you can use all your HANDWRITTEN materials (exception: printed lecture notes).
• Exam 2
- The exam retake policy is: A student can retake exam 2 if her/his score was below 50%. - ONLY BASIC CALCULATORS will be allowed in the exams. - EXAM 2 is a closed-book exam. #### Interim Assessment

• Interim assessment (3 module)
0.15 * Exam 1 + 0.4 * Exam 2 + 0.15 * Homework Assignments + 0.15 * Project + 0.15 * Quizzes #### Recommended Core Bibliography

• Introductory econometrics : a modern approach, Wooldridge, J. M., 2009