• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Master 2020/2021

Programming in R and Python

Category 'Best Course for Career Development'
Category 'Best Course for Broadening Horizons and Diversity of Knowledge and Skills'
Area of studies: Applied Mathematics and Informatics
When: 1 year, 1 module
Mode of studies: offline
Instructors: Gregory Khvatsky
Master’s programme: Applied Statistics with Network Analysis
Language: English
ECTS credits: 3
Contact hours: 20

Course Syllabus

Abstract

Students who have never programmed are afraid that it is difficult. This course is designed to introduce them to the basics of programming languages such as R and Python. This course will discuss the difference between these languages, the strengths of each of them. Students will learn the basics of programming and working with these languages.
Learning Objectives

Learning Objectives

  • to provide students with the basic R and Python skills that will be required in other courses in the programme
Expected Learning Outcomes

Expected Learning Outcomes

  • be able to create and work with vectors, matrices and lists
  • be able to upload files to R space
  • have skills on performing descriptive statistics, exploratory data analysis
  • be able to visualize data
  • know how to build simple and basic models
Course Contents

Course Contents

  • Data formats
    Vectors, matrices and lists. Operations on them. Functions for converting and working with them. Matrices and dataframes
  • Starting working with data
    Loading data files in R. Introduction to the data.table library: loading data, adding calculated variables, deleting columns, subsetting, merging dataframes, renaming columns. Loading Stata, SPSS and Excel files into R
  • Exploratory data analysis
    Descriptive statistics and exploratory data analysis. Grouping data into data.table + descriptive statistics for groups. Simple visualizations (bar chart, histogram, box / violin plot, scatterplot, correlations + their visualizations)
  • Visualization
    More complex visualizations with ggplot2. Chart facets, heatmaps, palettes. Design of visualizations.
  • Basic linear regression
    Linear regression using lm. Presentation of analysis results using Stargazer.
  • R Basics
    Installing R and RStudio. Getting started with RMarkdown. Getting started with R: installing libraries, variables and data types, logical and arithmetic operations, functions and methods, loops, the%>% operator.
Assessment Elements

Assessment Elements

  • non-blocking Project
    The main goal of this project is to pick dataset and prepare it for further analysis using R. The steps include choosing of dataset, loading it into R, preparation of it for further analysis and basics of exploratory data analysis methods.
  • non-blocking Final project
    In this project students should use clean dataset prepared during project 1 to explore relationships between variables. The exploration covers descriptive statistics, correlations, and simple regression models.
Interim Assessment

Interim Assessment

  • Interim assessment (1 module)
    0.6 * Final project + 0.4 * Project
Bibliography

Bibliography

Recommended Core Bibliography

  • W. N. Venables, & D. M. Smith. (2012). D.M.: An Introduction to R. Notes on R: A Programming Environment for Data Analysis and Graphics Version 2.15.0. R-project.org.

Recommended Additional Bibliography

  • Simon N. Wood. (2017). Generalized Additive Models : An Introduction with R, Second Edition: Vol. Second edition. Chapman and Hall/CRC.