Master
2020/2021
Programming in R and Python
Category 'Best Course for Career Development'
Category 'Best Course for Broadening Horizons and Diversity of Knowledge and Skills'
Type:
Bridging course (Applied Statistics with Network Analysis)
Area of studies:
Applied Mathematics and Informatics
Delivered by:
International laboratory for Applied Network Research
When:
1 year, 1 module
Mode of studies:
offline
Instructors:
Gregory Khvatsky
Master’s programme:
Applied Statistics with Network Analysis
Language:
English
ECTS credits:
3
Contact hours:
20
Course Syllabus
Abstract
Students who have never programmed are afraid that it is difficult. This course is designed to introduce them to the basics of programming languages such as R and Python. This course will discuss the difference between these languages, the strengths of each of them. Students will learn the basics of programming and working with these languages.
Learning Objectives
- to provide students with the basic R and Python skills that will be required in other courses in the programme
Expected Learning Outcomes
- be able to create and work with vectors, matrices and lists
- be able to upload files to R space
- have skills on performing descriptive statistics, exploratory data analysis
- be able to visualize data
- know how to build simple and basic models
Course Contents
- Data formatsVectors, matrices and lists. Operations on them. Functions for converting and working with them. Matrices and dataframes
- Starting working with dataLoading data files in R. Introduction to the data.table library: loading data, adding calculated variables, deleting columns, subsetting, merging dataframes, renaming columns. Loading Stata, SPSS and Excel files into R
- Exploratory data analysisDescriptive statistics and exploratory data analysis. Grouping data into data.table + descriptive statistics for groups. Simple visualizations (bar chart, histogram, box / violin plot, scatterplot, correlations + their visualizations)
- VisualizationMore complex visualizations with ggplot2. Chart facets, heatmaps, palettes. Design of visualizations.
- Basic linear regressionLinear regression using lm. Presentation of analysis results using Stargazer.
- R BasicsInstalling R and RStudio. Getting started with RMarkdown. Getting started with R: installing libraries, variables and data types, logical and arithmetic operations, functions and methods, loops, the%>% operator.
Assessment Elements
- ProjectThe main goal of this project is to pick dataset and prepare it for further analysis using R. The steps include choosing of dataset, loading it into R, preparation of it for further analysis and basics of exploratory data analysis methods.
- Final projectIn this project students should use clean dataset prepared during project 1 to explore relationships between variables. The exploration covers descriptive statistics, correlations, and simple regression models.
Bibliography
Recommended Core Bibliography
- W. N. Venables, & D. M. Smith. (2012). D.M.: An Introduction to R. Notes on R: A Programming Environment for Data Analysis and Graphics Version 2.15.0. R-project.org.
Recommended Additional Bibliography
- Simon N. Wood. (2017). Generalized Additive Models : An Introduction with R, Second Edition: Vol. Second edition. Chapman and Hall/CRC.