• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Getting and Cleaning Data

2020/2021
Academic Year
ENG
Instruction in English
3
ECTS credits
Course type:
Elective course
When:
2 year, 3 module

Instructor

Course Syllabus

Abstract

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data. The Johns Hopkins University: https://www.coursera.org/learn/data-cleaning
Learning Objectives

Learning Objectives

  • to introduce students to the basic ways that data can be obtained
  • to introduce students to the basics of data cleaning and how to make data “tidy”
Expected Learning Outcomes

Expected Learning Outcomes

  • applies data cleaning basics to make data "tidy"
  • understands common data storage systems
  • obtains usable data from the web, APIs, and databases
  • uses R for text and date manipulation
Course Contents

Course Contents

  • Finding data and reading different file types
  • The most common data storage systems
  • Organizing, merging and managing the data you have
  • Text and date manipulation in R
Assessment Elements

Assessment Elements

  • non-blocking online course
  • non-blocking discussion with a HSE instructor
  • non-blocking online course
  • non-blocking discussion with a HSE instructor
Interim Assessment

Interim Assessment

  • Interim assessment (3 module)
    0.3 * discussion with a HSE instructor + 0.7 * online course
Bibliography

Bibliography

Recommended Core Bibliography

  • Mailund, T. (2017). Beginning Data Science in R : Data Analysis, Visualization, and Modelling for the Data Scientist. New York: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1484645

Recommended Additional Bibliography

  • Wickham, H., & Grolemund, G. (2016). R for Data Science : Import, Tidy, Transform, Visualize, and Model Data (Vol. First edition). Sebastopol, CA: Reilly - O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1440131