• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Bachelor 2021/2022

Getting and Cleaning Data

Area of studies: Fundamental and Applied Linguistics
Delivered by: School of Linguistics
When: 4 year, 3 module
Mode of studies: distance learning
Open to: students of one campus
Language: English
ECTS credits: 3
Contact hours: 2

Course Syllabus

Abstract

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data. The Johns Hopkins University: https://www.coursera.org/learn/data-cleaning
Learning Objectives

Learning Objectives

  • to introduce students to the basic ways that data can be obtained
  • to introduce students to the basics of data cleaning and how to make data “tidy”
Expected Learning Outcomes

Expected Learning Outcomes

  • applies data cleaning basics to make data "tidy"
  • obtains usable data from the web, APIs, and databases
  • understands common data storage systems
  • uses R for text and date manipulation
Course Contents

Course Contents

  • Finding data and reading different file types
  • The most common data storage systems
  • Organizing, merging and managing the data you have
  • Text and date manipulation in R
Assessment Elements

Assessment Elements

  • non-blocking online course
  • non-blocking discussion with a HSE instructor
  • non-blocking online course
  • non-blocking discussion with a HSE instructor
Interim Assessment

Interim Assessment

  • 2021/2022 3rd module
    0.7 * online course + 0.3 * discussion with a HSE instructor
Bibliography

Bibliography

Recommended Core Bibliography

  • Mailund, T. (2017). Beginning Data Science in R : Data Analysis, Visualization, and Modelling for the Data Scientist. New York: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1484645

Recommended Additional Bibliography

  • Wickham, H., & Grolemund, G. (2016). R for Data Science : Import, Tidy, Transform, Visualize, and Model Data (Vol. First edition). Sebastopol, CA: Reilly - O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1440131