• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Master 2021/2022

Introduction to Data Science in Python

Type: Elective course (Comparative Social Research)
Area of studies: Sociology
Delivered by: School of Sociology
When: 2 year, 1 module
Mode of studies: distance learning
Open to: students of one campus
Instructors: Christian Fröhlich
Master’s programme: Comparative Soсial Research
Language: English
ECTS credits: 4
Contact hours: 2

Course Syllabus

Abstract

This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively. By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses. The course is a Massive Open Online Course delivered at Coursera platform (https://www.coursera.org/learn/python-data-analysis). Students are required to attend the course and take an oral examination at HSE for completing the course. The examination is taken after completion of the course during examination weeks. The full syllabus is published at the course website. (https://www.coursera.org/learn/python-data-analysis). Only for students of Comparative Social Research programme
Learning Objectives

Learning Objectives

  • to introduce the learner to the basics of the python programming environment
  • to introduce the abstraction of the Series and DataFrame
  • to be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses
Expected Learning Outcomes

Expected Learning Outcomes

  • Explain distributions, sampling, and t-tests
  • Query DataFrame structures for cleaning and processing
  • to describe common Python functionality and features used for data science
  • Understand techniques such as lambdas and manipulating csv files
Course Contents

Course Contents

  • Week 1
  • Week 2
  • Week 3
  • Week 4
Assessment Elements

Assessment Elements

  • Partially blocks (final) grade/grade calculation MOOC Certificate
  • non-blocking Oral exam
  • Partially blocks (final) grade/grade calculation MOOC Certificate
  • non-blocking Oral exam
Interim Assessment

Interim Assessment

  • 2021/2022 1st module
    0.7 * MOOC Certificate + 0.3 * Oral exam
Bibliography

Bibliography

Recommended Core Bibliography

  • Python for data analysis : data wrangling with pandas, numPy, and IPhython, Mckinney, W., 2017
  • Vanderplas, J. T. (2016). Python Data Science Handbook : Essential Tools for Working with Data (Vol. First edition). Sebastopol, CA: Reilly - O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=nlebk&AN=1425081
  • Изучаем Python, Лутц, М., 2014

Recommended Additional Bibliography

  • Sarkar, D. Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from your Data [Электронный ресурс] / Dipanjan Sarkar; БД Books 24x7. – Chicago: Apress, 2016. – 412 p. – ISBN 978-1-4842-2387-1