• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта
Бакалавриат 2020/2021

Введение в Data Science на языке Python

Статус: Курс по выбору (Маркетинг и рыночная аналитика)
Направление: 38.03.02. Менеджмент
Когда читается: 3-й курс, 4 модуль
Формат изучения: с онлайн-курсом
Преподаватели: Карпушкин Глеб Александрович
Язык: английский
Кредиты: 3
Контактные часы: 2

Course Syllabus

Abstract

This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively. By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses.
Learning Objectives

Learning Objectives

  • The main objective of the course is to provide students with the basic concepts of Python, its syntax, functions and packages to enable them to write scripts for data manipulation and analysis. The course develops skills of writing and running a code using Python. The course covers various variables types and their features, basic operators and statements, loops, as well as the main packages for data science: NumPy, Pandas. At the end of the course, students should be able to write short scripts to import, prepare and analyze data.
Expected Learning Outcomes

Expected Learning Outcomes

  • Understand techniques such as lambdas and manipulating csv files
  • Describe common Python functionality and features used for data science
  • Query DataFrame structures for cleaning and processing
  • Explain distributions, sampling, and t-tests
Course Contents

Course Contents

  • Fundamentals of Data Manipulation with Python
    In this week you'll get an introduction to the field of data science, review common Python functionality and features which data scientists use, and be introduced to the Coursera Jupyter Notebook for the lectures. All of the course information on grading, prerequisites, and expectations are on the course syllabus, and you can find more information about the Jupyter Notebooks on our Course Resources page.
  • Basic Data Processing with Pandas
    In this week of the course you'll learn the fundamentals of one of the most important toolkits Python has for data cleaning and processing -- pandas. You'll learn how to read in data into DataFrame structures, how to query these structures, and the details about such structures are indexed.
  • More Data Processing with Pandas:
    In this week you'll deepen your understanding of the python pandas library by learning how to merge DataFrames, generate summary tables, group data into logical pieces, and manipulate dates. We'll also refresh your understanding of scales of data, and discuss issues with creating metrics for analysis. The week ends with a more significant programming assignment.
  • Answering Questions with Messy Data
    In this week of the course you'll be introduced to a variety of statistical techniques such a distributions, sampling and t-tests. The week ends with two discussions of science and the rise of the fourth paradigm - data driven discovery.
Assessment Elements

Assessment Elements

  • non-blocking Group project
  • non-blocking Сourse assessments
  • non-blocking Exam
Interim Assessment

Interim Assessment

  • Interim assessment (4 module)
    0.4 * Exam + 0.3 * Group project + 0.3 * Сourse assessments
Bibliography

Bibliography

Recommended Core Bibliography

  • Derivatives analytics with Python : data analysis, models, simulation, calibration and hedging, Hilpisch, Y. J., 2015
  • Python for data analysis : data wrangling with pandas, numPy, and IPhython, Mckinney, W., 2017
  • Python для сложных задач : наука о данных и машинное обучение, Плас, Дж. В., 2018
  • Введение в анализ данных : Учебник и практикум для бакалавриата и магистратуры, Миркин Б.Г., НИУ ВШЭ, 2017
  • Введение в анализ данных : учебник и практикум для вузов, Миркин, Б. Г., 2015
  • Введение в машинное обучение с помощью Python : руководство для специалистов по работе с данными, Мюллер, А., 2018
  • Основы Data Science и Big data : Python и наука о данных, Силен, Д., 2017

Recommended Additional Bibliography

  • Python 3, Прохоренок, Н. А., 2016
  • Алгоритмы. Справочник : с примерами на C, C++, Java и Python, Хайнеман, Дж., 2017