• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Bachelor 2021/2022

Data Analysis in Python

Type: Elective course (Marketing and Market Analytics)
Area of studies: Management
When: 3 year, 4 module
Mode of studies: distance learning
Open to: students of all HSE University campuses
Instructors: Aleksandr Rozhkov
Language: English
ECTS credits: 3
Contact hours: 20

Course Syllabus

Abstract

The course is focused on Data Analysis tools and methods in Python. It includes a set of applied tasks to be solved with Python toolkit using methods and algorithms of data preprocessing, data visualization, descriptive and inferential statistics, regression, factor and cluster analysis. The course is primarily focused on tools application in the Python coding environment. Additional reading and exercises are provided to familiarize students with current trends in data analysis for business (marketing, product managements). This course is supplemented with the online module using DataCamp platform courses on NLP and Sentiment analysis. DataCamp’s learn-by-doing methodology combines short expert videos and hands-on-the-keyboard exercises to help learners retain knowledge. Particular focus is placed on data output interpretation and analysis. Upon completing the course students will be able to import data to Python, clean and process it, select and implement analytical methods relevant for the business task identified.
Learning Objectives

Learning Objectives

  • Students know and understand key data analysis principles, have command of data processing and analysis tools in Python, are able to import, process and analyze data, deliver structured analytic report in context of business goals of a company.
  • Students implement relevant tools (Python) of data collection, processing and analysis required for particular managerial tasks
Expected Learning Outcomes

Expected Learning Outcomes

  • Understand and implement data import and preparation procedure
  • Select and implement Python visualization tools to for data analysis and reporting
  • Identify, select and implement Python frameworks to complete tasks of descriptive and inferential statistics in data analysis process
  • Understand and implement factor analysis, cluster analysis and regression analysis for a business goal
  • Understand and implement NLP frameworks for business tasks including sentiment analysis.
  • Students are capable to analyze data output, visualize and interpret key insights in data analysis for business tasks.
  • Understand and implement factor analysis, cluster analysis and regression analysis in business tasks
  • Students know ethical issues of data analysis and can identify the ethical issues in business setting
Course Contents

Course Contents

  • Introduction to Data Analysis in Python
  • Data preprocessing in Python
  • Analytic tools and algorithms in Python
  • Data visualization
  • Natural Language Processing in Python. DataCamp online module
  • Ethical issues of data analysis
Assessment Elements

Assessment Elements

  • non-blocking Seminar tests 1-4
    Every seminar ( 8 times total) we have quick tests based on topics discussed previously and reading assignments.
  • non-blocking Python setup and survey
    Python environment setup and course survey
  • non-blocking Online class
    Students are required to complete 2 courses on DataCamp online platform, invites and course links is sent to students emails @edu.hse.ru
  • non-blocking Project
    Students will be given dataset to apply data analysis skills, including data preprocessing, exploratory analysis, model specification and reporting of the results.
  • blocking Exam
    The exam includes several sections: Coding / analytic assignment: You are required to import process, analyze and interpret the results for the data provided. This part is 70% of the exam points. Test: Multiple choice questions and open questions are based on the course materials and additional reading assigned. This part of the exam is 30%.
  • non-blocking Seminar tests 5-8
Interim Assessment

Interim Assessment

  • 2021/2022 4th module
    0.2 * Seminar tests 5-8 + 0.2 * Seminar tests 1-4 + 0.3 * Exam + 0.1 * Online class + 0.15 * Project + 0.05 * Python setup and survey
Bibliography

Bibliography

Recommended Core Bibliography

  • Ivan Idris - Python Data Analysis - Packt Publishing, Limited , 2014-430 - Текст электронный - https://ebookcentral.proquest.com/lib/hselibrary-ebooks/detail.action?docID=1826990
  • Bengfort, B., Bilbro, R., & Ojeda, T. (2018). Applied Text Analysis with Python : Enabling Language-Aware Data Products with Machine Learning. Beijing: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=nlebk&AN=1827695
  • Beysolow, T. (2018). Applied Natural Language Processing with Python : Implementing Machine Learning and Deep Learning Algorithms for Natural Language Processing. [Berkeley, CA]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1892182
  • Dr. Ossama Embarak. (2018). Data Analysis and Visualization Using Python : Analyze Data to Create Visualizations for BI Systems. Apress.
  • Idris, I. (2016). Python Data Analysis Cookbook. Birmingham, UK: Packt Publishing. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1290098
  • McKinney, W. (2018). Python for Data Analysis : Data Wrangling with Pandas, NumPy, and IPython (Vol. Second edition). Sebastopol, CA: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1605925
  • Vanderplas, J. T. (2016). Python Data Science Handbook : Essential Tools for Working with Data (Vol. First edition). Sebastopol, CA: Reilly - O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=nlebk&AN=1425081

Recommended Additional Bibliography

  • Ben Stephenson. (2019). The Python Workbook : A Brief Introduction with Exercises and Solutions (Vol. 2nd ed. 2019). Springer.
  • Keith McNulty. (2021). Handbook of Regression Modeling in People Analytics : With Examples in R and Python. Chapman and Hall/CRC.
  • Shmueli, G., Bruce, P. C., Gedeck, P., & Patel, N. R. (2020). Data Mining for Business Analytics : Concepts, Techniques and Applications in Python. Newark: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=2273611