Data analysis in Python
Сильчев Виталий Артемович
- to provide on overview of available data analysis tools in Python ecosystem
- to give knowledge about data analysis pipeline
- to practice how to use analytical tools in various tasks
- understand the steps of the analytical process
- use basic Python modules for data analysis (numpy, pandas, matplotlib)
- perform exploratory data analysis
- select appropriate visualizations for data
- build predictive models for clusterisation, regression and classification tasks
- prepare dataset before training the model
- select appropriate metric for model evaluation
- tune hyper-parameters of the model
- Introduction to Data Analytics in PythonThis is an introductory section that describes such key areas as the analytical process, how data is created, stored, accessed, and how the organization works with data. It also covers data analysis tools available in Python ecosystem.
- Descriptive AnalyticsDescriptive analytics is a preliminary stage of data processing that includes exploratory data analysis and data visualization.
- Predictive AnalyticsThis section covers basic machine learning tasks like clustering, regression and classification. It also includes the machine learning pipeline steps from feature engineering to metric selection and hyper-parameter optimization.
- Interim assessment (4 module)0.3 * Final Test + 0.3 * Intermediate Tests + 0.4 * Programming Assignments
- Nelli, F. (2018). Python Data Analytics : With Pandas, NumPy, and Matplotlib (Vol. Second edition). New York, NY: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1905344
- Python for data analysis : data wrangling with pandas, numPy, and IPhython, Mckinney, W., 2017
- Sarkar, D., Bali, R., & Sharma, T. (2018). Practical Machine Learning with Python : A Problem-Solver’s Guide to Building Real-World Intelligent Systems. [United States]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1667293
- Rajaraman, A., & Ullman, J. D. (2012). Mining of Massive Datasets. New York, N.Y.: Cambridge University Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=408850