• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
15
June

Big Data and Machine Learning with Applications to Economics and Finance

2019/2020
Academic Year
ENG
Instruction in English
4
ECTS credits
Course type:
Elective course
When:
2 year, 1 semester

Instructors


Zimin, Stepan

Course Syllabus

Abstract

Big Data and Machine Learning (M.Sc. level) is an advanced elective course designed for masters students at ICEF. The course is open to all second year M.Sc. students. Basic knowledge of the Python programming language is strongly advised but not required. Students without Python knowledge will be expected to exert additional effort during the first few weeks of the course to catch up. The course is taught in English. The course has three broad sections: I. Building skills using Python libraries to solve common problems in the analysis offinancial data. II. Learning how to mine the web, social media, and other big data sources in searchfor data. III. Designing and implementing machine learning models.
Learning Objectives

Learning Objectives

  • The main objective of the course is to endow students with fundamental skills related to data mining and analytics, as well as with designing and implementing machine learning predictive models.
Expected Learning Outcomes

Expected Learning Outcomes

  • - Be able to code simple algorithms using Python
  • - Use data structures to store and transform data
  • - Use Python to solve simple analytical tasks
  • - Find solutions to optimization problems using Python
  • - Analyze multiple data sources
  • - Apply clustering and anomaly detection methods
  • - Use web applications API to obtain data
  • - Convert text into input for machine learning algorithms
  • - Train an ML classifier. Make predictions
  • - Train a ML regression. Make predictions
  • - Be able to setup a neural network
  • - Present data graphically
Course Contents

Course Contents

  • Introduction to Python
    Introductory examples. Basic data types and structures. Functions and control flow. Classes and objects. Functional programming and Object-oriented programming.
  • Python’s Scientific Stack: NumPy, Pandas, and SciPy
    NumPy arrays and code vectorization. Indexing and slicing. Series and DataFrame objects. Basic data wrangling: missing data, transformed data, reshaped data, merged data. Data summarization
  • Data Visualization
    lots in two and three dimensions. The matplotlib and the seaborn libraries.
  • Financial and Other Applications
    The value of money. A financial calculator. Bond and stock valuation. The CAPM.
  • Mathematical tools and numerical calculus
    Finding roots. Convex optimization. Monte Carlo simulations. Portfolio optimization.
  • Big Data
    Accessing regular data with Python. How is big data different? Apache Hadoop and the HDFS data format. SQL and No-SQL databases.
  • Introduction to Data Mining
    Data mining methodologies.
  • Mining the Social Web
    Restful APIs. Mining Twitter, Facebook, Instagram, and GitHub to learn what’s trending.
  • Textual Analysis
    Mining Text Files: Computing Document Similarity, Extracting Collocations.
  • Machine Learning Classification Methods
    Gradient descent. Cross Validation. K-nearest neighbor. Naive Bayesian Classification. Decision Trees.
  • Machine Learning Regression Methods
    OLS, LASSO, Ridge regression.
  • Neural Networks. Forecasting Stock and Commodity Prices
    Comparing traditional time series analysis and Recursive Neural Nets.
Assessment Elements

Assessment Elements

  • non-blocking project proposal
  • non-blocking intermediate report
  • non-blocking a final report and presentation
    Students who could not hand in a project and make a presentation due to a valid reason are assigned an additional date to do so.
  • non-blocking attendance and participation
  • non-blocking home assignments
Interim Assessment

Interim Assessment

  • Interim assessment (1 semester)
    0.3 * a final report and presentation + 0.1 * attendance and participation + 0.3 * home assignments + 0.2 * intermediate report + 0.1 * project proposal
Bibliography

Bibliography

Recommended Core Bibliography

  • Géron, A. (2017). Hands-On Machine Learning with Scikit-Learn and TensorFlow : Concepts, Tools, and Techniques to Build Intelligent Systems (Vol. First edition). Sebastopol, CA: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=nlebk&AN=1486117
  • Han, J., Kamber, M., Pei, J. Data Mining: Concepts and Techniques, Third Edition. – Morgan Kaufmann Publishers, 2011. – 740 pp.
  • Hilpisch, Y. (2014). Python for Finance : Analyze Big Financial Data (Vol. First edition). Sebastopol, CA: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=925360
  • Kirk, M. (2015). Thoughtful Machine Learning with Python : A Test-Driven Approach. Sebastopol: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1455642
  • McKinney, W. (2018). Python for Data Analysis : Data Wrangling with Pandas, NumPy, and IPython (Vol. Second edition). Sebastopol, CA: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1605925

Recommended Additional Bibliography

  • Chatterjee, S., & Krystyanczuk, M. (2017). Python Social Media Analytics. Birmingham: Packt Publishing. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1565635
  • Russell, M. A., & Klassen, M. (2018). Mining the Social Web : Data Mining Facebook, Twitter, LinkedIn, Instagram, GitHub, and More. Sebastopol, CA: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1951213
  • Sarkar, D., Bali, R., & Sharma, T. (2018). Practical Machine Learning with Python : A Problem-Solver’s Guide to Building Real-World Intelligent Systems. [United States]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1667293
  • Squire, M. (2016). Mastering Data Mining with Python – Find Patterns Hidden in Your Data. Birmingham: Packt Publishing. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1343887
  • Weiming, J. M. (2019). Mastering Python for Finance : Implement Advanced State-of-the-art Financial Statistical Applications Using Python, 2nd Edition (Vol. Second edition). Birmingham, UK: Packt Publishing. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=2116431
  • Yan, Y. (2017). Python for Finance - Second Edition (Vol. Second edition). Birmingham, UK: Packt Publishing. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1547029