Данные и аналитика в финансах
- Work easily in R, import data in R, make basic manipulation with it to prepare data for calculations and export results of calculations.
- Apply methods of data analysis and understand their objectives.
- Understand limitation and relevance of the methods.
- Apply skills in data cleaning.
- Demonstrate the ability to work in different software environments for data analysis and to explain the choice of software.
- Understand basic theories in analysis of financial data, invent and write a code for a particular task in finance data analysis.
- Master ability of making decision on base of data analysis and proving them.
- Make decision in finance on base of data analysis and prove them.
- Data wrangling with R1. Introduction to R: Data Structures; Subsetting; Functions; Vectorization. 2. Data Wrangling: Tidy Data; Reshape; Summarize. 3. Data Visualization: Base Graphics; Grammar of Graphics; Interactive Graphics.
- Optimization problems on financial data4. Principal component analysis and clustering. Main objectives of principal component analysis (PCA). Mathematical model of components discovery. Algorithms of PCA implementation. Latent variable, criteria for defining number of components. Rotation, interpretation of the results. Main objectives of clustering, geometrical interpretation. Measures of distance between objects and measures of distance between clusters. k-means and k-median clustering: objective, algorithm, results interpretation. Criteria for defining number of clusters and quality of clustering. Method implementation for case-study “Customer analytics in banks”. 5. Curve fitting. Main objective of curve fitting and financial problems, that it can help to solve. Interpolation and extrapolation. Different types of curve fitting: polynomial and spline interpolation (local polynomial fitting). Procedure of estimating curve fitting. Method implementation for case-study “Fitting yield curve”. 6. Portfolio optimization on data. Optimal portfolio of two risky assets: theoretical model. Model solution as a solution of quadratic programming problem. Sensitivity to model inputs. Optimal portfolio problem for p-dimensions. and LASSO technique to deal it. Method implementation for case-study “Construction a portfolio on trading data of a stock”.
- Fraud detection using machine learning7. Introduction to fraud detection and Data preprocessing. Importance of fraud detection. Definition and types of fraud. Types of variables. Data exploration and visualization. Dealing with missing values. Standardizing and transforming data. 8. Featurization, Social Network Analysis and Dealing with imbalanced datasets. Traditional features for fraud detection. Social Network Analysis. Random oversampling (ROS) and random undersampling (RUS). Synthetic Minority Over-sampling Techniques (SMOTE). 9. Supervised and unsupervised techniques for fraud detection. Linear and logistic regression. Decision trees and ensemble methods. Evaluating fraud detection models. Digit analysis using Benford’s Law. Multivariate outlier detection using robust statistics.
- Interim assessment (4 module)0.4 * Exam + 0.15 * Self-study students’ work + 0.15 * Seminar activities + 0.15 * Test 1 + 0.15 * Test 2
- Provost, F., & Fawcett, T. (2013). Data Science for Business : What You Need to Know About Data Mining and Data-Analytic Thinking (Vol. 1st ed). Beijing: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=619895
- Tsay, R. S. (2013). An Introduction to Analysis of Financial Data with R. Wiley.