• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Bachelor 2020/2021

Business Analytics

Category 'Best Course for Career Development'
Type: Elective course (Sociology and Social Informatics)
Area of studies: Sociology
When: 3 year, 1, 2 module
Mode of studies: offline
Instructors: Anna Shirokanova
Language: English
ECTS credits: 5

Course Syllabus


The course is targeted at undergraduate social science students aiming at careers in business-oriented jobs in marketing, sales and service analytics. The course consists of lectures and seminars. The lecture part provides a gentle introduction to several fundamental concepts in business analytics (lifetime value, churn, segmentation, whatif analysis, business processes) and analytical techniques (choice models, complexity reduction), while guiding students through tailored cases relevant for data economy. The seminar part is blended with MOOCs that introduce specific data analysis techniques as tools for solving typical problems in business analytics. We will discuss the business context of each case, leverage analytical techniques that solve the tasks at hand, and discuss the effective delivery of results. This is a rather intense course that requires motivation and genuine interest in business analytics, team work, and a large amount of independent study. To success in the course, participants are expected to be familiar with the linear regression fundamentals and data management techniques in R. Familiarity with machine learning techniques is not required but will be an asset.
Learning Objectives

Learning Objectives

  • to stimulate students to apply the methods and concepts they learned in courses on data analysis and research methods to solving practical tasks in business and marketing analytics.
Expected Learning Outcomes

Expected Learning Outcomes

  • Students solve analytical problems using data analysis techniques suitable for the task; develop policies and scenarios using several methods of analysis
  • Students perform various statistical analyses, use them appropriately, and develop suggestions (recommendations, policies, scenarios) for the task (churn prevention, segmentation, etc.)
  • Students work individually and in teams to interpret the results and develop policies and scenarios for the company
  • Students collect information on the business context of a given case and evaluate possible solutions to a given task
  • Students select data features to be used in segmentation procedures; as a team, students develop analytical pipelines covering necessary tasks and combining individual results into a summary report
  • Students can plan the analytical data cycle, from formulating requests for data collection, to data cleaning and dimension reduction, to data analysis and reporting the recommendations
Course Contents

Course Contents

  • The trade of business analytics
    Jobs in data science for business. What do business analysts do? Responsibilities of business analysts. Domains of applied business analytics.
  • Introduction to Data Analysis for Consumer Behavior and Client Analytics. Customer Lifetime Value
    Basic concepts of consumer behavior and client analytics. Differences and similarities between typical academic research and business analytics pipelines. Customer acquisition, conversion, churn, segmentation, consumer behavior. Market basket analysis. Association rules. Classification of consumer behavior models. Generic marketing strategies. Types of business models. Statistical methods for client analytics. Life-time value (CLV, LTV). Net profit. Predicting future margin with current sales data. Predicting customer lifetime value with linear regression in R. Omitted variable problem. Multicollinearity. Model validation. Risk of overfitting: use of statistics (AIC), automatic model selection, out-of-sample validation. Adjusted R-squared.
  • Customer Segmentation and Cohort Analysis
    Factors of customer segmentation: demographics, technology, geography, lifestyles, behavior, new/returning contract, time from last purchase, frequency and value of spending, etc. Reducing the complexity of extensive correlated data. Differences between the goals of LTV models and segmentation techniques. Business-related criteria for segmentation: RFM (recency, frequency, monetary) analysis. Analytical techniques for customer segmentation: PCA, cluster analysis (k-means, DBSCAN, agglomerative algorithms). Applications of PCA for exploration in customer analytics. Reducing multicollinearity, building an index, visualizing multidimensional data. Visualizing correlations. Standardizing variances (scaling). Loadings of principal components. Interpretation of principal components. PCA model specification. Kaiser-Guttman criterion. Scree plot. Biplot of variables and components. Further analysis: fitting loadings to linear regression. Clustering algorithms. Distances between data points. Linkage criteria. Dendrogram plot. Applications of cluster analysis for customer analytics in R.
  • Customer Churn. Churn Prevention
    How to predict customer churn? How to detect and prevent customer churn? Factors of churn: expectations, performance, disconfirmation (disappointment based on perceived quality), satisfaction, churn intention/switching decisions. Push-pull-mooring paradigm for churn and service switching. Measurement of latent variables: satisfaction and expectation disconfirmation. Models of satisfaction, expectation disconfirmation, performance. Sources of data: Experts, logs, surveys. Case: Yandex Music vs. Spotify. Predicting client’s churn with logistic regression in R. The meaning of p-value. Interpretation of logistic regression coefficients. Model selection based on significance vs. theory. Inspecting the results of automatic model selection. Insample model fit for logistic regression: Pseudo-R-squared (interpretation of reasonable, good, and very good fit); accuracy calculation. The rule of “garbage in, garbage out”. Accuracy. Confusion matrix. Finding the optimal threshold: a table of potential payoffs. Composing a payoff matrix. Dealing with overfitting: out-ofsample validation and cross-validation. Splitting the sample in R. Specifying on train and predicting on test subsamples. K-fold methods of cross-validation. Accuracy for out-of-sample vs. cross validation. Addressing churn using segmentation and advertisement. Naive Bayes in predicting churn. Description of task and data for the project.
  • Predicting Customer’s Time to Churn
    Predicting time till next purchase with survival analysis. Addressing churn using segmentation and advertisement. Survival function. Censored data problem. Survival analysis models: pros and cons. Applications of survival models in customer analytics. Types of data censoring (left, interval, right, type I, type II, random). Assumptions of survival analysis. Survival curve analysis by Kaplan-Meier. Survival function and cumulative hazard function. Cumulative risk. Hazard rate. Kaplan-Meier estimation with a categorical covariate. Cox proportional hazards (CPH) model for multiple covariates. Assumptions of CPH. Interpretation of coefficients for categorical and continuous predictors. Survival plot. Visualization of CPH estimates. When assumptions are violated: stratified Cox model, model time-dependent coefficients. Prediction of survival curve for new customers. CPH model interpretation, calculation of customer lifetime value.
  • What-If Analysis
    The analytical pipeline: database, model, dashboard, what-if analysis. Use of simulations in business for decision making. Scenarios as ways to construct prediction on data. From scenario, to simulation model, to prediction. What-if analysis vs. Extraction, Transformation and Loading (ETL) approach. Source variables and scenario parameters. Seven stages of what-if analysis: goal analysis, business modeling, data source analysis, multidimensional modeling, simulation modeling, data design and implementation, and validation (if failed, repeat 4-7). Activity diagram (scenario diagram). Case: productivity of branches. Stating the assumptions required to perform what-if analysis of models. Grouping assumptions into scenarios describing different ways of customers’ reaction to the policies. Building what-if models for each policy for each scenario. Compare and reflect on the results of scenario models. Reactive programming. Functions in R.
  • Consumer Preferences
    Introduction to consumer preference theory. Utility analysis. Cardinal utility, ordinal utility. Indifference curves show combinations that give equal utility. Marginal rate of substitution (MRS). Constraints: income, price, time. Uses of choice models in marketing and business analytics. Modeling customers' choice by product features. Multinomial logit models for choice vs. Conjoint analysis: when and where. Choice-based and metric conjoint. Sample size for a conjoint survey. Preparing the data for choice modeling. Managing and summarizing choice data. Selecting the features for modeling. Building Choice Models. Modeling different preferences for different groups of customers with hierarchical models (mixed logit models). Reporting choice models: choice share predictions, willingness-to-pay metric. A/B testing and preference testing. Checklists and common pitfalls in A/B testing. Common metrics and reporting the results.
  • Customer Satisfaction
    Customer feedback surveys. Net promoter score (NPS) for measuring loyalty. Promoters, passives and detractors. Customer satisfaction survey (CSAT) for meeting expectations. Post-purchase surveys, product/service development survey, usability surveys. Expectation disconfirmation theory of post-purchase satisfaction. Key constructs: expectations, perceived performance, disconfirmation of beliefs and satisfaction. Inputs to expectations of value. Measuring the perceived performance: overall quality, interaction, service experience, value for money, social status. Problems of customer satisfaction surveys. Self-selection, overdelivering, expectation adjustment. Combining survey and behavior data. Discovering patterns with Bayesian networks (Bayes nets). Trimming groups of variables, defining the importance of predictors.
  • Introduction to Business Process Analytics
    Process mining. Business process data: extraction, processing and analysis. Process as a control flow, process as performance, process as the organizational background. Association rule mining. Markov chain models and sequential association rules. Identifying the process and process stakeholders. Collecting process information. Event data processing: event log objects, exploratory and descriptive analysis, conditional process analysis, process visualizations and process dashboards. Case study: order-to-cash process.
Assessment Elements

Assessment Elements

  • non-blocking project Churn
    The deadline will be announced two weeks before submission. Late submissions are not accepted.
  • non-blocking project Retention
  • non-blocking project Full report
  • non-blocking A/B Test project
  • non-blocking Small homework tasks
  • non-blocking BA Jobs Essay
Interim Assessment

Interim Assessment

  • Interim assessment (2 module)
    0.1 * A/B Test project + 0.1 * BA Jobs Essay + 0.2 * project Churn + 0.2 * project Full report + 0.2 * project Retention + 0.2 * Small homework tasks


Recommended Core Bibliography

  • Chapman, C., & Feit, E. M. (2015). R for Marketing Research and Analytics. Cham: Springer. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=964737
  • Ledolter, J. (2013). Data Mining and Business Analytics with R. Hoboken, New Jersey: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=587979
  • Provost, F., & Fawcett, T. (2013). Data Science for Business : What You Need to Know About Data Mining and Data-Analytic Thinking (Vol. 1st ed). Beijing: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=619895

Recommended Additional Bibliography

  • Saxena, R. N., & Srinivasan, A. (2013). Business Analytics : A Practitioner’s Guide. New York: Springer. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=528361