• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Advanced Data Analysis&Big Data for Business Intelligence

2019/2020
Academic Year
ENG
Instruction in English
5
ECTS credits
Delivered at:
Department of Information Systems and Digital Infrastructure Management
Course type:
Compulsory course
When:
1 year, 1-3 module

Instructor

Course Syllabus

Abstract

Advanced Data Analysis and Big Data for Business Intelligence is the study of the techniques for analyzing big data and big data technologies. Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture,сleaning, storage, search, sharing, transfer, analysis and visualization.Course is focused on understanding the role of big data analysis for business intelligence. Course content includes techniques for analyzing big data and big data technologies.
Learning Objectives

Learning Objectives

  • Formation of the theoretical knowledge and practical basic skills in the collection, storage, processing and analysis of large data. Develop skills and practical skills to analyze large data to tackle a wide range of applications, including analysis of corporate data, financial data from the data warehousing world markets, modeling data storage and processing, prediction of complex indicators.
Expected Learning Outcomes

Expected Learning Outcomes

  • Formation of the theoretical knowledge and practical basic skills in the collection, storage, processing and analysis of large data.
  • Develop skills and practical skills to analyze large data to tackle a wide range of applications, including analysis of corporate data, financial data from the data warehousing world markets, modeling data storage and processing, prediction of complex indicators.
Course Contents

Course Contents

  • Python basics
  • Python advanced level: comprehensions and generators
  • Introduction to functional programming: lambdas, map, reduce, filter, zip
  • Probability Theory and Mathematical Statistics
  • Data analysis with Python: numpy, pandas
  • Visualizing Big Data: Matplotlib
  • Machine learning models and its applications
  • Analysis of social media: Exploring Twitter API
  • Dealing with Twitter Big Data: multithreading in python
  • Apache Spark Machine Learning on Big Data
  • Business intelligence systems: Traditional Business analytics vs Big Data Analytics
  • Preprocessing Big Data: Tableau Prep
  • Introduction to Tableau
  • GIS - geoinformation systems
  • Big Data and Qlik Sense
Assessment Elements

Assessment Elements

  • non-blocking Control work module 1
  • non-blocking Control work module 2
  • non-blocking Seminars activity module 3
  • non-blocking Online - test
    The instructions for students in the LMS. 1. Midterm exams with asynchronous proctoring. Examination format: The exam is taken written (multiple choice questions) with asynchronous proctoring. Asynchronous proctoring means that all the student's actions during the exam will be “watched” by the computer. The exam process is recorded and analyzed by artificial intelligence and a human (proctor). Please be careful and follow the instructions clearly! The platform: The exam is conducted on the StartExam platform. StartExam is an online platform for conducting test tasks of various levels of complexity. The link to pass the exam task will be available to the student in the RUZ. Students are required to join a session 15 minutes before the beginning. The computers must meet the following technical requirements: https://eduhseru-my.sharepoint.com/:b:/g/personal/vsukhomlinov_hse_ru/EUhZkYaRxQRLh9bSkXKptkUBjy7gGBj39W_pwqgqqNo_aA?e=fn0t9N A student is supposed to follow the requirements below: Prepare identification documents (а passport on a page with name and photo) for identification before the beginning of the examination task; Check your microphone, speakers or headphones, webcam, Internet connection (we recommend connecting your computer to the network with a cable, if possible); Prepare the necessary writing equipment, such as pens, pencils, pieces of paper, and others. Disable applications on the computer's task other than the browser that will be used to log in to the StartExam program. If one of the necessary requirements for participation in the exam cannot be met, a student is obliged to inform a professor and a manager of a program 2 weeks before the exam date to decide on the student's participation in the exams. Students are not allowed to: Turn off the video camera; Use notes, textbooks, and other educational materials; Leave the place where the exam task is taken (go beyond the camera's viewing angle); Look away from your computer screen or desktop; Use smart gadgets (smartphone, tablet, etc.) Involve outsiders for help during the exam, talk to outsiders during the examination tasks; Read tasks out loud. Students are allowed to: Write on a piece of paper, use a pen for making notes and calculations; Use a calculator; Connection failures: A short-term communication failure during the exam is considered to be the loss of a student's network connection with the StartExam platform for no longer than 1 minute. A long-term communication failure during the exam is considered to be the loss of a student's network connection with the StartExam platform for longer than 1 minute. A long-term communication failure during the exam is the basis for the decision to terminate the exam and the rating “unsatisfactory” (0 on a ten-point scale). In case of long-term communication failure in the StartExam platform during the examination task, the student must notify the teacher, record the fact of loss of connection with the platform (screenshot, a response from the Internet provider). Then contact the manager of a program with an explanatory note about the incident to decide on retaking the exam.
Interim Assessment

Interim Assessment

  • Interim assessment (3 module)
    0.2 * Control work module 1 + 0.2 * Control work module 2 + 0.4 * Online - test + 0.2 * Seminars activity module 3
Bibliography

Bibliography

Recommended Core Bibliography

  • Idris, I. (2016). Python Data Analysis Cookbook. Birmingham, UK: Packt Publishing. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1290098

Recommended Additional Bibliography

  • Kirk, M. (2015). Thoughtful Machine Learning with Python : A Test-Driven Approach. Sebastopol: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1455642
  • Weiming, J. M. (2019). Mastering Python for Finance : Implement Advanced State-of-the-art Financial Statistical Applications Using Python, 2nd Edition (Vol. Second edition). Birmingham, UK: Packt Publishing. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=2116431