• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Master 2020/2021

Big Data Systems Research Seminar “Latest trends in Data Governance, Big Data Analytics & Data Architecture"

Type: Compulsory course
Area of studies: Business Informatics
When: 1 year, 2, 3 module
Mode of studies: offline
Open to: students of one campus
Instructors: Natalya Khapayeva
Master’s programme: Big Data Systems
Language: English
ECTS credits: 4
Contact hours: 48

Course Syllabus

Abstract

This course's key objective is to make the students familiar with the most important big data concepts and introduce the modern approaches in creating data products. We will examine what is a data product and the technology basis that allows to build it. We will learn about • Big Data Ecosystem • (Big) Data Management • Data Products&Economics • Data Culture&Ethics. Students will gain the ability to initiate and design data products and understand the business and ethics, governance, and sustainability challenges relating to Big Data. Most lectures will be presented using Python and SQL examples. Some lectures will use Java and/or Scala.
Learning Objectives

Learning Objectives

  • This course gives you insights into how big data technologies impact the business.
Expected Learning Outcomes

Expected Learning Outcomes

  • Define key concepts and identify technologies in the field of Big Data
  • Explain the challenges of creating and maintaining Big Data products
  • Describe the ethics, governance, and sustainability challenges relating to Big Data
  • Design and evaluate an approach for the architecture of infrastructure for Big Data products based upon particular needs, including selecting an appropriate set of technologies, and governance strategy for storage and processing data
  • Discuss the impact of digitization and the adoption of Big Data in business and overall society
Course Contents

Course Contents

  • Big Data Ecosystem
    We will learn about the past and the current state of Big Data technologies.
  • (Big) Data Management
    We will learn about different aspects of the governance of the data.
  • Data Products&Economics
    We'll learn about creating data products and evaluating them from the business and architecture points of view
  • Data Culture&Ethics
    We'll learn about data-informed approaches in making business and life decisions
Assessment Elements

Assessment Elements

  • non-blocking Homeworks
  • non-blocking Course project
  • non-blocking Activity during classes
  • non-blocking Exam
    Examination format: The exam is taken written The platform: The exam is taken on the Google Forms and MS Teams platforms. Students are required to join a session 15 minutes before the beginning. A student is supposed to follow the requirements below: Check your computer for compliance with technical requirements no later than 7 days before the exam; Use your corporate account (@edu.hse.ru) to check-in into the test form; Check your microphone, speakers or headphones, webcam, Internet connection (we recommend connecting your computer to the network with a cable, if possible); Prepare the necessary writing equipment, such as pens, pencils, pieces of paper, and others. If one of the necessary requirements for participation in the exam cannot be met, a student is obliged to inform a professor and a manager of a program 2 weeks before the exam date to decide on the student's participation in the exams. Students are not allowed to: Turn off the video camera; Leave the place where the exam task is taken (go beyond the camera's viewing angle); Involve outsiders for help during the exam, talk to outsiders during the examination tasks; Read tasks out loud. Interact with other students. Students are allowed to: Write on a piece of paper, use a pen for making notes and calculations; Use notes and textbooks, e.g. in digital form; Turn on the microphone to answer the teacher’s questions; Ask a teacher for additional information related to understanding the exam task. Connection failures: A short-term communication failure during the exam is considered to be the loss of a student's network connection with the MS Teams and Google Forms platforms for no longer than 1 minute. A long-term communication failure during the exam is considered to be the loss of a student's network connection with the MS Teams and Google Forms platforms for longer than 1 minute. A student cannot continue to participate in the exam, if there is a long-term communication failure appeared. The retake procedure is similar to the exam procedure. In case of long-term communication failure in the MS Teams and Google Forms platforms during the examination task, the student must notify the teacher, record the fact of loss of connection with the platform (screenshot, a response from the Internet provider). Then contact the manager of a program with an explanatory note about the incident to decide on retaking the exam.
Interim Assessment

Interim Assessment

  • Interim assessment (3 module)
    0.16 * Activity during classes + 0.4 * Course project + 0.2 * Exam + 0.24 * Homeworks
Bibliography

Bibliography

Recommended Core Bibliography

  • Malaska, T., & Seidman, J. (2018). Foundations for Architecting Data Solutions : Managing Successful Data Projects: Vol. First edition. O’Reilly Media.
  • Thomas Erl, Wajid Khattak, & Paul Buhler. (2016). Big Data Fundamentals : Concepts, Drivers & Techniques. Prentice Hall.

Recommended Additional Bibliography

  • Jules S. Damji, Brooke Wenig, Tathagata Das, & Denny Lee. (2020). Learning Spark. O’Reilly Media.
  • Kleppmann, M. (2017). Designing Data-Intensive Applications : The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. Sebastopol, CA: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1487643
  • Mark Richards, & Neal Ford. (2019). Fundamentals of Software Architecture : An Engineering Approach. O’Reilly Media.