• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта
Магистратура 2020/2021

Введение в Большие данные

Направление: 38.04.02. Менеджмент
Когда читается: 1-й курс, 4 модуль
Формат изучения: с онлайн-курсом
Преподаватели: Сахнюк Павел Анатольевич
Прогр. обучения: Международный менеджмент
Язык: английский
Кредиты: 3
Контактные часы: 2

Course Syllabus

Abstract

Program International Management Link https://www.coursera.org/learn/big-data-introduction?specialization=big-data https://www.coursera.org/learn/big-data-management?specialization=big-data Semester 2 Level Graduate Year 1 Study mode MOOC Type of course Elective ECTS 3 Prerequisites This course is for those new to data science. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Learning outcomes • to be able to describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors. • to be able to explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting. • to be able to get value out of Big Data by using a 5-step process to structure your analysis. • to be able to identify the frequent data operations required for various types of data • to be able to select a data model to suit the characteristics of your data • to be able to apply techniques to handle streaming data • to be able to differentiate between a traditional Database Management System and a Big Data Management System Contents This course provides an introduction to one of the common frameworks, Hadoop. It also provides the guided hands-on tutorials to introduce the students with the systems and tools like: AsterixDB, HP Vertica, Impala, Neo4j, Redis, SparkSQL. This course provides techniques to extract value from existing untapped data sources and discovering new data sources. This course covers the following topics: • Big Data Introduction • Big data modeling and management Systems
Learning Objectives

Learning Objectives

  • • to be able to describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors. • to be able to explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting. • to be able to get value out of Big Data by using a 5-step process to structure your analysis. • to be able to identify the frequent data operations required for various types of data • to be able to select a data model to suit the characteristics of your data • to be able to apply techniques to handle streaming data • to be able to differentiate between a traditional Database Management System and a Big Data Management System
Expected Learning Outcomes

Expected Learning Outcomes

  • At the end of this course, you will be able to: * Describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors.
  • * Explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting.
  • * Get value out of Big Data by using a 5-step process to structure your analysis.
  • * Identify what are and what are not big data problems and be able to recast big data problems as data science questions. * Provide an explanation of the architectural components and programming models used for scalable big data analysis.
  • * Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system and the MapReduce programming model. * Install and run a program using Hadoop!
Course Contents

Course Contents

  • Big Data: Why and Where
    Data -- it's been around (even digitally) for a while. What makes data "big" and where does this big data come from?
  • Characteristics of Big Data and Dimensions of Scalability
    You may have heard of the "Big Vs". We'll give examples and descriptions of the commonly discussed 5. But, we want to propose a 6th V and we'll ask you to practice writing Big Data questions targeting this V -- value.
  • Data Science: Getting Value out of Big Data
    We love science and we love computing, don't get us wrong. But the reality is we care about Big Data because it can bring value to our companies, our lives, and the world. In this module we'll introduce a 5 step process for approaching data science problems.
  • Foundations for Big Data Systems and Programming
    Big Data requires new programming frameworks and systems. For this course, we don't programming knowledge or experience -- but we do want to give you a grounding in some of the key concepts.
  • Systems: Getting Started with Hadoop
    Let's look at some details of Hadoop and MapReduce. Then we'll go "hands on" and actually perform a simple MapReduce task in the Cloudera VM. Pay attention - as we'll guide you in "learning by doing" in diagramming a MapReduce task as a Peer Review.
Assessment Elements

Assessment Elements

  • non-blocking exam upon finishing this online course
  • non-blocking online tests on Coursera platform
Interim Assessment

Interim Assessment

  • Interim assessment (4 module)
    0.7 * exam upon finishing this online course + 0.3 * online tests on Coursera platform
Bibliography

Bibliography

Recommended Core Bibliography

  • Grable, J. E., & Lyons, A. C. (2018). An Introduction to Big Data. Journal of Financial Service Professionals, 72(5), 17–20. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=bsu&AN=131378067
  • Sitalakshmi Venkatraman, & Ramanathan Venkatraman. (2019). Big data security challenges and strategies. https://doi.org/10.3934/math.2019.3.860
  • Valentine, C. (2014). Hadoop : 94 Most Asked Questions —— What You Need to Know. Emereo Publishing.
  • Wenbing Zhao, Longxiang Gao, & Anfeng Liu. (2018). Programming Foundations for Scientific Big Data Analytics. https://doi.org/10.1155/2018/2707604
  • White, T. (2011). Hadoop : The Definitive Guide: Vol. 2nd ed., updated. Yahoo Press.

Recommended Additional Bibliography

  • Brajesh Mishra. (2020). Big Data Analysis Using Hadoop Map Reduce. https://doi.org/10.26562/irjcs.2020.v0705.005
  • Laurent Thiry, Heng Zhao, & Michel Hassenforder. (2018). Categories for (Big) Data models and optimization. https://doi.org/10.1186/s40537-018-0132-9
  • UI AHSAAN, S., & MOURYA, A. K. (2019). Big Data Analytics: Challenges and Technologies. Annals of the Faculty of Engineering Hunedoara - International Journal of Engineering, 17(4), 75–79.
  • Wu, C. (2019). CS 644-101: Introduction to Big Data.
  • Zhenlong Li, Wenwu Tang, Qunying Huang, Eric Shook, & Qingfeng Guan. (2020). Introduction to Big Data Computing for Geospatial Applications. ISPRS International Journal of Geo-Information, 9(487), 487. https://doi.org/10.3390/ijgi9080487