• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
  • HSE University
  • Student Theses
  • Federative Storage Design and Implementation in Big Data Driven Analytical Environment for a Large e-Commerce Enterprise

Federative Storage Design and Implementation in Big Data Driven Analytical Environment for a Large e-Commerce Enterprise

Student: Rodionova Svetlana

Supervisor: Evgeny Koucheryavy

Faculty: Graduate School of Business

Educational Programme: Big Data Systems (Master)

Year of Graduation: 2021

Keywords: data infrastructure, data processing, data integration, open source software, testing, Docker, DBMS, ETL, data warehouse, application programming interface, dataflows, routine optimization Object of research: standardization of analytical integration routines as a key step to building effective big data environment Subject of research: analytical environment of a “start-up” department in a large e-commerce enterprise – Ozon. Master thesis consists of introduction, three chapters, conclusion, list of references and six annexes. The introduction highlights the relevance of the chosen topic, goal of research, object and subject of study. Chapter 1 introduces the initial big data environment in the company. Here, the main problems of data analysts are stated as well as standard business processes for working with data are described and visualized. Also, the chapter presents a new concept of daily routines organization and describes scientific and commercial value of the upcoming work. Apart from this, a company of interest profile is briefly described to show the current structure and possibilities of processes development within it. Chapter 2 mixes theoretical analysis of existing techniques and solutions to the problems stated in Chapter 1 with explanation of why they are not completely suitable for the case of interest. Also, the state-of-the-art concept of data integration is described and its applicability to the initial problem is discussed. Chapter 3 is devoted to the description of the proposed solution for the stated problems – a federative data storage prototype. Here, the architecture of the idea is described, and detailed explanation of the prototype components is provided. Also, this chapter introduces a test case with testing environment to show the capabilities of the solution, states key benefits of the final design and mentions abilities and basic steps for future concept development. The conclusion contains the main research findings and briefly describes the performed work. The Annexes section includes long listings of code for key system objects as well as repetitive figures and other project listings.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses