• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Student
Title
Supervisor
Faculty
Educational Programme
Final Grade
Year of Graduation
Dinara Fayzullina
Development of Services for Big Data Analytics with Everest Platform
2016
Everest is a web-based platform for distributed execution of scientific applications and data transfer is one of the key factors of the functionality of such platforms. Everest has some constraints on data management, in particular, supported means presuppose that all data have to be uploaded to Everest server before transferring to the computing resources. Thus, finding possible solutions to data transfer problem has become one of the tasks of the research.

In the context of data transfer problem Everest platform has been integrated with Dropbox allowing users to work with data through Dropbox file hosting system. Moreover, extra functionality has been added that enables users to run applications with data downloaded from Dataverse public repository using Digital Object Identifier.

Another problem that is stated in the research concerns the development of Everest applications for distributed data processing that is aimed to show the importance of the publication of such applications as services. As an example of such application the computationally intensive task of mapping reads on a reference genome has been chosen. As a result of working on this part of the research Everest application using Hadoop MapReduce technology for solving mapping task was implemented. The application implementation takes care about data download from FTP server that can be regarded as a solution to data transfer problem in a particular case and it can be integrated with Everest on a permanent basis.

In the near future it is planned to provide access to the new functionality for data transfer for all users, though, firstly, production status approval has to be received from Dropbox support. Moreover, the integration of Everest platform and Globus Online service is under consideration as it may grant reliable and fast tools for big data transfer from users to computing resources and vice versa.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses