Alexey Kosmachev
Optimization of Workflow Execution in Distributed Cloud Infrastructure
Cloud computing can provide almost unlimited resources for any purposes. We can

use this technology to build high-performance scalable dynamic clusters. However, there is

no universal method to effectively organize process of computation for any type of tasks.

Significant number of algorithms are composition of small tasks, which are grouped

into directed acyclic graph and used to process large amounts of data. Unfortunately, existing

solutions for big data processing are designed to different computational models therefore

launching graph based algorithms leads to non-optimal resource utilisation and long

execution time of the task. Such type of computational tasks ​ often occurs in various fields of

science - in biology, medicine, physics, chemistry, astronomy, etc.

In this paper, we present an algorithm, which consider technical constraints and

structural features of scientific workflows in order to build an effective schedule with respect

to large amounts of data. As a result, we get a flexible and tunable cloud cluster optimized to

data-intensive applications.

