• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Bachelor 2019/2020

Distributed Computing

Type: Elective course (Software Engineering)
Area of studies: Software Engineering
When: 3 year, 3, 4 module
Mode of studies: offline
Instructors: Petr Panfilov
Language: English
ECTS credits: 5
Contact hours: 64

Course Syllabus

Abstract

Distributed computing have become central concept of how computers are used, from web applications to e-commerce and to content distribution. Distributed computing help programmers aggregate the resources of many networked computers to construct highly available and scalable services. This course teaches the abstractions, design and implementation techniques that enable the building of fast, scalable, fault-tolerant distributed computing systems. A course will cover abstractions and implementation techniques for the construction of distributed computing systems, including client server computing, the web, cloud computing, peer-to-peer systems, and distributed storage systems. Topics will include remote procedure call, preventing and finding errors in distributed programs, maintaining consistency of distributed state, fault tolerance, and high availability. Also topics of multithreading, network programming, and several case studies of distributed computing systems will be considered.
Learning Objectives

Learning Objectives

  • To introduce students to the fundamental problems, concepts, and approaches in the design and analysis of distributed computing systems.
  • To familiarize students with the stages of the distributed system design cycle, including system architecture, data and processes arrangements, naming, communication and coordination issues, existing distributed computing paradigms, techniques, and tools, and evaluating the effectiveness of distributed application systems for specific data, task, and user types.
Expected Learning Outcomes

Expected Learning Outcomes

  • understand the evolution of the distributed computing from its early beginnings as multi-processor and multi-computer systems, to computer networks, to the emerging cloud, edge (fog, dew, mist) and heterogeneous computing environments
  • know the design goals of distributed computing systems
  • understand the distinction between distributed computing systems, distributed information systems and pervasive systems
  • know various types of distributed systems
  • understand the existing distributed computing paradigms and systematic issues
  • understand the distinction and relation between logical organization of the collection of software components and the actual physical realization of the distributed system
  • understand some commonly applied architectural styles toward organizing distributed computing systems
  • know the role of middleware layer in separating applications from underlying platforms
  • understand the practical issues and choices that can be made to instantiate and place software components on the real machines
  • understand the difference between centralized and decentralized architectures
  • explain and discuss basic principles and typical examples of real-world distributed systems such as NFS file-sharing system and the web
  • understand the concept of processes and how the different types of processes play a crucial role in distributed systems
  • understand threads and their role in obtaining performance in multicore and multiprocessor environments and in structuring clients and servers
  • know basic principles of virtualization for making applications to run concurrently and independently of the underlying hardware and platforms
  • understand client-server organizations in distributed systems
  • understand typical organizations of both clients and servers
  • know the design issues for servers including those used in object-based distributed systems
  • understand process migration or more specifically code migration and its role in achieving scalability of distributed system
  • understand the ways that processes on different machines in distributed system can exchange information
  • understand protocols or rules that communicating processes must adhere to
  • know the widely used models of communication: Remote Procedure Call (RPC), and Message-Oriented Middleware (MOM)
  • know basic principles of the RPC model and problems with achieving distribution transparency
  • understand the peculiarities of the high-level message-queuing model of process communication
  • know what an application-level routing means for the message-oriented communication
  • know how to set up multicast facilities for data dissemination in distributed systems
  • understand traditional deterministic means of multicasting as well as probabilistic approaches
  • understand the usage of names in resource sharing, identifying entities, referring to locations, and other uses in distributed systems
  • understand the difference in implementing naming system in distributed systems and nondistributed systems
  • know what a flat-naming system is, and what mechanisms are needed to trace the location of entities in distributed system
  • know naming approaches ranging from chains of forwarding links, to distributed hash tables, to hierarchical location services
  • understand general principles and scalability issues of structured name systems
  • know the use of Domain Name System (DNS)
  • know the way of using attributes assigned to an entity to resolve a description of an entity in distributed system
  • understand the importance of cooperation and synchronization of actions between processes
  • understand difference between process synchronization and data synchronization
  • understand the goal of process coordination, coordination problems and solutions in distributed systems
  • know basic principles of process synchronization based on actual time
  • understand coordination of a group of processes by means of election algorithms
  • know election algorithms for coordinating mutual exclusion to a shared resource
  • discuss the use of publish-subscribe systems for coordination in distributed event matching
  • understand an importance of the replication of data in distributed systems
  • know consistency models for shared data and their implementation
  • discuss and explain difference between data-centric and client-centric consistency models
  • know basic principles and key issues of actual implementation of consistency models
  • understand the issue of managing replica servers
  • know the alternatives for implementing strong consistency for replicas
  • understand how caching protocols can be used as a special case of consistency protocols
  • explain caching and replication in Web-based systems
  • understand the notion of partial failure of the distributed system and issue of recovery from partial failures
  • understand the process resilience through process groups
  • know the Paxos algorithm for reaching consensus among the group members
  • understand relation between fault tolerance and reliable communication
  • know basic principles of recovery from a failure in distributed systems
  • understand various mechanisms that are generally incorporated in distributed systems to support security
  • know about the security policy that is to be reinforced and design issues for mechanisms that help enforce such polices
  • know how to ensure secure communication between users or processes, possible residing on different machines
  • know how to ensure secure access control through authorization mechanisms
  • know basics of the security management including mechanisms to distribute cryptographic keys, add and remove users from a system, prove ownership to access specified resources, etc..
Course Contents

Course Contents

  • Introduction: Design goals
    Distributed systems consisit of autonomous computers that work together to givr appearance of a single coherent system. Design goals for distributed systems include sharing resources and ensuring openness. In addition designers aim at hiding many of the intricacies related to distribution of processes, data and control.
  • Introduction: Types of systems
    Different types of distributed systems exist which can be classified as being oriented towards supporting computations, information processing and pervasiveness. Distributed computing systems are typically deployed for high-performance applications often originating from parallel computing. Cloud computing goes beyond high-performance computing and also supports distributed systems found in traditional office environments. An emerging class of distributed systems is represented by pervasive computing environments, including mobile-computing systems as well as sensor-reach environments.
  • Architectures: Architectural styles. Middleware
    We can make a distinction between software architecture and system architecture. AN architectural style reflects the basic principle that is followed in organizing the interaction between the software components comprising a distributed system. Important styles include layering, object-based styles, resource-based styles, and styles in which handling events are prominent.
  • Architectures: System architecture. Example
    There are many different organizations of distributed systems. Client-server architecturesare often highly centralized. In peer-to-peer systems, the processes are organized into an overlaynetwrok, which is a logical network that can be structured using deterministic schemes for routing messages between processes, or unstructured. In hybrid architectures, elements from centralized and decentralized organizations are combined, as is the case in BitTorrent-based systems.
  • Processes: Threads. Virtualization
    Processes play a fundamental role in distributed systems as they form a basis for communication between different machines. Threads in distributed systems are particularly useful to contonue using the CPU when a blocking I/O operation is performed. In general, threads are preferred over the use of processeswhen performance is at stake.Virtualization has since long been an important field of computer science. Popular virtualization schemes allow users to run a suite of applications on top of their favourite operating system and configer complee virtual distributed system in the cloud.
  • Processes: Clients. Servers
    Organizing a distributed application in terms of clients and servers has proven to be useful. Client processes generally implement user interfaces, which may range from very simple displays to advanced interfaces. Client software is furthermore aimed at achieving distribution transparency by hiding details concerning the communication with servers. Servers are often more intricate than clients. They can either be iterative or concurrent, implement one or more services, and can be stateless or stateful.
  • Communication: Foundations. RPC
    Communication between processes is essential for any distributed system. In traditional network applications, communication is often based on the low-level message-passing primitives offered by the transport layer. One of the most widely used abstractions is the Remote Procedure Call (RPC), that offers synchronous communication facilities, by which a client is blocked until the server has sent a reply.
  • Communication: Message-oriented & Multicast communication
    Message-oriented middleware models generally offer persistent asynchronous communication, and are used where RPCs are not approapriate. An important class of communication protocols in distributed systems is multicasting.
  • Naming: Names, IDs. Flat naming
    Names are used to refer to entities. There are three types of names: an address, an identifier, and human-friendly names. Given these types, we make a distinction between flat naming, structured naming, and attribute-basednaming. Systems for flat naming essentially need to resolve an identifier to the address of its associated entity. This can be done in different ways.
  • Naming: Structured naming. Attribute-based naming
    Structured names are easily organized in a name space that can be represented by a naming graph in which a node represents a named entity and the label on an edge represents the name of the entity. Naming graphs are convenient to organize human-friendly names in a structured way. More problematic are attribute-based naming schemes in which entities are described by a collection of (attribute, value) pairs.
  • Coordination: Clock synchronization
    There are various ways to synchronize clocks in a distributed system. All methods are based on exchanging clock values, while taking into account the time it takes to send and receive messages.
  • Coordination: Mutual exclusion. Election algorithms
    An important class of synchronization algorithms is that of distributed mutual exclusion. These algorithms ensure that in a distributed collection of processes, at most one process at a time has access to a shared resource. Synchronization between processes often requires that one process acts as a coordinator. To decide on who is going to be that coordinator an election algorithm is applied.
  • Consistency and replication: Data-centric & Client-centric models
    Replicating data is used for improving the reliability of a distributed system and for improving preformance. Replication introduces a consistency problem: whenever a replica is updated, that replica becomes different from the others. To keep replicas consistent we need to propagate updates in such a way that temporary inconsistencies are not noticed. There are different consistency models. Consistent ordering of operations has since long formed the basis for many consistency models. An opposed to data-centric models, researchers in the field of distributed databases for mobile users have defined a number of client-centric consistency models.
  • Consistency and replication: Replica management. Consistency protocols
    Consistency protocols descibe specific implementations of consistency models. With respect to sequential consistency and its variants, a distinction can be made between primary-based protocols and replicated-write protocols. We pay separate attention to caching and replication in the Web and, related, content delivery networks.
  • Fault tolerance
    Fault tolerance is defined as the characteristic by which distributed computing system can mask the occurence and recovery from failures. Several types of failures exist. Redundancy is the key technique needed to achieve fault tolerance. When applied to processes, the notion of process groups becomes important. The real problem is that members of a process group need to reach consensus in the presence of various failures. Paxos is by now a well-established and highly robust consensus algorithm.
  • Security
    A distributed system schould provide the mechanisms that allow a variety of different security polices to bne reinforced. Three important issues can be distinguished: secure channels between processes, access control or authorization, and management. Also a special attention is required to handling secure names.
Assessment Elements

Assessment Elements

  • non-blocking InClass Activity
  • non-blocking Homeworks
  • Partially blocks (final) grade/grade calculation Referate (Individual Study)
    Study material is based on analysis of one-two recent papers on topic
  • Partially blocks (final) grade/grade calculation Home Assignment (Group Project)
  • Partially blocks (final) grade/grade calculation Final Examination
    Экзамен письменный в MS Teams. Без прокторинга. Технические требования: web-камера, микрофон, наушники / колонки
Interim Assessment

Interim Assessment

  • Interim assessment (4 module)
    0.2 * Final Examination + 0.3 * Home Assignment (Group Project) + 0.2 * Homeworks + 0.1 * InClass Activity + 0.2 * Referate (Individual Study)
Bibliography

Bibliography

Recommended Core Bibliography

  • Distributed Systems. (2017). Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsnar&AN=edsnar.oai.ris.utwente.nl.publications.db6a761f.b353.419e.b65a.81e3740bbe53
  • Tanenbaum, A. S., & Steen, M. van. (2014). Distributed Systems: Pearson New International Edition : Principles and Paradigms (Vol. 2nd ed). Harlow, Essex: Pearson. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1418515

Recommended Additional Bibliography

  • Steen, M., & Tanenbaum, A. (2016). A brief introduction to distributed systems. Computing, 98(10), 967–1009. https://doi.org/10.1007/s00607-016-0508-7