SlideShare a Scribd company logo
1. What is a Distributed System?

A distributed system is a collection of autonomous computers linked by a computer network that appear to the users of
the system as a single computer.

Inter-process communication
In computing, Inter-process communication (IPC) is a set of methods for the exchange of data among
multiple threads in one or more processes. Processes may be running on one or more computers connected by a network.
IPC methods are divided into methods for message passing, synchronization, shared memory, and remote procedure
calls (RPC). The method of IPC used may vary based on the bandwidth and latency of communication between the
threads, and the type of data being communicated.
There are several reasons for providing an environment that allows process cooperation:

   Information sharing
   Speedup
   Modularity
   Convenience
   Privilege separation
IPC may also be referred to as inter-thread communication and inter-application communication.
The combination of IPC with the address space concept is the foundation for address space independence/isolation.[

MARSHALLING

In distributed system different modules can use different representations for the same data. To exchange such data
between modules, it is necessary to reformat the data. This operation (called marshalling) needs some computer time and
sometimes it is most expensive part in network communication. The article discusses some problems and concrete
implementation examples of marshaling procedures.

Remote procedure call

In computer science, a remote procedure call (RPC) is an inter-process communication that allows a computer
program to cause a subroutine or procedure to execute in another address space (commonly on another computer on a
shared network) without the programmer explicitly coding the details for this remote interaction. That is, the
programmer writes essentially the same code whether the subroutine is local to the executing program, or remote. When
the software in question uses object-oriented principles, RPC is called remote invocation or remote method
invocation.
Client-Server Communication Model

The client–server model is a computing model that acts as a distributed application which partitions tasks or workloads
between the providers of a resource or service, called servers, and service requesters, called clients.[1] Often clients and
servers communicate over a computer network on separate hardware, but both client and server may reside in the same
system. A server machine is a host that is running one or more server programs which share their resources with clients.
A client does not share any of its resources, but requests a server's content or service function. Clients therefore initiate
communication sessions with servers which await incoming requests.




• Structure: group of servers offering service to clients

• Based on a request/response paradigm

• Techniques:

– Socket, remote procedure calls (RPC), Remote Method Invocation (RMI)




Issues in Client-Server Communication

• Addressing

• Blocking versus non-blocking

• Buffered versus unbuffered

• Reliable versus unreliable

• Server architecture: concurrent versus sequential

• Scalability
Stub (distributed computing)

• The stubs in RPC are responsible for packing and unpacking the call parameters, and the call results

- this is called marshalling/unmarshalling

• Stubs must allow for the fact that client and server may be machines of different types

- for example, integers may be represented differently (byte-ordering)




A stub in distributed computing is a piece of code used for converting parameters passed during a Remote Procedure
Call (RPC).
The main idea of an RPC is to allow a local computer (client) to remotely call procedures on a remote computer (server).
The client and server use different address spaces, so conversion of parameters used in a function call have to be
performed, otherwise the values of those parameters could not be used, because of pointers to the computer's memory
pointing to different data on each machine. The client and server may also use different data representations even for
simple parameters (e.g., big-endian versus little-endian for integers.) Stubs are used to perform the conversion of the
parameters, so a Remote Function Call looks like a local function call for the remote computer.
Stub libraries must be installed on client and server side. A client stub is responsible for conversion of parameters used in
a function call and deconversion of results passed from the server after execution of the function. A server skeleton, the
stub on server side, is responsible for deconversion of parameters passed by the client and conversion of the results after
the execution of the function.
Stub can be generated in one of the two ways:

    1. Manually: In this method, the RPC implementer provides a set of translation functions from which a user can
       construct his or her own stubs. This method is simple to implement and can handle very complex parameter
       types.
    2. Automatically: This is more commonly used method for stub generation. It uses an interface description
       language (IDL), that is used for defining the interface between Client and Server. For example, an interface
       definition has information to indicate whether, each argument is input, output or both — only input arguments
       need to be copied from client to server and only output elements need to be copied from server to client.
       function.
Mutual exclusion

Mutual exclusion (often abbreviated to mutex) algorithms are used in concurrent programming to avoid the
simultaneous use of a common resource, such as a global variable, by pieces of computer code called critical sections. A
critical section is a piece of code in which a process or thread accesses a common resource.

The critical section by itself is not a mechanism or algorithm for mutual exclusion. A program, process, or thread can
have the critical section in it without any mechanism or algorithm which implements mutual exclusion.

Election Algorithms

Election Algorithms
        The coordinator election problem is to choose a process from among a group of processes on different
        processors in a distributed system to act as the central coordinator.
        An election algorithm is an algorithm for solving the coordinator election problem. By the nature of the
        coordinator election problem, any election algorithm must be a distributed algorithm.
-a group of processes on different machines need to choose a coordinator

       -peer to peer communication: every process can send messages to every other process.

       -Assume that processes have unique IDs, such that one is highest

       -Assume that the priority of process Pi is i

(a) Bully Algorithm
Background: any process Pi sends a message to the current coordinator; if no response in T time units, Pi tries to elect
itself as leader. Details follow:

Algorithm for process Pi that detected the lack of coordinator

    1. Process Pi sends an ―Election‖ message to every process with higher priority.
    2. If no other process responds, process Pi starts the coordinator code running and sends a message to all processes
       with lower priorities saying ―Elected Pi‖
    3. Else, Pi waits for T’ time units to hear from the new coordinator, and if there is no response  start from step (1)
       again.


Algorithm for other processes (also called Pi)

       If Pi is not the coordinator then Pi may receive either of these messages from Pj



       if Pi sends ―Elected Pj‖; [this message is only received if i < j]



               Pi updates its records to say that Pj is the coordinator.

       Else if Pj sends ―election‖ message (i > j)

               Pi sends a response to Pj saying it is alive

               Pi starts an election.
(b) Election In A Ring => Ring Algorithm.

-assume that processes form a ring: each process only sends messages to the next process in the ring

- Active list: its info on all other active processes

- assumption: message continues around the ring even if a process along the way has crashed.

 Background: any process Pi sends a message to the current coordinator; if no response in T time units, Pi initiates an
election

    1. initialize active list to empty.
    2. Send an ―Elect(i)‖ message to the right. + add i to active list.


If a process receives an ―Elect(j)‖ message

        (a) this is the first message sent or seen

                 initialize its active list to [i,j]; send ―Elect(i)‖ + send ―Elect(j)‖

        (b) if i != j, add i to active list + forward ―Elect(j)‖ message to active list

        (c) otherwise (i = j), so process i has complete set of active processes in its active list.

                 => choose highest process ID + send ―Elected (x)‖ message to neighbor

If a process receives ―Elected(x)‖ message,

        set coordinator to x



Example:

Suppose that we have four processes arranged in a ring: P1  P2  P3  P4  P1 …

P4 is coordinator

Suppose P1 + P4 crash

Suppose P2 detects that coordinator P4 is not responding

P2 sets active list to [ ]

P2 sends ―Elect(2)‖ message to P3; P2 sets active list to [2]

P3 receives ―Elect(2)‖

This message is the first message seen, so P3 sets its active list to [2,3]

P3 sends ―Elect(3)‖ towards P4 and then sends ―Elect(2)‖ towards P4

The messages pass P4 + P1 and then reach P2

P2 adds 3 to active list [2,3]

P2 forwards ―Elect(3)‖ to P3

P2 receives the ―Elect(2) message
P2 chooses P3 as the highest process in its list [2, 3] and sends an ―Elected(P3)‖ message
P3 receives the ―Elect(3)‖ message

       P3 chooses P3 as the highest process in its list [2, 3] + sends an ―Elected(P3)‖ message

Distributed Scheduling

Load balancing (computing)

Load balancing is a computer networking methodology to distribute workload across multiple computers or a computer
cluster, network links, central processing units, disk drives, or other resources, to achieve optimal resource utilization,
maximize throughput, minimize response time, and avoid overload. Using multiple components with load balancing,
instead of a single component, may increase reliability throughredundancy. The load balancing service is usually
provided by dedicated software or hardware, such as a multilayer switch or a Domain Name System server.

Load Balancing




some may be lightly loaded, some moderately loaded, some heavily loaded
         Differences between "N    -times" faster processor and processor pool of N processors is interesting.
         While the arrival rate is N times the individual rate, the service rate is not unless the pool is constantly
        maximally busy
         but one N  -times-faster processor can be much more expensive ( or non-existent ) than N slow processor
         Let's analyze N isolated systems to see the problem of underutilization in the absence of load balancing --
        consider a system of N identical and independent M/M/1 servers:
Issues in Load Distribution

Load

           Resource and CPU queue lengths are good indicators of load.
           Artificially increment CPU queue length for transferred jobs on their way.
           Set timeouts for such jobs to safeguard against transfer failures.
           Little correlation between queue length and CPU utilization for interactive jobs: use utilization instead.
           Monitoring CPU utilization is expensive
           Modeling-- Poisson Process, Markov process, M/M/1 queue, M/M/N

Classification of Algorithms

         Static -- decisions hard-wired into algorithm using prior knowledge of system
         Dynamic -- use state information to make decisions.
         Adaptive -- special case of dynamic algorithms; dynamically change parameters of the algorithm

Load Sharing vs. Load Balancing

         Load Sharing-- reduce the likelihood of unshared state by transferring tasks to lightly loaded nodes
         Load Balancing -- try to make each load have approximately same load

Preemptive vs. Nonpreemptive
 Preemptive transfers -- transfer of a task that is partially executed, expensive due to collection of task's state
        Nonpreemptive transfers -- only transfer tasks that have not begun execution.

Components of Load Distribution

        Trans policy -- threshold based, determine if a process should be executed remotely or locally
                 fer
        Selection policy -- which task should be picked, overhead in transfer of selected task should be offset by
       reduction in its response time
        Location policy-- which node to be sent, possibly use polling to find suitable node
        Information policy -- when should the information of other nodes should be collected; demand-driven, or
       periodic,                                       or                                       state-change-driven
       Demand-driven:
               nodes gather information about other nodes

               sender initiated
               receiver initiated
               symmetrically initiated

       Periodic :
               nodes exchange information periodically
       State-change-driven :
               nodes disseminate information when their state changes
        Stability-- queueing-theoretic, or algorithmic perspective

Sender Initiated Algorithms

          overloaded node when a new task makes the queue length ≥ threshold T
                            --
          underloaded node-- if accepting a task still maintains queue lenght < threshold T
          overloaded node attempts to send task to underloaded node
          only newly arrived tasks considered for transfers
          location policies:

               random -- no information about other nodes
               threshold -- polling to determine if its a receiver ( underloaded )
               shortest -- a # of nodes are polled at random to determine their queue length

        information policy: demand   -driven
        stability: polling increases activites; render the system unstable at high load

Receiver Initiated Algorithms

          initiated by underloaded nodes
          underloaded node tries to obtain task from overloaded node
          initiate search for sender either on a task departure or after a predetermined period
          information policy: demand   -driven
          stability: remain stable at high and low loads
          disadvantage: most tasks tran  sfer are preemptive

Symmetrically Initiated Algorithms

        senders search for receivers --good in low load situations; but high polling overhead in high load situations
        receivers search for senders -- useful in high load situations, preemptive task transfer facility is necessary

Stable Symmetrically Initiated Algorithms

        use the information gathered during polling to classify nodes in system as either Sender/overload,
       Receiver/underload, or OK
Distributed Deadlock

A deadlock is a condition in a system where a process cannot proceed because it needns to obtain a resource held by
another process but it itself is holding a resource that the other process needs. More formally, four conditions have to be
met for a deadlock to occur in a system:

1. mutual exclusion -A resource can be held by at most one process.

2. hold and wait -Processes that already hold resources can wait for another resource.

3. non-preemption -A resource, once granted, cannot be taken away.

4. circular wait -Two or more processes are waiting for resources held by one of the other processes.

Resource allocation can be represented by directed graphs:

 P1R1 means that resource R1 is allocated to process P1.

 P1R1 means that resource R1 is requested by process P1.

Deadlock is present when the graph has cycles. An example is shown in Figure 1.




Deadlocks in distributed systems

The same conditions for deadlocks in uniprocessors apply to distributed systems. Unfortunately, as in many other aspects
of distributed systems, they are harder to detect, avoid, and prevent. Tannenbaum proposes four strategies for dealing
with distributed deadlocks:

1. ignorance: ignore the problem (this is the most common approach).

2. detection: let deadlocks occur, detect them, and then deal with them.

3. prevention: make deadlocks impossible.

4. avoidance: choose resource allocation carefully so that deadlocks will not occur.

The last of these, deadlock avoidance through resource allocation is difficult and requires the ability
to predict precisely the resources that will be needed and the times that they will be needed. This is difficult and not
      practical in real systems. The first of these is trivially simple. We will focus on the middle two approaches.



      DDBMS (distributed database management system)

      A DDBMS (distributed database management system) is a centralized application that manages a distributed database as
      if it were all stored on the same computer. The DDBMS synchronizes all the data periodically, and in cases where
      multiple users must access the same data, ensures that updates and deletes performed on the data at one location will be
      automatically reflected in the data stored elsewhere.

      DDBMS Advantages

       Data are located near ―greatest demand‖ site

       Faster data access

       Faster data processing

       Growth facilitation

       Improved communications

       Reduced operating costs

       User
           -friendly interface

       Less danger of a single-point failure

       Processor independence

      DDBMS Disadvantages

       Complexity of management and control

       Security

       Lack of standards

       Increased storage requirements

       Greater difficulty in managing the data environment

       Increased training cost

      Distributed Multimedia Systems

1. Introduction
  o     Definition: &quot;A distributed multimedia system (DMS) is an integrated communication, computing, and
        information system that enables the processing, management, delivery, and presentation of synchronized
        multimedia information with quality-of-service guarantees.&quot;
  o     http://encyclopedia.jrank.org/articles/pages/6729/Distributed-Multimedia-Systems.html
2. Characteristics
o       Delivering the streams of multimedia data
              Audio samples, Video frames
     o       To meet the timing requirements

              QoS (Quality of Service)
     o       Flexibility (adapting to user needs)
     o       Availability
     o       Scalability
3. Factors that affect a system
     o       Server bandwidth
     o       Cache space
     o       Number of copies
     o       The number of clients
4. Basic Schema Wide area gateway Video server Digital TV/radio server Video camera and mike Local network Local
   network
5. Typical infrastructure components for multimedia applications
6.
7. Different Designs and Architectures
     o       Database
     o       Proxy/information servers
     o       Clients
     o       Wired or wireless networks
8. Approaches
     o       Proxy-based approach
     o       Parallel or clustered servers approach
              Varies based on clip duration, number of clients, bandwidth available, etc
     o       Caching

9. Quality of Service (QoS)
     o       DMMS are real-time systems as data must be delivered on time
     o       Not critical – Some flexibility exists
     o       Loss is acceptable when resync is possible.
     o       ― Acceptable‖ service is measured by:
              Bandwidth (Throughput)
              Latency (Access time)
              Data Loss Rate (Acceptable loss ratio)

10. QoS Management
     o       ― QoS Management‖
     Process of managing resources to meet the Acceptable service criteria.
  o       Resources include:
           CPU / processing power
           Network bandwidth
           Buffer memory(on both ends)
           Disk bandwidth
           Other factors affecting communication
11. Why do we need QoS?
  o       As multimedia becomes more widespread, strain on network increases!
  o       Networks provide insufficient QoS for distribution of multimedia.
           Ethernet (wired or wireless) is best effort
           Collisions, data loss, congestion, etc.
  o       For some multimedia applications, synchronization is vital.
12. QoS Managers
  o       Software that runs on network nodes which have two main functions:
           QoS negotiation : get requirements from apps and checks feasibility versus available resources.
           Admission control : If negotiation succeeds, provides a &quot;resource contract&quot; that guarantees
            reservation of resources for a certain time.
13. Ways to achieve QoS
  o       Buffering (on both ends)
  o       Compression
           More load on the nodes, but that is okay
  o       Bandwidth Reservation
  o       Resource Scheduling
  o       Traffic Shaping
  o       Flow Specifications
  o       Stream Adaptation
14. Traffic Shaping
  o       Output buffering at the source to keep data flowing smoothly.
  o       Two main algorithms:
           Leaky bucket : guarantees that data flows at a constant rate without bursts - completely eliminate bursty
            traffic.
           Token bucket : variation of leaky bucket where tokens are generated to allow for some bursty traffic when
            bandwidth is unused for a certain period of time.
15. Traffic Shaping
16. Flow specifications
  o       RFC 1363 defines QoS parameters:
        Bandwidth
              Latency and jitter constraints
              Data loss limits
              Token bucket size
    Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Pearson
    Education 2001
17. Stream Adaptation
  o       Adjust the data flow based on resource availability.
  o       Scaling
              Scale down content at the source to reduce bandwidth required:
                Audio : reduce the rate of audio sampling or dropping channels
                Video : reduce resolution, number of pixels, change compression algorithm, color depths, color spaces,
                 and combinations.
  o       Filtering
              One target asks the source to reduce quality for all the clients, even if some can handle higher quality.
              Suitable for more than one simultaneous target and guarantees the same QoS for all the targets
18. Applications of DMMS
  o       Digital Libraries
  o       Distance learning
  o       Teleconferencing
  o       Video on Demand (VoD) & Video on Reservation (VoR)
  o       Pay Per View
  o       Audio Streaming
  o       Video Streaming
  o       E-commerce
  o       P2PTV
19. Voddler
  o       Video on Demand and Pay Per View
  o       Long movies
  o       Requires high bandwidth
  o       Hybrid P2P distribution network
20. Voddler http://en.wikipedia.org/wiki/File:P2ptv.PNG
21. YouTube, Platform
  o       Apache
  o       Python
  o       Linux
  o       MySQL
o   Psyco
  o   lighttpd for video instead of Apache, because of overheads
22. YouTube, Serving Video
  o   Each video hosted by a mini-cluster. Each video is served by more than one machine.
  o   Most popular content is moved to a CDN (content delivery network)
  o   Less popular content (1-20 views per day) uses YouTube servers in various proper sites
23. YouTube, Data Center Strategy
  o   Used manage hosting providers at first. Living off credit cards so it was the only way.
  o   Managed hosting can't scale with you. You can't control hardware or make favorable networking agreements.
  o   So they went to a colocation arrangement. Now they can customize everything and negotiate their own contracts.
  o   Videos come out of any data center. Not closest match or anything. If a video is popular enough it will move into
      the CDN.

More Related Content

Distributed System

  • 1. 1. What is a Distributed System? A distributed system is a collection of autonomous computers linked by a computer network that appear to the users of the system as a single computer. Inter-process communication In computing, Inter-process communication (IPC) is a set of methods for the exchange of data among multiple threads in one or more processes. Processes may be running on one or more computers connected by a network. IPC methods are divided into methods for message passing, synchronization, shared memory, and remote procedure calls (RPC). The method of IPC used may vary based on the bandwidth and latency of communication between the threads, and the type of data being communicated. There are several reasons for providing an environment that allows process cooperation:  Information sharing  Speedup  Modularity  Convenience  Privilege separation IPC may also be referred to as inter-thread communication and inter-application communication. The combination of IPC with the address space concept is the foundation for address space independence/isolation.[ MARSHALLING In distributed system different modules can use different representations for the same data. To exchange such data between modules, it is necessary to reformat the data. This operation (called marshalling) needs some computer time and sometimes it is most expensive part in network communication. The article discusses some problems and concrete implementation examples of marshaling procedures. Remote procedure call In computer science, a remote procedure call (RPC) is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space (commonly on another computer on a shared network) without the programmer explicitly coding the details for this remote interaction. That is, the programmer writes essentially the same code whether the subroutine is local to the executing program, or remote. When the software in question uses object-oriented principles, RPC is called remote invocation or remote method invocation.
  • 2. Client-Server Communication Model The client–server model is a computing model that acts as a distributed application which partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients.[1] Often clients and servers communicate over a computer network on separate hardware, but both client and server may reside in the same system. A server machine is a host that is running one or more server programs which share their resources with clients. A client does not share any of its resources, but requests a server's content or service function. Clients therefore initiate communication sessions with servers which await incoming requests. • Structure: group of servers offering service to clients • Based on a request/response paradigm • Techniques: – Socket, remote procedure calls (RPC), Remote Method Invocation (RMI) Issues in Client-Server Communication • Addressing • Blocking versus non-blocking • Buffered versus unbuffered • Reliable versus unreliable • Server architecture: concurrent versus sequential • Scalability
  • 3. Stub (distributed computing) • The stubs in RPC are responsible for packing and unpacking the call parameters, and the call results - this is called marshalling/unmarshalling • Stubs must allow for the fact that client and server may be machines of different types - for example, integers may be represented differently (byte-ordering) A stub in distributed computing is a piece of code used for converting parameters passed during a Remote Procedure Call (RPC). The main idea of an RPC is to allow a local computer (client) to remotely call procedures on a remote computer (server). The client and server use different address spaces, so conversion of parameters used in a function call have to be performed, otherwise the values of those parameters could not be used, because of pointers to the computer's memory pointing to different data on each machine. The client and server may also use different data representations even for simple parameters (e.g., big-endian versus little-endian for integers.) Stubs are used to perform the conversion of the parameters, so a Remote Function Call looks like a local function call for the remote computer. Stub libraries must be installed on client and server side. A client stub is responsible for conversion of parameters used in a function call and deconversion of results passed from the server after execution of the function. A server skeleton, the stub on server side, is responsible for deconversion of parameters passed by the client and conversion of the results after the execution of the function. Stub can be generated in one of the two ways: 1. Manually: In this method, the RPC implementer provides a set of translation functions from which a user can construct his or her own stubs. This method is simple to implement and can handle very complex parameter types. 2. Automatically: This is more commonly used method for stub generation. It uses an interface description language (IDL), that is used for defining the interface between Client and Server. For example, an interface definition has information to indicate whether, each argument is input, output or both — only input arguments need to be copied from client to server and only output elements need to be copied from server to client. function.
  • 4. Mutual exclusion Mutual exclusion (often abbreviated to mutex) algorithms are used in concurrent programming to avoid the simultaneous use of a common resource, such as a global variable, by pieces of computer code called critical sections. A critical section is a piece of code in which a process or thread accesses a common resource. The critical section by itself is not a mechanism or algorithm for mutual exclusion. A program, process, or thread can have the critical section in it without any mechanism or algorithm which implements mutual exclusion. Election Algorithms Election Algorithms The coordinator election problem is to choose a process from among a group of processes on different processors in a distributed system to act as the central coordinator. An election algorithm is an algorithm for solving the coordinator election problem. By the nature of the coordinator election problem, any election algorithm must be a distributed algorithm. -a group of processes on different machines need to choose a coordinator -peer to peer communication: every process can send messages to every other process. -Assume that processes have unique IDs, such that one is highest -Assume that the priority of process Pi is i (a) Bully Algorithm Background: any process Pi sends a message to the current coordinator; if no response in T time units, Pi tries to elect itself as leader. Details follow: Algorithm for process Pi that detected the lack of coordinator 1. Process Pi sends an ―Election‖ message to every process with higher priority. 2. If no other process responds, process Pi starts the coordinator code running and sends a message to all processes with lower priorities saying ―Elected Pi‖ 3. Else, Pi waits for T’ time units to hear from the new coordinator, and if there is no response  start from step (1) again. Algorithm for other processes (also called Pi) If Pi is not the coordinator then Pi may receive either of these messages from Pj if Pi sends ―Elected Pj‖; [this message is only received if i < j] Pi updates its records to say that Pj is the coordinator. Else if Pj sends ―election‖ message (i > j) Pi sends a response to Pj saying it is alive Pi starts an election.
  • 5. (b) Election In A Ring => Ring Algorithm. -assume that processes form a ring: each process only sends messages to the next process in the ring - Active list: its info on all other active processes - assumption: message continues around the ring even if a process along the way has crashed. Background: any process Pi sends a message to the current coordinator; if no response in T time units, Pi initiates an election 1. initialize active list to empty. 2. Send an ―Elect(i)‖ message to the right. + add i to active list. If a process receives an ―Elect(j)‖ message (a) this is the first message sent or seen initialize its active list to [i,j]; send ―Elect(i)‖ + send ―Elect(j)‖ (b) if i != j, add i to active list + forward ―Elect(j)‖ message to active list (c) otherwise (i = j), so process i has complete set of active processes in its active list. => choose highest process ID + send ―Elected (x)‖ message to neighbor If a process receives ―Elected(x)‖ message, set coordinator to x Example: Suppose that we have four processes arranged in a ring: P1  P2  P3  P4  P1 … P4 is coordinator Suppose P1 + P4 crash Suppose P2 detects that coordinator P4 is not responding P2 sets active list to [ ] P2 sends ―Elect(2)‖ message to P3; P2 sets active list to [2] P3 receives ―Elect(2)‖ This message is the first message seen, so P3 sets its active list to [2,3] P3 sends ―Elect(3)‖ towards P4 and then sends ―Elect(2)‖ towards P4 The messages pass P4 + P1 and then reach P2 P2 adds 3 to active list [2,3] P2 forwards ―Elect(3)‖ to P3 P2 receives the ―Elect(2) message
  • 6. P2 chooses P3 as the highest process in its list [2, 3] and sends an ―Elected(P3)‖ message P3 receives the ―Elect(3)‖ message P3 chooses P3 as the highest process in its list [2, 3] + sends an ―Elected(P3)‖ message Distributed Scheduling Load balancing (computing) Load balancing is a computer networking methodology to distribute workload across multiple computers or a computer cluster, network links, central processing units, disk drives, or other resources, to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid overload. Using multiple components with load balancing, instead of a single component, may increase reliability throughredundancy. The load balancing service is usually provided by dedicated software or hardware, such as a multilayer switch or a Domain Name System server. Load Balancing some may be lightly loaded, some moderately loaded, some heavily loaded  Differences between "N -times" faster processor and processor pool of N processors is interesting.  While the arrival rate is N times the individual rate, the service rate is not unless the pool is constantly maximally busy  but one N -times-faster processor can be much more expensive ( or non-existent ) than N slow processor  Let's analyze N isolated systems to see the problem of underutilization in the absence of load balancing -- consider a system of N identical and independent M/M/1 servers: Issues in Load Distribution Load  Resource and CPU queue lengths are good indicators of load.  Artificially increment CPU queue length for transferred jobs on their way.  Set timeouts for such jobs to safeguard against transfer failures.  Little correlation between queue length and CPU utilization for interactive jobs: use utilization instead.  Monitoring CPU utilization is expensive  Modeling-- Poisson Process, Markov process, M/M/1 queue, M/M/N Classification of Algorithms  Static -- decisions hard-wired into algorithm using prior knowledge of system  Dynamic -- use state information to make decisions.  Adaptive -- special case of dynamic algorithms; dynamically change parameters of the algorithm Load Sharing vs. Load Balancing  Load Sharing-- reduce the likelihood of unshared state by transferring tasks to lightly loaded nodes  Load Balancing -- try to make each load have approximately same load Preemptive vs. Nonpreemptive
  • 7.  Preemptive transfers -- transfer of a task that is partially executed, expensive due to collection of task's state  Nonpreemptive transfers -- only transfer tasks that have not begun execution. Components of Load Distribution  Trans policy -- threshold based, determine if a process should be executed remotely or locally fer  Selection policy -- which task should be picked, overhead in transfer of selected task should be offset by reduction in its response time  Location policy-- which node to be sent, possibly use polling to find suitable node  Information policy -- when should the information of other nodes should be collected; demand-driven, or periodic, or state-change-driven Demand-driven: nodes gather information about other nodes sender initiated receiver initiated symmetrically initiated Periodic : nodes exchange information periodically State-change-driven : nodes disseminate information when their state changes  Stability-- queueing-theoretic, or algorithmic perspective Sender Initiated Algorithms  overloaded node when a new task makes the queue length ≥ threshold T --  underloaded node-- if accepting a task still maintains queue lenght < threshold T  overloaded node attempts to send task to underloaded node  only newly arrived tasks considered for transfers  location policies: random -- no information about other nodes threshold -- polling to determine if its a receiver ( underloaded ) shortest -- a # of nodes are polled at random to determine their queue length  information policy: demand -driven  stability: polling increases activites; render the system unstable at high load Receiver Initiated Algorithms  initiated by underloaded nodes  underloaded node tries to obtain task from overloaded node  initiate search for sender either on a task departure or after a predetermined period  information policy: demand -driven  stability: remain stable at high and low loads  disadvantage: most tasks tran sfer are preemptive Symmetrically Initiated Algorithms  senders search for receivers --good in low load situations; but high polling overhead in high load situations  receivers search for senders -- useful in high load situations, preemptive task transfer facility is necessary Stable Symmetrically Initiated Algorithms  use the information gathered during polling to classify nodes in system as either Sender/overload, Receiver/underload, or OK
  • 8. Distributed Deadlock A deadlock is a condition in a system where a process cannot proceed because it needns to obtain a resource held by another process but it itself is holding a resource that the other process needs. More formally, four conditions have to be met for a deadlock to occur in a system: 1. mutual exclusion -A resource can be held by at most one process. 2. hold and wait -Processes that already hold resources can wait for another resource. 3. non-preemption -A resource, once granted, cannot be taken away. 4. circular wait -Two or more processes are waiting for resources held by one of the other processes. Resource allocation can be represented by directed graphs: P1R1 means that resource R1 is allocated to process P1. P1R1 means that resource R1 is requested by process P1. Deadlock is present when the graph has cycles. An example is shown in Figure 1. Deadlocks in distributed systems The same conditions for deadlocks in uniprocessors apply to distributed systems. Unfortunately, as in many other aspects of distributed systems, they are harder to detect, avoid, and prevent. Tannenbaum proposes four strategies for dealing with distributed deadlocks: 1. ignorance: ignore the problem (this is the most common approach). 2. detection: let deadlocks occur, detect them, and then deal with them. 3. prevention: make deadlocks impossible. 4. avoidance: choose resource allocation carefully so that deadlocks will not occur. The last of these, deadlock avoidance through resource allocation is difficult and requires the ability
  • 9. to predict precisely the resources that will be needed and the times that they will be needed. This is difficult and not practical in real systems. The first of these is trivially simple. We will focus on the middle two approaches. DDBMS (distributed database management system) A DDBMS (distributed database management system) is a centralized application that manages a distributed database as if it were all stored on the same computer. The DDBMS synchronizes all the data periodically, and in cases where multiple users must access the same data, ensures that updates and deletes performed on the data at one location will be automatically reflected in the data stored elsewhere. DDBMS Advantages  Data are located near ―greatest demand‖ site  Faster data access  Faster data processing  Growth facilitation  Improved communications  Reduced operating costs  User -friendly interface  Less danger of a single-point failure  Processor independence DDBMS Disadvantages  Complexity of management and control  Security  Lack of standards  Increased storage requirements  Greater difficulty in managing the data environment  Increased training cost Distributed Multimedia Systems 1. Introduction o Definition: &quot;A distributed multimedia system (DMS) is an integrated communication, computing, and information system that enables the processing, management, delivery, and presentation of synchronized multimedia information with quality-of-service guarantees.&quot; o http://encyclopedia.jrank.org/articles/pages/6729/Distributed-Multimedia-Systems.html 2. Characteristics
  • 10. o Delivering the streams of multimedia data  Audio samples, Video frames o To meet the timing requirements  QoS (Quality of Service) o Flexibility (adapting to user needs) o Availability o Scalability 3. Factors that affect a system o Server bandwidth o Cache space o Number of copies o The number of clients 4. Basic Schema Wide area gateway Video server Digital TV/radio server Video camera and mike Local network Local network 5. Typical infrastructure components for multimedia applications 6. 7. Different Designs and Architectures o Database o Proxy/information servers o Clients o Wired or wireless networks 8. Approaches o Proxy-based approach o Parallel or clustered servers approach  Varies based on clip duration, number of clients, bandwidth available, etc o Caching 9. Quality of Service (QoS) o DMMS are real-time systems as data must be delivered on time o Not critical – Some flexibility exists o Loss is acceptable when resync is possible. o ― Acceptable‖ service is measured by:  Bandwidth (Throughput)  Latency (Access time)  Data Loss Rate (Acceptable loss ratio) 10. QoS Management o ― QoS Management‖
  • 11. Process of managing resources to meet the Acceptable service criteria. o Resources include:  CPU / processing power  Network bandwidth  Buffer memory(on both ends)  Disk bandwidth  Other factors affecting communication 11. Why do we need QoS? o As multimedia becomes more widespread, strain on network increases! o Networks provide insufficient QoS for distribution of multimedia.  Ethernet (wired or wireless) is best effort  Collisions, data loss, congestion, etc. o For some multimedia applications, synchronization is vital. 12. QoS Managers o Software that runs on network nodes which have two main functions:  QoS negotiation : get requirements from apps and checks feasibility versus available resources.  Admission control : If negotiation succeeds, provides a &quot;resource contract&quot; that guarantees reservation of resources for a certain time. 13. Ways to achieve QoS o Buffering (on both ends) o Compression  More load on the nodes, but that is okay o Bandwidth Reservation o Resource Scheduling o Traffic Shaping o Flow Specifications o Stream Adaptation 14. Traffic Shaping o Output buffering at the source to keep data flowing smoothly. o Two main algorithms:  Leaky bucket : guarantees that data flows at a constant rate without bursts - completely eliminate bursty traffic.  Token bucket : variation of leaky bucket where tokens are generated to allow for some bursty traffic when bandwidth is unused for a certain period of time. 15. Traffic Shaping 16. Flow specifications o RFC 1363 defines QoS parameters:
  • 12. Bandwidth  Latency and jitter constraints  Data loss limits  Token bucket size Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Pearson Education 2001 17. Stream Adaptation o Adjust the data flow based on resource availability. o Scaling  Scale down content at the source to reduce bandwidth required:  Audio : reduce the rate of audio sampling or dropping channels  Video : reduce resolution, number of pixels, change compression algorithm, color depths, color spaces, and combinations. o Filtering  One target asks the source to reduce quality for all the clients, even if some can handle higher quality.  Suitable for more than one simultaneous target and guarantees the same QoS for all the targets 18. Applications of DMMS o Digital Libraries o Distance learning o Teleconferencing o Video on Demand (VoD) & Video on Reservation (VoR) o Pay Per View o Audio Streaming o Video Streaming o E-commerce o P2PTV 19. Voddler o Video on Demand and Pay Per View o Long movies o Requires high bandwidth o Hybrid P2P distribution network 20. Voddler http://en.wikipedia.org/wiki/File:P2ptv.PNG 21. YouTube, Platform o Apache o Python o Linux o MySQL
  • 13. o Psyco o lighttpd for video instead of Apache, because of overheads 22. YouTube, Serving Video o Each video hosted by a mini-cluster. Each video is served by more than one machine. o Most popular content is moved to a CDN (content delivery network) o Less popular content (1-20 views per day) uses YouTube servers in various proper sites 23. YouTube, Data Center Strategy o Used manage hosting providers at first. Living off credit cards so it was the only way. o Managed hosting can't scale with you. You can't control hardware or make favorable networking agreements. o So they went to a colocation arrangement. Now they can customize everything and negotiate their own contracts. o Videos come out of any data center. Not closest match or anything. If a video is popular enough it will move into the CDN.