Unit-I Notes
Unit-I Notes
UNIT I
INTRODUCTION
Introduction: Definition –Relation to computer system components –Motivation –Relation to
parallel systems – Message-passing systems versus shared memory systems –Primitives for
distributed communication –Synchronous versus asynchronous executions –Design issues and
challenges. A model of distributed computations: A distributed program –A model of distributed
executions –Models of communication networks –Global state – Cuts –Past and future cones of an
event –Models of process communications.
1.1. Introduction:
1.1.1. Definition:
A distributed system is one in which components located at networked computers
communicate and coordinate their actions only by passing messages.
A distributed system consists of a collection of autonomous computers, connected through a
network and distribution middleware, which enables computers to coordinate their activities and to
share the resources of the system, so that users perceive the system as a single, integrated
computing facility.
Centralised System Characteristics
One component with non-autonomous parts.
Component shared by users all the time.
All resources accessible.
Software runs in a single process.
Single Point of control.
Single Point of failure.
Distributed System Characteristics
Multiple autonomous components.
Components are shared by all the users.
Resources may not be accessible.
Multiple points of control and failure.
Software runs in concurrent processes on different processor.
Common Characteristics Certain common characteristics can be used to assess distributed
systems
Resource Sharing.
Openness.
Concurrency.
Scalability.
Fault Tolerance.
Transparency
Issues in distributed systems
Concurrency
Distributed system function in a heterogeneous environment. So adaptability is a major
issue.
Latency
Memory Considerations: The distributed system works on both local and shared memory.
1
Synchronization issues.
Since they are widespread security is a major issue.
Limits imposed on scalability.
They are less transparent.
QOS parameters
Performance
Reliability
Availability
Security.
Features and Consequences
⚫ No Common Physical Clock:
When programs need to cooperate they coordinate their actions by exchanging
messages. Close coordination often depends on a shared idea of the time at which the
programs’ actions occur. But it turns out that there are limits to the accuracy with which the
computers in a network can synchronize their clocks – there is no single global notion of the
correct time. This is a direct consequence of the fact that the only communication is by
sending messages through a network.
⚫ No shared Memory:
⚫ Distributed system provide the abstraction of a common address space via
distributed shared memory abstraction.
⚫ Autonomy and Heterogeneity
⚫ Processors are loosely coupled
⚫ They have different speed and each can be running a different OS.
⚫ They are not a part of dedicated systems.
⚫ They are cooperate with one another by offering services or solving a problem
jointly.
⚫ Concurrency
In a network of computers, concurrent program execution is the norm. The capacity
of the system to handle shared resources can be increased by adding more resources (for
example. computers) to the network. The coordination of concurrently executing programs
that share resources is also an important one.
⚫ Independent Failure
All computer systems can fail, and it is the responsibility of system designers to plan
for the consequences of possible failures. Distributed systems can fail in new ways. Faults in
the network result in the isolation of the computers that are connected to it, but that doesn’t
mean that they stop running. In fact, the programs on them may not be able to detect
whether the network has failed or has become unusually slow. Similarly, the failure of a
computer, or the unexpected termination of a program somewhere in the system (a crash), is
not immediately made known to the other components with which it communicates Each
component of the system can fail independently, leaving the others still running.
⚫ Geographical Separation:
⚫ The entities are geographically distributed.
⚫ It is not necessary for the processors to be wide-area network (WAN)
⚫ Network of Workstations (NOW) or Cluster of Workstations are also considered to
be distributed systems.
2
⚫ NOW configuration is to become popular because of low cost high speed off-the self
processor is available.
⚫ Google Search engine is based on NOW architecture
.
1.1.2. Relation to computer system components
The following figure1 shows the typical distributed system model. Each computer has a
processor (CPU), local memory and interface.All computers are connected by communication
network. Communication between 2 or more node is only by passing messages. There is no
common memory is available.The distributed system uses a layered architecture to break down the
complexity of system design. Each computer has memory processing unit and the computers are
connected by a communication network. All the computers can communicate often through LAN
and WAN. A distributed system is an information-processing system that contains a number of
independent computers that cooperate with one another over a communication network in order to
achieve a specific objective.
Usually, distributed systems are asynchronous, i.e., they do not use a common clock and do
not impose any bounds on relative processor speed or message transfer times. Difference between
the various computers and the ways in which they communicate are mostly hidden from users.
The figure 2 shows the relationship of the software components running on each computers
and the use of local operating systems and network protocol stack for functioning.Distributed
software is also known as middlewares. Users and applications can interact with a distributed
system in a consistent and uniform way, regardless of where and when interaction takes place. Each
host executes components and operates a distribution middleware. Distributed execution is the
execution of the processes across the distributed systems to collaboratively achieve a common goal.
Middleware enables the components to co-ordinate their activities. Users perceive the system as a
single, integrated computing facility. A distributed system can consists of any number of possible
configurations, such as mainframes, personal computers, workstations, mini computers and so on.
3
Figure 2: Relationship between software components, local OS and Network
protocol stack
Several libraries are exist to choose from to invoke primitive for the most common functions
such as reliable and ordered multicasting of middleware layer.Several standard middlewares are
exists:
⚫ Object Management Group’s (OMG)
⚫ Common Object Request Broker Architecture (CORBA)
⚫ Remote Procedure Call (RPC)
⚫ Some commercial standard middleware:
⚫ CORBA
⚫ Distributed Component Object Model (DCOM)
⚫ RMI (Remote Method Invocation)
⚫ Message Passing Interface (MPI)
1.1.3. Motivation
⚫ Inherently distributed computation
⚫ Most of the applications are geographically distant, the computation is inherently
distributed.
⚫ Examples
⚫ Money transfer in banking
⚫ Reaching consensus among parties.
⚫ Resource sharing
⚫ Resources can not be fully replicated at sites
⚫ Because resources are often neither practical nor cost-effective.
⚫ Further they can’t be placed at a single site because access to that site might prove to
be bottleneck.
⚫ Hence, such resources are distributed across the systems.
⚫ Some of the resources are: Peripherals, complete data set in database, special
libraries and etc.
⚫ Access to geographically remote data and resources
⚫ Many scenarios , the data can not be replicated in every sites.
⚫ Because, it may be too large or too sensitive to be replicated.
⚫ For example,
4
⚫ Payroll data within multinational corporation is both too large and too
sensitive to be replicated at every branch office/site.
⚫ Hence, such data stored in central database
⚫ Similarly, special resources such as supercomputer exist only in certain locations
⚫ Enhanced reliability
⚫ DS has inherent potential to provide increased reliability
⚫ Because of the possibility of replicating resources and execution
⚫ In reality, geographically distributed resources are not likely to crash/malfunction at
the same time under normal circumstances.
⚫ It has several aspects:
⚫ Availability – resources should be accessible at all time
⚫ Integrity – the value/ state of the resource should be correct
⚫ Fault tolerance – ability to recover from system failure.
⚫ Increased Performance/Cost ratio
⚫ The performance/cost ratio is increased.
⚫ Even though, higher throughput has not necessarily been the main objective behind
the DS, any task can be partitioned across the various computers in DS.
⚫ It will provide better performance/cost ratio than using special parallel mechanism.
⚫ It is practically true for NOW configuration.
⚫ Scalability
⚫ Adding more processor does not pose a direct bottleneck for communication
network.
⚫ Distributed system can be extended through the addition of components, thereby
providing better scalability compared to centralized systems.
⚫ Speed
⚫ A distributed system may have more total computing power than a mainframe.
⚫ Reliability
⚫ If one machine crashes, the systems as a whole can still survive. It gives higher
availability and improved reliability.
⚫ Economics
⚫ A collection of micro processors offer a better price/performance than main frame.
⚫ Low price/ performance ratio is the cost effective way to increase computing power
⚫ Incremental growth
⚫ Computing power can be added in small increments.
⚫ Modularity and incremental expandability
⚫ Heterogeneous processors may be easily added into the system without affecting the
performance.
⚫ Constraint: those processors are running the same middleware algorithm.
⚫ Similarly, existing processors may be easily replaced by other processors.
1.1.4. Relation to Parallel Systems
Characteristics of Parallel Systems
Multiprocessor Systems:
It is a parallel system. It has direct access to shared memory which forms common address
space. This processor do not have common clock. The figure 3 shows the architecture of
Multiprocessor system (UMA-Uniform Memory Access). Multiprocessor systems are corresponds
to UMA architecture. The access latency, i.e. waiting time, to complete an access to any memory
5
location from any processor is the same. The processors are in very close in proximity. They are
connected by an interconnection network. Inter process communication across processors are
occurred across on shared memory (through read and write operation) & MPI is also possible.
Mostly, bus interconnection topology is used for access the memory. Even though, it is more
efficient, it is usually a multistage switch with symmetric and regular design. Two popular
interconnection networks are:
⚫ Omega network
⚫ Butterfly network
Omega Networks
The figure 4 shows the 3-stage omega network interconnection. They are multi-staged network
formed of 2 x 2 switching elements. Each 2 x 2 switch allows data on either of the 2 input wires to
be switched to the upper or lower output wires. Only one data unit can be sent on an output wire.
Collision will occurs if data from both the input wires is to be routed to the same output wire in a
single step. Collision can be addressed by buffering.Each 2 x 2 switches is represented as a
rectangle. n-input and n-output network uses log n stages and log n bits for addressing. Routing in 2
x 2 switch at stage k uses only k th bit. The multi-stage networks can be expressed using an iterative
or recursive generating function.
6
Figure 4. Three Stage Omega Network
For above example omega network, n=8 In any stage, for the outputs i, where
0 ≤ i ≤ 3, the output i is connected to input 2i of the next stage. For 4 ≤ i ≤ 7, the output i of
any stage is connected to input 2i+1−n of the next stage.
Routing Function:
If S+1th MSB of j=0, then route on upper wireElse [s+1th MSB of j=1], then
route on lower wire.
Butterfly Networks
7
Figure 5. Three Stage Butterfly Network
Butterfly Interconnection Function
Here, interconnection pattern between a pairs of adjacent stages depends not only on
n but also on the stage s. But in omega network which depends only on n.The recursive
function as
if the s+1th MSB of j is 0, the data is routed to the upper output wire, otherwise it is
routed to the lower output wire.
The figure 6 shows the Non-Uniform memory Access Multiprocessor system. It is a parallel
system.It do not have direct access to shared memory.It may or may not form common address
space.Also do not have any common clock.The processors are in in close physical proximityThey
are usually tightly coupled.Processor can communicated via either common address space or
message passing.
8
Figure 6 .Architecture of NUMA Multiprocessor System
Figure 7 shows a wrap-around 4×4 mesh. For a k×k mesh which will contain K2
processors.The maximum path length between any two processors is 2(k/2−1). Routing can be done
along the Manhattan grid.
9
Figure 8 .4D- Hypecube Interconnection
Hamming distance is defined as the number of bit positions in which the two equal sized bit
strings differ. This is clearly bounded by k.Nodes 0101 and 1100 have a Hamming distance of
2.The shortest path between them has length 2.Routing in the hypercube is done hop-by-hop. At any
hop, the message can be sent along any dimension corresponding to the bit position in which the
current node’s address and the destination address differ.
Array Processors:
Array processors belong to a class of parallel computers .They are physically co-located,
are very tightly coupled, and have a common system clock. But may not share memory and
communicate by passing data using messages.Array processors and systolic arrays that perform
tightly synchronized processing and data exchange in lock-step for applications such as DSP and
image processing belong to this category. These applications usually involve a large number of
iterations on the data. This class of parallel systems has a very niche market.
Flynn’s Taxonomy
Based on the no of instruction stream and data stream used, flynn’s classify the architecture
in to 4:
10
This mode corresponds to the processing by multiple homogenous processorsThey
execute in lock-step on different data items. Applications that involve operations on large
arrays and matrices, such as scientific applications.It provides the SIMD mode of operation
because the data sets can be partitioned easily. Several of the earliest parallel computers:
⚫ Vector processors, array processors’ and systolic arrays also belong to the
SIMD class of processing.
In this mode, the various processors execute different code on different data. This is
the mode of operation in distributed systems as well as in the vast majority of parallel
systems. There is no common clock among the system processors. Examples:
⚫ IBM SP machines
11
SIMD MIMD
Stands for Single Instruction stream Stands for multiple Instruction stream
multiple data stream multiple data stream
Coupling:
Concurrency of a Program:
This is a broader term that means roughly the same as parallelism of a program. But,
it is used in the context of distributed programs.The parallelism/ concurrency in a
parallel/distributed program can be measured by the ratio of the number of local (non-
communication and non-shared memory access) operations to the total number of operations,
including the communication or shared memory access operations.
Granularity:
12
called synchronization latency.Computational granularity and communication latency are closely
related.
13
⚫ A Pi–Pj message-passing can be emulated by a write by Pi to the mailbox and then a
read by Pj from the mailbox.
⚫ In the simplest case, these mailboxes can be assumed to have unbounded size.
⚫ The write and read operations need to be controlled using synchronization primitives
to inform the receiver/sender after the data has been sent/received.
14
o Both the sender and receivers are blocked until message is delivered.
o This is called rendezvous.
o It allows for tight synchronization between process
15
1.1.7. Synchronous versus asynchronous executions
Synchronous execution – features:
Lower and upper bounds on execution time of processes can be set.
Transmitted messages are received with a known bound time
Drift rates between local clock have a known bound.
Synchronous execution – Important Consequences:
⚫ There is a notion of global physical time with a known relative precision depending
on the drift rate.
⚫ The system have a predictable behaviour in terms of timing.
⚫ Only such systems can be used for hard real-time applications
⚫ It is possible and safe to use timeouts in order to detect failures of process or
communication link.
Asynchronous execution:
16
1.1.8. Design issues and challenges
Challenges from System Perspectives:
⚫ Communication mechanisms
⚫ Processes
⚫ Naming
⚫ Synchronization
⚫ Data storage and access
⚫ Consistency and replication
⚫ Distributed systems security
⚫ Communication mechanisms
⚫ It involves designing appropriate mechanism for communication among the process
in the network
⚫ Example: RPC, Remote Object Invocation (ROI), Message-oriented vs stream-
oriented communication
⚫ Process
⚫ Issue involved are code migration, process/thread management at clients and servers,
design of software and mobile agents
⚫ Naming:
⚫ Easy to use identifiers needed to locate resources and processes transparently and
scalable.
⚫ Synchronization :
⚫ Mechanism for synchronization or coordination among the processes are essential.
⚫ Mutual exclusion is the classical example of synchronization
⚫ Data storage and access:
⚫ Various schemes for data storage, searching and lookup should be fast and scalable
across network
⚫ Consistency and replication:
⚫ To avoid bottleneck, to provide fast accesses to data provide replication of fast
access, scalability.
⚫ Require consistency management among replicas.
⚫ Distributed system security:
⚫ Secure channels, access control, key management, authorization, secure group
management are the various method used to provide security.
⚫ Designing the distributed system does not come for free.
⚫ Some challenges need to be overcome.
⚫ Transparency
⚫ Openness
⚫ Heterogeneity
⚫ Scalability
⚫ Security
⚫ Failure handling
⚫ Concurrency
17
Scalability
Security Failure
Heterogeneity Transparency
Transparency
Transparency is defined as the hiding of the separation of components in a
distributed systems from the user and the application programmer.
With transparency the system is perceived as a whole rather than a collection of
independent components.
Transparency is an important goal.
Transparency Description
Hide differences in data representation and
Access
how a resource is accessed
Location Hide where a resource is located
Hide that a resource may move to another
Migration
location
Hide that a resource may be moved to
Relocation
another location while in use
Replication Hide that a resource is replicated
Hide that a resource may be shared by
Concurrency
several competitive users
Failure Hide the failure and recovery of a resource
Movement of resources and clients with a
Mobility system without affecting the operation of
users and programs, e.g., mobile phone
Allows the system to be reconfigured to
Performance
improve performance as loads vary
Allows the system and applications to
expand in scale without change to the
Scaling
system structure or the application
algorithms
18
Goal of a distributed systems:
o To connect users and resources in a transparent, open and scalable way.
Advantages of Transparency:
o Easier for the user
o Doesn’t have to bother with system topography
o Doesn’t have to know about changes
o Easier to understand
o Easier for the programmer
Disadvantages of Transparency
o Optimization cannot be done by programmer or user
o Strange behavior when the underlying system fails
o Underlying system can be very complete.
Openness
Openness is the characteristic that determines whether the system can be
extended.
It refers to the ability of plug and play.
Open DS: offers services according to standard rules that describe the syntax and
semantics of those services.
In DSs, services are generally specified through interfaces, which are often
described in an Interface Definition Language (IDL).
Here, interface should be open.
i.e. it should be standardized
E.g., internet protocol documents RFC (Request for Comments)
Heterogeneity
Heterogeneous components that must be able to interoperate, apply to all of the
following:
o Networks
o Hardware architectures
o Operating systems
o Programming language
19
Examples that mask differences in network, operating systems, hardware and
software to provide heterogeneity are
o Middleware
o Mobile code
o Virtual Machine
Middleware
Middleware applies to a software layer.
Middleware provides a programming abstraction.
Middleware masks the heterogeneity of the underlying networks, hardware,
operating systems and programming languages.
The Common Object Request Broker (CORBA) is a middleware example.
Mobile code
Mobile code is the code that can be sent from one computer to another and
run at the destination.
Java applets are the example of mobile codes.
Virtual machine
Virtual machine provides a way of making code executable on any hardware.
Scalability
A system is described as scalable if it will remain effective when there is a
significant increase (or decrease) in the number of resources and the number of
users.
Scalability presents a number of challenges such as
o Controlling the cost of physical resources
o Controlling the performance loss – set of data whose size is proportional
to the number of users or resources in the system
o Preventing software resources running out, etc.-eg. IPV4
o Avoiding Performance bottleneck – e.g. DNS
Caching and Replication in web are examples of providing Scalability.
Security
Security of a computer system is the characteristic that the resources are
accessible to authorized users and used in the way they are intended.
Security for information resources has three components:
Confidentiality
o Protection against disclosure to unauthorized individual.
Integrity
o Protection against alteration or corruption.
20
Availability
o Protection against interference with the means to access the resources.
Failure Handling
Failures in distributed systems are partial, that is some components fail while
others continue to function.
Techniques for dealing with failures:
Detecting failures
o E.g. Checksums
Masking failures
o E.g. Retransmission of corrupt messages
o E.g. File redundancy
Tolerating failures
o E.g. Exception handling
o E.g. Timeouts
Recovery from Failure
o E.g. Rollback mechanisms
Redundancy
o E.g. Redundant components
Concurrency
Concurrency is the ability of different parts or units of a program, algorithm, or
problem to be executed out-of-order or in partial order, without affecting the
final outcome.
With concurrency, services and applications can be shared by clients in a
distributed system.
For an object to be safe in a concurrent environment, its operations must be
synchronized in such a way that its data remains consistent.
Concurrency can be achieved by standard techniques such as semaphores, which
are used in most operating systems.
21
P1
P2
P5
P3
P4
⚫ Let 𝑒�
⚫ Message receive events.
𝑥
denotes the xth event at process pi.
⚫ The occurrence
� of events changes the states of respective processes and channels.
⚫ An internal event changes the state of the process at which it occurs.
⚫ A send event changes the state of the process that sends the message and the state of the
channel on which the message is sent.
⚫ A receive event changes the state of the process that receives the message and the state of the
channel on which the message is received.
⚫ The events at a process are linearly ordered by their order of occurrence.
⚫ The execution of process pi produces a sequence of events e1i, e2i, ..., exi, ex+1 ,i ... and is denoted
by Hi where,
⚫ Hi = (hi , →i )
⚫ hi is the set of events produced by pi and
⚫ binary relation →i defines a linear order on these events.
⚫ Relation →i expresses causal dependencies among the events of pi .
⚫ The send and the receive events signify the flow of information between processes and establish
causal dependency from the sender process to the receiver process.
⚫ A relation →msg that captures the causal dependency due to message exchange, is defined as
follows.
⚫ For every message m that is exchanged between two processes, we have,
⚫ send (m) →msg rec (m)
22
⚫ Relation →msg defines causal dependencies between the pairs of corresponding send and
receive events.
⚫ The evolution of a distributed execution is depicted by a space-time diagram.
⚫ Horizontal line represents progress of the process
⚫ Dot represents event
⚫ Slant arrow represents message transfer
⚫ Since we assume that an event execution is atomic (hence, indivisible and instantaneous), it is
justified to denote it as a dot on a process line.
⚫ In the Figure, for process p1, the second event is a message send event, the third event is an
internal event, and the fourth event is a message receive event.
1 2
e e3
4 5
e 1
e1 e1
1
p 1
6
e21 e22 e23 e24 e2
p
2
e25
e31 e33
p
3
2 4
e3 e3
time
⚫ The causal precedence relation induces an irreflexive partial order on the events of a distributed
computation that is denoted as H=(H, →).
⚫ Note that the relation → is nothing but Lamport’s “happens before” relation.
⚫ For any two events eiand ej, if ei→ ej, then event ejis directly or transitively dependent on event
e i.
⚫ The relation → denotes flow of information in a distributed computation
⚫ ei→ ej - all 1the information available at eiis potentially accessible at ej.
⚫ Example: e →e3 and e3 → e6 .
1 3 3 2
⚫ Event e6 has the knowledge of all other events
⚫ ei→ ej - event ejis does not directly or transitively dependent on event ei. So, these events are
concurrent events denoted as ei||ej
⚫ Example: e3 → e3 and e4 → e1 .
1 3 2 3
⚫ i.e., event ei does not casually affect ej
⚫ In this case, event ejis not aware of the execution of eior any event executed after eion the same
process.
⚫ Two Rules
23
⚫ The relation ǁ is not transitive; that is, (eiǁ ej)𝖠 (ejǁ ek) ⇒eiǁ ek.
⚫ For any two events eiand ejin a distributed execution, ei→ ejor ej→ ei, or eiǁ ej.
Logical vs. Physical Concurrency
⚫ In a distributed computation, two events are logically concurrent if and only if they do not
causally affect each other.
⚫ Physical concurrency, on the other hand, has a connotation that the events occur at the same
instant in physical time.
⚫ Two or more events may be logically concurrent even though they do not occur at the same
instant in physical time.
⚫ However, if processor speed and message delays would have been different, the execution of
these events could have very well coincided in physical time.
⚫ Whether a set of logically concurrent events coincide in the physical time
or not, does not change the outcome of the computation.
⚫ Therefore, even though a set of logically concurrent events may not have occurred at the same
instant in physical time, we can assume that these events occurred at the same instant in
physical time.
⚫ Causally ordered delivery of messages implies FIFO message delivery. (Note that CO ⊂ FIFO
delivered in an order that is consistent with their causality relation.
⊂ Non-FIFO.)
⚫ Causal ordering model considerably simplifies the design of distributed algorithms because it
provides a built-in synchronization.
24
⚫ A receive event changes the state of the process that or receives the message and the state of
the channel on which the message is received.
⚫ Global state of distributed computation is the set of local state of all individual processes
involved in the computation plus the state of the communication channel.
Requirements of Global States:
Distributed Garbage Collection
Distributed Deadlock Detection
Distributed termination Detection
Distributed Debugging
Requirements of Global States - Distributed Garbage Collection:
⚫ An object is considered to be garbage if there are no longer any references to it anywhere in the
distributed system.
⚫ The memory taken up by that object can be reclaimed once it is known to be garbage.
⚫ To check that an object is garbage, we must verify that there are no references to it anywhere in
the system.
⚫ Consider the fig.,
⚫ Process p1 has two objects that both have references – one has a reference within p1
itself, and p2 has a reference to the other.
⚫ Process p2 has one garbage object, with no
references to it anywhere in the system.
⚫ It also has an object for which neither p1 nor p2 has a reference, but there is a reference to it
in a message that is in transit between the processes.
⚫ This shows that when we consider properties of a system, we must include the state of
communication channels as well as the state of the processes.
25
Requirements of Global States - Distributed termination detection:
⚫ Thus, channel state denotes all messages that pi sent upto event ex and which process pjhad
not received until event ey.
⚫ The global state of a distributed system is a collection of the local states of the processes and
the channels.
⚫ Notationally, global state GS is defined as,
⚫ For a global state to be meaningful, the states of all the components of the distributed system
must be recorded at the same instant.
26
⚫ This will be possible if the local clocks at processes were perfectly synchronized or if there
were a global system clock that can be instantaneously read by the processes. (However,
both are impossible.)
27
⚫ If at the time of balance determination, the balance form branch A is in transit to Branch B.
⚫ The result is a false reading
⚫ All messages in the transmit must be examined at the time of observation.
⚫ total consists of balance at both branches and amount in message.
⚫ Each amount sent and just received, must be added only one time.
⚫ If the clock at the 2 branches are not perfectly synchronized, then the transfer amount at 3.01
from branch A
⚫ The amount arrives at branch B at 2.59
⚫ At 3.00 the amount is counted twice.
28
⚫ Global history of Si is union of the individual process histories:
⚫ H=ho Մ h1 Մ …Մ hn-1
⚫ Prefix Union
C1 c2 cn
⚫ C= h1 Մ h2 …U hn
⚫ The set of events {eici:i=1,2,..N} is called the frontier of the cut.
⚫ Cut: Cuts of systems execution is a subset of its global history.
⚫ Inconsistent cut
⚫ A Inconsistent cut can violate temporal causality
⚫ At p2, it includes the receipt of the message m1 but at p1 it does not includes the
sending of the message.
⚫ Consistent Cut
⚫ A consistent cut cannot violate temporal causality
29
Steps:
4. Else (Pi has already recorded its state)
a. Pi records the state of Chji as the set of all messages it
has received over Chji since it saved its state.
1.2.6.
Past and Future Cones of an Event
⚫An event ej could have been affected only by all events ei such that ei → ej .
⚫In this situation, all the information available at ei could be made accessible atej .
⚫All such events ei belong to the past of ej .
⚫Let Past(ej) denote all events in the past of ej in a computation (H, →). Then,
⚫ Past(ej ) = {ei |∀ei∈ H, ei → ej }.
⚫ Figure (next slide) shows the past of an event ej .
30
1.3. Logical Time
1.3.1. Introduction
We require computers around the world to timestamp electronic commerce transactions
consistently. Time is also an important theoretical construct in understanding how distributed
executions unfold. But time is problematic in distributed systems. Each computer may have its own
physical clock, but the clocks typically deviate, and we cannot synchronize them perfectly. The
absence of global physical time makes it difficult to find out the state of our distributed programs as
they execute. We often need to know what state process A is in when process B is in a certain state,
but we cannot rely on physical clocks to know what is true at the same time.
Time is an important and interesting issue in distributed systems, for several reasons. First,
time is a quantity we often want to measure accurately. In order to know at what time of day a
particular event occurred at a particular computer it is necessary to synchronize its clock with an
authoritative, external source of time.
Algorithms that depend upon clock synchronization have been developed for several
problems in distribution. These include maintaining the consistency of distributed data, checking
the authenticity of a request sent to a server.
Measuring time can be problematic due to the existence of multiple frames of reference. The
relative order of two events can even be reversed for two different observers. But this cannot
happen if one event causes the other to occur the physical effect follows the physical cause for all
observers, although the time elapsed between cause and effect can vary. The timing of physical
events was thus proved to be relative to the observer.
31
Clock drift: they count time at different rates and so diverge (frequencies of oscillation differ).
Clock drift rate: the difference per unit of time from some ideal reference clock
– Ordinary quartz clocks drift by about 1 sec in 11-12 days. (10-6 secs/sec).
– High precision quartz clocks drift rate is about 10-7 or 10-8 secs/sec
⚫ Definition
⚫ A logical clocks consists :
⚫ Time domain T
⚫ Logical clock C .
⚫ Elements of T form a partially ordered set over a relation <.
⚫ Relation < is called the happened before or causal precedence.
⚫ This relation is similar to “earlier than” relation in physical time.
⚫ The logical clock C is a function.
32
⚫ It maps an event e in a distributed system to an element in the time domain T.
⚫ It is denoted as C(e) – called as time stamp of e & is defined as follows:
⚫ C : H ›→ T
⚫ such that the following property is satisfied:
⚫ for two events ei and ej , ei → ej =⇒ C(ei ) < C(ej ).
⚫ This monotonicity property is called the clock consistency condition.
⚫ for two events ei and ej , ei → ej ⇔ C(ei ) < C(ej ) the system of clocks is said to be
⚫ When T and C satisfy the following condition,
strongly consistent.
⚫ Implementing Logical Clocks
⚫ Implementation of logical clocks requires to address two issues:
⚫ Local data structures to represent logical time in every process
⚫ Protocol to update the data structures to ensure the consistency condition.
⚫ Each process pi maintains data structures that allow it the following two
capabilities:
⚫ A local logical clock(lci )
⚫ It helps process pi measure its own progress.
⚫ A logical global clock(gci )
⚫ It is a representation of process pi ’s local view of the logical global
time.
⚫ Typically, lci is a part of gci .
⚫ The protocol ensures that a process’s logical clock, and its view of the global time, is
managed consistently.
⚫ The protocol consists of the following two rules:
⚫ R1: This rule governs how the local logical clock is updated by a
progress.
⚫ Systems of logical clocks differ in their representation of logical time and also in the
protocol to update the logical clocks.
33
Ci := max (Ci , Cmsg )
Execute R1.
Deliver the message.
o Fig show the evolution of scalar time.
34
Lamport’s Algorithm
Basic Properties
Consistency Property
⚫ Scalar clocks satisfy the monotonicity and hence the consistency property: for two
events ei and ej , ei → ej =⇒ C(ei ) < C(ej ).
Total Ordering
⚫ Scalar clocks can be used to totally order events.
⚫ The main problem in totally ordering events is that two or more events at different
processes may have identical timestamp.
⚫ For example in the previous Figure , the third event of process P1 and the second
event of process P2 have identical scalar timestamp.
⚫ A tie-breaking mechanism is needed to order such events.
⚫ A tie is broken as follows:
⚫ Process identifiers are linearly ordered.
⚫ Tie among events with identical scalar timestamp is broken on the basis of
their process identifiers.
⚫ The lower the process identifier in the ranking, the higher the priority.
⚫ The timestamp of an event is denoted by a tuple (t, i )
⚫ t - time of occurrence
⚫ The total order relation ≺ on two events x and y with timestamps (h,i) and
⚫ i- identity of the process where it occurred.
⚫ Event counting
⚫ If the increment value d is always 1, the scalar time has the following interesting
property:
⚫ if event e has a timestamp h, then h-1 represents the minimum logical
duration, counted in units of events, required before producing the event e;
⚫ We call it the height of the event e.
⚫ In other words, h-1 events have been produced sequentially before the event e
regardless of the processes that produced these events.
⚫ For example, in previous figure, five events precede event b on the longest causal
path ending at b.
⚫ No Strong Consistency
⚫ Scalar clocks is not strongly consistent.
35
⚫ i.e., for two events ei and ej , C(ei ) < C(ej ) ƒ=⇒ ei → ej .
⚫ For example, in above figure, the third event of process P1 has smaller scalar
timestamp than the third event of process P2.
⚫ However, the former did not happen before the latter.
⚫ The reason that scalar clocks are not strongly consistent is that the logical local
clock and logical global clock of a process are squashed into one, resulting in the
loss causal dependency information among events at different processes.
⚫ For example, in the above Figure, when process P2 receives the first message from
process P1, it updates its clock to 3, forgetting that the timestamp of the latest event
at P1 on which it depends is 2.
36
⚫ x → y ⇔ vh[i ] ≤ vk [i ]
⚫ x ǁ y ⇔ vh[i ] > vk [i ] 𝖠 vh[j] < vk [j]
⚫ Isomorphism
⚫ If events in a distributed system are timestamped using a system of vector clocks,
we have the following property.
⚫ x → y ⇔ vh < vk x ǁ y ⇔ vh ǁ vk .
⚫ If two events x and y have timestamps vh and vk, respectively, then
⚫ Thus, there is an isomorphism between the set of partially ordered events produced
by a distributed computation and their vector timestamps.
⚫ Strong Consistency
⚫ The system of vector clocks is strongly consistent; thus, by examining the vector
timestamp of two events, we can determine if the events are causally related.
⚫ However, Charron-Bost showed that the dimension of vector clocks cannot be less
than n, the total number of processes in the distributed computation, for this
property to hold.
⚫ Event Counting
⚫ If d=1 (in rule R1), then the i th component of vector clock at process pi , vti [i ],
denotes the number of events that have occurred at pi until that instant.
⚫ So, if an event e has timestamp vh, vh[j] denotes the number of events executed by
process pj that causally precede e. Clearly, Σ vh[j] − 1 represents the total number of
events that causally precede e in the distributed computation.
⚫ Efficient Implementations of Vector Clocks
⚫ If the number of processes in a distributed computation is large, then vector clocks
will require piggybacking of huge amount of information in messages.
⚫ The message overhead grows linearly with the number of processors in the system
and when there are thousands of processors in the system, the message size becomes
huge even if there are only a few events occurring in few processors.
⚫ We discuss an efficient way to maintain vector clocks.
⚫ Charron-Bost showed that if vector clocks have to satisfy the strong consistency
property, then in general vector timestamps must be at least of size n, the total
number of processes.
⚫ However, optimizations are possible and next, and we discuss a technique to
implement vector clocks efficiently.
37
Clock synchronization has a significant effect on many problems:
o Secure systems
o Fault diagnosis and recovery
o Scheduled operations
o Database systems
o Real-world clock values.
Clock synchronization is the process of ensuring that physically distributed processors have
a common notion of time.
Periodically a clock synchronization must be performed to correct the clock skew, if the
clocks at various sites may diverge with time.
Because of different clock rate in clock, it may be varied.
Clocks are synchronized to an accurate real-time standard like UTC (Universal Coordinated
Time).
Clocks that must not only be synchronized with each other but also have to adhere to
physical time are termed physical clocks.
Coordinated Universal Time (UTC):
o UTC is an international standard for time keeping
o It is based on atomic time, but occasionally adjusted to astronomical time
o International Atomic Time is based on very accurate physical clocks (drift rate 10-
13 ms).
o It is broadcast from radio stations on land and satellite (e.g. GPS).
o Computers with receivers can synchronize their clocks with these timing signals (by
requesting time from GPS/UTC source).
o Signals from land-based stations are accurate to about 0.1-10 millisecond – Signals
from GPS are accurate to about 1 microsecond.
⚫ Definitions and Terminology
⚫ Let Ca and Cb be any two clocks.
⚫ Time: The time of a clock in a machine p is given by the function Cp (t),
where Cp (t) = t for a perfect clock.
⚫ Frequency: Frequency is the rate at which a clock progresses. The
frequency at time t of clock Ca is Ca (t).
⚫ Offset: Clock offset is the difference between the time reported by a clock
and the real time. The offset of the clock Ca is given by Ca (t) − t. The offset
of clock Ca relative to Cb at time t ≥ 0 is given by Ca(t) − Cb (t).
⚫ Skew: The skew of a clock is the difference in the frequencies of the clock
and the perfect clock. The skew of a clock Ca relative to clock Cb at time t
⚫ Clock Inaccuracies
⚫ Physical clocks are synchronized to an accurate real-time standard like UTC
(Universal Coordinated Time).
⚫ However, due to the clock inaccuracy discussed above, a timer (clock) is said to be
working within its specification if (where constant ρ is the maximum skew rate
specified by the manufacturer.)
1 − ρ ≤ dc/dt≤ 1 + ρ
38
⚫ Offset delay estimation method
⚫ The Network Time Protocol (NTP) which is widely used for clock synchronization
on the Internet uses the Offset Delay Estimation method.
⚫ The design of NTP involves a hierarchical tree of time servers.
⚫ The primary server at the root synchronizes with the UTC.
⚫ The next level contains secondary servers, which act as a backup to the
primary server.
⚫ At the lowest level is the synchronization subnet which has the clients.
⚫ strata: the hierarchy level
θ= a + b/ 2 δ= a−b
⚫ Thus, both peers A and B can independently calculate delay and offset using a single
bidirectional message stream as shown in Figure .
39
⚫ A pair of servers in symmetric mode exchange pairs of timing messages.
⚫ A store of data is then built up about the relationship between the two servers (pairs of
offset and delay).
⚫ Specifically, assume that each peer maintains pairs (Oi ,Di ), where
⚫ Oi - measure of offset (θ)
⚫ Di - transmission delay of two messages (δ).
⚫ The offset corresponding to the minimum delay is chosen.
⚫ Specifically, the delay and offset are calculated as follows.
⚫ Assume that message m takes time t to transfer and m′ takes t′ to transfer.
⚫ The offset between A’s clock and B’s clock is O. If A’s local clock time is
⚫ A(t) and B’s local clock time is B(t), we have
⚫ A(t) = B(t) + O
⚫ Then,
⚫ Ti −2 = Ti −3 + t + O
⚫ Ti = Ti −1 − O + t
⚫ Assuming t = t′, the offset Oi can be estimated as:
⚫ Oi = (Ti −2 − Ti −3 + Ti −1 − Ti )/2
⚫ The round-trip delay is estimated as:
⚫ Di = (Ti − Ti −3) − (Ti −1 − Ti −2)
⚫ The eight most recent pairs of (Oi , Di ) are retained.
⚫ The value of Oi that corresponds to minimum Di is chosen to estimate O.
40