Mod 3 Part 1
Mod 3 Part 1
System
▪ Clock synchronization
▪ Event Ordering
▪ Mutual Exclusion
▪ Election algorithm
▪ Every computer needs a timer(computer clock) to
keep track of the current time.
▪ In a distributed system an application may have
processes that concurrently run on multiple nodes
of the system.
▪ For correct results it is required that the clocks of
the nodes are synchronized with each other.
▪ Eg : to find the time taken for a message to
transmit from one node to another.
▪ Correct results impossible if the clocks are not
synchronized.
▪ Assume case 1 : Clocks are synchronized
▪ Problem to find the time taken to transmit a message from
sender to receiver.
▪ Let current time be on the sender be10:15.
▪ Since clocks are perfectly synchronized current time on the
receiver is also 10.15.
▪ Message is transmitted from the sender with a
timestamp10.15.
▪ Message is received at the receiver’s node at 10.17(according
to the receiver’s clock).
▪ Time taken = 2 min.
▪ Assume case 2 : Clocks are not synchronized
▪ Also assume same sender and same receiver.
▪ Let current time on the sender be10:15.
▪ Since clocks are not synchronized , current time on
the receiver is 10.16.(Receiver is 1 min. ahead
of the sender)
▪ Message is transmitted from the sender with a
timestamp10.15.
▪ Message is received at the receiver’s node(after 2 min.)
at 10.18 (according to the receiver’s clock).
▪ Time taken is incorrectly computed to be 3 min.
Reason : CLOCKS ARE NOT SYNCHRONIZED
How are computer clocks implemented?
A computer clock consists of 3 components :
1. Quartz crystal that oscillates at a well-defined
frequency. Oscillation depends on the kind of crystal,
how it is cut etc.
2. A constant register – Used to store a constant value.
3. A counter register – Used to keep track of the
oscillation of the quartz crystal.
▪ Each oscillation of the crystal decrements the counter
register by one.
▪ When the counter reaches zero 2 events are generated
1. An interrupt is generated and one time unit is added
the the current time.
2. The counter register is reloaded with the value in the
constant register.
▪ Within a single CPU system, it does not matter much if
this clock is off by a small amount. Since all processes
used the same clock, they will be internally consistent.
▪ In a distributed system, with n CPU’s there are n crystals
and each one may oscillate at different frequencies
▪ Crystals running at different rates will results in clocks
gradually out of sync and give different value when read
out.
▪ This differences in time values of two clocks is
called clock skew.
▪ External synchronization: clocks are synchronized with
an authoritative, external clock called Coordinated
Universal Time (UTC).
▪ UTC is an international Standard
▪ Time zones around the world are expressed as positive
or negative offsets from UTC
▪ To provide UTC, National Institute of Standard Time
(NIST) operates a radio station WWV which broadcasts
time on several short wave frequencies.
▪ If a machine has a WWV receiver, all machines can be
synchronized to it.
▪ Hence a computer clock must be periodically
resynchronized with the real time to keep it non-faulty.
• Assume that when the UTC time is t, the time value
of a clock on machine p is Cp(t).
• If all clocks in the world were perfectly
synchronized, we would have-----→
Cp(t)=t for all p and all t
• In an ideal case dC/dt=1
dC/dt=drift rate.
• If the Maximum allowable drift rate is ρ then
a clock is said to be non-faulty if the following
condition holds
1- ρ ≤ dC/dt ≤ 1 + ρ
The relation between clock time and UTC when clocks tick at different
rates.
Let the maximum allowable drift rate ρ be 0.5
i.e in 1 min a drift of 30 sec is allowed
i.e after 1 min of UTC time a slow clock can show
either 30 sec and a fast clock can show 1 min 30
sec.
If current UTC time is 10.15. Assume at this point
all clock in a distributed system are synchronized.
After synchronization all clocks drift away from
UTC, some become slower than UTC and some
become faster than UTC
In a distributed system if one clock is slow and one is fast at
a time Δt after they were synchronized, the maximum
deviation between the time value of two clocks will be
=2ρΔt
Justification :
Let the current UTC time be 10.15, so are all clocks in the
system assuming they are synchronized.
Let ρ=0.5 .
After 2 min UTC = 10.17.
But a slow clock will show 10.16 and a fast clock will show
10.18
Hence difference in time value is 2 min(2*0.5*2)
Hence to guarantee that no 2 clocks in a set
of clocks ever differ by more than , the
clocks in the set must be resynchronized
periodically with the time interval between
two synchronization (Δt) less than or equal to
Δt = / 2ρ
1) Physical clocks
Centralized Algorithms
UTC
◦ Cristian’s Algorithm
◦ Berkeley Algorithm
Decentralized Algorithms
◦ Averaging Algorithms
2) Logical clocks
◦ Lamport’s clock
◦ Vector clocks
Assume one machine (the time server) has a WWV
receiver and all other machines are to stay
synchronized with it.
Every /2 seconds, each machine sends a
message to the time server asking for the
current time.
Time server responds with message
containing its current time, CUTC.
When the sender gets a reply, it can just
set its clock to CUTC.
Drawback of Cristian’s algorithm
Propagation delay
• It takes a non zero amount of time for the server’s
reply (CUTC) to get back.
• By the time the client receives the reply, time at the
server has moved ahead.
Solution – Measure the propagation delay.
• Record accurately the interval between sending the
request to the time server and the arrival of reply.
• Let T0 be the time at which the request is sent.
Server
a b c d
After 5 min. a b c d
a b c d
After 5 min.
a b c d
b=(-10-5+0+5)/4=-10/4= -2.5
d=(0+5+10+15)/4=30/4= +7.5
a is currently 10.25 , skew computed = +2.5
i.e a is ahead by 2.5.
Hence decreases its clock by 2.5 to get --→ 10.22.30 sec
b is currently 10.20 , skew computed = -2.5
i.e b is behind by 2.5.
Hence increases its clock by 2.5 to get --→ 10.22.30 sec
Similarly
c increases by 7.5
Hence 10.15 becomes 10.22.30 sec
And
d decreases by 7.5
Hence 10.30 becomes 10.22.30 sec
Last 3 algorithms rely on absolute or physical
time to find out which event occurred before
which event.
Alternative----→
Lamport showed clock synchronization need
not be absolute. What is important is that all
processes agree on the order in which events
occur.
Eg :
In Unix ,large programs are split into multiple source files, so
that a change to one source file only requires one file to be
recompiled not all files.
If a program consists of 100 files, not having to recompile all
become of change in 1 leads to considerable time saving.
Consider a Unix command called make .
When the programmer has finished changing all the source files, he
runs the make command. Make examines the time at which the
source and object files were last modified. If the source file input.c
has time 2152 and the corresponding object file input.o has time
2150, make knows that input.c has been changed since input.o was
created and must be recompiled.
On the other hand if output.c --- 2144
output.o ---- 2145
No compilation needed
Thus make goes through all the source files to find out
which ones need recompilation and which do not.
In a distributed system editor runs one
machine and the compiler runs on the other
machine.
Assume there is no synchronization of time.
Suppose output.o has time 2144
Shortly thereafter output.c is modified but is
assigned time 2143 since the clock on it’s
machine is slightly behind.
Make will not call the compiler although it
should have.
In this example what count is whether
output.c is old than output .o( no
recompilation) or output.c is newer than
output .o (recomplication is needed).
ie which event took place first.
Use logical clocks.
To implement logical clocks Lamport defined
a relation called happens before
Lamport defined a relation ”happens before”. a → b ‘a happens
before b’.
Happens before is observable in two situations:
2
1.2 2.2 3.2 5.2
3
1.3
Eg :
e11 e12 e13 e14 e15
1 2 3
P2
e21 e22 e23 e24
P3
e31 e32
Global time
(1,0,0) (2,0,0) (3,4,1)
P1
e11 e12 e13
(0,1,0) (2,2,0) (2,3,1) (2,4,1)
Space
P2
e21 e22 e23 e24
(0,0,1) (0,0,2)
P3
e31 e32
Global time
(10 0) (2 0 2)
(0 1 1) (1 3 1)
(1 2 1)
( 0 0 1) (0 0 2) ( 1 3 3)
Vector Clocks captures causality
If v and w are two events --→
▪ If each element of timestamp v is less than or equal to
the corresponding element of timestamp w , then v
casually precedes w.
▪ If each element of timestamp v is greater than or equal
to the corresponding element of timestamp w , then w
casually precedes v.
▪ If neither, some elements greater and some less, then v
and w are concurrent.
Hence a system of vector clocks allows us to order events and decide
whether 2 events are causally related or not by simply looking at the
timestamps of the events
Many Distributed Systems require a single
process to act as coordinator process
eg : Time server in the Berkeley algorithm
If the coordinator fails due failure of the site
on which it is located, a new coordinator
process must be elected to take up the job of
the failed coordinator.
Election algorithms are meant for electing a
coordinator process from the currently
running processes.
Assumptions :
1. Every process has a unique number e.g : it’s
network address (Ip address).
2. Every process knows the process number of
every other process, but what a process
does not know is which ones are currently
active and which are dead.
3. Election algorithms locate the process with
the highest number and designate it as a
coordinator.
1. Bully Algorithm
Bully: “the biggest guy in town wins”.
• When a process P sends a request message to the
coordinator and does not receive a reply within a fixed
timeout period, it assumes that the coordinator has failed.
• P holds election--→
P sends an ELECTION message to all processes
with higher id numbers.
If no one responds, P wins the election and
becomes coordinator.
If a higher process responds, it takes over.
Process P’s job is done.
At any moment, a process can receive an
ELECTION message from one of its lower
numbered colleagues.
The receiver sends an OK back to the sender
(to indicate that he is alive) and conducts its
own election.
Eventually only the bully process (highest
number process) remains. The bully
announces victory to all processes in the
distributed group.
Process 4 holds Process 5 and 6 respond, Now 5 and 6
an election telling 4 to stop each hold an election
Process 6 tells 5 to stop Process 6 wins and tells everyone
If a process that was previously down comes back:
It holds an election.
If it happens to be the highest process currently running, it will
win the election and take over the coordinator’s job
2. Ring Algorithm
Assumptions :
1. Processes in the system are organized in a logical
ring.
2. Ring is unidirectional i.e. all messages are passed
only in one direction (clockwise/anticlockwise)
3. Every process in the system knows the structure of
the ring so that while circulating a message over
the ring if the successor of the sender process is
down the sender can skip the successor until an
active member is located.
In Ring algorithm if any process (Pi) notices that the current
coordinator has failed, it starts an election by sending an
election message to the first neighbor (first active successor) on
the ring.
76
Last site exits Next site
CS enters CS
FIGURE
Synchronization delay
Synchronization delay time
Its request
message sent The site exits the
CS request out The site enters the CS
arrives CS
Response time
FIGURE
Response time
78
▪ Here, a site communicates with a set of other sites to arbitrate
who should execute the CS next.
▪ For a site Si, request set Ri contains ids of all those sites from
which site Si must acquire permission before entering the CS.
▪ These algorithms use timestamps to order requests for the CS
and to resolve conflicts between simultaneous requests for the
CS.
▪ Logical clocks are maintained and updated according to
Lamport’s scheme.
▪ Smaller timestamp requests have higher priority over larger
timestamp requests.
79
▪ Lamport proposed Distributed Mutual Execution algorithm which
was based on his clock synchronization scheme.
▪ In Lamport’s algorithm
• Every site Si keeps a queue, request_queue, which contains
mutual exclusion requests ordered by their timestamps
• Algorithm requires messages to be delivered in the FIFO order
between every pair of sites
▪ Requesting the CS
1. When a site Si wants to enter the CS, it sends a REQUEST(tsi,i)
message to all the sites in its request set Ri and places the
request on request_queuei, ((tsi, i) is the timestamp of the
request.)
2. When a site Sj receives the REQUEST (tsi,i) message from site
Si, it returns a timestamped REPLY message to Si and places
site Si’s request on request_queuej
80
Executing the CS
81
(2,1)
(1,2)
82
(2,1) (1,2)(2,1)
(1,2)
(1,2)(2,1)
(1,2) (1,2)(2,1)
S2 enters the CS
83
(2,1) (1,2)(2,1)
(1,2)
(1,2)(2,1)
(1,2) (1,2)(2,1)
84
(2,1) (1,2)(2,1) (2,1)
(1,2)
(1,2)(2,1) (2,1)
86