Introduction To Distributed Systems: CSE 380 Computer Operating Systems
Introduction To Distributed Systems: CSE 380 Computer Operating Systems
CSE 380
Computer Operating Systems
University of Pennsylvania
Fall 2003
Lecture Note: Distributed Systems
Distributed Systems
Software Concepts
o rlogin machine
o rcp machine1:file1 machine2:file2
o Suns NFS (network file servers) for shared file systems (Fig.
9-11)
NFS Architecture
Server exports directories
Clients mount exported directories
NSF Protocols
For handling mounting
For read/write: no open/close, stateless
NSF Implementation
10
1. Transparency
How to achieve the single-system image, i.e., how to
make a collection of computers appear as a single
computer.
Hiding all the distribution from the users as well as
the application programs can be achieved at two
levels:
1) hide the distribution from users
2) at a lower level, make the system look transparent to
programs.
1) and 2) requires uniform interfaces such as access to
files, communication.
Transparency
Flexibility
Reliability
Performance
Scalability
11
12
2. Flexibility
3. Reliability
13
4. Performance
14
5. Scalability
Systems grow with time or become obsolete.
Techniques that require resources linearly in terms of
the size of the system are not scalable. (e.g.,
broadcast based query won't work for large
distributed systems.)
Examples of bottlenecks
15
16
Distributed Coordination
Computer for
compiling
Computer for
editing
2144
2142
2145
2143
2146
2144
2147
2145
foo.c modified
17
18
Event Ordering
Since there is no common memory or clock, it is sometimes impossible
to say which of two events occurred first.
The happened-before relation is a partial ordering of events in
distributed systems such that
1 If A and B are events in the same process, and A was executed before B,
then A B.
2 If A is the event of sending a message by one process and B is the event of
receiving that by another process, then A B.
3 If A B and B C, then A C.
If two events A and B are not related by the relation, then they are
executed concurrently (no causal relationship)
To obtain a global ordering of all the events, each event can be time
stamped satisfying the requirement: for every pair of events A and B, if
A B then the time stamp of A is less than the time stamp of B. (Note
that the converse need not be true.)
19
20
Global ordering
21
(2,0)
(1,1)
(5,0)
(2,2)
(4,1)
(5,1)
(6,0)
(1,2)
(2,1)
(3,1)
22
(4,2)
(6,1)
(7,1)
(7,0)
23
24
Cristian's algorithm
need to change time gradually
need to consider msg delays, subtract (T1 - T0 - I)/2
26
27
28
Unreliable communication
Reaching Agreement
How can processes reach consensus in a distributed system
Messages can be delayed
Messages can be lost
Processes can fail (or even malignant)
Messages can be corrupted
Each process starts with a bit (0 or 1) and Non-faulty
processes should eventually agree on common value
No solution is possible
Note: solutions such as computing majority do not work. Why?
Two generals problem (unreliable communications)
Byzantine generals problem (faulty processes)
29
30
Impossibility Proof
Theorem. If any message can be lost, it is not possible for two
processes to agree on non-trivial outcome using only messages
for communication.
Proof. Suppose it is possible. Let m[1],,m[k] be a finite
sequence of messages that allowed them to decide. Furthermore,
lets assume that it is a minimal sequence, that is, it has the least
number of messages among all such sequences. However, since
any message can be lost, the last message m[k] could have been
lost. So, the sender of m[k] must be able to decide without
having to send it (since the sender knows that it may not be
delivered) and the receiver of m[k] must be able to decide without
receiving it. That is, m[k] is not necessary for reaching
agreement. That is, m[1],,m[k-1] should have been enough for
the agreement. This is a contradiction to that the sequence
m[1],,m[k] was minimum.
31
32
A Centralized Algorithm
Use a coordinator which enforces mutual exclusion.
Two operations: request and release.
Process 1 asks the coordinator for permission to enter a critical region. Permission
is granted.
Process 2 then asks permission to enter the same critical region. The coordinator
des not reply.
When process 1 exists the critical region, it tells the coordinator, which then replies
to 2.
33
34
A Centralized Algorithm
Algorithm properties
guarantees mutual exclusion
fair (First Come First Served)
a single point of failure (Coordinator)
if no explicit DENIED message, then cannot distinguish
permission denied from a dead coordinator
35
36
A Decentralized Algorithm
A Decentralized Algorithm
Decision making is distributed across the entire
system
a) Two processes want to enter the same critical region
at the same moment.
b) Process 0 has the lowest timestamp, so it wins.
c) When process 0 is done, it sends an OK also, so 2
can now enter the critical region.
37
38
Correctness
40
10
Properties
1
2
3
4
5
Comparison
42
Leader Election
In many distributed applications, particularly the
centralized solutions, some process needs to be
declared the central coordinator
Electing the leader also may be necessary when the
central coordinator crashes
Election algorithms allow processes to elect a unique
leader in a decentralized manner
43
44
11
Bully Algorithm
Bully Algorithm
45
ID1
ID5
ID4
ID2
ID3
46
Distributed Deadlock
48
12
49
Definitions
50
Resource Model
1 reusable resource
2 exclusive access
Three Request Models
1 Single-unit request model:
a cycle in WFG
51
52
13
54
55
56
14
Deadlock Prevention
The basic idea is to assign a unique priority to each process and use
these priorities to decide whether process P should wait for process Q.
Let P wait for Q if P has a higher priority than Q; Otherwise, P is
rolled back.
Note:
Both favor old jobs (1) to avoid starvation, and (2) since older
jobs might have done more work, expensive to roll back.
Unnecessary rollbacks may occur.
This prevents deadlocks since for every edge (P ,Q) in the wait-for
graph, P has a higher priority than Q.
Thus, a cycle cannot exist.
57
Sample Scenario
58
WD versus WW
59
60
15
Example
Wait-Die (WD):
(1) P1 requests the resource held by P2. P1 waits.
(2) P3 requests the resource held by P2. P3 rolls back.
Wound-Wait (WW):
(1) P1 requests the resource held by P2. P1 gets the resource
and P2 is rolled back.
(2) P3 requests the resource held by P2. P3 waits.
When there are more than one process waiting for a resource held
by P, which process should be given the resource when P finishes?
In WD, the youngest among waiting ones. In WW, the oldest.
61
62
Computer networks
Local area networks such as Ethernet
Wide area networks such as Internet
Network services
Connection-oriented services
Connectionless services
Datagrams
Network protocols
Internet Protocol (IP)
Transmission Control Protocol (TCP)
Middleware
63
64
16
Summary
Distributed coordination problems
Event ordering
Agreement
Mutual exclusion
Leader election
Deadlock detection
Middleware for distributed application support
Starting next week: Chapter 9 (Security)
65
17