0% found this document useful (0 votes)
11 views128 pages

Lect7 - Deadlock Modified by Sau

The document discusses deadlock in systems where multiple processes compete for finite resources, outlining the sequence of resource requests, allocations, and releases. It details the necessary conditions for deadlock, including mutual exclusion, hold-and-wait, no-preemption, and circular wait, and explains how deadlocks can be modeled using directed graphs and resource allocation graphs. Additionally, it covers strategies for deadlock avoidance, prevention, and detection, emphasizing the importance of maintaining safe states to prevent deadlock situations.

Uploaded by

uday19102022
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views128 pages

Lect7 - Deadlock Modified by Sau

The document discusses deadlock in systems where multiple processes compete for finite resources, outlining the sequence of resource requests, allocations, and releases. It details the necessary conditions for deadlock, including mutual exclusion, hold-and-wait, no-preemption, and circular wait, and explains how deadlocks can be modeled using directed graphs and resource allocation graphs. Additionally, it covers strategies for deadlock avoidance, prevention, and detection, emphasizing the importance of maintaining safe states to prevent deadlock situations.

Uploaded by

uday19102022
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 128

Deadlock

Deadlock
 A system consists of finite number of
resources.
 Multiple concurrent processes may
have to compete to use a resource.
 Sequence of events to use a resource by a
process:
1. Request
2. Allocate
3. Release
Cont…
 Request
 Process makes a request for the source.
 If the requested resource is used by another
process the process must wait.
 Number of units requested may not exceed the
total no. of available units of the resource.

 Allocate
 System allocates the resource to requesting
process as soon as possible.
 Maintain a table of resources
allocated or available.
Cont…
 Release
 Release the resources after the process has
finished using the allocated resource.

 Request and release are system


calls, initiated by a process.
Cont…
 Deadlock is state of
the blocking of a permanent of
set
which is waiting processes
for an eventeach of only
that
another process in the set can cause.
Cont… : Example
 Resources : two tape drives T1, T2.

 Allocation strategy:
 Allocate the resource to the requester if free.

 Concurrent Processes : P1 and P2.


Cont… : Example
Cont…
 Process waits until another process releases
the allocated tape drive.

 Processes are in the state of deadlock.

 requests made by processes are legal because each


is requesting for only two tape drives, which is total
no. of tape drives available.
Cont…
 Why Deadlock occurs:
 Because the total requests of both processes
exceed the total number of available
resource

 The resource allocation policy allocates a


resource on request if the resource is
free without considering the safe states.
Cont…
 This is also true for logical objects (like
files, tables, records in a database, etc..).

 Non-Preemptable resource:
 One that cannot be taken away from a process to which
it was allocated until the process voluntarily releases it.
Resource Types
 Two general categories of resources:
 Reusable &
 Consumable

 A reusable resource is one that can be safely


used by only one process at a time and is not
depleted by that use.
 Eg.: Processors, I/O channels, main and
secondary memory, devices, &
 Data structures such as files, databases, and
semaphores.
Consumable
Resources
 The resources that are depleted (or
destroyed) after the use by a process

 Example: interrupts, signals, messages, and


information in I/O buffers.
Necessary conditions for
deadlock
1. Mutual-exclusion condition
2. Hold-and-wait condition
3. No-preemption condition
4. Circular-wait condition

 All four conditions must hold simultaneously in a


system for a deadlock to occur.

 If anyone of them is absent, no deadlock can occur.


Cont…
 Mutual-exclusion condition
 At least one resource must be held in a non-sharable
mode; that is, only one process at a time can use the
resource. If another process requests that resource, the
requesting process must be delayed until the resource
has been released

 Hold-and-wait condition
 Processes are allowed to request for new resources
without releasing the resources that are currently held.
Cont…
 No-preemption condition
 Resources cannot be preempted; that is, a resource can
be released only voluntarily by the process holding it,
after that process has completed its task.

 Circular-wait condition
 Two or more processes must form a circular chain in
which each process is waiting for a resource
that is held by the next process of the chain.

 It implies the hold-and-wait condition.


Cont…

A set {P1, P2, ..., Pn} of waiting processes must exist such
that P1 is waiting for a resource held by P2, P2 is waiting for
a resource held by P3, ..., Pn−1 is waiting for a resource held
by Pn, and Pn is waiting for a resource held by P1
Deadlock modeling
 Deadlocks can be modeled using
directed graphs.

 Directed graph:
 A pair (N,E), where N is a nonempty set of
nodes and E is a set of directed edges.

 Path:
 A sequence of nodes (a,b,c,….i,j) of a directed
graph such that (a,b), (b,c),….. (i,j)
are directed edges.
Cont…
 Cycle:
 A path whose first and last nodes
are the same.
 Reachable set:
 The reachable set of a node ‘a’ is the set of all
nodes ‘b’ such that a path exists from ‘a’
to ‘b’.
 Knot: A nonempty set ‘K’ of nodes such
that the reachable set of each node in ‘K’
is exactly the set ‘K’.
 It always contains one or more cycles.
A directed graph

Cycles :
a c 1. (a,b,c,d,e,f,a)
2. (b,c,d,e,b)

Knot:
d {a,b,c,d,e,f}
f

e
Resource allocation graph (RAG)
 Both the set of nodes and the set of edges are
partitioned into two types, resulting in the
following graph elements.

1. Process nodes: shown as circle


2. Resource nodes: shown as rectangle
3. Assignment edges: directed edges from a
resource node to a process node.
4. Request edges: directed edges from a process
node to a resource node.
Cont…
Pi A process named Pi

Rj A resource Rj having 3 units in the system.

Pi Process Pi holding a unit of resource Rj.


R3

Pi R3 Process Pi requesting for a unit of resource Rj.


Necessary and sufficient conditions for
deadlock
 Presence of a cycle in a RAG is a necessary
but not a sufficient condition for deadlock.
 E.g., consider the RAG of the previous slide:

 It contains a cycle (P1, R2, P2 , R1, P1) but


does not represent a deadlock state.
A RAG contains cycle but no deadlock

P1

R2 R1 R3

P2 P3
Necessary and sufficient conditions for
deadlock
1. A cycle in the graph is both a necessary
and a sufficient condition for deadlock
 if all the resource types requested by
processes forming the cycle have the
single unit each. only a
A cycle representing a deadlock
(P1 & P2 are deadlocked)

P1

R2 R1 R3

P2
Cont…
2. A cycle in the graph is a necessary
but not a sufficient condition for deadlock
 if one or more of the resource types requested
by the processes forming the cycle have
more than one unit.

 In this situation, a knot is the sufficient


condition for deadlock.
A Knot representing a deadlock

P1

R2 P2 R1 R3

P3
Wait-for graph
 A simplified obtained from the original
graph,
resource allocation graph by removing the
resource nodes and collapsing the appropriate
edges.

 Since WFG is constructed only when each resource type has


only a single unit, a cycle is both a necessary and
sufficient condition for deadlock in a WFG

 Appropriate for modeling Communication


Deadlocks.
Cont…
Simplified to
P1 P1

R2 R1

P3 P2 P3 P2

(a) Resource allocation graph Corresponding WFG


Communication deadlocks
 Occurs among a set of processes, when these are
blocked waiting for messages from other
processes in the set, in order to start execution
but there are no messages in transit between
these.

 All processes in the set are deadlocked.


Resource deadlocks
 Occurs when two or more processes wait
permanently for resources held by each
other.
 Handling deadlocks in Distributed Systems
 Strategies:
 Avoidance
 Prevention
 Detection & recovery
Basic Deadlock Avoidance
 Resources are carefully to
allocated deadlocks. avoid

 These methods use some advance knowledge of


the resource usage of processes.

 Considers safe / unsafe states.


Cont…
 Safe state:
 A state is safe if the system can allocate
resources to each process (up to its maximum)
in some order and still avoid a deadlock.

 More formally, a system is in a safe state only


if there exists a safe sequence
Cont…
 Safe sequence:
 Any ordering of the processes that
can guarantee the completion of all the
processes.
 (By Galvin et al.): A sequence of processes <P1, P2, ...,
Pn> is a safe sequence for the current allocation state if,
for each Pi , the resource requests that Pi can still make,
can be satisfied by the currently available resources plus
the resources held by all Pj, with j < i.
 If the resources that Pi needs are not immediately
available, then Pi can wait until all Pj have finished.
 Unsafe state:
 A system state if no safe sequence exists.
Example
 There are a total of 8 units of a particular
resource type for which three processes P1,P2,
and P3 are competing.

 Suppose the maximum units of the resource


required by P1, P2, and P3 are 4, 5, and 6,
respectively.

 Currently each of the three processes is holding 2


units of the resource.

 Therefore, in the current state of the system, 2


units of the resource are free.
Example
Example
 The state of Figure 6.13(a) is safe.

 Two safe sequences, (P1,P2,P3) and (P1,P3,P2 ).

 For a particular state there may be more than one


safe sequence.
Example
 If resource allocation is not done cautiously, system
may move from a safe state to an unsafe state.
Cont…
 Some assumptions of the algorithm:
 The knowledge of the resource
advance of the various processes is
requirements
available.
 The number of processes that compete for a
particular resource and the number of units
of that resource are fixed and known in
advance.

 in practice, the number of processes is not fixed but


dynamically varies as new users log in and log out.
 actual number of resource units available may
change dynamically due to the sudden breakdown
Cont…
 The practical limitations of deadlock avoidance
algorithms become more severe in a distributed
system.

 because the collection of information needed


for making resource allocation decisions at
one point is difficult and inefficient.

 Deadlock avoidance strategy is never used in


distributed operating systems.
Deadlock Avoidance
 Approaches to deadlock avoidance:

1. Process Initiation Denial


2. Resource Allocation Denial
Process Initiation
Denial
 Do not start a process if its demands
might lead to deadlock.
 Define a deadlock avoidance policy that
refuses to start a new process if its
resource requirements might lead to
deadlock.
 A process is only started if the maximum
claim of all current processes plus
those of the new process can be met.
 Strategy is hardly optimal.
 Assumes the worst: that all processes will
make their maximum claims together.
Resource Allocation
Denial
 Do not grant an incremental resource request
to a process if this allocation might lead to
deadlock.

 Also referred to as the banker’s algorithm.


Cont…
 Consider a system with a fixed number of
processes and a fixed number of
resources.

 At any time, a process may have zero or


more resources allocated to it.
Cont…
 State of the system
 The current of resources to
processes consists of:
allocation
 two vectors: Resource and Available &
 two matrices: Claim and Allocation.
Deadlock Prevention
 Based on the idea of designing the system in
such a way that deadlocks become impossible.

 Constraints are imposed on the ways in which


processes request resources, in order to prevent
deadlocks.

 Tries to ensure that at least one of the necessary


conditions for a deadlock is never satisfied.
Cont…
 Deadlock prevention methods:
 Collective requests
 Ordered requests
 Preemption
Collective Requests
 This method the hold-and-wait
denies condition.

 Ensures that whenever a process requests


a resource, it does not hold any other
resources.
Cont…
 Two resource allocation policies:
1. A process must request all of its resources
before it begins execution. If all the needed
resources are available, they are allocated to the
process so that the process can run to
completion. If one or more of the requested
resources are not available, none will be
allocated and the process would just wait.
Cont…
2. Instead of requesting all its resources before
its execution starts:
 a process may request resources during its
execution if it obeys the rule that it requests
resources only when it holds no other resources

 If the process is holding some resources, it can


adhere to this rule by first releasing all of them
and then re-requesting all the necessary resources
Advantage of second policy over the first
I. In practice, many processes do not know how
many resources they will need until they have
started running

II. A long process may require some resources only


toward the end of its execution. In the first policy,
the process will unnecessarily hold these
resources for the entire duration of its execution.
In the second policy, however, the process can
request for these resources only when it needs
them.
Cont…
 Problems :
 It generally has low resource utilization because
a process may hold many resources but may not
actually use several of them for fairly long
periods.

 It may cause starvation of a process that needs


many resources, but whenever it makes a
request for the needed resources, one or more
of the resources is not available
Ordered Requests
 Each resource type is assigned a unique global
number to impose a total ordering of all resource
types.

 The process should not request a resource with a


number lower than the number of any of the
resources that it is already holding.

 If process requires several units of same resource


type, it must issue a single request for all units.
Cont…
 A process may not acquire all its resources
in strictly increasing sequence.
 Eg. Holding resource # 7 & 10. Now release 10 & can
request for 9.
 ordering of resources is decided according to the natural usage
pattern of the resources. For example, since the tape drive is
usually needed before the printer, it would be reasonable to
assign a lower number to the tape drive than to the printer

 Problem :
 Reordering may become inevitable when
new resources are added to the
system.
Preemption
 A preemptable resource is one whose
state can be easily saved and restored
later.

 Examples : CPU registers, main memory.


Two Resource allocation policies
1. Preempt all the resources of a process when it
request for a resource that is not currently
available
 Process is blocked.

2. (a) If a resource is held by a waiting process


(waiting for another resource) and another
process requests for that,
 Preempt the requested resource from the

waiting process,
 Allot to the requesting process

(b) Otherwise, the requesting process is blocked and waits for


the requested resource to become available
Cont…
 Method only for
works preemptable
resources.
 Availability of atomic transactions and global timestamps
makes this method an attractive approach for deadlock
prevention in distributed and database transaction processing
systems
Cont…
 Transaction-based deadloc prevention
method k
 Use of priority number for
unique each
transaction
 Priority numbers are used to break the tie

 Global unique timestamp may serve as priority


number

 Lower the value of timestamp means


higher the priority
Cont…
 Schemes based on the above idea:
 Wait-die scheme
 Wait-wound scheme
Wait-die scheme
 Three Transactions Ti (oldest), Tj and
Tk(youngest)
 If a transaction Ti requests a resource that
is currently held by another transaction Tj,
 Ti is blocked because its timestamp is lower
than that if Tj;
 If a transaction Tk requests a resource
that is currently held by another
transaction Tj,
 Tk is aborted (dies).
Wait-wound
scheme
 Three Transactions T (oldest), T and
i j
Tk(youngest)
 If a transaction Ti requests a resource that is
currently held by another transaction Tj,
 Tj is preempted because its timestamp is
higher than that if Tj;
 If a transaction Tk requests a resource that is
currently held by another transaction Tj,
 Tk blocked because its timestamp is lower than
that if Tj
Deadlock Detection and Recovery
 Deadlocks are allowed to occur.
 A detection algorithm is used to
detect these.
 After a deadlock is detected, it is resolved
by certain means.
 Detection algorithms are the same in both
centralized and distributed systems.
 Based on resource allocation graph
(RAG) and possibility of deadlock.
How to construct WFG for a DS?
For simplicity, we consider only the case of a single unit of each resource type. Therefore, the
deadlock detection algorithms get simplified to maintaining WFG and searching for cycles in
the WFG.

 Steps to construct WFG for a


distributed system:
 Construct a separate RAG for each site of
the system. In the RAG of a site,
 a resource node exists for all local resources and
 a process node exists for all processes that are
either holding or waiting for a resource of this site
immaterial of whether the process is local or
nonlocal.
How to construct WFG for a DS?
 Convert the RAG into a separate WFG for each
site.

 Take the union of the WFGs of all sites


and construct a single global WFG.
Cont… : Example
Cont… : Example
P1 P3 P1 P3

R1 R2
P1 P3
P2 P2

Site S1 Site S1
P2
P1 R3 P3 P1 P3

Site S2 Site S2
(c) Global WFG by
(A) Resource (B) WFGs taking the union of
allocation graph for corresponding to the two local WFGs of
each site graphs in (a) (b)
Cont…
 although local WFGs of the two sites do not
contain any cycle, the global WFG contains a
cycle, implying that the system is in a deadlock
state.

 Local WFGs are not sufficient to characterize


all deadlocks in a distributed system.

 So, construction of a global WFG is required


 Taking the union of all local WFGs.
Cont…
 Properties for the correctness of deadlock
detection algorithms:

 Progress property
 Deadlock must be detected in a finite
amount of time.

 Safety property
 If a deadlock is detected, it must indeed exist.
Message delays and out-of-date WFGs sometimes
cause false cycles to be detected, resulting in the
detection of deadlocks that do not actually exist.
Such deadlocks are called phantom deadlocks.
Cont…
 Problem:
 How to maintain WFG in a distributed system?

 Techniques are:
 Centralized
 Hierarchical
 Distributed
Centralized approach for deadlock detection
 Use of local coordinator for each site
 Maintains a WFG for its local resources

 Use of central coordinator/


centralized deadlock detector
 Receives information from the
local coordinators of the sites.

 Constructs the union of al the individual WFGs.


Cont…
 Approach:
 Local deadlocks are detected and
handled by the local coordinator.
 Considers the cycles that exist in the local WFG of a
site.

 Deadlocks at two or more sites are


detected and resolved by the central
coordinator.
 Considers the cycles in the global WFG.
Methods to transfer information
 Methods to transfer information from local
coordinators to the central coordinator:
 Continuous transfer
 Transfer of message whenever a new edge is
added to or deleted from the local WFG.

 Periodic transfer
 a local coordinator periodically (when a number of
changes have occurred in its local WFG) sends a list
of edges added to or deleted from its WFG.

 Transfer-on-request
 A local coordinator sends a list of edges added to
or deleted from its WFG only when the central
coordinator makes a request for it.
Cont…
 Drawbacks of deadlock
centralized
detection approach:
 Vulnerable to of the central
failures
coordinator.
 Performance bottleneck in large
systems having too many sites.

 Detection of false deadlocks.


Example

 Steps:
 1: P1 requests for R1 and R1 is allocated to it.
 2: P2 requests for R2 and R2 is allocated to it.
 3: P3 requests for R3 and R3 is allocated to it.
 4: P2 requests for R1 and waits for it.
 5: P3 requests for R2 and waits for it.
 6: P1 releases R1 and R1 is allocated to P2.
 7: P1 requests for R3 and waits for it.
Cont…
 Sequences of messages sent to the central
coordinator (using continuous transfer):
 M1: from site S1 to add the edge (R1, P1).
 M2: from site S1 to add the edge (R2, P2).
 M3: from site S2 to add the edge (R3, P3).
 M4: from site S1 to add the edge (P2, R1).
 M5: from site S1 to add the edge (P3, R2).
 M6: from site S1 to delete edges (R1,P1) and (P2,R1),
and add edge (R1, P2).
 M7: from site S2 to add the edge (P1, R3).
Cont…

P1 P3 P1 R3 P3

R1 R2 R3 P3
R1 R2

P2 P2
Resource allocation Resource allocation
Resource allocation
graph of the local graph maintained
graph of the local
coordinator of site by the central
coordinator of site
S2 coordinator
S1

Resource allocation graphs after step 5


Cont…

P3 P1 R3 P3

R1 R2 P1 R3 P3
R1 R2

P2 P2

Site S2 after message


Site S1
m7, has been
received by the
central
coordinator.

Resource allocation graphs after step 7


Cont…
 central coordinator graph has no cycles

 system is free from deadlocks.

 Suppose m7 from site S2 is received before


message m6 from site S1 by central coordinator.

 In this case, the central coordinator's view of the


system will be as shown in the following RAG

 Here, central coordinator will incorrectly conclude


a deadlock and initiates deadlock recovery action.
Cont…

P1 R3 P3

R1 R2

P2
Central
coordinator

RAG of central coordinator showing false deadlock


if message m7 is received before m6 by the
central coordinator.
Cont…
 Result :
 Possibility of detection of phantom deadlocks
 when the method of continuous
transfer of information is used.

 phantom deadlocks may even get detected in the


other two methods of information transfer due to
incomplete or delayed information.
Cont…
 Solution:
 Use unique global timestamp with eac
h
message.
Hierarchical approach for deadlock
detection
 Experimental measurements have shown that 90% of all
deadlock cycles involve only two processes

 centralized approach seems to be less attractive for most


real applications because of significant time and message
overhead

 To minimize communications cost, in geographically


distributed systems:
 deadlock should be detected by a site located as close as
possible to the sites involved in the cycle.

 But this is not possible in the centralized approach.

 The hierarchical approach overcomes these


Hierarchical approach for deadlock
detection
 Uses a logical hierarchy (tree) of deadlock
detectors, known as controllers.

 Each controller is responsible for detecting only


those deadlocks that involve the sites falling
within its range in the hierarchy.

 Approach is distributed over a number of


different controllers.
Cont…
 Rules :
 Each controller that forms a leaf of the
hierarchy tree maintains the local WFG
of a single site.

 Each nonleaf controller maintains a WFG that


is the union of the WFGs of its
immediate children in the hierarchy tree.
Cont…
 The lowest controller that finds
level cycle in its a
WFG
 detects a deadlock and

 Takes necessary action to resolve it.

 A WFG that contains a cycle will never be


passed as it is to a higher level controller.
Example : Hierarchical deadlock
detection approach
P4 P5

P7 P6 Controller G

P1 P3 P4 P5
P5 P6 P7
P2 P7
Controller E Controller F

P1 P3 P1 R3 P3

P5 R6 P6 P6 R7 P7
R1 R2 P4 R4 P5

P2 R5 P7 Site S3 Site S4
Controller C Controller D
Site S1 Site S2
Controller A Controller B
Cont…
 Deadlock cycle
 (p1,p3,p2,p1) of site S1 and S2 gets reflected
in the WFG of controller E.

 (p4,p5,p6,p7,p4) of site S2,S3 and S4


gets reflected in the WFG of
controller G.
Fully distributed approaches for deadlock
detection
 Each site of the system shares
equal responsibility for deadlock
detection.

 Algorithms are:
 WFG-based distributed algorithm for
deadlock detection.
 Probe-based distributed algorithm for deadlock
detection.
WFG-based fully distributed deadlock
detection algorithms
 Each site maintains its own local WFG.
 To model waiting situations of external processes
 An extra node Pex is added to the local
WFG of each site in the following manner:

1. An edge (Pi,Pex) is added if process Pi is waiting for a


resource in another site being held by any process.

2. An edge (Pex, Pj) is added if Pj is a process of


another site that is waiting for a resource
currently being held by a process of this site.
 In the WFG of site S1, edge (P1, Pex) is added
because P1 is waiting for a resource in the site S2
that is held by P3 &
 Edge (Pex, P3) is added because P3 is a process
of site S2 that is waiting to acquire a resource
held by P2 of S1.
 WFG of S2, edge (P3, Pex) added because P3 is
waiting for a resource in S1 held by P2 &
 Edge (P3, Pex) added because P1 of S1 is waiting
to acquire resource held by P3 of S2.
Example
P1 P3 Pex

P1 P3
Pex
P4 P2
P1 P3 P5
Site S1 P4 P2
Site S1
P2
Pex
P1 P3 P5 Updating local
P1 P5 WFGs of site S2
P3
after receiving the
Site S2 Site S2 deadlock
detection
Local WFGs Local WFGs after message
S1. from site
addition of node
Pex
Cont…
 If a local WFG contains a cycle that:
 does not involve node Pex, a deadlock that involves only
local processes of that site has occurred – Resolved
Locally

 does involve Pex, there is a possibility of a distributed


deadlock that involves processes of multiple sites
– By Distributed Deadlock Detection Algorithm.
Distributed Deadlock Detection
Algorithm
Distributed Deadlock Detection
Algorithm
Cont…
 Problem:
 Two sites may initiate the deadlock detection algorithm
independently for a deadlock that involves the same
processes.
 More than the necessary process may be killed.

 Solution:
 Assign a unique identifier to each process.
Distributed Deadlock Detection
Algorithm
Distributed Deadlock Detection
Algorithm
Probe-based distributed algorithm for
deadlock detection
 Proposed by Chandy et al. in 1983.

 Also known as Chandy-Mishra- or CMH


Hass algorithm.

 When a process that requests for a resource(s)


fails to get the requested resource(s) and times
out, it generates a special probe message and
sends it to the process(es) holding the requested
resource(s).
Cont…
 Fields of probe message:
 The identifier of the process just blocked.
 The identifier of the sender process.
 The identifier of the receiver process.
Cont…
 If recipient is using the resource, it ignores the probe
message.

 If recipient itself is waiting for any resource(s), it


forwards the probe message to the process(es)
holding the resource(s) for which it is waiting.

 before the probe message is forwarded, the recipient


modifies its fields in the following manner:

 1. The first field is left unchanged.

 2. The recipient changes the second field to its own process


identifier.

 3. The third field is changed to the identifier of the process


that will be the new recipient of this message
Cont…
 Cycle exists if the probe message returns back to
the original sender.
 System is deadlocked.
 A probe packet : (p1,p1,p3)
p1 generates a probe message and sends it to
p3.
Cont…: CMH distributed deadlock
detection algorithm

(P1,P1,P3) (P1,P3,P5)
P1 P3 P5
(P1,P2,P1)

(P1,P3,P2)
(P1,P2,P4)
P4 P2

Site S1 Site S2
Cont…
 Features of the CMH algorithm:
 Easy to implement.
 Each message is of fixed length.
 Few computational steps.

 Low overhead of the algorithm


 No graph construction information
and collecting involved
 Doesn’t detect false deadlocks
 Does not require any particular
structure among the processes.
Methods for recovery from
deadlock
 Asking for operator intervention.
 Termination of process (es).
 Rollback of process (es).

Operator intervention refers to a manual approach where a


system administrator or operator is notified of a deadlock and
can manually intervene to resolve it by terminating processes
or preempting resources
Asking for operator intervention
 Inform the operator about the deadlock.

 Let the operator deal with it manually.

 Works for a centralized system.

 Not suitable for dealing with the deadlocks


involving processes of multiple sites.
Asking for operator intervention
 when a deadlock involving processes of multiple sites is
detected, it is not clear which site should be informed.

 If all the sites whose processes involved in the deadlock are


informed, each site's operator may independently take
some action for recovery.

 If the operator of only a single site is informed, the operator


may favor the process (or processes) of its own site while
taking a recovery action.

 Operator of one site may not have the right to interfere


with a process of another site for taking recovery action.
Termination of process
 Terminate (kill) one or more processes
and reclaim the resources held by them
for reallocation.

 How recovery algorithm works?


 Analyze the resource requirements and
interdependencies of the processes
involved in a deadlock cycle, and
 then select a set of processes, which, if killed,
can break the cycle.
Rollback of process
 Killing a process requires its restart from the very
beginning, which is very expensive.

 It is sufficient to reclaim the needed resources from


the processes that were selected for being
killed.

 Rollback the process to a point where the resource


was not allocated.

 Processes are checkpointed periodically, i.e., a


process's state (its memory image and list of resources
held by it) is written to a file at regular intervals.
Rollback of process
 May appear as less expensive than
the process termination approach.
 Though it experiences extra overhead
involved in periodic checkpointing of
all the processes.
Issues in recovery from deadlock
 Selection of victim (s)

 Victim: the process which is killed or rolled


back to break the deadlock.

 Factors for victim selection:


Minimization of recovery cost
 Prevention of starvation
Issues in recovery from deadlock
Minimization of recovery cost

 those processes should be selected as victims whose


termination/rollback will incur the minimum recovery cost.

 Some of the factors that may be considered for this


purpose: a
(a)the priority of the processes;
(b)the nature of the processes, such as interactive or batch
and possibility of rerun with no ill effects;
(c)the number and types of resources held by the processes;
(d)the length of service already received and the expected
length of service further needed by the processes; and
(e)total number of processes that will be affected.
Issues in recovery from deadlock
Prevention of starvation

 If a system only aims at minimization of recovery cost, it


may happen that the same process is repeatedly selected
as a victim and may never complete (starvation)

 One approach to handle this problem is to raise the priority


of the process every time it is victimized.

 Another approach is to include the number of times a


process is victimized as a parameter in the cost function.
Issues in recovery from deadlock
Use of Transaction Mechanism
 Rerunning a process may not always be safe, especially
when operations performed by process are nonidempotent.

 For example, if a process has updated the amount of a bank


account by adding a certain amount to it, re-execution of
the process will result in adding the same amount once
again, leaving balance in the account in an incorrect state.

 Use of a transaction mechanism (which ensures all or no


effect) becomes almost inevitable when the system chooses
the method of deadlock detection and recovery.
Issues in recovery from deadlock
Use of Transaction Mechanism
 System should ensure the updates of a partially executed
transaction are not reflected in the database.

 However, notice that the transaction mechanism need not


be used for those processes that can be rerun with no ill
effects.
Election algorithms
 Several algorithms requires a coordinator process
to perform some type of coordination activity.

 Election algorithms are used for electing


coordinator process aamong
from the currently
running processes.

 At any instance time there is a


of
coordinator for all processes
singlein the system.
Cont…
 Assumptions:
 Eachprocess the system has a
in unique
priority number.
 Process with highest priority will be elected as
the Coordinator.

 On recovery, a failed process can take


appropriate actions to rejoin the set of
active processes.
Bully algorithm by Garcia-Molina[1982]

 Assumption :
 Every process knows the priority number of every other
process in the system.

 When a process sends a request to the


coordinator and does not receive a reply within a
fixed timeout period, it starts election process
assuming that the coordinator has failed.
Cont…
 sends electionmessage to all with a higher
priority number than itself.

 If does not receive any response to its election message


within a fixed timeout period, it assumes that among the
currently active processes it has the highest priority number.

 sends a coordinator message to all processes having lower


priority numbers than itself, informing that from now on it is
the new coordinator.

 if receives a response for its election message, this means


some other process having higher priority number is alive.
Therefore Pi does not take any further action and just waits to
receive a coordinator message from the new coordinator
Cont…
 When a process receives an election message (from a lower
priority process), it sends an alive message to the sender
informing it is alive and will take over election activity.

 In this way, the election activity gradually moves on to the


process that has the highest priority number.

 As part of the recovery action:


 a failed process (say ) must initiate an election on recovery.

 If the current coordinator's priority number is higher than then


the current coordinator will continue.

 If 's priority number is higher than current coordinator, it will


not receive any response for its election message.

 It wins the election and takes over the coordinator's job


Cont…
 if the process having the highest priority number recovers
after a failure, it does not initiate an election

 because it knows from its list of priority numbers that all other
processes in the system have lower priority numbers

 On recovery, it simply sends a coordinator message to all


other processes and bullies the current coordinator into
submission.
Cont…
Cont…
Cont…
 when the process having the lowest priority number detects
the coordinator's failure and initiates an election, in a system
having total n processes:
 altogether n-2 elections are performed one after another for
the initiated one.
 when the process having the priority number just below the
failed coordinator detects that the coordinator has failed, it
immediately elects itself as the coordinator and sends n-2
coordinator messages.

Algorithm requires
 O(n2) messages in worst case.
 (n-2) messages in the best case.
Ring Algorithm
 All the processes in the system are organized in a
logical unidirectional ring.

 Every process in the system knows the structure


of the ring.

 When a process Pi sends a request message to the current


coordinator and does not receive a reply within a fixed
timeout period, it assumes that the coordinator has crashed.

 It initiates an election by sending an election message to


its successor (first successor that is currently active).

 This message contains the priority number of process Pi.


Ring Algorithm
 On receiving the election message, the successor appends
its own priority number to the message and passes it on to
the next active member.

 This member appends its own priority number to the


message and forwards it to its own successor.

 In this manner, the election message circulates over the


ring from one active process to another and eventually
returns back to process Pi.

 Process Pi recognizes the message as its own election


message by seeing that in the list of priority numbers held
within the message the first priority number is its own
priority number.
Cont…
 Pi recognizes its own election message.
 It elects the process having the highest priority
as the new coordinator.
 It sends a coordination message over the ring
about the new coordinator.
Cont…
 If previous coordinator (Pj) recovers after
failure:
 Sends an inquiry message to its successor.

 The current coordinator replies to Pj that it is


the current coordinator.
Cont…
 Drawback :
 Two or more processes may circulate an
election message over the ring on
discovery that the coordinator has crashed.

 An election always requires 2(n-1) messages.

 n/2 messages are required on an average for


recovery action.

 More efficient and easier to implement


than bully algorithm.

You might also like