DC unit 1 - notes
DC unit 1 - notes
Regulation : 2021
UNIT I
UNIT I INTRODUCTION
1. 1 INTRODUCTION
The process of computation was started from working on a single processor. This uni-
processor computing can be termed as centralized computing. As the demand for the
increased processing capability grew high, multiprocessor systems came to existence. The
advent of multiprocessor systems, led to the development of distributed systems with high
degree of scalability and resource sharing. The modern day parallel computing is a subset of
distributed computing
A distributed system is a collection of independent computers, interconnected via
a network, capable of collaborating on a task. Distributed computing is computing
performed in a distributed system.
QOS parameters
The distributed systems must offer the following QOS:
Performance
Reliability
Availability
Security
Differences between centralized and distributed systems
Centralized Systems Distributed Systems
In Centralized Systems, several jobs are done In Distributed Systems, jobs are distributed
on a particular central processing unit(CPU) among several processors. The Processor are
interconnected by a computer network
They have shared memory and shared They have no global state (i.e.) no shared
variables. memory and no shared variables.
Clocking is present. No global clock.
The interaction of the layers of the network with the operating system and
middleware is shown in Fig 1.2. The middleware contains important library functions for
facilitating the operations of DS.
The distributed system uses a layered architecture to break down the complexity of system
design. The middleware is the distributed software that drives the distributed system, while
providing transparency of heterogeneity at the platform level
1.3 Motivation
The following are the keypoints that acts as a driving force behind DS:
The main objective of parallel systems is to improve the processing speed. They
aresometimes known as multiprocessor or multi computers or tightly coupled systems.
They refer to simultaneous use of multiple computer resources that can include a single
computer with multiple processors, a number of computers connected by a network to form
a parallel processing cluster or a combination of both.
1. A multiprocessor system
A multiprocessor system is a parallel system in which the multiple processors have direct
access to shared memory which forms a common address space. The architecture is shown
in Figure 1.3(a). Such processors usually do not have a common clock.
Figure 1.3 Two standard architectures for parallel systems. (a) Uniform memory
access (UMA) multiprocessor system. (b) Non-uniform memory access (NUMA)
multiprocessor.
Figure 1.4 shows two popular interconnection networks – the Omega network and the
Butterfly network, each of which is a multi-stage network formed of 2×2 switching
elements. Each 2×2 switch allows data on either of the two input wires to be switched to the
upper or the lower output wire. In a single step, however, only one data unit can be sent on
an output wire. So if the data from both the input wires is to be routed to the same output
wire in a single step, there is a collision.
number s, where s ∈ [0, log n – 1]. In a stage s switch, if the s + 1th most significant bit of
The routing function from input line i to output line j considers only j and the stage
j
is 0, the data is routed to the upper output wire, otherwise it is routed to the lower output
wire.
Butterfly network
A butterfly network links multiple computers into a high-speed network. For a butterfly
network with n processor nodes, there need to be n (log n + 1) switching nodes. The
generation of the interconnection pattern between a pair of adjacent stages depends not only
on n but also on the stage numbers.In a stage (s) switch, if the s + 1th MSB of j is 0, the data
is routed to the upper output wire, otherwise it is routed to the lower output wire.
A k X k mesh will contain k 2processor with maximum path length as 2*(k/2 -1). Every unit
in the torus topology is identified using a unique label, with dimensions distinguished as bit
positions.
3. ArrayProcessors
They are a class of processors that executes one instruction at a time in an array or
table of data at the same time rather than on single data elements on a common clock.
They are also known as vector processors. An array processor implement the instruction set
where each instruction is executed on all data items associated and then move on the other
instruction. Array elements are incapable of operating autonomously, and must be driven by
the control unit.
Flynn's taxonomy based on the number of instruction streams and data streams are the
following:
1. (SISD) single instruction, single data
2. (MISD) multiple instruction, single data
3. (SIMD) single instruction, multiple data
4. (MIMD) multiple instruction, multiple data
Single Multiple
The multiprocessor systems are classified into two types based on coupling:
1. Loosely coupled systems
2. Tightly coupled systems
Concurrency
Concurrent programming refer to techniques for decomposing a task into subtasks
that can execute in parallel and managing the risks that arise when the program
executes more than one task at the same time.
The parallelism or concurrency in a parallel or distributed program can be measured
by theratio of the number of local non-communication and non-shared memory
access operations to the total number of operations, including the communication or
shared memory access operations.
Granularity
Granularity or grain size is a measure of the amount of work or computation that is performed
by that task.
Processes can communicate with other Here, a process does not have private address
processes. They can be protected from one space. So one process can alter the execution
another by having private address spaces. of other.
This technique can be used in heterogeneous This cannot be used to heterogeneous
computers. computers.
Synchronization between processes is through Synchronization is through locks and
message passing primitives. semaphores.
Processes communicating via message passing Processes communicating through DSM
must execute at the same time. may execute with non-overlapping lifetimes.
Efficiency:
All remote data accesses are explicit and Any particular read or update may or may not
therefore the programmer is always aware of involve communication by the underlying
whether a particular operation is in-process or runtime support.
involves the expense of communication.
Buffered: The standard option copies the data from the user buffer to the kernel
buffer. The data later gets copied from the kernel buffer onto the network. For the
Receive primitive, the buffered option is usually required because the data may
already have arrived when the primitive is invoked, and needs a storage place in
the kernel.
Unbuffered: The data gets copied directly from the user buffer onto thenetwork.
Blocking primitives
The primitive commands wait for the message to be delivered. The execution of
the processes is blocked.
The sending process must wait after a senduntil an acknowledgement is made by
the receiver.
The receiving process must wait for the expected message from the sending
process
The receipt is determined by polling common buffer or interrupt
This is a form of synchronization or synchronous communication.
A primitive is blocking if control returns to the invoking process after the
processing for the primitive completes.
Non Blocking primitives
If send is nonblocking, it returns control to the caller immediately, before the
message is sent.
The advantage of this scheme is that the sending process can continue computing
in parallel with the message transmission, instead of having the CPU go idle.
This is a form of asynchronous communication.
A primitive is non-blocking if control returnsback to the invoking process
immediately after invocation, even thoughthe operation has not completed.
For a non-blocking Send, control returnsto the process even before the data is
copied out of the user buffer.
For anon-blocking Receive, control returns to the process even before the
datamay have arrived from the sender.
Synchronous
A Send or a Receive primitive is synchronous if both the Send() and Receive()
handshake with each other.
The processing for the Send primitive completes only after the invoking
processor learns
that the other corresponding Receive primitive has also been invoked andthat the
receive operation has been completed.
The processing for theReceive primitive completes when the data to be received
is copied intothe receiver’s user buffer.
Asynchronous
A Send primitive is said to be asynchronous, if control returns back to the
invoking process after the data item to be sent has been copied out of the user-
specified buffer.
It does not make sense to define asynchronous Receive primitives.
Implementing non -blocking operations are tricky.
For non-blocking primitives, a return parameter on the primitive call returns a
system-generated handle which can be later used to check the status of
completion of the call.
The process can check for the completion:
o checking if the handle has been flagged or posted
o issue a Wait with a list of handles as parameters: usually blocks until one
of the parameter handles is posted.
The send and receive primitives can be implemented in four modes:
Blocking synchronous
Non- blocking synchronous
Blocking asynchronous
Non- blocking asynchronous
Fig 1.12 a) Blocking synchronous send and blocking Fig 1.12 b) Non-blocking synchronous send and
receive blocking receive
Fig 1.12 c) Blocking asynchronous send Fig 1.12 d) Non-blocking asynchronous send
buffer. The checking for the completion may be necessary if the user wants to reuse the
buffer from which the data was sent.
Processor synchrony indicates that all the processors execute in lock-step with their clocks
synchronized.
Since distributed systems do not follow a common clock, this abstraction is implemented
using some form of barrier synchronization to ensure that no processor begins executing the
next step of code until all the processors have completed executing the previous steps of
code assigned to each of the processors.
RMI RPC
RMI uses an object oriented paradigm RPC is not object oriented and does not
where the user needs to know the object deal with objects. Rather, it calls specific
and the method of the object he needs to subroutines that are already established
invoke.
With RPC looks like a local call. RPC RMI handles the complexities of passing
handles the complexities involved with along the invocation from the local to the
passing the call from the local to the remote computer. But instead of passing
remote computer. a procedural call, RMI passes a reference
to the object and the method that is being
called.
Asynchronous Execution:
A communication among processes is considered asynchronous, when every
communicating process can have a different observation of the order of the messages being
exchanged. In an asynchronous execution:
there is no processor synchrony and there is no bound on the drift rate of
processor clocks
message delays are finite but unbounded
no upper bound on the time taken by a process
Synchronous Execution:
A communication among processes is considered synchronous when every process
observes the same order of messages within the system. In the same manner, the execution is
considered synchronous, when every individual process in the system observes the same
total order of all the processes which happen within it. In an synchronous execution:
processors are synchronized and the clock drift rate between any two processors is
bounded
message delivery times are such that theyoccur in one logical step or round
upper boundon the time taken by a process to execute a step.
Fig 1.14: Synchronous execution
Emulating an asynchronous system by a synchronous system (A → S)
An asynchronous program can be emulated on a synchronous system fairly trivially as the
synchronous system is a special case of an asynchronous system – all communication
finishes within the same round in which it is initiated.
Performance
User perceived latency in distributed systems must be reduced. The common issues in
performance:
Metrics: Appropriate metrics must be defined for measuring the performance of
theoretical distributed algorithms and its implementation.
Measurement methods/tools:The distributed system is a complexentity appropriate
methodology andtools must be developed for measuring the performance metrics.
1.8.3 Applications of distributed computing and newer challenges
The deployment environment of distributed systems ranges from mobile systems to
cloud storage. All the environments have their own challenges:
Mobile systems
o Mobile systems which use wireless communication in shared broadcast
mediumhave issues related to physical layer such as transmission range,
power, battery power consumption, interfacing with wired internet, signal
processing and interference.
o The issues pertaining to other higher layers includerouting, location
management, channel allocation, localization and position estimation, and
mobility management.
o Apart from the above mentioned common challenges, the architectural
differences of the mobile network demands varied treatment. The two
architectures are:
Base-station approach (cellular approach): The geographical region is divided into
hexagonal physical locations called cells. The powerful base station transmits signals to all
other nodes in its range
Ad-hoc network approach:This is an infrastructure-less approach which do not have
any base station to transmit signals. Instead all the responsibilityis distributed among the
mobile nodes.
It is evident that both the approaches work in different environment with different
principles of communication. Designing a distributed system to cater the varied need is a
great challenge.
Sensor networks
o A sensor is a processor with an electro-mechanical interface that is capable
ofsensing physical parameters.
o They are low cost equipment with limited computational power and battery
life. They are designed to handle streaming data and route it to external
computer network and processes.
o They are susceptible to faults and have to reconfigure themselves.
o These features introduces a whole newset of challenges, such as position
estimation and time estimation when designing a distributed system .
Ubiquitous or pervasive computing
o In Ubiquitous systems the processors are embedded in the environment to
performapplication functions in the background.
o Examples: Intelligent devices, smart homes etc.
o They are distributed systems with recent advancements operating in wireless
environments through actuator mechanisms.
o They can be self-organizing and network-centricwith limited resources.
Peer-to-peer computing
o Peer-to-peer (P2P) computing is computing over an application layernetwork
where all interactions among the processors are at a same level.
o This is a form of symmetric computation against the client sever paradigm.
o They are self-organizing with or without regular structure to the network.
o Some of the key challenges include: object storage mechanisms, efficient
object lookup, and retrieval in a scalable manner; dynamic reconfiguration
with nodes as well as objects joining and leaving the networkrandomly;
replication strategies to expedite object search; tradeoffs betweenobject size
latency and table sizes; anonymity, privacy, and security.
Publish-subscribe, content distribution, and multimedia
o The users in present day require only the information of interest.
o In a dynamic environment where the informationconstantly fluctuates there is
great demand for
o Publish:an efficient mechanism for distributing this information
o Subscribe: an efficient mechanism to allow end users to indicate interest in
receiving specific kinds of information
o An efficient mechanism foraggregating large volumes of published
information and filtering it as per theuser’s subscription filter.
o Content distribution refers to a mechanism that categorizes the information
based on parameters.
o The publish subscribe and content distribution overlap each other.
o Multimedia data introduces special issue because of its large size.
Distributed agents
o Agents are software processes or sometimes robots that move around the
systemto do specific tasks for which they are programmed.
o Agents collect and process information and can exchangesuch information
with other agents.
o Challenges in distributed agent systems include coordination mechanisms
among the agents, controlling the mobility of the agents,their software design
and interfaces.
Distributed data mining
o Data mining algorithms process large amount of data to detect patternsand
trends in the data, to mine or extract useful information.
o Themining can be done by applying database and artificial intelligence
techniquesto a data repository.
Grid computing
Grid computing is deployed to manage resources. For instance, idle CPU
cycles of machines connected to the network will be available to others.
The challenges includes:scheduling jobs, framework for implementing quality
of service, real-time guarantees,security.
Security in distributed systems
The challenges of security in a distributed setting include: confidentiality,
authentication and availability. This can be addressed usingefficient and scalable solutions.
1.9 A MODEL OF DISTRIBUTED COMPUTATIONS: DISTRIBUTED PROGRAM
A distributed program is composed of a set of asynchronous processes that
communicate by message passing over thecommunication network. Each process
may run on different processor.
The processes do not share a global memory and communicate solely by passing
messages. These processes do not share a global clock that is instantaneously
accessible tothese processes.
Process execution and message transfer are asynchronous – aprocess may execute an
action spontaneously and a process sending a message does not wait for the delivery
of the message to be complete.
The global state of a distributed computation is composed of the states of the
processes and the communication channels. The state of a process ischaracterized by
the state of its local memory and depends upon the context.
The state of a channel is characterized by the set of messages in transit in thechannel.
1.9.1 A MODEL OF DISTRIBUTED EXECUTIONS
When all the above conditions are satisfied, then it can be concluded that ab is casually
related. Consider two events c and d; cd and dc is false (i.e) they are not casually related,
then c and d are said to be concurrent events denoted as c||d.
Fig 1.17: Communication between processes
Fig 1.22 shows the communication of messages m1 and m2 between three processes p1, p2
and p3. a, b, c, d, e and f are events. It can be inferred from the diagram that, ab; cd; ef;
b->c; df; ad; af; bd; bf. Also a||e and c||e.
A system that supports the causal ordering model satisfies the following property:
Distributed Snapshot represents a state in which the distributed system might have been in. A
snapshot of the system is a single configuration of the system.
The global state of a distributed system is a collection of the local states of its
components, namely, the processes and the communication channels.
The state of a process at any time is defined by the contents of processor registers,
stacks, local memory, etc. and depends on the local context of the distributed
application.
The state of a channel is given by the set of messages in transit in the channel.
The state of a channel is difficult to state formally because a channel is a distributed entity
and its state depends upon the states of the processes it connects. Let
denote the state of a channel Cij defined as follows:
A distributed snapshot should reflect a consistent state. A global state is consistent if it could
have been observed by an external observer. For a successful Global State, all states must be
consistent:
If we have recorded that a process P has received a message from a process Q, then
we should have also recorded that process Q had actually send that message.
Otherwise, a snapshot will contain the recording of messages that have been received
but never sent.
The reverse condition (Q has sent a message that P has not received) is allowed.
The notion of a global state can be graphically represented by what is called a cut. A cut
represents the last event that has been recorded for each process.
The history of each process if given by:
Each event either is an internal action of the process. We denote by s ik the state of process pi
immediately before the kth event occurs. The state si in the global state S corresponding to
the cut C is that of p i immediately after the last event processed by p i in the cut – eici . The
set of events eici is called the frontier of the cut.
A cut is a set of cut events, one per node, each of which captures the state of the node on which it
occurs.
Cut is pictorially a line slices the space–time diagram, and thus the set of events in the
distributed computation, into a PAST and a FUTURE. The PAST contains all the events to
the left of the cut and the FUTURE contains all the events to theright of the cut. For a cut C,
let PAST(C) and FUTURE(C) denote the set ofevents in the PAST and FUTURE of C,
respectively.
Consistent cut: A consistent global state corresponds to a cut in which every message
received in the PAST of the cut was sent in the PAST of that cut.
Inconsistent cut: A cut is inconsistent if a message crosses the cut from the FUTURE to the
PAST.
The term max(past(ei)) denotes the latest event of process p i that has affected ej. This will
always be a message sent event.
A cut in a space-time diagram is a line joining an arbitrary point on each process line that
slices the space-time diagram into a PAST and a FUTURE. A consistent global state
corresponds to a cut in which every message received in the PAST of the cut was sent in the
PAST of that cut.
The future of an event ejdenoted by Future(ej) contains all the events ei that are
casually affected by ej.
Futurei(ei ) is the set of those events of Future (e j) are the process pi and min(Futurei(ej)) as
the first event on process pi that is affected by ej. All events at a process pi that occurred
afterMax(Past(ej)) but before min(Futurei(ej)) are concurrent with ej.
Logical clocks are based on capturing chronological and causal relationships of processes and
ordering events based on these relationships.
In a system of logical clocks, every process has a logical clock that is advanced using a set
of rules. Every event is assigned a timestamp and the causality relation between events can
be generally inferred from their timestamps.
The timestamps assigned to events obey the fundamental monotonicity property; that is, if
an event a causally affects an event b, then the timestamp of a is smaller than the timestamp
of b.
A system of logical clocks consists of a time domain T and a logical clock C. Elements of T form
a partially ordered set over a relation <. This relation is usually called the happened before or
causal precedence.
The logical clock C is a function that maps an event e in a distributed system to an element
in the time domain T denoted as C(e).
such that
for any two events ei and ej,. eiej C(ei)< C(ej).
This monotonicity property is called the clock consistency condition.When T and C satisfy
the following condition,
Then the system of clocks is strongly consistent.
Data structures:
Each process pimaintains data structures with the given capabilities:
•
A local logical clock (lci), that helps process pi measure itsown progress.
•
A logical global clock (gci), that is a representation of process pi’s local view of the
logical global time. It allows this process to assignconsistent timestamps to its local events.
Protocol:
The protocol ensures that a process’s logical clock, and thus its view of theglobal time, is
managed consistently with the following rules:
Rule 1: Decides the updates of the logical clock by a process. It controls send, receive and
other operations.
Rule 2: Decides how a process updates its global logical clock to update its view of the
global time and global progress. It dictates what information about the logical time is
piggybacked in a message and how this information is used by the receiving process to
update its view of the global time.
2. Total Reordering:Scalar clocks order the events in distributed systems.But all the events
do not follow a common identical timestamp. Hence a tie breaking mechanism is essential to
order the events. The tie breaking is done through:
Linearly order process identifiers.
Process with low identifier value will be given higher priority.
The term (t, i) indicates timestamp of an event, where t is its time of occurrence and i is the
identity of the process where it occurred.
The total order relation () over two events x and y with timestamp (h, i) and (k, j) is given by:
3. Event Counting
If event e has a timestamp h, then h−1 represents the minimum logical duration,
counted in units of events, required before producing the event e. This is called height of the
event e. h-1 events have been produced sequentially before the event e regardless of the
processes that produced these events.
4. No strong consistency
The scalar clocks are not strongly consistent is that the logical local clock and
logical global clock of a process are squashed into one, resulting in the loss causal
dependency information among events at different processes.
The time domain is represented by a set of n-dimensional non-negative integer vectors in vector
time.
There is an isomorphism between the set of partially ordered events produced by a
distributed computation and their vector timestamps.
If the process at which an event occurred is known, the test to compare
two timestamps can be simplified as:
2. Strong consistency
The system of vector clocks is strongly consistent; thus, by examining the vector timestamp
of two events, we can determine if the events are causally related.
3. Event counting
If an event e has timestamp vh, vh[j] denotes the number of events executed by process pj
that causally precede e.
Clock synchronization is the process of ensuring that physically distributed processors have a
common notion of time.
Due to different clocks rates, the clocks at various sites may diverge with time, and
periodically a clock synchrinization must be performed to correct this clock skew in
distributed systems. Clocks are synchronized to an accurate real-time standard like UTC
(Universal Coordinated Time). Clocks that must not only be synchronized with each other
but also have to adhere to physical time are termed physical clocks. This degree of
synchronization additionally enables to coordinate and schedule actions between multiple
computers connected to a common network.
Fig 1.30 a) Offset and delay estimation Fig 1.30 b) Offset and delay estimation
between processes from same server between processes from different servers
Let T1, T2, T3, T4 be the values of the four mostrecent timestamps. The clocks A and B are
stable andrunning at the same speed. Let a = T1 − T3 and b = T2 − T4. If the networkdelay
difference from A to B and from B to A, called differential delay, is
small, the clock offset and roundtrip delay of B relative to A at time T4are approximately
given by the following:
Each NTP message includes the latest three timestamps T1, T2, andT3, while T4 is
determined upon arrival.