0% found this document useful (0 votes)
55 views

CS3551 Unit 1 Part1

Uploaded by

jikinop574
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

CS3551 Unit 1 Part1

Uploaded by

jikinop574
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Chapter 1:

Introduction
Ajay Kshemkalyani and Mukesh
Singhal

Distributed Computing: Principles, Algorithms, and


Systems

Cambridge University Press

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 1 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Definitio
n
Distributed system is a collection of independent computers
interconnected through communication network (Autonomous
processors communicating over a communication network),
capable of collaborating on a task.
A collection of independent computers that appears to its users
as a single system. (Tanenbaum)

Distributed Computing is computing done on a distributed


system.

Some Features/characteristics
) No common physical clock – Clock time of each system varies
) No shared memory
) Geographical separation – NoW, CoW, Google search engine based on
NoW
NoW - Network of Workstations, CoW - Cluster of Workstations
) Autonomy and heterogeneity Introductio
A. Kshemkalyani and M. Singhal (Distributed ting 2/ 36
Distributed Computing: Principles, Algorithms, and
Systems

Distributed System
Model
A typical distributed system is shown in Figure. Each computer has a memory-
processing unit and the computers are connected by a communication
network.

P M P M P M

Communication network P processor(s)


M memory bank(s)
(WAN/ LAN)

P M P M
P M P M
Figure 1.1: A distributed system connects processors by a
communication network.

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 3 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Relation between Software


Components
Distributed application Extent of
distributed
Distributed software protocols
(middleware

Network protocol stack


libraries)
Application layer
Operating
Transport layer
system
Network layer
Data link layer

Figure 1.2: Interaction of the software components at each


process.

A. Kshemkalyani and M. Singhal (Distributed ting Introductio CUP 4 / 36


Figure 1.2 shows the relationships of the software components that run on each
of the computers and use the local operating system and network protocol
stack for functioning.

The distributed software is also termed as middleware. The middleware is the


distributed software that drives the distributed system, while providing
transparency of heterogeneity.

Figure 1.2 schematically shows the interaction of the middleware software with
the system components. Various primitives and function calls defined in various
libraries of the middleware layer are used in the user program code.

There exist several libraries to invoke primitives for the more common functions
of the middleware layer such as reliable and ordered multicasting. Currently
deployed commercial versions of middleware use CORBA, DCOM (distributed
component object model), Java, and RMI (remote method invocation)
technologies. The message-passing interface (MPI) is an example of an
interface for various communication functions.

A distributed execution is the execution of processes across the distributed


system to collaboratively achieve a common goal. An execution is also
sometimes termed a computation or a run.

The distributed system uses a layered architecture to break down the


complexityof system design.
Distributed Computing: Principles, Algorithms, and
Systems

Motivation for Distributed


System
Inherently distributed computation- eg. Banking
application
Resource sharing- Sharing of hardware and software
resources
Hardware resource – CPU, Printer, Scanner,
Cameras etc.
Software resources – Data sources, Files, Ojects
etc.
Access to remote resources Increased performance/cost
ratio Reliability
) availability (resources are made accessible all times)
) integrity (Trustwortiness of data. Changes made in one
replica must be
available in all replicas)
) fault-tolerance – ability to recover from failures
automatically.

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 6 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Message-passing vs. Shared


Memory
- Shared memory systems are those in which there is a shared address
space.
- Communication among processor takes place via shared data variables
- Semaphores and monitors were designed for shared uniprocessors and

multiprocessors
- Message Passing systems are Multicomputer systems that do not have
a Emulating
shared address
MP space and communicate by message passing
over SM:
) Partition shared address space
) Send/Receive emulated by writing/reading from special mailbox
per pair of processes
Emulating SM over MP:
) Model each shared object as a process
) Write to shared object emulated by sending message to owner
process for the object
) Read from shared object emulated by sending query to owner of
shared object

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 7 / 36


Message Communication Primitives
Message send and message receive communication primitives are
denoted Send() and Receive(), respectively.

A Send primitive has at least two parameters– the destination, and the
buffer in the user space, containing the data to be sent. Similarly, a
Receive primitive has at least two parameters – the source from which
the data is to be received and the user buffer into which the data is to
be received.

There are two ways of sending data when the Send primitive is invoked
– the buffered option and the unbuffered option. The buffered option
which is the standard option copies the data from the user buffer to the
kernel buffer. The data later gets copied from the kernel buffer onto the
network. In the unbuffered option, the data gets copied directly from the
user buffer onto the network.

For the Receive primitive, the buffered option is usually required


because the data may already have arrived when the primitive is
invoked, and needs a storage place in the kernel.
Distributed Computing: Principles, Algorithms, and
Systems

Classification of Primitives
Synchronous (send/receive)
) Handshake between sender and receiver
) Send completes when Receive completes
) Receive completes when data copied into buffer
Asynchronous (send)
) Control returns to process when data copied out of user-
specified buffer
Blocking (send/receive)
) Control returns to invoking process after processing of primitive
(whether sync or async) completes
Nonblocking (send/receive)
) Control returns to process immediately after invocation
) Send: even before data copied out of user buffer
) Receive: even before data may have arrived from sender

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 9 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Non-blocking
Primitive

Send(X, destination, handlek ) //handlek is a return


... parameter
...
Wait(handle1, handle2, . . . , handlek , . . . , handlem ) //Wait always blocks

Figure 1.7: A nonblocking send primitive. When the Wait call returns,
at least one of its parameters is posted.
Return parameter returns a system-generated handle
) Used later to check for status of completion of call
) Keep checking (loop or periodically) if handle has
been posted
) Issue Wait(handle1, handle2, . . .) call with list
of handles
) Wait call blocks until one of the stipulated
handles is posted

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 10 / 36


Distributed Computing: Principles, Algorithms, and
Systems
Blocking/nonblocking; Synchronous/asynchronous; send/receive
primities

Figure 1.8 Blocking/non-blocking and synchronous/asynchronous primitives. Process Pi is sending and


process Pj is receiving. (a) Blocking synchronous Send and blocking (synchronous) Receive.
(b) Non-blocking synchronous Send and nonblocking (synchronous) Receive.
(c) Blocking asynchronous Send. (d) Non-blocking asynchronous Send.
A. Kshemkalyani and M. Singhal (Distributed ting Introductio 11 / 36
Blocking synchronous Send: The data gets copied from user buffer to kernel
buffer and is then sent over the network. After the data is copied to the receiver’s
system buffer and a Receive call has been issued, an acknowledgement is sent
back to the sender. This causes control to return to the process that invoked the
Send operation and Send completes.

Blocking Receive: It blocks until the data expected arrives and is written in the
specified user buffer. Then control is returned to the user process.

Non-blocking synchronous Send: Control returns back to the invoking process as


soon as the copying of data from user buffer to kernel buffer is initiated. A
parameter in non-blocking call gets set with the handle of a location that a user
process can check later for the completion of synchronous send operation.

Non-blocking Receive: It will cause the kernel to register the call and return the
handle of a location that the user process can later check for the completion of
the non-blocking Receive operation. The user process can check for the
completion of the non-blocking Receive by invoking the Wait operation on the
returned handle.
Blocking asynchronous Send: The user process that invokes the Send is
blocked until the data is copied from the user’s buffer to the kernel buffer.
For the unbuffered option, until the data is copied from the user’s buffer to
the network.

Non-blocking asynchronous Send: Send is blocked until the transfer of data


from the user’s buffer to the kernel buffer is initiated. For the unbuffered
option, it is blocked until data transfer from user’s buffer to network is
initiated. Control returns to the user process as soon as this transfer is
initiated, and a parameter in non-blocking call gets set with the handle to
check later using Wait operation for the completion of the asynchronous
Send operation.
Distributed Computing: Principles, Algorithms, and
Systems

Asynchronous Executions: Mesage-passing


System
An asynchronous execution is an execution in which
(i) there is no processor synchrony and there is no bound on the drift rate of
processor clocks
(ii) message delays (transmission + propagation times) are finite but
unbounded
(iii) there is no upper bound on the time taken by a process to execute a step.
An example asynchronous execution with four processes P0 to P3 is shown in Figure
1.9. The arrows denote the messages; the tail and head of an arrow mark the send
and receive event for that message, denoted by a circle and vertical line, respectively.
Non-communication events, also termed as internal events, are shown by shaded
circles.

Figure 1.9: Asynchronous execution in a message-


passing system
A. Kshemkalyani and M. Singhal (Distributed ting Introductio 14 / 36
Distributed Computing: Principles, Algorithms, and
Systems

Synchronous Executions: Message-passing


System
A synchronous execution is an execution in which
(i) processors are synchronized and the clock drift rate between any two
processors is bounded
(ii) message delivery (transmission + delivery) times are such that they occur
in one logical step or round
(iii) there is a known upper bound on the time taken by a process to execute
a step.

An example of a synchronous execution with four processes P0 to P3 is


shown in Figure 1.10. The arrows denote the messages.

Figure 1.10 An example of a synchronous execution in a message-passing system. All the messages
sent in a round are received within that same round.

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 15 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Synchronous vs. Asynchronous


Executions (1)

Async execution
) No processor synchrony, no bound on drift rate of
clocks
) Message delays finite but unbounded
) No bound on time for a step at a process
Sync execution
) Processors are synchronized; clock drift rate
bounded
) Message delivery occurs in one logical step/round
) Known upper bound on time to execute a step at a
process

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 16 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Synchronous vs. Asynchronous


Executions (2)

Difficult to build a truly synchronous system; can simulate this


abstraction Virtual synchrony:
) async execution, processes synchronize as per application
requirement;
) execute in rounds/steps
Emulations:
) Async program on sync system: trivial (A is special case of S)
) Sync program on async system: tool called synchronizer

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 17 / 36


Distributed Computing: Principles, Algorithms, and
Systems

System
Emulations

Figure 1.11: Sync ↔ async, and shared memory ↔ msg-passing


emulations Assumption: failure-free system

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 18 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Challenges: System
Perspective (1)
Communication mechanisms: E.g., Remote Procedure Call
(RPC), remote object invocation (ROI), message-oriented vs.
stream-oriented communication
Processes: Code migration, process/thread management at
clients and servers, design of software and mobile agents
Naming: Easy to use identifiers needed to locate resources and
processes transparently and scalably
Synchronization Eg. Mutual Exclusion, Leader election,
Synchronizing physical clocks and devising logical clocks.
Data storage and access
) Schemes for data storage, search, and lookup should be fast and
scalable across network
) Revisit file system design
Consistency and
replication
) Replication for fast
access, scalability, avoid
A. Kshemkalyani and M. Singhal (Distributed ting Introductio 19 / 36
Distributed Computing: Principles, Algorithms, and
Systems

Challenges: System
Perspective (2)

Fault-tolerance: correct and efficient operation despite link, node,


process failures
Distributed systems security
) Secure channels, access control, key management (key generation
and key distribution), authorization, secure group management
Scalability and modularity of algorithms, data,
services Some experimental systems: Globe,
Globus, Grid

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 20 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Challenges: System
Perspective (3)

API for communications, services: ease of use


Transparency: hiding implementation policies
from user
) Access: hide differences in data rep across systems, provide uniform
operations to access resources
) Location: locations of resources are transparent
) Migration: relocate resources without renaming
) Relocation: relocate resources as they are being accessed
) Replication: hide replication from the users
) Concurrency: mask the use of shared resources
) Failure: reliable and fault-tolerant operation

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 21 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Challenges: Algorithm/Design
(1)

Designing Useful execution models and frameworks: to


reason with and design correct distributed programs
) Interleaving model
) Partial order model
) Input/Output automata
) Temporal Logic of Actions
Dynamic distributed graph algorithms and routing
algorithms
) System topology: distributed graph, with only local neighborhood
knowledge
) Graph algorithms: building blocks for group
communication, data dissemination, object location
) Algorithms need to deal with dynamically changing graphs
) Algorithm efficiency: also impacts resource consumption,
latency, traffic, congestion

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 22 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Challenges: Algorithm/Design
(2)

Time and global


state
) Physical time
(clock) accuracy –
Challenge is to
provide accurate
physical time
) Logical time captures inter-process dependencies and tracks
relative time progression

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 23 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Challenges: Algorithm/Design
(3)

Synchronization/coordination mechanisms
) Physical clock synchronization: hardware drift needs correction
) Leader election: select a distinguished process, due to inherent
symmetry
) Mutual exclusion: coordinate access to critical resources
) Distributed deadlock detection and resolution: need to observe
global state; avoid duplicate detection, unnecessary aborts
) Termination detection: global state of quiescence; no CPU
processing and no in-transit messages
) Garbage collection: Reclaim objects no longer pointed to by any
process

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 24 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Challenges: Algorithm/Design
(4)
Group communication, multicast, and ordered message delivery
Efficient algorithms for group communication and group management must b
designed
) Group: processes sharing a context, collaborating
) Multiple joins, leaves, fails
) Concurrent sends: semantics of delivery order need to be specified

Monitoring distributed events and predicates


) Predicates must be defined for condition on global
system state
) Debugging, environmental sensing, industrial process control, analyzing eve
streams

Distributed program design and verification tools

) Debugging distributed programs is hard. So tools for


debugging have to be designed.
A. Kshemkalyani and M. Singhal (Distributed ting Introductio 25 / 36
Distributed Computing: Principles, Algorithms, and
Systems

Challenges: Algorithm/Design
(5)

Data replication, consistency models, and caching


) Fast, scalable access;
) coordinate replica updates;
) optimize replica placement
World Wide Web design: caching, searching, scheduling
) Global scale distributed system; end-users
) Read-intensive; caching
(WWW is an example of widespread Distributed System with
direct interface to end user wherein the operations are read
intensive)
) Object search and navigation are resource-intensive
) User-perceived latency must be minimized by minimizing
response time

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 26 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Challenges: Algorithm/Design
(6)

Distributed shared memory abstraction


) Wait-free algorithm design: process completes execution,
irrespective of actions of other processes, i.e., n − 1 fault-
resilience
) Mutual exclusion
2 Bakery algorithm, semaphores, based on atomic hardware
primitives, fast algorithms when contention-free access
) Register constructions
2 Revisit assumptions about memory access
2 What behavior under concurrent unrestricted access to memory?
Foundation for future architectures, decoupled with technology
(semiconductor, biocomputing, quantum . . .)
) Consistency models:
2 coherence versus access cost trade-off
2 Weaker models than strict consistency of uniprocessors

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 27 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Challenges: Algorithm/Design
(7)
Reliable and fault-tolerant distributed systems
) Consensus algorithms: processes reach agreement in spite of
faults (under various fault models)
) Replication and replica management
) Voting and quorum systems
) Distributed databases, commit: ACID properties
) Self-stabilizing systems: ”illegal” system state changes to
”legal” state; requires built-in redundancy
) Checkpointing and recovery algorithms: roll back and restart
from earlier ”saved” state
) Failure detectors:
2 Diffi cult to distinguish a ”slow” process/message from a failed
process/ never sent message
2 algorithms that ”suspect” a process as having failed and converge on
a
determination of its up/down status

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 28 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Challenges: Algorithm/Design
(8)

Load balancing: to reduce latency, increase throughput,


dynamically. E.g., server farms
) Computation migration: relocate processes to redistribute
workload
) Data migration: move data, based on access patterns
) Distributed scheduling: across processors
Real-time scheduling: difficult without global view, network
delays make task harder
Performance modeling and analysis: Network latency to
access resources must be reduced
) Metrics: theoretical measures for algorithms, practical
measures for systems
) Measurement methodologies and tools

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 29 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Applications and Emerging


Challenges (1)

Mobile systems
) Wireless communication: unit disk model; broadcast medium
(MAC), power management etc.
) CS perspective: routing, location management, channel allocation,
localization and position estimation, mobility management
) Base station model (cellular model)
) Ad-hoc network model (rich in distributed graph theory problems)
Sensor networks: Processor with electro-mechanical
Interface Ubiquitous or pervasive computing
) Processors embedded in and seamlessly pervading
environment
) Wireless sensor and actuator mechanisms; self-organizing;
network-centric, resource-constrained
) E.g., intelligent home, smart workplace

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 30 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Applications and Emerging


Challenges (2)

Peer-to-peer computing
) No hierarchy; symmetric role; self-organizing; efficient object
storage and lookup;scalable; dynamic reconfig
Publish/subscribe, content distribution
) Filtering information to extract that of interest
Distributed agents
) Processes that move and cooperate to perform specific tasks;
coordination, controlling mobility, software design and interfaces
Distributed data mining
) Extract patterns/trends of interest
) Data not available in a single repository

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 31 / 36


Distributed Computing: Principles, Algorithms, and
Systems

Applications and Emerging


Challenges (3)

Grid computing
) Grid of shared computing resources; use idle CPU cycles
) Issues: scheduling, QOS guarantees, security of machines and
jobs
Security
) Confidentiality, authentication, availability in a distributed
setting
) Manage wireless, peer-to-peer, grid environments
2 Issues: e.g., Lack of trust, broadcast media, resource-constrained,
lack of structure

A. Kshemkalyani and M. Singhal (Distributed ting Introductio 32 / 36

You might also like