0% found this document useful (0 votes)
24 views

Unit 1 Part 2

The document discusses primitives for distributed computing including send and receive primitives. It covers synchronous vs asynchronous communication and blocking vs non-blocking operations. It also discusses process synchronization, libraries and standards, and design issues and challenges in distributed systems.

Uploaded by

g.shalini12cs
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Unit 1 Part 2

The document discusses primitives for distributed computing including send and receive primitives. It covers synchronous vs asynchronous communication and blocking vs non-blocking operations. It also discusses process synchronization, libraries and standards, and design issues and challenges in distributed systems.

Uploaded by

g.shalini12cs
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

CS 3551 DISTRIBUTED

COMPUTING
Primitives for Distributed Computing
Function Parameters Need

Send() 1. destination, To send data from a system


2. buffer in the user space, containing the to another
data to be sent.

Receive() 1. source from which the data is to be To receive data from


received another system
2. and the user buffer into which the data is
to be received.

Sl Primitive Shortcut

1 Synchronous I care for you, you care for me

2 Asynchronous I don’t care for you

3 Blocking I care for my task completion

4 Non-blocking I don’t even care for my own task


Let’s Understand With Example
Need of Handles
Key Points
Primitives Part 2a
Send

● The data gets copied from the user buffer to the


kernel buffer and is then sent over the network.
● After the data is copied to the receiver’s system
buffer and a Receive call has been issued, an
acknowledgement is sent to sender.
● The process that invoked the Send operation and
completes the Send.

Receive

● The Receive call blocks until the data expected


arrives and is written in the specified user buffer.
● Then control is returned to the user process
Primitives Part 2b
Send

● Control returns back to the invoking process as


soon as the copy of data from the user buffer to the
kernel buffer is initiated
● Handle is returned as parameter which can be used
to check the process completion.

Receive

● When you make a Receive call, the system gives


you a special ID (handle).
● Handle can be used to check if the non-blocking
Receive operation is finished.
● The system sets this handle as "done" when the
expected data arrives and is put in your specified
place (buffer).
Primitives Part 2c
Send

● The user process that invokes the Send is blocked


until the data is copied from the user’s buffer to the
kernel buffer.
Primitives Part 2d
Send

● The user process that invokes the Send is blocked


until the transfer of the data from the user’s buffer to
the kernel buffer is initiated.
● Control returns to the user process as soon as this
transfer is initiated, and handle is given back.
● The asynchronous Send completes when the data
has been copied out of the user’s buffer
Key Points
1. A synchronous Send is easier to use from a programmer’s perspective because the
handshake between the Send and the Receive makes the communication appear
instantaneous, thereby simplifying the program logic.
2. The non-blocking asynchronous Send (see Figure 1.8(d)) is useful when a large
data item is being sent because it allows the process to perform other instructions in
parallel with the completion of the Send.
3. The non-blocking synchronous Send (see Figure 1.8(b)) also avoids the potentially
large delays for handshaking, particularly when the receiver has not yet issued the
Receive call.
4. The non-blocking Receive (see Figure 1.8(b)) is useful when a large data item is
being received and/or when the sender has not yet issued the Send call, because it
allows the process to perform other instructions in parallel with the completion of the
Receive.
Process Synchrony
Libraries and Standards

● In computer systems, there are many ways for programs to talk to each other, like
sending messages or making remote calls. Different software products and scientific
tools use their own special ways to do this.
● For example, some big companies use their custom methods, like IBM's CICS
software. Scientists often use libraries called MPI or PVM. Commercial software
often uses a method called RPC, which lets you call functions on a different
computer like you would on your own computer.
● All these methods use something like a hidden network phone line (called "sockets")
to make these remote calls work.
● There are many types of RPC, like Sun RPC and DCE RPC. There are also other
ways to communicate, like "messaging" and "streaming."
● As software evolves, there are new methods like RMI and ROI for object-based
programs, and big standardized systems like CORBA and DCOM.
Synchronous vs Asynchronous Execution
Synchronous vs Asynchronous Execution

Asynchronous Synchronous

1. there is no processor synchrony and 1. processors are synchronized and the


there is no bound on the drift rate of clock drift rate between any two
processor clocks, processors is bounded
2. message delays (transmission + 2. message delivery (transmission +
propagation times) are finite but delivery) times are such that they
unbounded occur in one logical step or round
3. there is no upper bound on the time 3. there is a known upper bound on the
taken by a process to execute a step time taken by a process to execute a
step
Key Points
● It is easier to design and verify algorithms assuming synchronous executions
because of the coordinated nature of the executions at all the processes.
● It is practically difficult to build a completely synchronous system, and have
the messages delivered within a bounded time. (Simulated)
● Thus, synchronous execution is an abstraction that needs to be provided to
the programs. When implementing this abstraction, observe that the fewer the
steps or “synchronizations” of the processors, the lower the delays and costs.
Emulating Systems
Design Issue and Challenges

1. From System Perspective


2. Algorithmic Challenges
3. Applications of distributed computing
1. From System Perspective ( CSCANS-PDF)
Challenge Description

Communication designing appropriate mechanisms for communication among the


processes in the network
Eg RPC,ROI, Message oriented vs stream oriented communication

Synchronization Mechanism for synch among processes.


Eg Mutual exclusion & leader election algos
Global state recording algo & physical clocks need synch

Consistency and Replicate for performance but it is essential that replicated compute
replication should be consistent.

API and transparency 1. API Required for ease of use.


2. Access transparency hides differences in data representation
on different systems and provides uniform operations to access
system resources.
3. Location transparency makes the locations of resources
transparent to the users.
4. Migration transparency allows relocating resources without
changing names
1. From System Perspective ( CSCANS-PDF)
Challenge Description

Naming Easy to use and robust naming for identifiers, and addresses is essential for locating
resources and processes in a transparent and scalable manner.

Security & Scalability Involves various aspects of cryptography, secure channels, access control, key
management – generation and distribution, authorization, and secure group
management
The system must be scalable to handle large loads.

Processes management of processes and threads at clients/servers; code migration; and the
design of software and mobile agents.

Data storage and Schemes for data storage, and implicitly for accessing the data in a fast and scalable
access manner across the network are important for efficiency

Fault tolerance ● Fault tolerance requires maintaining correct and efficient operation in spite of
any failures of links, nodes, and processes.
● Process resilience, reliable communication, distributed commit, checkpointing
and recovery, agreement and consensus, failure detection, and
self-stabilization are some of the mechanisms to provide fault-tolerance.
2. Algorithmic Challenges ( CREPT-MGW)
Challenge Description

group Communication, 1. A group is a collection of processes that share a common context and collaborate on a
multicast, and ordered common task within an application domain
message delivery 2. Algorithms need to be designed to enable efficient group communication and group
management wherein processes can join and leave groups dynamically
3. Specify order of delivery when multiple process send message concurrently

Replication and ● Replicate for performance.


consistency ● Managing such replicas in the face of updates introduces the problems of ensuring
consistency among the replicas and cached copies.

Execution models and ● interleaving model and partial order model are two widely adopted models of distributed
frameworks system executions
● The input/output automata model [25] and the TLA (temporal logic of actions) are two
other examples of models that provide different degrees of infrastructure

Program design and ● Methodically designed and verifiably correct programs can greatly reduce the overhead
verification tools of software design, debugging, and engineering
● Designing mechanisms to achieve these design and verification goals is a challenge.
2. Algorithmic Challenges ( CREPT-MGW)
Challenge Description

Time and global state 1. The processes in the system are spread across three-dimensional physical space.
Another dimension, time, has to be superimposed uniformly across space.
2. The challenges pertain to providing accurate physical time, and to providing a variant of
time, called logical time.
3. Logical time is relative time, and eliminates the overheads of providing physical time for
applications where physical time is not required.
4. Observing the global state of the system (across space) also involves the time
dimension for consistent observation

Monitoring ● On-line algorithms for monitoring such predicates are hence important.
distributed events ● Event streaming is used where streams of relevant events reported from different
and predicates processes are examined collectively to detect predicates

Graph algorithms ● The distributed system is modeled as a distributed graph, and the graph algorithms form
and distributed the building blocks for a large number of higher level communication, data
routing algorithms dissemination, object location, and object search functions.
● the design of efficient distributed graph algorithms is of paramount importance

World Wide Web ● Minimizing response time to minimize user perceived latencies is important challenge
design ● Object search and navigation on the web efficiently is important
2. Algorithmic Challenges (Synch Mechanism)
2. Algorithmic Challenges
2. Algorithmic Challenges (Fault Tolerant)
2. Algorithmic Challenges (Load Balancing)
2. Algorithmic Challenges (performance)
3. Applications of distributed computing
3. Applications of distributed computing
Sensor Networks:

1. Sensors, which can measure physical properties like temperature and humidity, have become affordable
and are deployed in large numbers (over a million).
2. They report external events, not internal computer processes. These networks have various applications,
including mobile or static sensors that communicate wirelessly or through wires. Self-configuring ad-hoc
networks introduce challenges like position and time estimation.

Ubiquitous Computing:

1. Ubiquitous systems involve processors integrated into the environment, working in the background, like in
sci-fi scenarios. Examples include smart homes and workplaces.
2. These systems are essentially distributed, use wireless tech, sensors, and actuators, and can self-organize.
They often consist of many small processors in a dynamic network, connecting to more powerful resources
for data processing.
3. Applications of distributed computing
Peer-to-Peer (P2P) Computing:

1. In P2P computing, all processors interact as equals without any hierarchy, unlike client-server systems. P2P
networks are often self-organizing and may lack a regular structure.
2. They don't use central directories for name resolution. Challenges include efficient object storage and
lookup, dynamic reconfiguration, replication strategies, and addressing issues like privacy and security.

Publish-Subscribe, Content Distribution, and Multimedia: (Netflix)

1. As information grows, we need efficient ways to distribute and filter it. Publish-Subscribe involves
distributing information, letting users subscribe to what interests them, and then filtering it based on user
preferences.
2. Content distribution is about sending data with specific characteristics to interested users, often used in web
and P2P settings. When dealing with multimedia, we face challenges like large data, compression, and
synchronization during storage and playback.
3. Applications of distributed computing
Data Mining Algorithms:

1. They analyze large data sets to find patterns and useful information. For example, studying
customer buying habits for targeted marketing.
2. This involves applying database and AI techniques to data. When data is distributed, as in private
banking or large-scale weather prediction, efficient distributed data mining algorithms are needed.

Security Challenges in Distributed Systems:

1. Traditional challenges include ensuring confidentiality, authentication, and availability.


2. The goal is efficient and scalable solutions.
3. In newer distributed architectures like wireless, peer-to-peer, grid, and pervasive computing, these
challenges become more complex due to resource constraints, broadcast mediums, lack of
structure, and network trust issues.

You might also like