Binder 1
Binder 1
Parallel Computing:
In distributed computing we have multiple autonomous computers which seems to the user as
single system. In distributed systems there is no shared memory and computers communicate with
each other through message passing. In distributed computing a single task is divided among
different computers.
Difference between Parallel Computing and Distributed Computing:
S.NO PARALLEL COMPUTING DISTRIBUTED COMPUTING
6. performance capabilities
Parallel Computing –
It is the use of multiple processing elements simultaneously for solving any problem.
Problems are broken down into instructions and are solved concurrently as each
resource which has been applied to work is working at the same time.
Advantages of Parallel Computing over Serial Computing are as follows:
1. It saves time and money as many resources working together will reduce the time and
cut potential costs.
2. It can be impractical to solve larger problems on Serial Computing.
3. It can take advantage of non-local resources when the local resources are finite.
4. Serial Computing ‘wastes’ the potential computing power, thus Parallel Computing
makes better work of hardware.
Types of Parallelism:
1. Bit-level parallelism: It is the form of parallel computing which is based on the
increasing processor’s size. It reduces the number of instructions that the system must
execute in order to perform a task on large-sized data.
Example: Consider a scenario where an 8-bit processor must compute the sum of two
16-bit integers. It must first sum up the 8 lower-order bits, then add the 8 higher-order
bits, thus requiring two instructions to perform the operation. A 16-bit processor can
perform the operation with just one instruction.
2. Instruction-level parallelism: A processor can only address less than one instruction
for each clock cycle phase. These instructions can be re-ordered and grouped which are
later on executed concurrently without affecting the result of the program. This is called
instruction-level parallelism.
3. Task Parallelism: Task parallelism employs the decomposition of a task into subtasks
and then allocating each of the subtasks for execution. The processors perform
execution of sub tasks concurrently.
Why parallel computing?
• The whole real world runs in dynamic nature i.e. many things happen at a certain time
but at different places concurrently. This data is extensively huge to manage.
• Real world data needs more dynamic simulation and modeling, and for achieving the
same, parallel computing is the key.
• Parallel computing provides concurrency and saves time and money.
• Complex, large datasets, and their management can be organized only and only using
parallel computing’s approach.
• Ensures the effective utilization of the resources. The hardware is guaranteed to be used
effectively whereas in serial computation only some part of hardware was used and the
rest rendered idle.
• Also, it is impractical to implement real-time systems using serial computing.
Applications of Parallel Computing:
• Data bases and Data mining.
• Real time simulation of systems.
• Science and Engineering.
• Advanced graphics, augmented reality and virtual reality.
Limitations of Parallel Computing:
• It addresses such as communication and synchronization between multiple sub-tasks
and processes which is difficult to achieve.
• The algorithms must be managed in such a way that they can be handled in the parallel
mechanism.
• The algorithms or program must have low coupling and high cohesion. But it’s difficult
to create such programs.
• More technically skilled and expert programmers can code a parallelism based program
well.
Future of Parallel Computing: The computational graph has undergone a great
transition from serial computing to parallel computing. Tech giant such as Intel has
already taken a step towards parallel computing by employing multicore processors.
Parallel computation will revolutionize the way computers work in the future, for the
better good. With all the world connecting to each other even more than before, Parallel
Computing does a better role in helping us stay that way. With faster networks,
distributed systems, and multi-processor computers, it becomes even more necessary.
A distributed system contains multiple nodes that are physically separate but linked together
using the network. All the nodes in this system communicate with each other and handle
processes in tandem. Each of these nodes contains a small part of the distributed operating
system software.
A diagram to better explain the distributed system is −
• All the nodes in the distributed system are connected to each other. So nodes can easily
share data with other nodes.
• More nodes can easily be added to the distributed system i.e. it can be scaled as required.
• Failure of one node does not lead to the failure of the entire distributed system. Other
nodes can still communicate with each other.
• Resources like printers can be shared with multiple nodes rather than being restricted to
just one.
Disadvantages of Distributed Systems
Some disadvantages of Distributed Systems are as follows −
• Virtualization
• Service-Oriented Architecture (SOA)
• Grid Computing
• Utility Computing
Virtualization
Virtualization is a technique, which allows to share single physical instance of an application
or resource among multiple organizations or tenants (customers). It does this by assigning a
logical name to a physical resource and providing a pointer to that physical resource when
demanded.
The Multitenant architecture offers virtual isolation among the multiple tenants. Hence, the
organizations can use and customize their application as though they each have their instances
running.
1
Access
Hides the way in which resources are accessed and the differences in
data platform.
2
Location
Hides where resources are located.
3
Technology
Hides different technologies such as programming language and OS
from user.
4
Migration / Relocation
Hide resources that may be moved to another location which are in use.
5
Replication
Hide resources that may be copied at several location.
6
Concurrency
Hide resources that may be shared with other users.
7
Failure
Hides failure and recovery of resources from user.
8
Persistence
Hides whether a resource ( software ) is in memory or disk.
Advantages
Disadvantages
Client-Server Architecture
The client-server architecture is the most common distributed system architecture
which decomposes the system into two major subsystems or logical processes −
• Client − This is the first process that issues a request to the second process i.e.
the server.
• Server − This is the second process that receives the request, carries it out, and
sends a reply to the client.
In this architecture, the application is modelled as a set of services that are provided by
servers and a set of clients that use these services. The servers need not know about
clients, but the clients must know the identity of servers, and the mapping of
processors to processes is not necessarily 1 : 1
Client-server Architecture can be classified into two models based on the functionality
of the client −
Thin-client model
In thin-client model, all the application processing and data management is carried by
the server. The client is simply responsible for running the presentation software.
• Used when legacy systems are migrated to client server architectures in which
legacy system acts as a server in its own right with a graphical interface
implemented on a client
• A major disadvantage is that it places a heavy processing load on both the
server and the network.
Thick/Fat-client model
In thick-client model, the server is only in charge for data management. The software
on the client implements the application logic and the interactions with the system user.
• Most appropriate for new C/S systems where the capabilities of the client system
are known in advance
• More complex than a thin client model especially for management. New versions
of the application have to be installed on all clients.
Advantages
Disadvantages
Presentation Tier
Presentation layer is the topmost level of the application by which users can access
directly such as webpage or Operating System GUI (Graphical User interface). The
primary function of this layer is to translate the tasks and results to something that user
can understand. It communicates with other tiers so that it places the results to the
browser/client tier and all other tiers in the network.
Application tier coordinates the application, processes the commands, makes logical
decisions, evaluation, and performs calculations. It controls an application’s
functionality by performing detailed processing. It also moves and processes data
between the two surrounding layers.
Data Tier
In this layer, information is stored and retrieved from the database or file system. The
information is then passed back for processing and then back to the user. It includes
the data persistence mechanisms (database servers, file shares, etc.) and provides
API (Application Programming Interface) to the application tier which provides methods
of managing the stored data.
Advantages
• Better performance than a thin-client approach and is simpler to manage than a
thick-client approach.
• Enhances the reusability and scalability − as demands increase, extra servers
can be added.
• Provides multi-threading support and also reduces network traffic.
• Provides maintainability and flexibility
Disadvantages
• Unsatisfactory Testability due to lack of testing tools.
• More critical server reliability and availability.
The components of broker architectural style are discussed through following heads −
Broker
Broker is responsible for coordinating communication, such as forwarding and
dispatching the results and exceptions. It can be either an invocation-oriented service,
a document or message - oriented broker to which clients send a message.
• It is responsible for brokering the service requests, locating a proper server,
transmitting requests, and sending responses back to clients.
• It retains the servers’ registration information including their functionality and
services as well as location information.
• It provides APIs for clients to request, servers to respond, registering or
unregistering server components, transferring messages, and locating servers.
Stub
Stubs are generated at the static compilation time and then deployed to the client side
which is used as a proxy for the client. Client-side proxy acts as a mediator between
the client and the broker and provides additional transparency between them and the
client; a remote object appears like a local one.
The proxy hides the IPC (inter-process communication) at protocol level and performs
marshaling of parameter values and un-marshaling of results from the server.
Skeleton
Skeleton is generated by the service interface compilation and then deployed to the
server side, which is used as a proxy for the server. Server-side proxy encapsulates
low-level system-specific networking functions and provides high-level APIs to mediate
between the server and the broker.
It receives the requests, unpacks the requests, unmarshals the method arguments,
calls the suitable service, and also marshals the result before sending it back to the
client.
Bridge
A bridge can connect two different networks based on different communication
protocols. It mediates different brokers including DCOM, .NET remote, and Java
CORBA brokers.
Bridges are optional component, which hides the implementation details when two
brokers interoperate and take requests and parameters in one format and translate
them to another format.
Broker implementation in CORBA
CORBA is an international standard for an Object Request Broker – a middleware to
manage communications among distributed objects defined by OMG (object
management group).
SOAP SOAP
XML
Provider
Features of SOA
SOA Operation
• SOA allows users to combine a large number of facilities from existing services to form
applications.
• SOA encompasses a set of design principles that structure system development and provide means
for integrating components into a coherent and decentralized system.
• SOA based computing packages functionalities into a set of interoperable services, which can be
integrated into different software systems belonging to separate business domains.
There are two major roles within Service-oriented Architecture:
1. Service provider: The service provider is the maintainer of the service and the organization that
makes available one or more services for others to use. To advertise services, the provider can
publish them in a registry, together with a service contract that specifies the nature of the service,
how to use it, the requirements for the service, and the fees charged.
2. Service consumer: The service consumer can locate the service metadata in the registry and
develop the required client components to bind and use the service.
Services might aggregate information and data retrieved from other services or create workflows of
services to satisfy the request of a given service consumer. This practice is known as service
orchestration Another important interaction pattern is service choreography, which is the
coordinated interaction of services without a single point of control.
Components of SOA:
Guiding Principles of SOA:
1. Standardized service contract: Specified through one or more service description documents.
2. Loose coupling: Services are designed as self-contained components, maintain relationships that
minimize dependencies on other services.
3. Abstraction: A service is completely defined by service contracts and description documents. They
hide their logic, which is encapsulated within their implementation.
4. Reusability: Designed as components, services can be reused more effectively, thus reducing
development time and the associated costs.
5. Autonomy: Services have control over the logic they encapsulate and, from a service consumer
point of view, there is no need to know about their implementation.
6. Discoverability: Services are defined by description documents that constitute supplemental
metadata through which they can be effectively discovered. Service discovery provides an effective
means for utilizing third-party resources.
7. Composability: Using services as building blocks, sophisticated and complex operations can be
implemented. Service orchestration and choreography provide a solid support for composing
services and achieving business goals.
Advantages of SOA:
• Service reusability: In SOA, applications are made from existing services.Thus, services can be
reused to make many applications.
• Easy maintenance: As services are independent of each other they can be updated and modified
easily without affecting other services.
• Platform independant: SOA allows making a complex application by combining services picked
from different sources, independent of the platform.
• Availability: SOA facilities are easily available to anyone on request.
• Reliability: SOA applications are more reliable because it is easy to debug small services rather
than huge codes
• Scalability: Services can run on different servers within an environment, this increases scalability
Disadvantages of SOA:
• High overhead: A validation of input parameters of services is done whenever services interact
this decreases performance as it increases load and response time.
• High investment: A huge initial investment is required for SOA.
• Complex service management: When services interact they exchange messages to tasks. the
number of messages may go in millions. It becomes a cumbersome task to handle a large number of
messages.
Practical applications of SOA: SOA is used in many ways around us whether it is mentioned or
not.
1. SOA infrastructure is used by many armies and air force to deploy situational awareness systems.
2. SOA is used to improve the healthcare delivery.
3. Nowadays many apps are games and they use inbuilt functions to run. For example, an app might
need GPS so it uses inbuilt GPS functions of the device. This is SOA in mobile solutions.
4. SOA helps maintain museums a virtualized storage pool for their information and content.
WEB SERVICES
Different books and different organizations provide different definitions to Web Services. Some
of them are listed here.
• A web service is any piece of software that makes itself available over the internet and
uses a standardized XML messaging system. XML is used to encode all
communications to a web service. For example, a client invokes a web service by
sending an XML message, then waits for a corresponding XML response. As all
communication is in XML, web services are not tied to any one operating system or
programming language—Java can talk with Perl; Windows applications can talk with
Unix applications.
• Web services are self-contained, modular, distributed, dynamic applications that can be
described, published, located, or invoked over the network to create products, processes,
and supply chains. These applications can be local, distributed, or web-based. Web
services are built on top of open standards such as TCP/IP, HTTP, Java, HTML, and
XML.
• Web services are XML-based information exchange systems that use the Internet for
direct application-to-application interaction. These systems can include programs,
objects, messages, or documents.
• A web service is a collection of open protocols and standards used for exchanging data
between applications or systems. Software applications written in various programming
languages and running on various platforms can use web services to exchange data over
computer networks like the Internet in a manner similar to inter-process communication
on a single computer. This interoperability (e.g., between Java and Python, or Windows
and Linux applications) is due to the use of open standards.
To summarize, a complete web service is, therefore, any service that −
• Is available over the Internet or private (intranet) networks
• Uses a standardized XML messaging system
• Is not tied to any one operating system or programming language
• Is self-describing via a common XML grammar
• Is discoverable via a simple find mechanism
Example
Consider a simple account-management and order processing system. The accounting personnel
use a client application built with Visual Basic or JSP to create new accounts and enter new
customer orders.
The processing logic for this system is written in Java and resides on a Solaris machine, which
also interacts with a database to store information.
The steps to perform this operation are as follows −
• The client program bundles the account registration information into a SOAP message.
• This SOAP message is sent to the web service as the body of an HTTP POST request.
• The web service unpacks the SOAP request and converts it into a command that the
application can understand.
• The application processes the information as required and responds with a new unique
account number for that customer.
• Next, the web service packages the response into another SOAP message, which it sends
back to the client program in response to its HTTP request.
• The client program unpacks the SOAP message to obtain the results of the account
registration process.
GRID COMPUTING
Grid computing is a distributed structure of a large number of computers connected to solve
a complicated problem. In grid computing, servers and computers run independently and
are loosely connected by the Internet. Computers may connect directly or through
scheduling systems.
In other words, Grid Computing involves a large number of computer which are connected
parallel and makes a computer cluster.
Grid computing is used in various types of applications such as mathematical, scientific, and
educational tasks via various computing resources.
Grid computing is a processor architecture that integrates computer resources from various
domains to achieve a primary goal. The computers on the network will work together in grid
computing on a project, thus acting as a supercomputer.
Grid systems are mainly designed for resource sharing by Distributed
and cluster computing on a large scale. It divides the complex tasks into smaller pieces that
are distributed to the CPUs
Cloud Computing
loud Computing is defined as the on-demand facility of computer power, database storage,
applications, and other IT resources through the internet. It provides a solution for IT
infrastructure at a low price.
In simple words, cloud computing means storing and accessing the data via the internet
instead of the computer’s hard drive.
Cloud computing is a pay-per-use model.
In Grid computing, resources are shared among In cloud computing, all the resources are
multiple computing units for processing a single managed centrally and are place over
task. different servers in clusters.
Grid computing is operated within a corporate Cloud computing can be accessed via the
network Internet.
The cloud servers are owned by infrastructure
In this, Grids are mainly owned and managed by
providers and are placed in physically various
an organization within its premises.
locations.
PROS &CONS
Advantages and Disadvantages of Cloud Computing
Advantages of Cloud Computing
As we all know that Cloud computing is trending technology. Almost every company switched their servi
cloud to rise the company growth.
Once the data is stored in the cloud, it is easier to get back-up and restore that data using the cloud.
2) Improved collaboration
Cloud applications improve collaboration by allowing groups of people to quickly and easily share informa
cloud via shared storage.
3) Excellent accessibility
Cloud allows us to quickly and easily access store information anywhere, anytime in the whole world, us
internet connection. An internet cloud infrastructure increases organization productivity and efficiency by
that our data is always accessible.
Cloud computing reduces both hardware and software maintenance costs for organizations.
5) Mobility
Cloud computing allows us to easily access all cloud data via mobile.
Cloud computing offers Application Programming Interfaces (APIs) to the users for access services on th
pays the charges as per the usage of service.
7) Unlimited storage capacity
Cloud offers us a huge amount of storing capacity for storing our important data such as documents, im
video, etc. in one place.
8) Data security
Data security is one of the biggest advantages of cloud computing. Cloud offers many advanced features
security and ensures that data is securely stored and handled.
1) Internet Connectivity
As you know, in cloud computing, every data (image, audio, video, etc.) is stored on the cloud, and we a
data through the cloud by using the internet connection. If you do not have good internet connectivity, y
access these data. However, we have no any other way to access data from the cloud.
2) Vendor lock-in
Vendor lock-in is the biggest disadvantage of cloud computing. Organizations may face problems when t
their services from one vendor to another. As different vendors provide different platforms, that can cau
moving from one cloud to another.
3) Limited Control
As we know, cloud infrastructure is completely owned, managed, and monitored by the service provider,
users have less control over the function and execution of services within a cloud infrastructure.
4) Security
Although cloud service providers implement the best security standards to store important information. B
adopting cloud technology, you should be aware that you will be sending all your organization's sensitive
to a third party, i.e., a cloud computing service provider. While sending the data on the cloud, there may
that your organization's information is hacked by Hackers.
REAL TIME APPLICATIONS
Cloud Computing has its applications in almost all the fields such as business, entertainment, data sto
networking, management, entertainment, education, art and global positioning system, etc. Some
famous cloud computing applications are discussed here in this tutorial:
Business Applications
Cloud computing has made businesses more collaborative and easy by incorporating various
as MailChimp, Chatter, Google Apps for business, and Quickbooks.
SN Application Description
1 MailChimp
It offers an e-mail publishing platform. It is widely employed by the
businesses to design and send their e-mail campaigns.
2 Chatter
Chatter app helps the employee to share important information about
organization in real time. One can get the instant feed regarding any issue.
4 Quickbooks
It offers online accounting solutions for a business. It helps
in monitoring cash flow, creating VAT returns and creating business
reports.
SN Application Description
1 Box.com
Box.com offers drag and drop service for files. The users need to drop the
files into Box and access from anywhere.
2 Mozy
Mozy offers online backup service for files to prevent data loss.
3 Joukuu
Joukuu is a web-based interface. It allows to display a single list of
contents for files stored in Google Docs, Box.net and Dropbox.
Management Applications
There are apps available for management task such as time tracking, organizing notes. Application
such tasks are discussed below:
SN Application Description
1 Toggl
It helps in tracking time period assigned to a particular project.
2 Evernote
It organizes the sticky notes and even can read the text from images which
helps the user to locate the notes easily.
3 Outright
It is an accounting app. It helps to track income, expenses, profits and
losses in real time.
Social Applications
There are several social networking services providing websites such as Facebook, Twitter, etc.
SN Application Description
1 Facebook
It offers social networking service. One can share photos, videos, files,
status and much more.
2 Twitter
It helps to interact with the public directly. One can follow any celebrity,
organization and any person, who is on twitter and can have latest updates
regarding the same.
Entertainment Applications
SN Application Description
1 Audio box.fm
It offers streaming service. The music files are stored online and can be
played from cloud using the own media player of the service.
Art Applications
SN Application Description
1 Moo
It offers art services such as designing and printing business cards,
postcards and mini cards.
Distributed memory programming with
message passing and MPI.
Chapter 02
The MPI Distributed Memory Model
The MPI Distributed Memory Model
The MPI Distributed Memory Model
The MPI Memory Model
The MPI execution Model
The MPI execution Model
The MPI execution Model
The MPI execution Model
Example “ Hello world”
Example “ Hello world”
Example “ Hello world”
Example Send N integer
Example Send N integer
Example Send N integer
The Pros and Cons of MPI
Speedup
Chapter 04
Outline
• Speedup & Efficiency
• Amdahl’s Law
Speed up
• Speedup:
• S = Time(the most efficient sequential algorithm) / Time(parallel
algorithm)
• Efficiency:
• E = S / N with N is the number of processors
Amdahl’s Law
Amdahl’s Law
Amdahl’s Law
Amdahl’s Law
Amdahl’s Law
Amdahl’s Law
Amdahl’s Law
Amdahl’s Law
Amdahl’s Law
Amdahl’s Law
Amdahl’s Law
Distributed Systems
(3rd Edition)
Architectural styles
Basic idea
A style is formulated in terms of
(replaceable) components with well-defined interfaces
the way that components are connected to each other
the data exchanged between components
how these components and connectors are jointly configured into a
system.
Connector
A mechanism that mediates communication, coordination, or cooperation
among components. Example: facilities for (remote) procedure call,
messaging, or streaming.
2 / 36
Architectures: Architectural styles Layered architectures
Layered architecture
Handle
Upcall
Layer N-2
Layer N-2
Layer 2
Layer N-3
Layer 1
3 / 36
Architectures: Architectural styles Layered architectures
Layer N Layer N
Interface Service
Two-party communication
Server
1 from socket import *
2 s = socket(AF_INET, SOCK_STREAM)
3 (conn, addr) = s.accept() # returns new socket and addr. client
4 while True: # forever
5 data = conn.recv(1024) # receive data from client
6 if not data: break # stop if client stopped
7 conn.send(str(data)+"*") # return sent data plus an "*"
8 conn.close() # close the connection
Client
1 from socket import *
2 s = socket(AF_INET, SOCK_STREAM)
3 s.connect((HOST, PORT)) # connect to server (block until accepted)
4 s.send(’Hello, world’) # send some data
5 data = s.recv(1024) # receive the response
6 print data # print the result
7 s.close() # close the connection
Application Layering
Application layering 6 / 36
Architectures: Architectural styles Layered architectures
Application Layering
Observation
This layering is found in many distributed information systems, using traditional
database technology and accompanying applications.
Application layering 6 / 36
Architectures: Architectural styles Layered architectures
Application Layering
HTML page
Keyword expression containing list
HTML
generator Processing
Query Ranked list level
generator of page titles
Ranking
Database queries algorithm
Application layering 7 / 36
Architectures: Architectural styles Object-based and service-oriented architectures
Object-based style
Essence
Components are objects, connected to each other through procedure calls.
Objects may be placed on different machines; calls can thus execute across a
network.
State
Object Object
Method
Method call
Object
Object
Object
Interface
Encapsulation
Objects are said to encapsulate data and offer methods on that data without
revealing the internal implementation.
8 / 36
Architectures: Architectural styles Resource-based architectures
RESTful architectures
Essence
View a distributed system as a collection of resources, individually managed by
components. Resources may be added, removed, retrieved, and modified by
(remote) applications.
1 Resources are identified through a single naming scheme
2 All services offer the same interface
3 Messages sent to or from a service are fully self-described
4 After executing an operation at a service, that component forgets
everything about the caller
Basic operations
Operation Description
PUT Create a new resource
GET Retrieve the state of a resource in some representation
DELETE Delete a resource
POST Modify a resource by transferring a new state
9 / 36
Architectures: Architectural styles Resource-based architectures
Essence
Objects (i.e., files) are placed into buckets (i.e., directories). Buckets cannot be
placed into buckets. Operations on ObjectName in bucket BucketName require
the following identifier:
http://BucketName.s3.amazonaws.com/ObjectName
Typical operations
All operations are carried out by sending HTTP requests:
Create a bucket/object: PUT, along with the URI
Listing objects: GET on a bucket name
Reading an object: GET on a full URI
10 / 36
Architectures: Architectural styles Resource-based architectures
On interfaces
Issue
Many people like RESTful approaches because the interface to a service is so
simple. The catch is that much needs to be done in the parameter space.
11 / 36
Architectures: Architectural styles Resource-based architectures
On interfaces
Simplifications
Assume an interface bucket offering an operation create, requiring an input
string such as mybucket, for creating a bucket “mybucket.”
12 / 36
Architectures: Architectural styles Resource-based architectures
On interfaces
Simplifications
Assume an interface bucket offering an operation create, requiring an input
string such as mybucket, for creating a bucket “mybucket.”
SOAP
import bucket
bucket.create("mybucket")
12 / 36
Architectures: Architectural styles Resource-based architectures
On interfaces
Simplifications
Assume an interface bucket offering an operation create, requiring an input
string such as mybucket, for creating a bucket “mybucket.”
SOAP
import bucket
bucket.create("mybucket")
RESTful
PUT "http://mybucket.s3.amazonsws.com/"
12 / 36
Architectures: Architectural styles Resource-based architectures
On interfaces
Simplifications
Assume an interface bucket offering an operation create, requiring an input
string such as mybucket, for creating a bucket “mybucket.”
SOAP
import bucket
bucket.create("mybucket")
RESTful
PUT "http://mybucket.s3.amazonsws.com/"
Conclusions
Are there any to draw?
12 / 36
Architectures: Architectural styles Publish-subscribe architectures
Subscribe Notification
Publish Subscribe Data
delivery
delivery
Event bus
Publish
Component
Shared (persistent) data space
13 / 36
Architectures: Architectural styles Publish-subscribe architectures
More details
Calling out(t) twice in a row, leads to storing two copies of tuple t ⇒ a
tuple space is modeled as a multiset.
Both in and rd are blocking operations: the caller will be blocked until a
matching tuple is found, or has become available.
14 / 36
Architectures: Architectural styles Publish-subscribe architectures
Alice
1 blog = linda.universe._rd(("MicroBlog",linda.TupleSpace))[1]
2
3 blog._out(("alice","gtcn","This graph theory stuff is not easy"))
4 blog._out(("alice","distsys","I like systems more than graphs"))
Chuck
1 blog = linda.universe._rd(("MicroBlog",linda.TupleSpace))[1]
2
3 t1 = blog._rd(("bob","distsys",str))
4 t2 = blog._rd(("alice","gtcn",str))
5 t3 = blog._rd(("bob","gtcn",str))
15 / 36
Architectures: Middleware organization Wrappers
Problem
The interfaces offered by a legacy component are most likely not suitable for all
applications.
Solution
A wrapper or adapter offers an interface acceptable to a client application. Its
functions are transformed into those available at the component.
16 / 36
Architectures: Middleware organization Wrappers
Organizing wrappers
Application Broker
17 / 36
Architectures: Middleware organization Interceptors
Problem
Middleware contains solutions that are good for most applications ⇒ you may
want to adapt its behavior for specific applications.
18 / 36
Architectures: Middleware organization Interceptors
Client application
Intercepted call
B.doit(val)
Application stub
Object middleware
Message-level interceptor
Local OS
To object B
19 / 36
Architectures: System architecture Centralized organizations
Client Server
Request
User interface User interface User interface User interface User interface
Application Application Application
Database
User interface
Server machine
(a) (b) (c) (d) (e)
Multitiered Architectures 21 / 36
Architectures: System architecture Centralized organizations
Three-tiered architecture
Client Application Database
server server
Request
operation
Request
data
Wait for Wait for
reply data
Return
data
Return
reply
Multitiered Architectures 22 / 36
Architectures: System architecture Decentralized organizations: peer-to-peer systems
Alternative organizations
Vertical distribution
Comes from dividing distributed applications into three logical layers, and
running the components from each layer on a different server (machine).
Horizontal distribution
A client or server may be physically split up into logically equivalent parts, but
each part is operating on its own share of the complete data set.
Peer-to-peer architectures
Processes are all equal: the functions that need to be carried out are
represented by every process ⇒ each process will act as a client and a server
at the same time (i.e., acting as a servant).
23 / 36
Architectures: System architecture Decentralized organizations: peer-to-peer systems
Structured P2P
Essence
Make use of a semantic-free index: each data item is uniquely associated with
a key, in turn used as an index. Common practice: use a hash function
key(data item) = hash(data item’s value).
P2P system now responsible for storing (key,value) pairs.
0100
0101 1101
1100
0110 0111 1111
1110
Example: Chord
Principle
Nodes are logically organized in a ring. Each node has an m-bit identifier.
Each data item is hashed to an m-bit key.
Data item with key k is stored at node with smallest identifier id ≥ k ,
called the successor of key k .
The ring is extended with various shortcut links to other nodes.
Example: Chord
31 0 1
30 2
29 3
Actual node
28 Shortcut 4
27 5
26 6
Nonexisting
25 7
node
24 8
21 11
20 12
19 13
18 14
17 16 15
lookup(3)@9 : 28 → 1 → 4
Structured peer-to-peer systems 26 / 36
Architectures: System architecture Decentralized organizations: peer-to-peer systems
Unstructured P2P
Essence
Each node maintains an ad hoc list of neighbors. The resulting overlay
resembles a random graph: an edge hu, v i exists only with a certain probability
P[hu, v i].
Searching
Flooding: issuing node u passes request for d to all neighbors. Request
is ignored when receiving node had seen it before. Otherwise, v searches
locally for d (recursively). May be limited by a Time-To-Live: a maximum
number of hops.
Random walk: issuing node u passes request for d to randomly chosen
neighbor, v . If v does not have d, it forwards request to one of its
randomly chosen neighbors, and so on.
Model
Assume N nodes and that each data item is replicated across r randomly
chosen nodes.
Random walk
P[k ] probability that item is found after k attempts:
r r
P[k ] = (1 − )k−1 .
N N
S (“search size”) is expected number of nodes that need to be probed:
N N
r r
S= ∑ k · P[k ] = ∑ k · N (1 − N )k −1 ≈ N/r for 1 r ≤ N.
k =1 k =1
Flooding
Flood to d randomly chosen neighbors
After k steps, some R(k ) = d · (d − 1)k−1 will have been reached
(assuming k is small).
With fraction r /N nodes having data, if Nr · R(k) ≥ 1, we will have found
the data item.
Comparison
If r /N = 0.001, then S ≈ 1000
With flooding and d = 10, k = 4, we contact 7290 nodes.
Random walks are more communication efficient, but might take longer
before they find the result.
Super-peer networks
Essence
It is sometimes sensible to break the symmetry in pure peer-to-peer networks:
When searching in unstructured P2P systems, having index servers
improves performance
Deciding where to store data can often be done more efficiently through
brokers.
Super peer
Overlay network of super peers
Weak peer
Edge-server architecture
Essence
Systems deployed on the Internet where servers are placed at the edge of the
network: the boundary between enterprise networks and the actual Internet.
ISP
ISP
Core Internet
Edge server
Enterprise network
Edge-server systems 32 / 36
Architectures: System architecture Hybrid Architectures
Client node
K out of N nodes
Lookup(F) Node 1
Exchange blocks
A file is divided into equally sized pieces (typically each being 256 KB)
Peers exchange blocks of pieces, typically some 16 KB.
A can upload a block d of piece D, only if it has piece D.
Neighbor B belongs to the potential set PA of A, if B has a block that A
needs.
If B ∈ PA and A ∈ PB : A and B are in a position that they can trade a block.
BitTorrent phases
Bootstrap phase
A has just received its first piece (through optimistic unchoking: a node from NA
unselfishly provides the blocks of a piece to get a newly arrived node started).
Trading phase
|PA | > 0: there is (in principle) always a peer with whom A can trade.
BitTorrent phases
0.8
0.6
|P|
|N|
0.4
|N| = 5
0.2 |N| = 10
|N| = 40
Introduction to threads
Basic idea
We build virtual processors in software, on top of physical processors:
Processor: Provides a set of instructions along with the capability of
automatically executing a series of those instructions.
Thread: A minimal software processor in whose context a series of
instructions can be executed. Saving a thread context implies
stopping the current execution and saving all the data needed to
continue the execution at a later stage.
Process: A software processor in whose context one or more threads may
be executed. Executing a thread, means executing a series of
instructions in the context of that thread.
2 / 47
Processes: Threads Introduction to threads
Context switching
Contexts
Processor context: The minimal collection of values stored in the registers
of a processor used for the execution of a series of instructions (e.g.,
stack pointer, addressing registers, program counter).
3 / 47
Processes: Threads Introduction to threads
Context switching
Contexts
Processor context: The minimal collection of values stored in the registers
of a processor used for the execution of a series of instructions (e.g.,
stack pointer, addressing registers, program counter).
Thread context: The minimal collection of values stored in registers and
memory, used for the execution of a series of instructions (i.e., processor
context, state).
3 / 47
Processes: Threads Introduction to threads
Context switching
Contexts
Processor context: The minimal collection of values stored in the registers
of a processor used for the execution of a series of instructions (e.g.,
stack pointer, addressing registers, program counter).
Thread context: The minimal collection of values stored in registers and
memory, used for the execution of a series of instructions (i.e., processor
context, state).
Process context: The minimal collection of values stored in registers and
memory, used for the execution of a thread (i.e., thread context, but now
also at least MMU register values).
3 / 47
Processes: Threads Introduction to threads
Context switching
Observations
1 Threads share the same address space. Thread context switching can be
done entirely independent of the operating system.
2 Process switching is generally (somewhat) more expensive as it involves
getting the OS in the loop, i.e., trapping to the kernel.
3 Creating and destroying threads is much cheaper than doing so for
processes.
4 / 47
Processes: Threads Introduction to threads
Process A Process B
Operating system
Trade-offs
Threads use the same address space: more prone to errors
No support from OS/HW to protect threads using each other’s memory
Thread context switching may be faster than process context switching
Thread usage in nondistributed systems 6 / 47
Processes: Threads Introduction to threads
MRU
A D
Main issue
Should an OS kernel provide threads, or should they be implemented as
user-level packages?
User-space solution
All operations can be completely handled within a single process ⇒
implementations can be extremely efficient.
All services provided by the kernel are done on behalf of the process in
which a thread resides ⇒ if the kernel decides to block a thread, the
entire process will be blocked.
Threads are used when there are lots of external events: threads block on
a per-event basis ⇒ if the kernel can’t distinguish threads, how can it
support signaling events to them?
Thread implementation 8 / 47
Processes: Threads Introduction to threads
Kernel solution
The whole idea is to have the kernel contain the implementation of a thread
package. This means that all operations return as system calls:
Operations that block a thread are no longer a problem: the kernel
schedules another available thread within the same process.
handling external events is simple: the kernel (which catches all events)
schedules the thread associated with the event.
The problem is (or used to be) the loss of efficiency due to the fact that
each thread operation requires a trap to the kernel.
Conclusion – but
Try to mix user-level and kernel-level threads into a single concept, however,
performance gain has not turned out to outweigh the increased complexity.
Thread implementation 9 / 47
Processes: Threads Introduction to threads
Lightweight processes
Basic idea
Introduce a two-level threading approach: lightweight processes that can
execute user-level threads.
Thread state
User space
Thread
Lightweight process
Kernel space
Thread implementation 10 / 47
Processes: Threads Introduction to threads
Lightweight processes
Principle operation
Thread implementation 11 / 47
Processes: Threads Introduction to threads
Lightweight processes
Principle operation
User-level thread does system call ⇒ the LWP that is executing that
thread, blocks. The thread remains bound to the LWP.
Thread implementation 11 / 47
Processes: Threads Introduction to threads
Lightweight processes
Principle operation
User-level thread does system call ⇒ the LWP that is executing that
thread, blocks. The thread remains bound to the LWP.
The kernel can schedule another LWP having a runnable thread bound to
it. Note: this thread can switch to any other runnable thread currently in
user space.
Thread implementation 11 / 47
Processes: Threads Introduction to threads
Lightweight processes
Principle operation
User-level thread does system call ⇒ the LWP that is executing that
thread, blocks. The thread remains bound to the LWP.
The kernel can schedule another LWP having a runnable thread bound to
it. Note: this thread can switch to any other runnable thread currently in
user space.
A thread calls a blocking user-level operation ⇒ do context switch to a
runnable thread, (then bound to the same LWP).
Thread implementation 11 / 47
Processes: Threads Introduction to threads
Lightweight processes
Principle operation
User-level thread does system call ⇒ the LWP that is executing that
thread, blocks. The thread remains bound to the LWP.
The kernel can schedule another LWP having a runnable thread bound to
it. Note: this thread can switch to any other runnable thread currently in
user space.
A thread calls a blocking user-level operation ⇒ do context switch to a
runnable thread, (then bound to the same LWP).
When there are no threads to schedule, an LWP may remain idle, and
may even be removed (destroyed) by the kernel.
Thread implementation 11 / 47
Processes: Threads Introduction to threads
Lightweight processes
Principle operation
User-level thread does system call ⇒ the LWP that is executing that
thread, blocks. The thread remains bound to the LWP.
The kernel can schedule another LWP having a runnable thread bound to
it. Note: this thread can switch to any other runnable thread currently in
user space.
A thread calls a blocking user-level operation ⇒ do context switch to a
runnable thread, (then bound to the same LWP).
When there are no threads to schedule, an LWP may remain idle, and
may even be removed (destroyed) by the kernel.
Note
This concept has been virtually abandoned – it’s just either user-level or
kernel-level threads.
Thread implementation 11 / 47
Processes: Threads Threads in distributed systems
Multithreaded clients 12 / 47
Processes: Threads Threads in distributed systems
Multithreaded clients 13 / 47
Processes: Threads Threads in distributed systems
Practical measurements
A typical Web browser has a TLP value between 1.5 and 2.5 ⇒ threads are
primarily used for logically organizing browsers.
Multithreaded clients 13 / 47
Processes: Threads Threads in distributed systems
Improve performance
Starting a thread is cheaper than starting a new process.
Having a single-threaded server prohibits simple scale-up to a
multiprocessor system.
As with clients: hide network latency by reacting to next request while
previous one is being replied.
Better structure
Most servers have high I/O demands. Using simple, well-understood
blocking calls simplifies the overall structure.
Multithreaded programs tend to be smaller and easier to understand due
to simplified flow of control.
Multithreaded servers 14 / 47
Processes: Threads Threads in distributed systems
Worker thread
Request coming in
from the network
Operating system
Overview
Model Characteristics
Multithreading Parallelism, blocking system calls
Single-threaded process No parallelism, blocking system calls
Finite-state machine Parallelism, nonblocking system calls
Multithreaded servers 15 / 47
Processes: Virtualization Principle of virtualization
Virtualization
Observation
Virtualization is important:
Hardware changes faster than software
Ease of portability and code migration
Isolation of failing or attacked components
Program
Interface A
Program Implementation of
mimicking A on B
Interface A Interface B
16 / 47
Processes: Virtualization Principle of virtualization
Mimicking interfaces
Types of virtualization 17 / 47
Processes: Virtualization Principle of virtualization
Ways of virtualization
(a) Process VM, (b) Native VMM, (c) Hosted VMM
Application/Libraries
Differences
(a) Separate set of instructions, an interpreter/emulator, running atop an OS.
(b) Low-level instructions, along with bare-bones minimal operating system
(c) Low-level instructions, but delegating most work to a full-fledged OS.
Types of virtualization 18 / 47
Processes: Virtualization Principle of virtualization
Special instructions
Control-sensitive instruction: may affect configuration of a machine (e.g.,
one affecting relocation register or interrupt table).
Behavior-sensitive instruction: effect is partially determined by context
(e.g., POPF sets an interrupt-enabled flag, but only in system mode).
Types of virtualization 19 / 47
Processes: Virtualization Principle of virtualization
Solutions
Emulate all instructions
Wrap nonprivileged sensitive instructions to divert control to VMM
Paravirtualization: modify guest OS, either by preventing nonprivileged
sensitive instructions, or making them nonsensitive (i.e., changing the
context).
Types of virtualization 20 / 47
Processes: Virtualization Application of virtual machines to distributed systems
IaaS
Instead of renting out a physical machine, a cloud provider will rent out a VM
(or VMM) that may possibly be sharing a physical machine with other
customers ⇒ almost complete isolation between customers (although
performance isolation may not be reached).
21 / 47
Processes: Clients Networked user interfaces
Client-server interaction
Network Network
22 / 47
Processes: Clients Networked user interfaces
Basic organization
Application server Application server User's terminal
Xlib Xlib
X kernel
Device drivers
Basic organization
Application server Application server User's terminal
Xlib Xlib
X kernel
Device drivers
Improving X
Practical observations
There is often no clear separation between application logic and
user-interface commands
Applications tend to operate in a tightly synchronous manner with an X
kernel
Alternative approaches
Let applications control the display completely, up to the pixel level (e.g.,
VNC)
Provide only a few high-level display operations (dependent on local video
drivers), allowing more efficient display operations.
Client-side software
25 / 47
Processes: Servers General design issues
Basic model
A process implementing a specific service on behalf of a collection of clients. It
waits for an incoming request from a client and subsequently ensures that the
request is taken care of, after which it waits for the next incoming request.
26 / 47
Processes: Servers General design issues
Concurrent servers
Observation
Concurrent servers are the norm: they can easily handle multiple requests,
notably in the presence of blocking operations (to disks or other servers).
Contacting a server
Out-of-band communication
Issue
Is it possible to interrupt a server once it has accepted (or is in the process of
accepting) a service request?
Interrupting a server 29 / 47
Processes: Servers General design issues
Out-of-band communication
Issue
Is it possible to interrupt a server once it has accepted (or is in the process of
accepting) a service request?
Interrupting a server 29 / 47
Processes: Servers General design issues
Out-of-band communication
Issue
Is it possible to interrupt a server once it has accepted (or is in the process of
accepting) a service request?
Interrupting a server 29 / 47
Processes: Servers General design issues
Consequences
Clients and servers are completely independent
State inconsistencies due to client or server crashes are reduced
Possible loss of performance because, e.g., a server cannot anticipate
client behavior (think of prefetching file blocks)
Consequences
Clients and servers are completely independent
State inconsistencies due to client or server crashes are reduced
Possible loss of performance because, e.g., a server cannot anticipate
client behavior (think of prefetching file blocks)
Question
Does connection-oriented communication fit into a stateless design?
Stateless versus stateful servers 30 / 47
Processes: Servers General design issues
Stateful servers
Keeps track of the status of its clients:
Record that a file has been opened, so that prefetching can be done
Knows which data a client has cached, and allows clients to keep local
copies of shared data
Stateful servers
Keeps track of the status of its clients:
Record that a file has been opened, so that prefetching can be done
Knows which data a client has cached, and allows clients to keep local
copies of shared data
Observation
The performance of stateful servers can be extremely high, provided clients
are allowed to keep local copies. As it turns out, reliability is often not a major
problem.
Common organization
Logical switch Application/compute servers Distributed
(possibly multiple) file/database
system
Dispatched
request
Client requests
Crucial element
The first tier is generally responsible for passing requests to an appropriate
server: request dispatching
Local-area clusters 32 / 47
Processes: Servers Server clusters
Request Handling
Observation
Having the first tier handle all communication from/to the cluster may lead to a
bottleneck.
Request
Request (handed off)
Client Switch
Server
Local-area clusters 33 / 47
Processes: Servers Server clusters
Server clusters
The front end may easily get overloaded: special measures may be needed
Transport-layer switching: Front end simply passes the TCP request to
one of the servers, taking some performance metric into account.
Content-aware distribution: Front end reads the content of the request
and then selects the best server.
Other messages
Dis-
Client Switch 4. Inform patcher
Setup request switch
1. Pass setup request Distributor 2. Dispatcher selects
to a distributor server
Application
server
Local-area clusters 34 / 47
Processes: Servers Server clusters
Client transparency
To keep client unaware of distribution, let DNS resolver act on behalf of client.
Problem is that the resolver may actually be far from local to the actual client.
Wide-area clusters 35 / 47
Processes: Servers Server clusters
Wide-area clusters 36 / 47
Processes: Servers Server clusters
Wide-area clusters 37 / 47
Processes: Servers Server clusters
Wide-area clusters 37 / 47
Processes: Servers Server clusters
Example: PlanetLab
Essence
Different organizations contribute machines, which they subsequently share for
various experiments.
Problem
We need to ensure that different distributed applications do not get into each
other’s way ⇒ virtualization
Process
Process
Process
Process
Process
Process
Process
Process
Process
/usr
/usr
/usr
/usr
/usr
/dev
/home
/proc
/dev
/home
/proc
/dev
/home
/proc
/dev
/home
/proc
/dev
/home
/proc
Vserver Vserver Vserver Vserver Vserver
Hardware
Vserver
Independent and protected environment with its own libraries, server versions,
and so on. Distributed applications are assigned a collection of vservers
distributed across multiple machines
Case study: PlanetLab 39 / 47
Processes: Servers Server clusters
Node
Vserver
41 / 47
Processes: Code migration Reasons for migrating code
code code
CS exec exec*
resource resource
code code
REV −→ exec −→ exec*
resource resource
42 / 47
Processes: Code migration Reasons for migrating code
code code
CoD exec ←− exec* ←−
resource resource
code code
MA exec −→ −→ exec*
resource resource resource resource
43 / 47
Processes: Code migration Reasons for migrating code
Object components
Code segment: contains the actual code
Data segment: contains the state
Execution state: contains context of thread executing the object’s code
Weak mobility: Move only code and data segment (and reboot execution)
Relatively simple, especially if code is portable
Distinguish code shipping (push) from code fetching (pull)
44 / 47
Processes: Code migration Migration in heterogeneous systems
Main problem
The target machine may not be suitable to execute the migrated code
The definition of process/thread/processor context is highly dependent on
local hardware, operating system and runtime system
45 / 47
Processes: Code migration Migration in heterogeneous systems
46 / 47
Processes: Code migration Migration in heterogeneous systems
Problem
A complete migration may actually take tens of seconds. We also need to
realize that during the migration, a service will be completely unavailable for
multiple seconds.
Downtime
Response time
Time
47 / 47