Chapter 2
Chapter 2
.
Architectures
Distributed systems are complex.
In order to manage their intrinsic complexity, distributed
systems should be organized properly.
Organization is mostly expressed in terms of its software
components.
Different ways to look at organization of distributed
systems –two obvious ones:
Software architecture – logical organization (of software
components and interconnections)
System architecture – physical realization (the instantiation
of software components on real machines)
2
Architectural style
A architectural style is formulated in terms of
Components,
The way that components are connected to each other,
The data exchanged between components, and finally
How these elements are jointly configured into a system.
A component is a modular unit with well-defined
interfaces that is replaceable within its environment.
A connector is a mechanism that mediates
communication, coordination, or cooperation among
components.
It allows for the flow of control between components
E.g., facilities for remote procedure call, message passing, or streaming
data.
3
Types of Architectural Styles
Common architectural styles of distributed systems
• Layered architectures
• Object-based architectures
• Resource-centered architectures
• Event-based architectures
4
Layered architectural style
It is hierarchical organization
Components are organized in a layered fashion
Component at layer Lj can make a down-call to a component at a lower-
level layer Li (with i < j) and generally expects a response.
Only in exception, an up-call is made to higher level component
Each layer exposes an interface to be used by above layers
“Multi-level client-server”
Each layer acts as a
Server: service provider to layers “above”
Client :service consumer of layer(s) “below”
Communication protocol-stacks are a typical examples
OSI Reference model
TCP/IP
5
Con’t
The three common cases
6
Con’t
Essentially, layered architectural style contains three
logical levels commonly know as application layers
The application(user)-interface level
The processing level
The data level
7
Object-Based Architectures
Components are objects
Objects are easy to be replaced so long as the interface is not touched
It is less structured and hence a relatively loose organization
The calling object might not run on the same machine as the
called object
Connectors are RPC and RMI
8
Con’t
Note:
Object-based architectures are attractive because they provide a
natural way of encapsulating data (called an object’s state) and the
operations that can be performed on that data (which are referred
to as an object’s methods) into a single entity.
9
Resource-based architectures
Viewed as a huge collection of resources that are individually
managed by components.
Resources may be added or removed by (remote) applications, and
likewise can be retrieved or modified.
This approach has now been widely adopted for the Web and is
known as Representational State Transfer (REST) [Fielding,
2000].
There are four key characteristics of what are known as RESTful
architectures [Pautasso et al., 2008]:
1. Resources are identified through a single naming scheme
2. All services offer the same interface, consisting of at most four
operations, as shown in the following figure.
3. Messages sent to or from a service are fully self-described
4. After executing an operation at a service, that component forgets
everything about the caller(stateless execution)
10
Con’t
The four operations available in RESTful architectures.
Components
Resources
Components, that interact with the resource
Connectors
Queries
11
Event–based Architecture
Event based architecture supports publish-subscribe communication
Publisher: components that announce data to be shared
Subscriber: components register their interest for published data.
Decouples sender and receiver (asynchronous communication)
Both parties don’t need to be up at the time of communication
Event can be considered as “a significant change in state”
12
Con’t
Components:
Can be an instance of a class or simply a module.
Connectors:
Event buses
13
System Architectures
The software components, their interactions, and their
placement leads to an instance of a software architecture,
also called a system architecture.
System architecture are of three types:
Centralized - most components located on a single machine
Decentralized - most machines have approximately the same
functionality
Hybrid - some combination
14
Centralized Architecture
In the basic client-server model, processes in a distributed system are
divided into two (possibly overlapping) groups.
Server:- is a process implementing a specific service E.g File server
Client:-is a process that requests a service
Clients and servers can be on different machines
Clients follow request/reply model with respect to using services
15
Cont...
Communication between a client and a server can be implemented by :
Connectionless protocol when the underlying network is fairly reliable
like local-area networks (UDP)
Connection-oriented protocol in WANs, (TCP)
Possibilities:
Request message was lost
Reply message was lost
Server failed either before, during or after performing the service
16
Cont ...
Common approach to lost request in connectionless communication:
Re-transmission (resending request )
Good for idempotent operations, i.e., operations that could be repeated more
than once without harm. E.g., “Return current value of X”
Not good for non idempotent operations like “ increase value of x by 100”
Because, may result in performing the operation twice
In this case reporting an error is appropriate, than resending
17
Logical Architecture vs. Physical Architecture
Layer and tier are roughly equivalent terms, but
Layer typically implies software and
Tier is more likely to refer to hardware.
Logical organization is not physical organization.
Physical architecture may or may not match the logical architecture.
Meaning, logically separate components might reside on single machine or
on different machines
Clients and servers could be placed on the same node, or be
distributed according to several different topologies.
Single-Tier Architecture: dumb terminal/mainframe configuration
Two-Tier Architecture: client/single server configuration
Three-Tier Architecture: each layer on separate machine
Two-tier and three-tier are the most common
18
Two-Tiered Architecture
Where are the three application-layers placed?
On the client machines, or on the server machines?
A range of possible solutions:
Thin-Client- A client machine only implements (part of) the user-
interface level
A server machine implementing the rest, i.e, the processing and data
levels
Pros: easier to manage, more reliable, client machines don’t need to be so
large and powerful
Con: perceived performance loss at client
Fat-Client - All user interface, application processing and some data
resides at the client
Pros: reduces work load at server;
More scalable
Cons: harder to manage by system admin,
Less secure
Other solutions in between thin-client and fat-client
19
Two-tiered Architectures
20
Three-tiered
The server tier in two-tiered architecture becomes more and more
distributed
A single server is no longer adequate for modern information systems
This leads to three-tiered architecture
Server may acting as a client
Three-tiered: each of the three layers corresponds to three separate
machines.
21
Decentralized Architectures
Placing logically different components on different machines is
called vertical distribution(VD)
User-interface, Processing components and a data level are on
different machine
It is similar with the concept of vertical fragmentation in distributed
database where
Tables are split into column wise and distributed on different machines
The advantage of VD is that each machine can be tailored for
specific type of function
22
Cont…
An alternative to VD is horizontal distribution(HD)
A client or server may be physically split up into logically equivalent
parts
Each part operates on its own share of the complete data set,
This results in balanced work load
Again this one is similar with that of horizontal fragmentation in
distributed database where
Tables are split row wise, and subset of rows distributed onto
different machines
Peer-to-peer systems are a class of modern architectures
that support horizontal distribution.
The functions that need to be carried out are represented by every
process that constitute the distributed system
23
Peer-to-peer systems
P2P systems partitions tasks or work loads between
peers
Often, the processes that constitute the system are all equal
Nodes act as both client and server;
Much of the interaction is symmetric.
24
Overlay network
Nodes of the P2P distributed system are connected using
overlay network
It is network that is built on top of another network
Nodes are formed by the processes of the network.
Overlay networks in the P2P system:
Define the structure between nodes in the system.
Allow nodes to route requests to locations that may not be known at
time of request.
The main question for peer-to-peer system is
How to organize the processes in an overlay network
Their organization can be:
Unstructured P2P:
Structured P2P:
Hybrid P2P:
25
Unstructured P2P architecture
Largely relying on randomized algorithm to construct the
overlay network
Each node has a list of neighbours, which is more or less
constructed in a random way
One challenge is how to efficiently locate a needed data
item
The two common approaches are
Flooding
Random walk
26
Cont…
Flooding:
Issuing node u passes request for data d to all neighbors.
Request is ignored when receiving node had seen it
before. Otherwise, v searches locally for d (recursively).
Return d if found, Otherwise forward the request to the
neighbors
However, this approach causes high signalling traffic
over the network
May be limited by a Time-To-Live: a maximum number of hops.
27
Cont…
Random walk:
Issuing node u passes request for d to randomly chosen
neighbor, v.
If v does not have d, it forwards request to one of its
randomly chosen neighbors, and so on.
28
Structured P2P
Nodes are organized following a specific distributed data
structure.
The most common one is distributed hash table (DHT)
In such systems, each data item is uniquely associated with a key,
in turn used as an index.
Each node is responsible to store data that are associated with
subset of these keys
P2P system now responsible for storing (key, value) pairs
Looking up data d with key k means routing request to node with
identifier k.
Example
chord
29
Hybrid Architectures
Many distributed systems require properties from both client-
server and peer-to-peer architectures.
So, they put together features from both centralized and
decentralized architectures, resulting in hybrid architectures.
Some nodes are appointed special functions in a well
organized fashion
Examples
Edge-server systems: placed at the edge of enterprise network
E.g., ISPs, which act as servers to their clients, but cooperate with
other edge servers to host shared content
Collaborative distributed systems:
E.g., BitTorrent, which supports parallel downloading and uploading
of chunks of a file.
First, interact with client-server system to download the torrent file,
and then operate in decentralized manner.
30
End of Chapter 2