Chapter 1-Introduction
Chapter 1-Introduction
Distributed Systems
Distributed Systems
Department of Software Engineering
WCU
Lecture by: Mesay A. M.
Introduction
Before the mid-80s, computers were
• very expensive (hundred of thousands or even millions of
dollars)
• very slow (a few thousand instructions per second)
• not connected among themselves
After the mid-80s: two major developments
• cheap and powerful microprocessor-based computers appeared
• computer networks
• LANs at speeds ranging from 10 to 1000 Mbps
• WANs at speed ranging from 64 Kbps to gigabits/sec
Consequence
• feasibility of using a large network of computers to work for the
same application; this is in contrast to the old centralized systems
where there was a single computer with its peripherals.
2
Definition of a Distributed System
Distributed System
A distributed system :is a collection of independent computers that
appears to its users as a single coherent system - computer
(Tanenbaum & Van Steen)
This definition has two aspects:
1. hardware: autonomous machines
2. software: a single system view for the users
Other Definitions
• A distributed system :is a system designed to support the development of
applications and services which can exploit a physical architecture consisting
of multiple, autonomous processing elements that do not share primary
memory but cooperate by sending asynchronous messages over a
communication network (Blair & Stefani)
3
Why Distributed?
Resource and Data Sharing
• printers, databases, multimedia servers, ...
Availability, Reliability
• the loss of some instances can be hidden
Scalability, Extensibility
• the system grows with demand (e.g., extra servers)
Performance
• huge power (CPU, memory, ...) available
Inherent distribution, communication
• organizational distribution, e-mail, video
4
Characteristics of Distributed Systems
Differences between the computers and the ways they
communicate are hidden from users
Users and applications can interact with a distributed
system in a consistent and uniform way regardless of
location
Distributed systems should be easy to expand and scale
A distributed system is normally continuously available,
even if there may be partial failures
Users and applications should not notice that parts are being
replaced or fixed, or that new parts are added to serve more
users or applications
5
Organization of a Distributed System
To support heterogeneous computers and networks with a single-
system view, a distributed system is often organized by means of a
layer of software called middleware that extends over multiple
machines.
7
Different forms of transparency in a distributed system
Transparency Description
Access Hide differences in data representation
and how a resource is accessed
Location Hide where a resource is physically located; where
is http://www.prenhall.com/index.html? (naming)
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another location
while in use; e.g., mobile users using their wireless laptops
Replication Hide that a resource is replicated
Concurrency Hide that a resource may be shared by several competitive
users; a resource must be left in a consistent state
Failure Hide the failure and recovery of a resource
Persistence Hide whether a (software) resource is in memory
or on disk
8
Openness in a Distributed System
A distributed system should be open we need well-defined interfaces
Interoperability
components of different origin can communicate
Portability
components work on different platforms
Another goal of an open distributed system is that it should be flexible
and extensible; easy to configure the system out of different
components; easy to add new components, replace existing ones
An Open Distributed System is a system that offers services according
to standard rules that describe the syntax and semantics of those
services; e.g., protocols in networks
In distributed systems, such services are often specified through
interfaces often described using an Interface Definition Language
(IDL)
specify only syntax: the names of the functions, types of
parameters, return values, possible exceptions, ...
9
Scalability in Distributed Systems
A distributed system should be scalable
size: adding more users and resources to the system
geographically: users and resources may be far apart
administratively: should be easy to manage even if it spans
many administrative organizations
scalability problems: performance problems caused by limited
capacity of servers and networks
11
(a) a server checking the correctness of field entries
(b) a client doing the job
e.g. Shipping code is now supported in Web applications using Java
Applets
12
Distribution
e.g. DNS - Domain Name System
divide the name space into zones
for details, see later in Chapter 4 - Naming
14
Pros and Cons of Distributed Systems
Advantages of Distributed Systems
Performance: Very often a collection of processors can provide higher
performance (and better price/performance ratio) than a centralized
computer.
Distribution: many applications involve, by their nature, spatially
separated machines (banking, commercial, automotive system).
Reliability (fault tolerance): if some of the machines crash, the system
can survive.
Incremental growth: as requirements on processing power grow, new
machines can be added incrementally.
Sharing of data/resources: shared data is essential to many applications
(banking, computer supported cooperative work, reservation systems);
other resources can be also shared (e.g. expensive printers).
Communication: facilitates human-to-human communication.
15
Pros and Cons of Distributed Systems(cont.’)
Disadvantages of Distributed Systems
16
Hardware and Software Concepts in Distributed System
Hardware Concepts of Distributed System
different classification schemes exist
multiprocessors - with shared memory
multicomputers - that do not share memory
can be homogeneous or heterogeneous
17
a single
backbone
a bus-based multiprocessor
bus-based multiprocessors are difficult to scale even with caches
two possible solutions: crossbar switch and omega network
19
Crossbar switch
divide memory into modules and connect them to the processors with
a crossbar switch
at every intersection, a cross point switch is opened and closed to
establish connection
problem: expensive; with n CPUs and n memories, n 2 switches are
required
20
Omega network
use switches with multiple input and output lines
drawback: high latency because of several switching stages between
the CPU and memory
21
Homogeneous --Multicomputer Systems
Also referred to as System Area Networks (SANs)
the nodes are mounted on a big rack and connected through a high-
performance network
could be bus-based or switch-based
bus-based
shared multiaccess network such as Fast Ethernet can be used and
messages are broadcasted
performance drops highly with more than 25-100 nodes (contention)
22
switch-based
messages are routed through an interconnection network
two popular topologies: meshes (or grids) and hypercubes
Hypercube
Grid
23
Heterogeneous --Multicomputer Systems
most distributed systems are built on heterogeneous multicomputer
systems
the computers could be different in processor type, memory size,
architecture, power, operating system, etc. and the interconnection
network may be highly heterogeneous as well
the distributed system provides a software layer to hide the
heterogeneity at the hardware level; i.e., provides transparency
24
Software Concepts of Distributed System
OSs in relation to distributed systems
tightly-coupled systems, referred to as distributed OSs (DOS)
the OS tries to maintain a single, global view of the resources it
manages
used for multiprocessors and homogeneous multicomputers
loosely-coupled systems, referred to as network OSs (NOS)
a collection of computers each running its own OS; they work
together to make their services and resources available to others
used for heterogeneous multicomputers
Middleware: to enhance the services of NOSs so that a better
support for distribution transparency is provided
25
Project Work and Presentation for Next
week on August 19/2022
Mini project using java and JDBC which perform
functionality of last semester Requirement Engineering
project title.
Total mark -15%
Presentation day : Next week on class time
Submission should be Setup application
Submission Date: August 17-18/2022
Tools:
NetBeans or eclipse
MySql, Access, MSSQL, Orcel, and Appache
26
Summary of main issues
Description Main Goal
Tightly-coupled operating system for Hide and manage
DOS multi-processors and homogeneous hardware
multicomputers resources
Loosely-coupled operating system for Offer local
NOS heterogeneous multicomputers (LAN and services to remote
WAN) clients
Provide
Additional layer atop of NOS
Middleware distribution
implementing general-purpose services
transparency
27
Distributed Operating Systems
Two types
multiprocessor operating system: to manage the resources of a
multiprocessor
multicomputer operating system: for homogeneous
multicomputers
Uniprocessor Operating Systems
separating applications from operating system code through a
microkernel
28
Multiprocessor Operating Systems
extended uniprocessor operating systems to support multiple
processors having access to a shared memory
a protection mechanism is required for concurrent access to
guarantee consistency
two synchronization mechanisms: semaphores and monitors
semaphore: an integer with two atomic operations down (if s=0
then sleep; s := s-1) and up (s := s+1; wakeup a sleeping process
if any)
monitor: a programming language construct consisting of
procedures and variables that can be accessed only by the
procedures of the monitor; only a single process at a time is
allowed to execute a procedure
29
Multicomputer Operating Systems
processors can not share memory; instead communication is through
message passing
each node has its own
kernel for managing local resources
separate module for handling inter-processor communication
34
Middleware
a distributed operating system is not intended to handle a collection of
independent computers but provides transparency and ease of use
a network operating system does not provide a view of a single
coherent system but is scalable and open
combine the scalability and openness of network operating systems
and the transparency and ease of use of distributed operating systems
this is achieved through a middleware, another layer of software
35
general structure of a distributed system as middleware
36
Different middleware models exist
treat every resource as a file; just as in UNIX
through Remote Procedure Calls (RPCs) - calling a procedure on a
remote machine
distributed object invocation
(details later in Chapter 2 - Communication)
middleware services
access transparency: by hiding the low-level message passing
naming: such as a URL in the WWW
distributed transactions: by allowing multiple read and write
operations to occur atomically
security
37
Middleware and Openness
In an open middleware-based distributed system, the protocols used
by each middleware layer should be the same, as well as the interfaces
they offer to applications
38
A comparison between multiprocessor operating systems,
multicomputer operating systems, network operating systems, and
middleware-based distributed systems
Distributed OS
Network Middleware-
Item
Multiproc Multicomp OS based OS
39
The Client-Server Model
how are processes organized in a system
thinking in terms of clients requesting services from servers
40
Application Layering
no clear distinction between a client and a server; for instance a
server for a distributed database may act as a client when it forwards
requests to different file servers
three levels exist
the user-interface level: implemented by clients and contains
all that is required by a client; usually through GUIs, but not
necessarily
the processing level: contains the applications
the data level: contains the programs that maintain the actual
data dealt with
41
the general organization of an Internet search engine into three different
layers
Client-Server Architectures
how to physically distribute a client-server application across several
machines
Multitiered Architectures 42
Two-tiered architecture: alternative client-server organizations
a) put only terminal-dependent part of the user interface on the client
machine and let the applications remotely control the presentation
b) put the entire user-interface software on the client side
c) move part of the application to the client, e.g. checking correctness in
filling forms
d) and e) are for powerful client machines
43
three tiered architecture: an example of a server acting as a client
44
Modern Architectures
vertical distribution: when the different tiers correspond directly with
the logical organization of applications
horizontal distribution: physically split up the client or the server into
logically equivalent parts. e.g. Web server
46
Distributed Computing Systems
Cluster Computing
Essentially a group of systems connected through a LAN.
Homogeneous o Same OS, near-identical hardware
Single managing node
Tightly coupled systems
Centralized job management & scheduling system
47
Distributed Computing Systems
Grid Computing
Lots of nodes (including clusters across multiple subnets) from
everywhere.
Heterogeneous
Diversity and dynamism (it can handle nodes dropping in and out at any
point of time)
Dispersed across several organizations
Can easily span a wide-area network
To allow for collaborations, grids generally use virtual organizations
(grouping of users that will allow for authorization on resource
allocation).
Loosely coupled (decentralization)
Distributed job management & scheduling
48
Distributed Computing Systems
49
Distributed Computing Systems
Cloud Computing
Web-based tools or applications that users can access and use through a
web browser as if it were a program installed locally on their own
computer.
Internet-based computing
offers dynamically scalable and virtualized resources that make up
services for users to use over the internet
The only thing the user's computer needs to be able to run is the cloud
computing system's interface software
50
Distributed Computing Systems
52
Distributed Computing Systems
53
Distributed Computing Systems-Distributed Information Systems
Transaction processing systems
A transaction is a collection of operations on the state of an object (database,
object composition, etc.) that satisfies the following properties (ACID):
Atomicity: All operations either succeed, or all of them fail.
- When the transaction fails, the state of the object will remain unaffected by
the transaction.
Consistency: A transaction establishes a valid state transition.
- This does not exclude the possibility of invalid,
intermediate states during the transaction’s execution.
Isolation: Concurrent transactions do not interfere with each other.
- It appears to each transaction T that other transactions occur either
before T, or after T, but never both.
Durability: After the execution of a transaction, its effects are
made permanent:
- Changes to the state survive failures.
54
Distributed Computing Systems
Distributed Pervasive Systems
A next-generation of distributed systems emerging in which the nodes
are small, wireless, battery-powered, mobile (e.g. PDAs, smart phones,
wireless surveillance cameras, portable ECG monitors, etc.), and often
embedded as part of a larger system.
Some requirements:
Contextual change: The system is part of an environment in
which changes should be immediately accounted for.
Ad hoc composition: Each node may be used in a very different ways by different
users.
- Requires ease-of-configuration.
Sharing is the default: Nodes come and go, providing sharable services and
information.
- Calls again for simplicity.
55
Distributed Computing Systems
Distributed Pervasive Systems: Examples
Electronic Health Systems
Devices are physically close to a person
Where and how should monitored data be stored?
How can we prevent loss of crucial data?
What infrastructure is needed to generate and propagate alerts?
How can security be enforced?
How can physicians provide online feedback?
56
Distributed Pervasive Systems: Examples
Electronic Health Systems
57
Distributed Computing Systems
Sensor Networks
Consists of spatially distributed autonomous sensors to
cooperatively monitor physical or environmental conditions,
such as temperature, sound, vibration, pressure, motion or
pollutants, etc.
The nodes to which sensors are attached are:
• Many (10s-1000s)
• Simple (i.e., hardly any memory,CPU power, or
communication
facilities)
• Often battery-powered (or even battery-less)
58
Distributed Pervasive Systems: Examples
Sensor Networks
59
THANK YOU
60