Chapter 1
Introduction
1
1. Introduction
1.1 Introduction to Distributed Systems
1.2 Characteristics of Distributed Systems
1.3 Goals of Distributed Systems
1.4 Hardware Concepts
1.5 Software Concepts
1.6 Multiprocessor Systems, Multicomputer
Systems
1.7 Distributed Programming
2
Computer and Network Evolution
Computer Systems
– 10 million dollars and 1 instruction/sec
– 1000 dollars and 1 billion instructions/sec
Computer Networks
– Local-area networks (LANs)
• Small amount of information, a few microseconds
• Large amount of information, at rate of 100 million to 10 billion bits/sec
– Wide-area networks (WANs)
• 64 Kbps gigabits per second
3
Definition of a Distributed System
A distributed system is:
A collection of independent computers
that appears to its users as a single
coherent system
4
Definition of a Distributed System
1.1
A distributed system organized as middleware
Note that the middleware layer extends over multiple machines5
Characteristics of Distributed Systems
• Fault-Tolerant: Distributed systems consist of a large number
of hardware and software modules that are bound to fail in the
long run. Such component failures can lead to service
unavailability.
• Scalable: A distributed system can operate correctly even as
some aspect of the system is scaled to a larger size.
• Security: Distributed systems should allow communication
between programs/users/ resources on different computers by
enforcing necessary security arrangements.
• Transparency: Distributed systems should be perceived by
users and application developers as a whole rather than as a 6
collection of cooperating components.
1.2 Goals of Distributed Systems
Making resources available
Distribution transparency
Openness
Scalability
7
Make Resources Accessible
Access resources and share them in a controlled
and efficient way.
Printers, computers, storage facilities, data, files, Web
pages, and networks, …
Connecting users and resources also makes it
easier to collaborate and exchange information.
Internet for exchanging files, mail, documents, audio,
and video
8
Openness
Goal: Open distributed system -- able to interact with
services from other open systems, irrespective of the
underlying environment:
System should conform to well-defined interfaces
Systems should support portability of applications
Systems should easily interoperate
Achieving openness: At least make the distributed system
independent from heterogeneity of the underlying
environment:
Hardware
Platforms
Languages
9
Scale in Distributed Systems
Scalability: At least three components:
Number of users and/or processes (size scalability)
Maximum distance between nodes (geographical scalability)
Number of administrative domains (administrative scalability)
10
Techniques for Scaling
Distribution: Partition data and
computations across multiple machines:
Move computations to clients (Java applets)
Decentralized naming services (DNS)
Decentralized information systems (WWW)
11
Scaling Techniques (cont..)
1.4
The difference between letting:
a) a server or
b) a client check forms as they are being filled
12
13
Techniques for Scaling
Replication/caching: Make copies of data
available at different machines:
Replicated file servers and databases
Mirrored Web sites
14
Transparency in a
Distributed System
Transparency Description
Hide differences in data representation and how a resource is
Access
accessed
Location Hide where a resource is located
Migration Hide that a resource may move to another location
Hide that a resource may be moved to another location while
Relocation
in use
Replication Hide that a resource may be replicated on multiple servers
Hide that a resource may be shared by several competitive
Concurrency
users
Failure Hide the failure and recovery of a resource
Persistence Hide whether a (software) resource is in memory or on disk
15
Advantages of distributed systems
over centralized systems
16
Advantages of distributed systems
over Independent PCs
17
Disadvantages of distributed
systems
Security and privacy
•Security: The prevention and protection of (computer) assets
from unauthorized access.
•Privacy: The right of the individual to be protected against
intrusion into personal affair by direct physical means or by
publication of information.
“The most secure computers are those not connected to the
Internet and shielded from any interference”
18
Developing Distributed Systems: Pitfalls
There are many false assumptions:
The network is reliable
The network is secure
The network is homogeneous
The topology does not change
Latency is zero
Bandwidth is infinite
Transport cost is zero
There is one administrator
19
Hardware and Software concepts
Distributed system may be
Multi-processor System (tightly coupled system)
Multi-computers System (loosely coupled system)
Major issues
How they are interconnected?
How they communicate?
20
Hardware Concepts
Tightly Coupled versus Loosely Coupled
A tightly-coupled system is one where the components tend to be
reliably connected in close proximity.
A loosely-coupled system is one where the components tend to be
distributed.
Tightly coupled systems (multiprocessors)
o shared memory
o intermachine delay short, data rate high
Loosely coupled systems (multicomputers)
o private memory
o intermachine delay long, data rate low
21
Models of Computing
Memory
(Code and Data)
Instructions Data
I/O with external
world
CPU
The Von Neumann architecture
22
Shared Memory in Multiprocessors
Shared Memory
(Code and Data)
I D I D I D I D
CPU CPU CPU CPU
Parallel Random Access Machine model:
•N processors connected to shared memory
•All memory addresses reachable in unit time by
any CPU
23
Shared Memory & Private Memory
Multiprocessors (not multicomputers)
Single physical address space shared by all CPUs
CPU A writes 37 to address 1000
CPU B then reads from address 1000 and gets 37
e.g., multiple processors on a board with shared
memory
Multicomputers
Every machine has its own private memory
CPU A writes 37 to its address 1000
CPU B reads from its address 1000 and gets whatever
happens to be there; not affected by the other write
e.g., PCs connected by a network
24
Bus-based & Switch-based
Bus architecture of the interconnection network
single network, backplane, bus, cable or other medium
that connects all the machines
e.g., cable television
Switched architecture
Msgs move along wires with an explicit switching
decision made at each step to route the message
along one of the outgoing wires.
e.g., worldwide public telephone system
25
Multicomputers
Bus-Based multicomputers
easy to build
communication volume much smaller
Switched multicomputers
interconnection networks
26
• A bus can get overloaded rather quickly with each CPU accessing
the bus for all data and instructions.
• A solution to this is to add cache memory between the
CPU and the bus.
• The cache holds the most recently accessed regions of memory.
• This way, the CPU only has to go out to the bus to access main
memory only when the regions are not in its cache.
28
Switched Multiprocessors
Using switches enables us to achieve a far greater CPU
density in multiprocessor systems.
An m ×n crossbar switch is a switch that allows any of m
elements to be switched to any of n elements.
To use a crossbar switch, we place the CPUs on one axis
(e.g. m) and the break the memory into a number of chunks
which are placed on the second axis (e.g. n memory
chunks).
There will be a delay only when multiple CPUs try to access
the same memory chunk.
29
Switched Multiprocessors
30
Software Concept
The software design goal in building a distributed system is to
create a Single System Image - have a collection of
independent computers appear as a single system to the user(s).
By single system, we refer to creating a system in which
the user is not aware of the presence of multiple
computers or of distribution.
Two types
Loosely coupled software
Tightly-coupled software
31
Software Concept
Loosely-coupled - software in which the systems interact with
each other to a limited extent as needed. For the most part, they
operate as fully-functioning stand-alone machines.
If the network goes down, things are pretty much functional.
Loosely coupled systems may be ones in which there are shared
services (parts of file service, web service).
With tightly-coupled software, there is a strong dependence on
other machines for all aspects of the system.
Essentially, both the interconnect and functioning of the remote
systems are necessary for the local system's operation.
32
Software-Hardware Combination
Three types:
Network Operating Systems
(True) Distributed Systems
Multiprocessor Time Sharing
33
Network Operating Systems
loosely-coupled software on loosely-coupled
hardware
Eg. A network of workstations connected by LAN
Each machine has a high degree of autonomy
Files servers: client and server model
34
(True) Distributed Systems
tightly-coupled software on loosely-coupled hardware
provide a single-system image or a virtual machine
(uniprocessor)
To accomplish this, we need certain capabilities:
Uniform naming from anywhere; the file system
should look the same.
Same system call interface everywhere
35
Time Sharing Systems
Multiple jobs are executed by switching the CPU
between them.
In this, the CPU time is shared by different
processes, so it is called as “Time sharing
Systems”.
Time slice is defined by the OS, for sharing CPU
time between processes.
36
Comparison between Systems
Distributed OS
Item Multiproc Network OS
Multicomp
(Time
(true)
Sharing)
Degree of transparency Very High High Low
Same OS on all nodes Yes Yes No
Number of copies of OS 1 N N
Global,
Resource management Global, central Per node
distributed
Scalability No Moderately Yes
A comparison between multiprocessor operating systems,
multicomputer operating systems, and network operating systems
37