Chapter 1 - Introduction
Chapter 1 - Introduction
Yihenew Gebru
February 2020
1
[email protected]
Course Outline
1. Introduction to DS: Introduction and Definition, Characteristics, Goals of a
Distributed System, Types of Distributed Systems.
2. Architectures: Architectural Styles, System Architectures.
3. Processes: Threads and their Implementation, Anatomy of Clients, Servers and
Design Issues, Code Migration
4. Communication: Network Protocols and Standards, Remote Procedure Call, Message-
Oriented Communication, Stream-Oriented Communication, Multicast Communication.
5. Naming: Names, Identifiers, and Addresses, Flat Naming, Structured Naming,
Attribute-Based Naming,
6. Synchronization: Clock Synchronization, Logical Clocks, Mutual Exclusion, Election
algorithms,
7. Consistency and Replication: Reasons for Replication, Data-Centric Consistency
Models, Client-Centric Consistency Models, Replica Management, Consistency
Protocols,
8. Fault Tolerance: Introduction to Fault Tolerance, Process Resilience, Reliable Client-
Server Communication, Reliable Group Communication, Distributed Commit,
Recovery
2
Course Outline Continued
Text Book: S. Tanenbaum and Maarten van Steen,
“Distributed Systems, Principles and Paradigms”, 2nd
Edition, Prentice Hall, 2007.
Method of Evaluation: Lab exam & Project (30%), Test,
assignment, and quiz (30%), and final exam (40%).
Course Policies:
Late Policy: Assignment must be submitted on the due date
given for a particular assignment. If not, the student will be
penalized 10% of the total mark of the assignment.
Testing Policy: there will NOT be any makeup
exam/assessment unless there is a reliable documented
evidence (supported by department stamp).
3
Course Outline Continued
The main Objective of this course is to introduce:
Explain what a distributed system is
The current DS development Techniques
Their construction issues
Issues that are involved in building reliable
distributed systems, and
Possible applications of distributed systems.
Design a distributed system that fulfills
requirements with regards to key distributed
systems properties.
4
Course Tentative Schedule
1. Chapter 1 - Introduction - February – 4th Week
2. Chapter 2 - Architectures - March – 1st Week
3. Chapter 3 - Processes - March – 2nd Week
4. Chapter 4 - Communication - March – 3rd Week
Test - March 4th Week
Assignment I (Chapter 6-8) - April – 1st Week
5. Chapter 5 - Naming - April – 1st Week
5
Chapter 1: Introduction
Outline
Introduction
Definition
Goals of a Distributed System
Types of Distributed Systems
6
Introduction
Before the mid-80s:computers were
Very expensive (hundred of thousands or even millions of dollars).
Very slow (a few thousand instructions per second).
Not connected among themselves.
After the mid-80s: two major developments
Cheap and powerful microprocessor-based computers appeared.
Computer networks
LANs at speeds ranging from 10 to 1000 Mbps (now even 10Gbps)
WANs at speed ranging from 64 Kbps to gigabits/sec.
Consequence
Feasibility of using a large network of computers to work for the same
application; this is in contrast to the old centralized systems where there was a
single computer with its peripherals.
7
Definition of a Distributed System
A distributed system is a collection of independent computers
that appears to its users as a single coherent system computer
(Tanenbaum & Van Steen)
This definition has two aspects:
hardware: autonomous machines
software: a single system view for the users
A distributed system is one that stops you getting any work
done when a machine you have never even heard of crashes
(Leslie)
A distributed system is one in which components located at
networked computers communicate and coordinate their
actions only by passing messages. (George C.)
8
Middleware
The middleware layer extends over multiple machines,
and offers each application the same interface.
Goal is to hide the heterogeneity of the underlying OS
and HWs
What does it contain?
Commonly used components and functions that need
not be implemented by applications separately.
9
Middleware
Middleware is software that usually referred as the OS of
distributed systems
Manage resources for its applications.
Moreover, it offers services that can also be found in most operating systems, including:
Facilities for inter-application communication.
Security services.
Accounting services.
Masking of and recovery from failures.
Example middleware service.
Remote Procedure Call (RPC) - allows an application to invoke
a function that is implemented and executed on a remote
computer as if it was locally available.
10
Examples of Distributed Systems
Local Area Network
Database Management System
Automatic Teller Machine Network
Internet/World-Wide Web
Mobile Computing/Mobile
Communication
11
Local Area Network
email server Desktop
computers
print and other servers
Local area
Web server network
email server
print
File server
other servers
the rest of
the Internet
router/firewall
12
Database Management System
13
Automatic Teller Machine
14
Mobile Computing/Mobile Communication
15
Characteristics of Distributed Systems
Differences between the computers and the ways they
communicate are hidden from users
Users and applications can interact with a distributed
system in a consistent and uniform way regardless of location
Distributed systems should be easy to expand and scale
A distributed system is normally continuously available,
even if there may be partial failures (Fault tolerance)
16
Advantages of Distributed Systems
Performance: Very often a collection of processors can provide higher
performance (and better price/performance ratio) than a centralized
computer.
Distribution: many applications involve, by their nature, spatially
separated machines (banking, commercial, automotive system).
Reliability (fault tolerance): if some of the machines crash, the system
can survive.
Incremental growth: as requirements on processing power grow, new
machines can be added incrementally.
Sharing of data/resources: shared data is essential to many applications
(banking, computer supported cooperative work, reservation systems);
other resources can be also shared (e.g. expensive printers).
Communication: facilitates human-to-human communication.
17
Disadvantages of Distributed Systems
Difficulties of developing distributed software: how should
operating systems, programming languages and applications look
like?
Networking problems: several problems are created by the
network infrastructure, which have to be dealt with: loss of
messages, overloading, ...
Security problems: sharing generates the problem of data security.
18
Goals of a Distributed System -Making Resources accessible
Connect users and resources.
Easily connect (printers, computers, storage facilities, data, files,
Web pages, ...)..Some of the reasons..
economics: sharing resources such as printers and high-speed
computers..
to collaborate and exchange information
groupware: software for collaborative editing, teleconferencing,
etc.
e-commerce: buying and selling goods.
19
Goals of Distributed System - Transparency
A distributed system that is able to present itself to users and
applications as if it were only a single computer system is
said to be transparent.
20
Different forms of transparency in a distributed system
21
Goals of Distributed System - Openness
Openness is concerned with extensions and improvements
of distributed systems according to standard rules that
describe their syntax and semantics.
Detailed interfaces of components need to be published.
New components have to be integrated with existing
components.
Differences in data representation of interface types on
different processors (of different vendors) have to be
resolved.
Interoperability: components of different origin can
communicate
Portability: components work on different platforms
22
Goals of Distributed System - Scalability
A distributed system should be scalable; there are three dimensions
size: adding more users and resources to the system
geographically: users and resources may be far apart
administratively: should be easy to manage even if it spans many
administrative organizations
Scalability allows the system and applications to expand in scale
without change to the system structure or the application algorithms.
But a scalable system may exhibit performance problems
Scalability Problems
Concept Example
Centralized services - A single server for all users
Centralized data - A single on-line telephone book
Centralized algorithms - Doing routing based on complete information
23
Scaling Techniques: How to solve scaling problems
The problem is mainly performance, and arises as a result of
limitations in the capacity of servers and networks (for
geographical scalability with high latency and mostly unreliable
links)
Three possible solutions:
Hiding communication latencies
Distribution and
Replication
24
A. Hide Communication Latencies
25
B. Distribution
Means splitting a component into smaller parts and spreading those parts
across the system
e.g., DNS -Domain Name System ..divide the name space into non over
lapping zones
26
C. Replication
Replicate components across a distributed system to increase
availability and for load balancing, leading to better performance
replication is decided by the owner of a resource
Caching (a special form of replication) also reduces
communication latency; decided by the user
but, caching and replication may lead to consistency problems
27
Pitfalls when Developing Distributed Systems
Because of false assumptions made by first time developers (of
distributed systems) which are related to the properties of
distributed systems and do not occur in non distributed
applications.
The network is reliable (making it difficult to achieve failure
transparency)
The network is secure
The network is homogeneous
The topology does not change
Latency is zero
Bandwidth is infinite
Transport cost is zero
There is one administrator
28
Types of Distributed System
Three main types:
Distributed computing systems
Focus on computation
Goal: High performance computing tasks
Distributed information systems
Focus on interoperability (the ability to exchange and use
information)
Goal: Distribute information across several servers
Distributed pervasive systems
Focus on mobile, embedded, communicating systems
Goal: Spread a real-life environment with a large variety of smart
devices
29
Distributed Computing Systems: Cluster Computing
Essentially a group of systems connected through a high speed LAN.
Homogeneous: Same OS, near-identical hardware
Single managing node
Tightly coupled systems
A master node runs a middleware (containing libraries for parallel
programs) and controls other compute nodes
Centralized job management & scheduling system
30
Distributed Computing Systems: Grid Computing
Lots of nodes (including clusters across multiple subnets) from everywhere.
Heterogeneous: no assumptions are made concerning hardware, operating
systems, networks, administrative domains, security policies, etc.
Diversity and dynamism (it can handle nodes dropping in and out at any
point of time)
Dispersed across several organizations and can easily span a wide-area
network
To allow for collaborations, grids generally use virtual organizations
(grouping of users that will allow for authorization on resource allocation).
Loosely coupled (decentralization)
Distributed job management & scheduling
31
Distributed Computing Systems: Cloud Computing
Web-based tools or applications that users can access and use through a
web browser as if it were a program installed locally on their own
computer.
Internet-based computing
offers dynamically scalable and virtualized resources that make up
services for users to use over the internet
The only thing the user's computer needs to be able to run is the cloud
computing system's interface software
32
Distributed Information Systems
Evolved in organizations that were confronted with a wealth of networked
applications, but for which interoperability turned out to be problematic.
Many of the existing middleware solutions are the result of working with an
infrastructure in which it was easier to integrate applications into an
enterprise-wide information system. Several levels at which integration took
place:
A networked application simply consisted of a server running that
application (often including a database) and making it available to remote
programs, called clients.
Such clients could send a request to the server for executing a specific
operation, after which a response would be sent back. Integration at the
lowest level would allow clients to wrap a number of requests, possibly for
different servers, into a single larger request and have it executed as a
distributed transaction.
The key idea was that all, or none of the requests would be executed.
33
… Cont’d
As applications became more sophisticated and were gradually separated into independent
components, it became clear that integration should also take place by letting applications
communicate directly with each other.
The vast amount of distributed systems in use today is in the form of traditional information systems.
Example: Transaction processing systems
BEGIN TRANSACTION(server, transaction);
READ(transaction, file-1, data);
WRITE(transaction, file-2, data);
newData := MODIFIED(data);
IF WRONG(newData) THEN
ABORT TRANSACTION(transaction);
ELSE
WRITE(transaction, file-2, newData);
END TRANSACTION(transaction);
END IF;
Note:
All READ and WRITE operations are executed, i.e. their effects are made permanent at the execution
of END TRANSACTION.
Transactions form an atomic operation.
34
Distributed Information Systems: Transaction processing systems
Focus on database applications - operations on a database are usually carried
out in the form of transactions.
special primitives are required to program transactions, supplied either by the
underlying distributed system or by the language runtime system
Exact list of primitives depends on the type of application; procedure calls,
ordinary statements, etc. can also be included
35
… Cont’d
Transaction between processes:
e.g., assume the following banking operation
withdraw an amount x from account 1
deposit the amount x to account 2
what happens if there is a problem after the first activity is carried out?
group the two operations into one transaction; either both are carried
out or neither
we need a way to roll back when a transaction is not completed
36
… Cont’d
A transaction is a collection of operations on the state of an object (database,
object composition, etc.) that satisfies the following properties (ACID):
Atomicity: All operations either succeed, or all of them fail.
When the transaction fails, the state of the object will remain unaffected
by the transaction.
Consistency: A transaction establishes a valid state transition.
This does not exclude the possibility of invalid, intermediate states during
the transaction’s execution.
Isolation: Concurrent transactions do not interfere with each other.
It appears to each transaction T that other transactions occur either before
T, or after T, but never both.
Durability: After the execution of a transaction, its effects are made
permanent:
Changes to the state survive failures.
37
Distributed Pervasive Systems
The distributed systems discussed so far are characterized by their
stability; fixed nodes having high-quality connection to a network
A next-generation of distributed systems emerging in which the nodes are
small, wireless, battery-powered, mobile (e.g. PDAs, smart phones,
wireless surveillance cameras, portable ECG monitors, etc.), and often
embedded as part of a larger system.
Some requirements:
Contextual change: The system is part of an environment in which
changes should be immediately accounted for.
Ad hoc composition: Each node may be used in a very different ways
by different users.
Requires ease-of-configuration.
Sharing is the default: Nodes come and go, providing sharable
services and information.
38
Distributed Pervasive Systems: Examples
Home Systems
built around home networks·
consist of one or more personal computers
integrate consumer electronics such as TVs, audio and video equipment,
gaming devices, (smart) phones, PDAs, and other personal wearables into a
single system.
Now/Soon: all kinds of devices such as kitchen appliances, surveillance
cameras, clocks, controllers for lighting, and so on, will all be hooked up into a
single distributed system.
Several challenges:
System should be completely self-configuring and self-managing
e.g. Universal Plug and Play (UPnP) standards by which devices
automatically obtain IP addresses, can discover each other, etc.
Unclear how software and firmware in devices can be easily updated without
manual intervention, or when updates do take place, that compatibility with
other devices is not violated.
39
Distributed Pervasive Systems: Examples
Electronic health System
New devices are being developed to monitor the well-being of individuals and
to automatically contact physicians when needed.
Major goal is to prevent people from being hospitalized.
Questions to be raised:
Where and how should monitored data be stored?
How can we prevent loss of crucial data?
What infrastructure is needed to generate and propagate alerts?
How can security be enforced?
How can physicians provide online feedback?
40
Distributed Pervasive Systems: Examples
Sensor Networks
Consists of spatially distributed autonomous sensors to cooperatively
monitor physical or environmental conditions, such as temperature, sound,
vibration, pressure, motion or pollutants, etc.
The nodes to which sensors are attached are:
Many (10s-1000s)
Simple (i.e., hardly any memory, CPU power, or communication facilities)
Often battery-powered
41
Related Concepts : Parallel Systems
Parallel Computing is a form of computation in which many calculations are
carried out simultaneously, operating on the principle that large problems can
often be divided into smaller ones, which are then solved concurrently (in
parallel).
In parallel computing, all processors have access to a shared memory. Shared
memory can be used to exchange information between processors.
In distributed computing, each processor has its own private memory
(distributed memory). Information is exchanged by passing messages between
the processors.
42
Related Concepts : Centralized System
Centralized System Characteristics
One component with non-autonomous parts.
Component shared by users all the time.
All resources accessible.
Software runs in a single process.
Single point of control.
Single point of failure.
Distributed System Characteristics
Multiple autonomous components.
Components are not shared by all users.
Resources may not be accessible.
Software runs in concurrently on different processors.
Multiple points of control.
Multiple points of failure.
43
44
?