Distributed Systems Chapter 3-Processes
Distributed Systems Chapter 3-Processes
1
Introduction
2
3.1 Threads and their Implementation
threads can be used in both distributed and nondistributed
systems
Threads in Nondistributed Systems
a process has an address space (containing program text
and data) and a single thread of control, as well as other
resources such as open files, child processes, accounting
information, etc.
Process 1 Process 2 Process 3
three processes each with one thread one process with three threads 3
each thread has its own program counter, registers, stack,
and state; but all threads of a process share address space,
global variables and other resources such as open files, etc.
4
Threads take turns in running
Threads allow multiple executions to take place in the same
process environment, called multithreading
Thread Usage – Why do we need threads?
e.g., a wordprocessor has different parts; parts for
interacting with the user
formatting the page as soon as changes are made
timed savings (for auto recovery)
spelling and grammar checking, etc.
1. Simplifying the programming model: since many activities
are going on at once
2. They are easier to create and destroy than processes
since they do not have any resources attached to them
3. Performance improves by overlapping activities if there is
too much I/O; i.e., to avoid blocking when waiting for
input or doing calculations, say in a spreadsheet
4. Real parallelism is possible in a multiprocessor system 5
having finer granularity in terms of multiple threads per
process rather than processes provides better performance
and makes it easier to build distributed applications
in nondistributed systems, threads can be used with shared
data instead of processes to avoid context switching
overhead in interprocess communication (IPC)
9
b. threads
threads are more important for implementing servers
e.g., a file server
the dispatcher thread reads incoming requests for a file
operation from clients and passes it to an idle worker
thread
the worker thread performs a blocking disk read; in which
case another thread may continue, say the dispatcher or
another worker thread
Model Characteristics
Single-threaded process No parallelism, blocking system calls
Parallelism, blocking system calls
Threads
(thread only)
Finite-state machine Parallelism, nonblocking system calls
three ways to construct a server
11
3.2 Anatomy of Clients
12
the basic organization of the X Window System
13
b. Client-Side Software for Distribution Transparency
in addition to the user interface, parts of the processing
and data level in a client-server application are executed at
the client side
an example is embedded client software for ATMs, cash
registers, etc.
moreover, client software can also include components to
achieve distribution transparency
e.g., replication transparency
assume a distributed system with replicated servers; the
client proxy can send requests to each replica and a
client side software can transparently collect all
responses and passes a single return value to the client
application
14
transparent replication of a server using a client-side solution
15
3.3 Servers and design issues
3.3.1 General Design Issues
How to organize servers?
Where do clients contact a server?
Whether and how a server can be interrupted
Whether or not the server is stateless
18
Client-to-server binding using a daemon
ii. use a superserver (as in UNIX) that listens to all endpoints
and then forks a process to take care of the request; this is
instead of having a lot of servers running simultaneously and
most of them idle
19
c. Whether and how a server can be interrupted
for instance, a user may want to interrupt a file transfer,
may be it was the wrong file
let the client exit the client application; this will break the
connection to the server; the server will tear down the
connection assuming that the client had crashed
or
let the client send out-of-bound data, data to be
processed by the server before any other data from the
client; the server may listen on a separate control
endpoint; or send it on the same connection as urgent
data as is in TCP
d. Whether or not the server is stateless
a stateless server does not keep information on the state
of its clients; for instance a Web server
soft state: a server promises to maintain state for a
limited time; e.g., to keep a client informed about
updates; after the time expires, the client has to poll 20
a stateful server maintains information about its clients;
for instance a file server that allows a client to keep a
local copy of a file and can make update operations
21
the general organization of a three-tiered server cluster
22
Distributed Servers
the problem with a server cluster is when the logical switch
(single access point) fails making the cluster unavailable
hence, several access points can be provided where the
addresses are publicly available leading to a distributed
server
e.g., the DNS can return several addresses for the same host
name
23
3.4 Code Migration
so far, communication was concerned on passing data
we may pass programs, even while running and in
heterogeneous systems
code migration also involves moving data as well: when a
program migrates while running, its status, pending signals,
and other environment variables such as the stack and the
program counter also have to be moved
24
Reasons for Migrating Code
to improve performance; move processes from heavily-
loaded to lightly-loaded machines (load balancing)
to reduce communication: move a client application that
performs many database operations to a server if the
database resides on the server; then send only results to the
client
to exploit parallelism (for nonparallel programs): e.g., copies
of a mobile program (a crawler as is called in search
engines) moving from site to site searching the Web
25
to have flexibility by dynamically configuring distributed
systems: instead of having a multitiered client-server
application deciding in advance which parts of a program
are to be run where
27
Strong Mobility
transfer code and execution segments; helps to migrate a
process in execution
can also be supported by remote cloning; having an
exact copy of the original process and running on a
different machine; executed in parallel to the original
process; UNIX does this by forking a child process
migration can be
sender-initiated: the machine where the code resides or is
currently running; e.g., uploading programs to a server;
may need authentication or that the client is a registered
one
receiver-initiated: by the target machine; e.g., Java
Applets; easier to implement
28
Summary of models of code migration
29
Migration and Local Resources
how to migrate the resource segment
not always possible to move a resource; e.g., a reference to
TCP port held by a process to communicate with other
processes
Types of Process-to-Resource Bindings
Binding by identifier (the strongest): a resource is referred
by its identifier; e.g., a URL to refer to a Web page or an FTP
server referred by its Internet (IP) address
Binding by value (weaker): when only the value of a resource
is needed; in this case another resource can provide the
same value; e.g., standard libraries of programming
languages such as C or Java which are normally locally
available, but their location in the file system may vary from
site to site
Binding by type (weakest): a process needs a resource of a
specific type; reference to local devices, such as monitors,
printers, ...
30
in migrating code, the above bindings cannot change, but the
references to resources can
how can a reference be changed? depends whether the
resource can be moved along with the code, i.e., resource-to-
machine binding
Types of Resource-to-Machine Bindings
Unattached Resources: can be easily moved with the
migrating program (such as data files associated with the
program)
Fastened Resources: such as local databases and complete
Web sites; moving or copying may be possible, but very
costly
Fixed Resources: intimately bound to a specific machine or
environment such as local devices and cannot be moved
we have nine combinations to consider
31
Resource-to machine binding
Unattached Fastened Fixed
By identifier MV (or GR) GR (or MV) GR
Process-to-
resource binding By value CP (or MV, GR) GR (or CP) GR
By type RB (or GR, CP) RB (or GR, CP) RB (or GR)
actions to be taken with respect to the references to local resources when migrating
code to another machine
32
Migration in Heterogeneous Systems
distributed systems are constructed on a heterogeneous
collection of platforms, each with its own OS and machine
architecture
heterogeneity problems are similar to those of portability
easier in some languages
for scripting languages the source code is interpreted
for Java an intermediary code is generated by the
compiler for a virtual machine
in weak mobility
since there is no runtime information, compile the source
code for each potential platform
in strong mobility
difficult to transfer the execution segment since there
may be platform-dependent information such as register
values; Read the book about possible solutions
33