Lecture 4.0 - Distributed File Systems
Lecture 4.0 - Distributed File Systems
A file system is a subsystem of the operating system that performs file management
activities such as organization, storing, retrieval, naming, sharing, and protection of files.
A file system frees the programmer from concerns about the details of space allocation and
layout of the secondary storage device.
The design and implementation of a distributed file system is more complex than a
conventional file system due to the fact that the users and storage devices are physically
dispersed.
• In addition to the functions of the file system of a single-processor system, the distributed
file system supports the following:
2. User mobility
User should be permitted to work on different nodes.
3. Availability
For better fault-tolerance, files should be available for use even in the event of
temporary failure of one or more nodes of the system. Thus the system should
maintain multiple copies of the files, the existence of which should be transparent to
the user.
4. Diskless workstations
A distributed file system, with its transparent remote-file accessing capability, allows
the use of diskless workstations in a system.
1. Storage service
Allocation and management of space on a secondary storage device thus providing a
logical view of the storage system.
3.Name/Directory service
Responsible for directory related activities such as creation and deletion of
directories, adding a new file to a directory, deleting a file from a directory, changing
the name of a file, moving a file from one directory to another etc.
(This chapter deals with the design and implementation issues of the true file service
component of distributed file systems).
1. Transparency
- Structure transparency
Clients should not know the number or locations of file servers and the
storage devices. Note: multiple file servers provided for performance,
scalability, and reliability.
- Access transparency
Both local and remote files should be accessible in the same way. The file
system should automatically locate an accessed file and transport it to the
clients site.
- Naming transparency
The name of the file should give no hint as to the location of the file. The
name of the file must not be changed when moving from one node to another.
- Replication transparency
If a file is replicated on multiple nodes, both the existence of multiple copies
and their locations should be hidden from the clients.
2. User mobility
Automatically bring the users environment (e.g. users home directory) to the node
where the user logs in.
3. Performance
Performance is measured as the average amount of time needed to satisfy client
requests. This time includes CPU time + time for accessing secondary storage +
network access time. It is desirable that the performance of a distributed file system
be comparable to that of a centralized file system.
5. Scalability
Growth of nodes and users should not seriously disrupt service.
6. High availability
A distributed file system should continue to function in the face of partial failures
such as a link failure, a node failure, or a storage device crash.
A highly reliable and scalable distributed file system should have multiple and
independent file servers controlling multiple and independent storage devices.
7. High reliability
Probability of loss of stored data should be minimized. System should automatically
generate backup copies of critical files.
8. Data integrity
Concurrent access requests from multiple users who are competing to access the file
must be properly synchronized by the use of some form of concurrency control
mechanism. Atomic transactions can also be provided.
9. Security
Users should be confident of the privacy of their data.
10. Heterogeneity
There should be easy access to shared data on diverse platforms (e.g. Unix
workstation, Wintel platform etc).
File Models
In structured files (rarely used now) a file appears to the file server as an ordered
sequence of records. Records of different files of the same file system can be of
different sizes.
2. Mutable and immutable files
Based on the modifiability criteria, files are of two types, mutable and immutable.
Most existing operating systems use the mutable file model. An update performed on
a file overwrites its old contents to produce the new contents.
In the immutable model, rather than updating the same file, a new version of the file
is created each time a change is made to the file contents and the old version is
retained unchanged. The problems in this model are increased use of disk space and
increased disk activity.
b. Data-caching model
This model attempts to reduce the network traffic of the previous model by caching
the data obtained from the server node. This takes advantage of the locality feature of
the found in file accesses. A replacement policy such as LRU is used to keep the cache
size bounded.
While this model reduces network traffic it has to deal with the cache coherency
problem during writes, because the local cached copy of the data needs to be updated,
the original file at the server node needs to be updated and copies in any other caches
need to be updated.
The data-caching model offers the possibility of increased performance and greater
system scalability because it reduces network traffic, contention for the network, and
contention for the file servers. Hence almost all distributed file systems implement
some form of caching.
Example, NFS uses the remote service model but adds caching for better
performance.
Disadvantage: requires sufficient storage space on the client machine. This approach
fails for very large files, especially when the client runs on a diskless workstation. If
only a small fraction of a file is needed, moving the entire file is wasteful.
File-Sharing Semantics
Multiple users may access a shared file simultaneously. An important design issue
for any file system is to define when modifications of file data made by a user are
observable by other users.
UNIX semantics:
This enforces an absolute time ordering on all operations and ensures that every read
operation on a file sees the effects of all previous write operations performed on that
file.
The UNIX semantics is implemented in file systems for single CPU systems because it is the
most desirable semantics and because it is easy to serialize all read/write requests.
Implementing UNIX semantics in a distributed file system is not easy. One may think
that this can be achieved in a distributed system by disallowing files to be cached at client
nodes and allowing a shared file to be managed by only one file server that processes all
read and write requests for the file strictly in the order in which it receives them. However,
even with this approach, there is a possibility that, due to network delays, client requests
from different nodes may arrive and get processed at the server node in an order different
from the actual order in which the requests were made.
Also, having all file access requests processed by a single server and disallowing caching on
client nodes is not desirable in practice due to poor performance, poor scalability, and poor
reliability of the distributed file system.
Hence distributed file systems implement a more relaxed semantics of file sharing.
Applications that need to guarantee UNIX semantics should provide mechanisms (e.g.
mutex lock etc) themselves and not rely on the underlying semantics of sharing provided
by the file system.
File Caching Schemes
2. It contributes to the scalability and reliability of the distributed file system since data can
be remotely cached on the client node.
1. Cache location
2. Modification Propagation
3. Cache Validation
Cache Location
This refers to the place where the cached data is stored. Assuming that the original location
of a file is on its servers disk, there are three possible cache locations in a distributed file
system:
Advantages:
a. Easy to implement
b. Totally transparent to clients
c. Easy to keep the original file and the cached data consistent.
2. Clients disk
In this case a cache hit costs one disk access. This is somewhat slower than having
the cache in servers main memory. Having the cache in servers main memory is also
simpler.
Advantages:
a. Provides reliability against crashes since modification to cached data is lost
in a crash if the cache is kept in main memory.
b. Large storage capacity.
c. Contributes to scalability and reliability because on a cache hit the access
request can be serviced locally without the need to contact the server.
Advantages:
a. Maximum performance gain.
b. Permits workstations to be diskless.
c. Contributes to reliability and scalability.
Modification Propagation
When the cache is located on clients nodes, a files data may simultaneously be cached on
multiple nodes. It is possible for caches to become inconsistent when the file data is changed
by one of the clients and the corresponding data cached at other nodes are not changed or
discarded.
The modification propagation scheme used has a critical affect on the systems performance
and reliability. Techniques used include:
a. Write-through scheme.
When a cache entry is modified, the new value is immediately sent to the server for
updating the master copy of the file.
Advantage:
High degree of reliability and suitability for UNIX-like semantics.
This is due to the fact that the risk of updated data getting lost in the event of a client
crash is very low since every modification is immediately propagated to the server
having the master copy.
Disadvantage:
This scheme is only suitable where the ratio of read-to-write accesses is fairly large. It
does not reduce network traffic for writes.
This is due to the fact that every write access has to wait until the data is written to the
master copy of the server. Hence the advantages of data caching are only read accesses
because the server is involved for all write accesses.
b. Delayed-write scheme.
To reduce network traffic for writes the delayed-write scheme is used. In this case, the new
data value is only written to the cache and all updated cache entries are sent to the server at
a later time.
2. Modified data may be deleted before it is time to send to send them to the server
(e.g. temporary data). Since modifications need not be propagated to the server this
results in a major performance gain.
3. Gathering of all file updates and sending them together to the server is more
efficient than sending each update separately.
The modification propagation policy only specifies when the master copy of a file on the
server node is updated upon modification of a cache entry. It does not tell anything about
when the file data residing in the cache of other nodes is updated.
A file data may simultaneously reside in the cache of multiple nodes. A clients cache entry
becomes stale as soon as some other client modifies the data corresponding to the cache
entry in the master copy of the file on the server.
It becomes necessary to verify if the data cached at a client node is consistent with the
master copy. If not, the cached data must be invalidated and the updated version of the data
must be fetched again from the server.
There are two approaches to verify the validity of cached data: the client-initiated approach
and the server-initiated approach.
Client-initiated approach
The client contacts the server and checks whether its locally cached data is consistent with
the master copy. Two approaches may be used:
2. Periodic checking.
A check is initiated every fixed interval of time.
When a client closes a file, it sends intimation to the server along with any modifications
made to the file. Then the server updates its record of which client has which file open in
which mode.
When a new client makes a request to open an already open file and if the server finds that
the new open mode conflicts with the already open mode, the server can deny the request,
queue the request, or disable caching by asking all clients having the file open to remove
that file from their caches.
Note: On the web, the cache is used in read-only mode so cache validation is not an issue.
Disadvantage: It requires that file servers be stateful. Stateful file servers have a distinct
disadvantage over stateless file servers in the event of a failure.
File Replication
High availability is a desirable feature of a good distributed file system and file replication is
the primary mechanism for improving file availability.
A replicated file is a file that has multiple copies, with each file on a separate file server.
Advantages of Replication
1. Increased Availability.
Alternate copies of a replicated data can be used when the primary copy is
unavailable.
2. Increased Reliability.
Due to the presence of redundant data files in the system, recovery from catastrophic
failures (e.g. hard drive crash) becomes possible.
6. Better scalability.
Multiple file servers are available to service client requests since due to file
replication. This improves scalability.
Replication Transparency
Replication of files should be transparent to the users so that multiple copies of a replicated
file appear as a single logical file to its users. This calls for the assignment of a single
identifier/name to all replicas of a file.
In addition, replication control should be transparent, i.e., the number and locations of
replicas of a replicated file should be hidden from the user. Thus replication control must be
handled automatically in a user-transparent manner.
Multicopy Update Problem
Maintaining consistency among copies when a replicated file is updated is a major design
issue of a distributed file system that supports file replication.
1. Read-only replication
In this case the update problem does not arise. This method is too restrictive.
2. Read-Any-Write-All Protocol
A read operation on a replicated file is performed by reading any copy of the file and
a write operation by writing to all copies of the file. Before updating any copy, all
copies need to be locked, then they are updated, and finally the locks are released to
complete the write.
3. Available-Copies Protocol
A read operation on a replicated file is performed by reading any copy of the file and
a write operation by writing to all available copies of the file. Thus if a file server
with a replica is down, its copy is not updated. When the server recovers after a
failure, it brings itself up to date by copying from other servers before accepting any
user request.
4. Primary-Copy Protocol
For each replicated file, one copy is designated as the primary copy and all the others
are secondary copies. Read operations can be performed using any copy, primary or
secondary. But write operations are performed only on the primary copy. Each
server having a secondary copy updates its copy either by receiving notification of
changes from the server having the primary copy or by requesting the updated copy
from it.
E.g. for UNIX-like semantics, when the primary-copy server receives an update
request, it immediately orders all the secondary-copy servers to update their copies.
Some form of locking is used and the write operation completes only when all the
copies have been updated. In this case, the primary-copy protocol is simply another
method of implementing the read-any-write-all protocol.
Exercise
Discuss the following types of file Systems