ICS 431-Ch14-File System Implementation
ICS 431-Ch14-File System Implementation
Weeks 13-14
1
Dr. Tarek Helmy, KFUPM-ICS
Ch14: File System Implementation
• What kind of on-disk and in-memory data structures used to implement a file
system?
• How disk blocks are allocated to files so that disk space is used effectively and
files can be accessed quickly?
• Contiguous Allocation
• Linked Allocation
• Indexed Allocation
• How to improve the efficiency, performance, and recovery of the file system?
2
Dr. Tarek Helmy, KFUPM-ICS
File System Modules Organization: Layered Approach
• File-Organization Module:
• knows about files and their logical blocks,
and how they map to physical blocks on the
disk.
• Translating from logical to physical blocks,
• Maintains the list of free blocks, and
allocates free blocks to files as needed.
3
Dr. Tarek Helmy, KFUPM-ICS
File System Modules Organization: Layered Approach
• Basic file system module:
• Works directly with the device drivers for
retrieving and storing raw blocks of data.
• Issues high level commands to a specific device
driver to read & write physical blocks. i.e. (drive
1, cylinder 60, track 4, sector 10)
• I/O control:
• Consists of device drivers, which communicate
with the devices by reading and writing special
codes directly to and from memory addresses
corresponding to the controller card's registers.
• Physical devices:
• Consisting of the magnetic media, motors,
controls, and the electronics connected for
controlling. Modern disk put more and more of
the electronic controls directly on the disk drive
itself, leaving relatively little work for the disk
controller card to perform.
4
Dr. Tarek Helmy, KFUPM-ICS
File System Implementation: On-Disk Structure
• Several on-disk and in-memory data structures are used to implement a file
system. These structures vary depending of the OS and the file system.
• The on-disk data structures include:
• Boot Control Block: It contains:
• Information needed by the OS to boot. If the disk does not contain an
OS, this block will be empty. It is in zero block of first partition. It is
called boot block in the Unix File System (UFS) and Partition boot sector in
NTFS.
• NTFS stands for New Technology File System. It is more better than
FAT/FAT32, it supports Unicode filenames, proper security,
compression and encryption.
• Volume Control Block: it contains:
• Details information about partitions such as the number of blocks in the
partition, size of blocks, free block count and free block pointers.
• e.g., super block in UFS and Master File Table in NTFS.
• Directory Structure Table: Is used to organize the files within the directory by
mapping file names and pointers to corresponding FCBs.
• File Control Block: It contains:
• File attributes such as file’s owner, size and location of data blocks, etc.
• It is called inode in UFS.
• In NTFS this information stored within Master File Table, NTFS uses
relational database structure with a row per file. 5
Dr. Tarek Helmy, KFUPM-ICS
A Typical File Control Block
File Control Block (FCB): a storage structure contains information about a file.
6
Dr. Tarek Helmy, KFUPM-ICS
File System Implementation: In-Memory Structure
• There are also several data structures stored in memory:-
• In-memory Partition Table: Contains information about each mounted
partition. Mounting a file system associates it with a directory in the existing
file system tree. once mounted, the file system becomes accessible.
• In-memory Directory Structure: Holds information about recently accessed
directories.
• In-memory System-wide Open-file Table: Containing a copy of the FCB for
every currently open file in the system, as well as some other related
information.
• In-memory Per-process Open-file Table: Contains a pointer to the opened
file entry of this process in the system-wide open-file table.
• Buffers: used to map file’s blocks for reading and writing.
• The in-memory information is used for both file-system management and
performance improvement.
• Caching on desk information speeds up the searching process in the data
structures used to implement the file system.
7
Dr. Tarek Helmy, KFUPM-ICS
Directory Structure Implementation
• Contiguous allocation
• Linked allocation
• Indexed allocation
10
Dr. Tarek Helmy, KFUPM-ICS
Contiguous Allocation of Disk Space
• A file occupies contiguous blocks on
disk.
– This is similar to contiguous
allocation of the main memory to
a process’s pages.
• Efficient because it offers random
access to any location in a file.
– Block i of a file is located at b+i
where b is the starting location of
the file.
• Faster in accessing as blocks will be
quickly read one next to the other. It
means the conversion of logical to
physical address will be easy.
• When a new file is to be written, the
file system determines where to put it.
– Algorithms include best-fit and
first-fit (which is most common).
11
Dr. Tarek Helmy, KFUPM-ICS
Drawbacks of Contiguous Allocation
• Fragmentation
– “ Contiguous blocks” may be too big for a given file and therefore
a small fragment is left. Therefore,
• When the file is first created, its size must be provided or estimated.
– Program sizes can be pre-determined, but data files can not.
– If the estimation is too low, sufficient space will not be made
available later (specially if best fit was used), if it is too high,
internal fragmentation occurs.
12
Dr. Tarek Helmy, KFUPM-ICS
Linked Allocation
• File blocks are going to be scattered
across the disk (non-contiguously) where
one block points to the next block in the
file.
• Each block contains a pointer to the next
block and the last block contains a NIL (-
1) pointer.
– Files can grow or shrink without
fragmentation and without the need
to know the file size in advance.
– No waste of space except for
pointers.
• Pointers take up a great portion of the file
space. Perhaps as much as 1% of
storage is now pointers.
• This method does not support random
access into a file block.
– Instead, sequential access must be
performed from the first block,
following pointers.
13
Dr. Tarek Helmy, KFUPM-ICS
File-Allocation Table (FAT)
14
Dr. Tarek Helmy, KFUPM-ICS
Linked Allocation Advantages and Disadvantages
• Advantages
• This method does not suffer from external fragmentation, every block wherever
it is can be linked and used.
• This makes it relatively better in terms of disk space utilization.
• Any free block can be used to satisfy a request.
• There is no need to declare the size of a file when that file is created.
• A file can continue to grow as long as there are free blocks
• Disadvantages
– Long seek time is needed to access every block individually, because the
file blocks are distributed randomly on the disk.
– This makes linked allocation slower (unless FAT is used and cached).
– A lot of space used for pointers of the blocks,
• One way of solving this problem is to cluster the blocks and to use
pointers for clusters not for blocks.
– Does not to support direct-access.
– It is not reliable, since the pointers/linked may be lost or damaged which will
cause a trap or an error.
15
Dr. Tarek Helmy, KFUPM-ICS
Indexed Allocation
• Can we get the benefits of both
contiguous and linked allocations.
• Allocation of blocks is still scattered
across the disk like linked, but access
to each block is provided by an index
where we can support random access.
• Each file has its own index of pointers.
– This allows random access to a
given block without external
fragmentation.
• Each file’s index is stored in one block
on disk and pointed to by the directory.
– If a file can be stored in n block s,
then the file can only consume n+1
blocks, 1 block for the index.
• For very small files, say files that expand only 2-3 blocks, the
indexed allocation would keep one entire block (index block) for
the pointers which is inefficient in terms of memory utilization.
• For files that are very large, single index block may not be able
to hold all the pointers.
– One proposed solution is to use two or more index blocks
together for holding the pointers.
– Every index block would then contain a pointer or the
address to the next index block.
17
Dr. Tarek Helmy, KFUPM-ICS
Summary of Allocation Methods
• Contiguous Allocation:
– Efficient because it offers random access to any location in a file
– This method suffers from both internal and external fragmentation. A
block may be too big and therefore a small internal fragment is left. Or
some blocks may not be used and external fragment will be left also.
– When the file is first created, its size must be estimated. Increasing file size
is difficult because it depends on the availability of contiguous memory at a
particular instance.
• Linked Allocation:
– Files can grow or shrink without fragmentation and the need to know the
file size in advance.
– It does not support random access.
– Pointers take up a great portion of the file space.
– Not reliable as pointers may be lost or damaged and causes trap errors.
• Indexed Allocation:
– Allocation of blocks is still scattered across the disk like linked, but access
to each block is provided by an index rather than linked pointers.
– Each file has its own index of pointers, this allows random access to a
given block without external fragmentation.
– If a file can be stored in n block s, then the file can only consume n+1
blocks, 1 block for the index regardless the size of the file.
18
Dr. Tarek Helmy, KFUPM-ICS
Selection of Allocation and Access Methods
• The file system must keep track of the free disk space,
• The operating system maintains a free-space list.
• The free-space list records all free disk blocks, those not allocated to
files or directories.
• To create a file, OS searches the free space list for the required
amount of space and allocates that space to the new file.
• This allocated space is then removed from the free list.
• When the file is deleted, its disk space is added to the free list.
• How does the file system know where a free block of disk space is
located? There are two ways to implement that.
20
Dr. Tarek Helmy, KFUPM-ICS
Bit Vector
• Using a bit vector to indicate every block in the file system.
• 1 indicates a free block and 0 indicates a used block.
• Example, a disk of 32 blocks where blocks 2, 3, 4, 5, 8, 9, 10, 11, 12, 13,
17, 18, 25, 26, 27 are free and the rest are allocated. The bit vector
map is:
00111100111111000110000001110000
• To allocate a new block, OS looks for the first 1 and changes it to a 0.
• This can be done by looking for the first word of first bit that is not 0.
• Block number calculation = (number of bits per word) * (number of 0-
value words) + offset of first 1 bit.
• The Macintosh OS uses this technique for managing the free blocks.
• Unfortunately, bit vectors are inefficient unless the entire vector is kept in
memory, this will consume more memory specially for large disks.
• Bit map requires extra space, example:
– Disk size = 240 bytes (1 Terabyte)
– Block size = 215 bytes = 32 KB
– Bit vector size = 240/215 = 225 bits (or 22 * 23 * 210 * 210 =4 MB)
21
Dr. Tarek Helmy, KFUPM-ICS
Bit Vector Variations
22
Dr. Tarek Helmy, KFUPM-ICS
Linked List
• Another way to manage the free-blocks is to link together all the free
blocks, keeping a pointer to the first free block in a special location
on the disk and caching it in memory.
00111100111111000110000001110000
– The first free block contains a pointer to the next free block and
so on.
• Allocating one block is simple.
23
Dr. Tarek Helmy, KFUPM-ICS
Linked Free Space List on Disk
24
Dr. Tarek Helmy, KFUPM-ICS
Efficiency and Performance
26
Dr. Tarek Helmy, KFUPM-ICS
Recovery of Lost Data
• In order to recover lost data in the event of a disk crash, it is important to
conduct backups regularly.
• Files should be copied to some removable medium, such CDs, DVDs, or
external removable hard drives.
• A full backup copies every file on a file system.
• Incremental backups copy only files which have changed since some
previous time.
• For example, one strategy might be:
– At the beginning of the month do a full backup.
– At the end of the first and again at the end of the second week, backup all
files which have changed since the beginning of the month.
– At the end of the third week, backup all files that have changed since the
end of the second week.
– Every day of the month not listed above, do an incremental backup of all
files that have changed since the most recent of the weekly backups
described above
• Recover lost file or disk by restoring data from backup.
• A useful backup strategy is required!
27
Dr. Tarek Helmy, KFUPM-ICS
Recovery
28
Dr. Tarek Helmy, KFUPM-ICS
The End!!
Thank you
Any Questions?
29
Dr. Tarek Helmy, KFUPM-ICS