0% found this document useful (0 votes)

7 views32 pages

4.4.persistence-fs-impl

The document discusses the implementation of file systems, focusing on data structures and access methods used to organize data and metadata. It explains the organization of on-disk structures, including inodes, allocation structures, superblocks, and multi-level indexes to manage file sizes. Additionally, it covers directory organization, free space management, and the process of reading files from disk through system calls.

Uploaded by

duonkvzdmzrweepuyf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views32 pages

4.4.persistence-fs-impl

Uploaded by

duonkvzdmzrweepuyf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Operating Systems

Topic: Persistence – File System Implementation

Endadul Hoque
Acknowledgement

• Youjip Won (Hanyang University)

• OSTEP book – by Remzi and Andrea Arpaci-Dusseau

(University of Wisconsin)
The Way To Think
• There are two different aspects to implement file
system
– Data structures
• What types of on-disk structures are utilized by the file
system to organize its data and metadata?

– Access methods
• How does it map the calls made by a process as open(),
read(), write(), etc.?
• Which structures are read during the execution of a particular
system call?
Overall Organization
• Let’s develop the overall on-disk organization of the file
system data structure.
• vsfs (very simple file system)
• Divide the disk into blocks.
– Block size is 4 KB.
– The blocks are addressed from 0 to N -1.

0 7 8 15 16 23 24 31

32 39 40 47 48 55 56 63
Data region in file system
• Reserve data region to store user data (i.e., file content)
Data Region

D D D D D D D D D D D D D D D D D D D D D D D D

0 7 8 15 16 23 24 31

Data Region

D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D

32 39 40 47 48 55 56 63

– File system has to track which data blocks comprise a file and its
metadata (the size of the file, its owner, etc.)
• It uses an inode (index node) structure
How can we store
the inodes in the file
system?
Inode table in file system
• Reserve some space for inode table
– This holds an array of on-disk inodes.
– Ex) inode table: 5 blocks (from 3 to 7) and inode size : 256 bytes
• Each 4-KB block can hold 16 inodes.
• The filesystem supports total 80 inodes. (maximum number of files)

Inodes Data Region

I I I I I D D D D D D D D D D D D D D D D D D D D D D D D

0 7 8 15 16 23 24 31

Data Region

D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D

32 39 40 47 48 55 56 63
How does the file system
keep track whether
inodes or data blocks are
free or allocated?
Allocation structures
• These structures are to track whether inodes or data
blocks are free or allocated.
• Use bitmap, each bit indicates free(0) or in-use(1)
– data bitmap: for data region
– inode bitmap: for inode table
Inodes Data Region

i d I I I I I D D D D D D D D D D D D D D D D D D D D D D D D

0 7 8 15 16 23 24 31

Data Region

D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D

32 39 40 47 48 55 56 63
Superblock
• Super block (S) contains the information for particular
file system
– Ex) The number of inodes, starting location of inode table. etc
Inodes Data Region

S i d I I I I I D D D D D D D D D D D D D D D D D D D D D D D D

0 7 8 15 16 23 24 31

Data Region

D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D

32 39 40 47 48 55 56 63

– Thus, when mounting a file system, OS will read the superblock

first, to initialize various information.
File Organization: The inode
• Each inode is referred to by the inode number.
– Given an inode number, File system can calculate the location of
the inode on the disk.
– Ex) inode number: 32
• Calculate the offset into the inode region = inode number ×
sizeof(inode) = 32 x 256 bytes = 8192 bytes = 8 KB
• Actual location = start address of the inode table + the offset
= 12 KB + 8 KB = 20 KB
The Inode table

iblock 0 iblock 1 iblock 2 iblock 3 iblock 4

0 1 2 3 16 17 18 19 32 33 34 35 48 49 50 51 64 65 66 67

4 5 6 7 20 21 22 23 36 37 38 39 52 53 54 55 68 69 70 71
Super i-bmap d-bmap 8 9 10 11 24 25 26 27 40 41 42 43 56 57 58 59 72 73 74 75

12 13 14 15 28 29 30 31 44 45 46 47 60 61 62 63 76 77 78 79

0KB 4KB 8KB 12KB 16KB 20KB 24KB 28KB 32KB

File Organization: The inode
• Disk are not byte addressable, sector addressable.
• Disk consists of a large number of addressable sectors (e.g., 512 bytes
each)
– Ex) Fetch the desired inode block for inode number 32
The sector of the inode block can be calculated as follows:
sector = ((inumber * sizeof(inode)) +inodeStratAddr) / sectorsize
= 40

The Inode table

iblock 0 iblock 1 iblock 2 iblock 3 iblock 4

0 1 2 3 16 17 18 19 32 33 34 35 48 49 50 51 64 65 66 67

4 5 6 7 20 21 22 23 36 37 38 39 52 53 54 55 68 69 70 71
Super i-bmap d-bmap 8 9 10 11 24 25 26 27 40 41 42 43 56 57 58 59 72 73 74 75

12 13 14 15 28 29 30 31 44 45 46 47 60 61 62 63 76 77 78 79

0KB 4KB 8KB 12KB 16KB 20KB 24KB 28KB 32KB

File Organization: The inode
• inode have all of the information (metadata) about a file
– File type (regular file, directory, etc.),
– Size, the number of blocks allocated to it.
– Protection information(who ones the file, who can access, etc).
– Time information.
– Etc.
• inode needs to refer to where data blocks of this file are
• One simple approach: use direct pointers (say, 12 pointers)
– Each pointer refers to one disk block that belongs to the file
– Limitation: A file can grow to be 48 KB (= 4KB × 12)
File Organization: The inode
Size Name What is this inode field for?
2 mode can this file be read/written/executed?
2 uid who owns this file?
4 size how many bytes are in this file?
4 time what time was this file last accessed?
4 ctime what time was this file created?
4 mtime what time was this file last modified?
4 dtime what time was this inode deleted?
4 gid which group does this file belong to?
2 links_count how many hard links are there to this file?
2 blocks how many blocks have been allocated to this file?
4 flags how should ext2 use this inode?
4 osd1 an OS-dependent field
60 block a set of disk pointers (15 total)
4 generation file version (used by NFS)
4 file_acl a new permissions model beyond mode bits
4 dir_acl called access control lists
4 faddr an unsupported field
12 i_osd2 another OS-dependent field

The EXT2 Inode

The Multi-Level Index
• To support bigger files, we use multi-level index.
• Indirect pointer points to a disk block that contains
more pointers.
The Multi-Level Index
• To support bigger files,
we use multi-level
index.
• Indirect pointer
points to a disk block
that contains more
pointers.

src: “The Design of the UNIX Operating

System” by Maurice J. Bach. 1986.
The Multi-Level Index
• To support bigger files, we use multi-level index.
• Indirect pointer points to a disk block that contains
more pointers.
– Ex) inode have fixed number of direct pointers (12) and a single
indirect pointer.
– If a file grows large enough, an indirect block is allocated (from
the data-block region), and inode’s slot for an indirect pointer is
set to point to it
• A file can grow to be 12 + 1024 × 4 𝐾 or 4144 𝐾𝐵
The Multi-Level Index
• Double indirect pointer points to a block that contains pointers to
indirect blocks.
– Allow file to grow with an additional 1024 x 1024 or 1 million 4KB
blocks.
• Triple indirect pointer points to a block that contains pointers to double
indirect blocks.
• Multi-Level Index approach to pointing to file blocks.
– Ex) 12 direct pointers, a single and a double indirect block.
• over 4GB in size, because (12+1024+10242 ) × 4KB
• Many file system use a multi-level index.
– Linux EXT2, EXT3, NetApp’s WAFL, Unix file system.
– Linux EXT4 use extents instead of simple pointers. (Extent = a pointer
+ a length in blocks … similar to variable-length memory segments)
The Multi-Level Index

Most files are small Roughly 2K is the most common size

Average file size is growing Almost 200K is the average
Most bytes are stored in large files A few big files use most of the space
File systems contains lots of files Almost 100K on average
File systems are roughly half full Even as disks grow, file system remain ~50% full
Directories are typically small Many have few entries; most have 20 or fewer

File System Measurement Summary

Directory Organization
• Directory contains a list of (entry name, inode number)
pairs.
• Each directory has two extra files . (dot) for current
directory and .. (dot-dot) for parent directory
– For example, dir has three files (foo, bar, foobar)

inum | reclen | strlen | name

5 4 2 .
2 4 3 ..
12 4 4 foo
13 4 4 bar
24 8 7 foobar
on-disk dir content
Directory Organization
• Directory contains a list of (entry name, inode number)
pairs.
• Each directory has two extra files . (dot) for current
directory and .. (dot-dot) for parent directory
– For example, dir has three files (foo, bar, foobar)

inum | reclen | strlen | name Total bytes for

5 4 2 . the name + any
2 4 3 ..
left over space
12 4 4 foo
13 4 4 bar
(because a new
24 8 7 foobar entry may reuse
an old, bigger
on-disk dir content entry)
Free Space Management
• File system tracks which inodes and data blocks are free
or not.
• In order to manage free space, we have two simple
bitmaps.
– When a file is newly created, it allocates inode by searching the
inode bitmap and update on-disk bitmap.
– Pre-allocation policy is commonly used for allocate contiguous
data blocks.
Access Paths: Reading a File From Disk
• A process issues an open(“/foo/bar”, O_RDONLY):
– Goal: Traverse the pathname and thus locate the desired inode.
– Begin at the root of the file system (/)
• In most Unix file systems, the root inode number is 2
– Filesystem reads in the block that contains inode number 2.
– Look inside of it to find pointer to data blocks (contents of the
root).
– By reading in one or more directory data blocks, it will find “foo”
directory.
– Traverse recursively the path name until the desired inode (“bar”)
– Check final permissions, allocate a file descriptor for this process
and returns file descriptor to user.
Access Paths: Reading a File From Disk
data inode root foo bar root foo bar bar bar
bitmap bitmap inode inode inode data data data[0] data[1] data[2]

open( read
/foo/bar, read
O_RDONL read
Y) read
read
read() read
read
write
read() read
read
write
read() read
read
write

File Open() Timeline (Time Increasing Downward)

Access Paths: Reading a File From Disk
• Issue read() to read from the file.
– Read in the desired block of the file by consulting the
inode to find the location of such a block.
• Update the inode with a new last accessed time.
• Update in-memory open file table for file descriptor, the file
offset.

• When file is closed using close():

– File descriptor should be deallocated, but for now,
that is all the file system really needs to do.
– No disk I/Os take place.
Access Paths: Reading a File From Disk
data inode root foo bar root foo bar bar bar
bitmap bitmap inode inode inode data data data[0] data[1] data[2]

open( read
/foo/bar, read
O_RDONL read
Y) read
read
read() read
read
write
read() read
read
write
read() read
read
write

File Read Timeline (Time Increasing Downward)

Access Paths: Writing to Disk
• Ex: Issue write() to update a file with new contents.
• File may allocate a new data block (unless an existing
data block is being overwritten).
– Need to update the data block and the data bitmap.
– It generates five disk I/Os:
• one to read the data bitmap
• one to write the bitmap (to reflect its new state to disk)
• two more to read and then write the inode
• one to write the actual data block itself.

• To create a file, it also allocates space for directory,

causing high I/O traffic.
Access Paths: Writing to Disk
data inode root foo bar root foo bar bar bar
bitmap bitmap inode inode inode data data data[0] data[1] data[2]

create ( read
/foo/bar) read
read
read
[Assume, read
/foo/bar write
doesn’t write
exist in read
/foo/] write
write
write() read
read
write write
write
write() read
read
write write
write
write() read
read write
write
write

File Creation and write Timeline (Time Increasing Downward)

Caching and Buffering
• Reading and writing files are expensive, incurring many I/Os.
– For example, long pathname(/1/2/3/.../100/file.txt)
• One read for the inode of the directory and at least one read for its data.
• Literally perform hundreds of reads just to open the file.

• In order to reduce I/O traffic, file systems aggressively use

system memory (DRAM) to cache.
– Early file system use fixed-size cache to hold popular
blocks.
• Static partitioning of memory can be wasteful;

– Modem systems use dynamic partitioning approach,

unified page cache.
• Read I/O can be avoided by large cache.
Caching and Buffering
• Write traffic has to go to disk for persistence. Thus, cache
does not reduce write I/Os.
• File system use write buffering for write performance
benefits.
– Delaying writes (file system batch some updates into a smaller
set of I/Os).
– By buffering a number of writes in memory, the file system can
then schedule the subsequent I/Os.
– In some cases, it can avoid some writes completely

• Some applications force flush data to disk by calling

fsync() or calling direct I/O functions.
Reading Material

• Chapter 40 of OSTEP book – by Remzi and Andrea

Arpaci-Dusseau (University of Wisconsin)
http://pages.cs.wisc.edu/~remzi/OSTEP/file-
implementation.pdf
Questions?

Os Unit 5 Class Notes
No ratings yet
Os Unit 5 Class Notes
17 pages
Linux File System Structure
100% (1)
Linux File System Structure
55 pages
Solution Manual for C++ Programming: Program Design Including Data Structures, 6th Edition D.S. Malik - Download All Chapters Immediately In PDF Format
100% (9)
Solution Manual for C++ Programming: Program Design Including Data Structures, 6th Edition D.S. Malik - Download All Chapters Immediately In PDF Format
46 pages
Chapter 12 File System Implementation
No ratings yet
Chapter 12 File System Implementation
53 pages
Lecture 2 Advanced File Systems
No ratings yet
Lecture 2 Advanced File Systems
66 pages
L18 VSFS and FSFormat
No ratings yet
L18 VSFS and FSFormat
38 pages
Software Recovery Data Hardisk Eksternal
No ratings yet
Software Recovery Data Hardisk Eksternal
3 pages
12 File Systems
No ratings yet
12 File Systems
42 pages
Operating Systems CMPSC 473
No ratings yet
Operating Systems CMPSC 473
27 pages
100-Computer Solved Mcqs for Ppsc Fpsc Kppsc Nts Bpsc Tests
No ratings yet
100-Computer Solved Mcqs for Ppsc Fpsc Kppsc Nts Bpsc Tests
30 pages
File System Interface and Operations
No ratings yet
File System Interface and Operations
30 pages
lec14_fsapi
No ratings yet
lec14_fsapi
38 pages
Module 02
No ratings yet
Module 02
8 pages
13-FileSystemImplementation
No ratings yet
13-FileSystemImplementation
24 pages
OS_FileSystem
No ratings yet
OS_FileSystem
18 pages
THE IMPORTANCE-WPS Office
No ratings yet
THE IMPORTANCE-WPS Office
2 pages
4 Internal Representation of Files
No ratings yet
4 Internal Representation of Files
12 pages
Data Organisation and File Allocation
No ratings yet
Data Organisation and File Allocation
17 pages
ITPEC IP 2020 October
No ratings yet
ITPEC IP 2020 October
34 pages
EB-23144-18 QbusIntrfs 1983
No ratings yet
EB-23144-18 QbusIntrfs 1983
642 pages
Lecture Notes Course Outcome 1 & Session 4 Topic: SFS File System Implementation
No ratings yet
Lecture Notes Course Outcome 1 & Session 4 Topic: SFS File System Implementation
8 pages
File System Structure and File system implemetation
No ratings yet
File System Structure and File system implemetation
6 pages
File Systems (1) : XVII-1
No ratings yet
File Systems (1) : XVII-1
24 pages
List+of+Filesystems
No ratings yet
List+of+Filesystems
3 pages
Session 5 6 Revision
No ratings yet
Session 5 6 Revision
47 pages
IGNOU MCA MCS-41 Solved Assignment 2011: TH TH
No ratings yet
IGNOU MCA MCS-41 Solved Assignment 2011: TH TH
8 pages
5.FileSystems
No ratings yet
5.FileSystems
33 pages
4.5.4 Lab Navigating The Linux Filesystem and Permission Settings 1878829
No ratings yet
4.5.4 Lab Navigating The Linux Filesystem and Permission Settings 1878829
18 pages
NetBackup 5340 Appliance Product Description
No ratings yet
NetBackup 5340 Appliance Product Description
87 pages
File Management: Objectives
No ratings yet
File Management: Objectives
7 pages
2 (1) (1) - File System
No ratings yet
2 (1) (1) - File System
7 pages
Ch-14 - File System Implementation
No ratings yet
Ch-14 - File System Implementation
34 pages
Modern Operating Systems, 2nd Edition, Chapter 6 course slides
No ratings yet
Modern Operating Systems, 2nd Edition, Chapter 6 course slides
46 pages
Com Sci p1 Chap 5 (System Software) Notes
No ratings yet
Com Sci p1 Chap 5 (System Software) Notes
25 pages
File System Structure1
No ratings yet
File System Structure1
21 pages
File Systems: Fundamentals: Files
No ratings yet
File Systems: Fundamentals: Files
14 pages
Assingment Dbms
No ratings yet
Assingment Dbms
15 pages
File Management15
No ratings yet
File Management15
52 pages
DS8900F Seller Presentation - 2020-Sep-24
100% (1)
DS8900F Seller Presentation - 2020-Sep-24
47 pages
Module 4 File System
No ratings yet
Module 4 File System
58 pages
This Lecture: Physical Reality (Disks) File System Abstraction
No ratings yet
This Lecture: Physical Reality (Disks) File System Abstraction
8 pages
He-Dieu-Hanh - Kai-Li - Filelayout - (Cuuduongthancong - Com)
No ratings yet
He-Dieu-Hanh - Kai-Li - Filelayout - (Cuuduongthancong - Com)
7 pages
FILE CONCEPT for second internels
No ratings yet
FILE CONCEPT for second internels
20 pages
4741397
No ratings yet
4741397
31 pages
OS Unit-4
No ratings yet
OS Unit-4
29 pages
13 Filesystems Slides
No ratings yet
13 Filesystems Slides
39 pages
Austin Datasheet
No ratings yet
Austin Datasheet
64 pages
File System implementation
No ratings yet
File System implementation
32 pages
FILE ÔN CUỐI KỲ
No ratings yet
FILE ÔN CUỐI KỲ
31 pages
Attix 5 Backup
No ratings yet
Attix 5 Backup
30 pages
Advanced Operating Systems -3
No ratings yet
Advanced Operating Systems -3
50 pages
107-Huawei Edesigner & SCT Tools Pre-Sales Training V1.82
No ratings yet
107-Huawei Edesigner & SCT Tools Pre-Sales Training V1.82
42 pages
Unit VI File Management
No ratings yet
Unit VI File Management
41 pages
Pertemuan 12 13 Storage Device
No ratings yet
Pertemuan 12 13 Storage Device
38 pages
10 Examples of hardware
No ratings yet
10 Examples of hardware
14 pages
Reading: Washington. Thank You, Hank!
No ratings yet
Reading: Washington. Thank You, Hank!
4 pages
Epson 9880 Error Codes
100% (3)
Epson 9880 Error Codes
2 pages
Chapter #7: Sequential Logic Case Studies
No ratings yet
Chapter #7: Sequential Logic Case Studies
48 pages
Lecture 10
No ratings yet
Lecture 10
57 pages
14 File System Implementation
No ratings yet
14 File System Implementation
46 pages
LINUX File System: Slides Adopted From
No ratings yet
LINUX File System: Slides Adopted From
41 pages
File System Implementation
No ratings yet
File System Implementation
27 pages
Lecture-9 Cyber Forensics - File System
No ratings yet
Lecture-9 Cyber Forensics - File System
3 pages
Surface Roughness Tester
No ratings yet
Surface Roughness Tester
1 page
Btech Electronics and Communication Engg Semester I To Viii Cbcegs Gndu
No ratings yet
Btech Electronics and Communication Engg Semester I To Viii Cbcegs Gndu
118 pages
Epson Emp-Dm1 PDF
No ratings yet
Epson Emp-Dm1 PDF
79 pages
Linux-Foundation Pass4sure LFCS v2019-03-20 by Austin 154q PDF
No ratings yet
Linux-Foundation Pass4sure LFCS v2019-03-20 by Austin 154q PDF
66 pages
Unit-V
No ratings yet
Unit-V
91 pages
Filesystem Implementation
No ratings yet
Filesystem Implementation
27 pages
Module 4 File System Implemenattion
No ratings yet
Module 4 File System Implemenattion
21 pages
18.FileSystems Fundamentals
No ratings yet
18.FileSystems Fundamentals
14 pages
Lec19 Filesystems2
No ratings yet
Lec19 Filesystems2
30 pages
Chapter - 6
No ratings yet
Chapter - 6
48 pages
Os 1 PDF
No ratings yet
Os 1 PDF
43 pages
Os - Unit 5
No ratings yet
Os - Unit 5
60 pages
File Systems: Implementation: Bilkent University Department of Computer Engineering CS342 Operating Systems
No ratings yet
File Systems: Implementation: Bilkent University Department of Computer Engineering CS342 Operating Systems
107 pages
TX sr707
No ratings yet
TX sr707
165 pages
18.FileSystems Fundamentals Handout
No ratings yet
18.FileSystems Fundamentals Handout
5 pages
File Systems
No ratings yet
File Systems
44 pages
Dfc2053 Computer System Architecture
No ratings yet
Dfc2053 Computer System Architecture
54 pages
IBM SVC Advanced Copyservices
No ratings yet
IBM SVC Advanced Copyservices
264 pages
System Has No Power at All
100% (1)
System Has No Power at All
4 pages
File System Implementation
No ratings yet
File System Implementation
31 pages
PLC Details of Agc Plant
No ratings yet
PLC Details of Agc Plant
93 pages
Session 111 Encryption Policy Sample
No ratings yet
Session 111 Encryption Policy Sample
7 pages
Operating Systems Unit - 5: I/O and File Management
No ratings yet
Operating Systems Unit - 5: I/O and File Management
48 pages
Best Practices Guide For Dell EMC Unity Storage Integration: Michael Cade
No ratings yet
Best Practices Guide For Dell EMC Unity Storage Integration: Michael Cade
39 pages
FreeBSD Mastery: Storage Essentials: IT Mastery, #4
From Everand
FreeBSD Mastery: Storage Essentials: IT Mastery, #4
Michael W. Lucas
No ratings yet
Neo Geo Architecture: Architecture of Consoles: A Practical Analysis, #23
From Everand
Neo Geo Architecture: Architecture of Consoles: A Practical Analysis, #23
Rodrigo Copetti
No ratings yet
Mega Drive Architecture: Architecture of Consoles: A Practical Analysis, #3
From Everand
Mega Drive Architecture: Architecture of Consoles: A Practical Analysis, #3
Rodrigo Copetti
No ratings yet

4.4.persistence-fs-impl

Uploaded by

4.4.persistence-fs-impl

Uploaded by

Operating Systems

Topic: Persistence – File System Implementation

• Youjip Won (Hanyang University)

• OSTEP book – by Remzi and Andrea Arpaci-Dusseau

Inodes Data Region

– Thus, when mounting a file system, OS will read the superblock

iblock 0 iblock 1 iblock 2 iblock 3 iblock 4

0KB 4KB 8KB 12KB 16KB 20KB 24KB 28KB 32KB

The Inode table

iblock 0 iblock 1 iblock 2 iblock 3 iblock 4

0KB 4KB 8KB 12KB 16KB 20KB 24KB 28KB 32KB

The EXT2 Inode

src: “The Design of the UNIX Operating

Most files are small Roughly 2K is the most common size

File System Measurement Summary

inum | reclen | strlen | name

inum | reclen | strlen | name Total bytes for

File Open() Timeline (Time Increasing Downward)

• When file is closed using close():

File Read Timeline (Time Increasing Downward)

• To create a file, it also allocates space for directory,

File Creation and write Timeline (Time Increasing Downward)

• In order to reduce I/O traffic, file systems aggressively use

– Modem systems use dynamic partitioning approach,

• Some applications force flush data to disk by calling

• Chapter 40 of OSTEP book – by Remzi and Andrea

You might also like