UNIX Module 2 Chapter 4 and 5
UNIX Module 2 Chapter 4 and 5
22CSE351
Chapter 4 and 5
File
A file is a container for storing information.
A sequence of characters.
All file attributes are kept in a separate area of hard disk accessible only to kernel.
All physical devices(hard disk, memory, printer ,CD ROM etc) are treated as files.
The shell and the kernel is also a file.
The File system
UNIX looks at everything as files.
•Can consist of any ASCII character except the / and the NULL character
List.
^V^B^D-++bcd
{}[]ac
@#$%*abcd
a.b.c.d.e
•A filename can begin with a dot and also end with a dot
Never use a – at the beginning of a filename. It may be treated as option for a command.
Basic File Types/Categories
Files can be divided into 3 categories
1) Ordinary file
2) Directory file
3) Device file
Ordinary file : Also known as regular file contains data as stream of characters.
Directory file : Contains files and other directories – name and a number associated with
each name
Device file : All devices are represented by files – To access a device file need, one has to
perform these operations on the associated file.
Ordinary (Regular) File
•Most common file type.
•All the programs you write belong to this type.
•An Ordinary file can be divided into:
– Text file
– Binary file
Text file
– A text file contains only printable characters can be viewed.
– All C and java program sources, shell and perl scripts
– A text file contains lines of characters with each line terminated by a newline character ,
also known as linefeed(LF).
– LF is appended to every line when [Enter] is pressed.
Binary file
– Contains both printable and unprintable characters that cover entire ASCII range (0-255)
– Object code and executables that are produced by compiling C programs are binary files.
• Commands used to access an ordinary file also work with device files.
• A device file is indeed a special; it‟s not really a stream of characters. In fact, it
doesn‟t
contain anything at all.
• The operation of a device is entirely governed by the attributes of its associated file.
Organization of Files
• All the directories, folders, devices and files are considered as files in UNIX operating
system.
• Each directory can contain files and subdirectories (which are also directories). The
top-level directory is called the root directory (denoted by /), and all other directories
branch out from there.
• Basic commands to list and manipulate files: UNIX provides essential commands for
working with files and directories
• Independent of physical file system organization: The UNIX file system provides a
logical view of files and directories. This view is independent of the physical storage
devices (such as hard drives or SSDs) where the data resides. Users interact with files
and directories through logical paths (e.g., /home/user/documents/file.txt) without
needing to know the physical location.
• Always single tree: Unlike some other file systems, UNIX maintains a single tree
structure. There’s no concept of multiple drives (like C:, D:, etc.). All directories and
files are part of this unified tree, starting from the root directory.
Hidden Files
• A hidden folder (sometimes hidden directory) or hidden file is a folder or file which file
system utilities do not display by default when showing a directory listing.
• They are commonly used for storing user preferences or preserving the state of a utility,
and are frequently created implicitly by using various utilities.
• They are not a security mechanism because access is not restricted - usually the intent
is simply to not "clutter" the display of the contents of a directory listing with files the user
did not directly create.
Hidden Files
• In Unix-like operating systems, any file or folder that starts with a dot character (for
example, .profile and .config), commonly called a dot file or dotfile, is to be treated as
hidden.
• The hidden files are often found in home directory, that normally don‟t show up in the
listing.
• /tmp – users are allowed to create temporary files. These files are wiped away regularly.
• /var – variable part of file system. Contains all your print jobs and outgoing and incoming mails.
• First group contains files that are made available during system installation
• Second group contains directories that would change as more software and utilities are
added to the system
/bin and /usr/bin:
These directories contain essential binary files (executable programs) that are
commonly used by all users. Commands like ls, cp, and mv reside here.
/etc:
The /etc directory holds configuration files for various system components. These files
control system behavior, services, and applications.
/dev:
Device files reside in /dev. These files represent hardware devices (e.g., disks, printers,
terminals) and allow communication between software and hardware. They don’t
consume disk space.
/usr/include:
It contains standard header files used in C and C++ programming. These headers define
functions, data types, and macros for developers.
/usr/share/man:
Man pages (manual pages) are stored here. They provide detailed documentation for
commands, system calls, and library functions. Use man <command> to access them.
/tmp:
Users can create temporary files in /tmp. These files are automatically cleaned up
periodically.
/var:
The /var directory holds variable data, including logs, spool files (e.g., print jobs), and
mail. It’s a dynamic part of the file system.
The Parent - Child relationship
• File system in UNIX is a collection of all related files organized in a hierarchical structure.
• The top that serves as the reference point for all files is called root.
• Ex: home directory is the parent of kumar, while / is the parent of home and
grandparent of kumar.
• Ex: home and kumar are both directories as they are parents of at least one file or
directory
The Parent - Child relationship
The home Directory and HOME variable
• When logged in ; UNIX places the user in a directory called the home directory.
• It is created when a user account is opened
• The shell variable HOME has the home directory
Ex: echo $HOME
/home/kumar (Absolute pathname)
• The HOME variable gives the absolute pathname -a sequence of directory names separated by
slashes.
• An absolute pathname shows a file location with reference to the top i.e., root.
• To locate file ‘foo’ in home directory use Example: $HOME/foo
• Also ‘~’ can be used to refer home directory. This symbol can refer to any user‟s home
directory
and not just your own.
• ~/foo - own user(Kumar) home - A tilde(~) followed by a / refers to one’s own home
• ~Sharma/foo - user Sharma home – A tilde(~) followed by a string refers to home directory of he
Reaching required files – the PATH variable, Manipulating
PATH
• To execute the UNIX command, shell looks for PATH variable to determine the sequence of
directories it has to search.
/bin:/usr/bin:/usr/local/bin
$echo $PATH
• The output will be the appended value along with default value of PATH.
Absolute pathname
• Many UNIX commands use file and directory names as arguments – presumed to existing current
directory
Ex:$cat login.sql
• If placed in /usr and want to access login.sql in /home/kumar pathname of the file to be used
$cat /home/kumar/login.sql
• If the first character of the pathname is / the file‟s location determined with respect to the root –
this is absolute pathname
• When there is one / in the pathname for each / descend one level in the file system
Using absolute pathname for a command
• When date command is specified the system has to locate the file date from a list of directories
specified in PATH variable
Eg: /bin/date
• If the location of a particular command is known the complete path can precede the command
• For any command that resides in the directory specified in the path variable the --absolute
pathname need not be used
UNIX places a user in a specific directory of the file system when logged in
User can navigate from one directory to another but at any point of time user is located in only one
directory current directory
Ex: $pwd
Reasons:
The directory already exists
There may be an ordinary file by that name in the current directory
Permissions set for the current directory don‟t permit the creation of files and directories by
the user.
rmdir – Removing Directory
rmdir command removes directories
$rmdir pis
rmdir can remove more than one directory at once
$rmdir pis/data pis/progs pis
When a directory and subdirectories need to be deleted a reverse logic is applied
Sequence followed is subdirectories first then the parent directory
$rmdir pis pis/progs pis/data
<error>
The error message leads to 2 rules to be remembered when deleting directories
Cannot delete a directory using rmdir command unless it is empty – pis could not be removed
because of progs and data under it.
Cannot remove a subdirectory, unless you are placed in a directory that is hierarchically above the
one you have chosen to remove.
rmdir – Removing Directory
Try removing progs being in progs
$cd progs
$pwd
/home/kumar/pis/progs
$rmdir /home/kumar/pis/progs
<error>
To remove this directory place yourself in a directory above progs
– $cd /home/kumar/pis
$pwd
/home/kumar/pis
$rmdir progs
How files/directories are created and removed?
A file(ordinary or directory) is associated with a name and a number called inode
number.
When a directory is created an entry comprising these 2 parameters is made in file‟s
parent directory
The entry is removed when the directory(or ordinary file) is removed.
ls: listing files in current directory
The ls command in UNIX is used to list files and directories within the file system
1. Basic Listing
Command:
$ ls
$ ls /home/user
Description: Lists files and directories in the /home/user directory
The files are arranged alphabetically with uppercase having precedence over lower.
Common Options:
-l: Long listing format. Displays detailed information about each file, including
permissions, number of links, owner, group, size, and timestamp.
-a: Lists all files, including hidden files (those starting with a dot).
-h: Human-readable format. Makes file sizes easier to read (e.g., 1K, 234M, 2G).
Description: Lists all files, including hidden files (those starting with a dot).
Example Output:
.bashrc .profile file.txt directory
Human-Readable Sizes
•Command:
ls -lh
•Description: Shows file sizes in a human-readable format (e.g., KB, MB).
•Example Output:
drwxr-xr-x 2 user group 4.0K Aug 12 20:00 directory
-rw-r--r-- 1 user group 1.2K Aug 12 20:00 file.txt
Combine Options
Command:
ls -alh
Description: Combines multiple options to list all files (including hidden ones) in long
format with human-readable sizes.
Example Output:
drwxr-xr-x 5 user group 4.0K Aug 12 20:00 .
drwxr-xr-x 3 user group 4.0K Aug 12 20:00 ..
-rw-r--r-- 1 user group 1.2K Aug 12 20:00 .bashrc
drwxr-xr-x 2 user group 4.0K Aug 12 20:00 directory
-rw-r--r-- 1 user group 1.2K Aug 12 20:00 file.txt
Consider the following directory structure:
ls -R /home/user
Output:
/home/user:
dir1 file1.txt file4.txt
/home/user/dir1:
dir2 file2.txt
/home/user/dir1/dir2:
file3.txt
The ls -x command in Linux lists directory contents in columns, sorted horizontally.
This means that the files and directories are displayed in rows across the terminal window,
rather than in a single column or sorted vertically.
The ls -F command in Linux is used to append a character to each file name indicating the
file type. This makes it easier to distinguish between different types of files and directories
at a glance.
/ for directories
* for executable files
@ for symbolic links
| for FIFOs (named pipes)
= for sockets
#!/bin/bash
# This script displays the current user, date and time, and the contents of the current
directory
• ls
• LS
• ls chap*
• ls –l
• ls –l chap*
ls
• ls –d current directory
• ls –l list with long format -show permissions
• ls –la list long format including hidden file
• ls –s list file size
• ls –S sort by file size
• ls –t sort by time & date
ls options
Even though we used simple filenames here, both source and destination can also be pathnames.
Otherwise, it simply overwrites the file without any warning. So check with ls whether the
destination file exists before you use cp.
cp - Copying a file
When both are ordinary files, the first one is copied to the second
cp chap01 unit1
If there is only one file to be copied the destination can be either an ordinary or directory
cp chap01 progs/unit1 //chap01 copied to unit1 under progs
cp chap01 progs // chap01 retains its name under progs
cp - Copying a file
cp is often used with shorthand notation . (dot) to signify the current directory as destination
cp /home/sharma/.profile .profile //Destination is a file
cp /home/sharma/.profile . // destination is a current directory
cp is used to copy more than one file. The last file name must be a directory
cp chap01 chap02 chap03 progs // progs directory must exist and cp wont create it
cp chap* progs // copies all files beginning with chap
cp overwrites without warning the destination file if it exists!. Run ls before you use cp unless you
are sure that the destination file doesn’t exist or deserves to be overwritten.
cp - Options
Interactive copying (-i)
The –i option warns the user before overwriting the destination file.
If unit1 exists cp prompts for a response
cp –i chap01 unit1
cp: overwrite unit1 (yes/no)? y
A y at this prompt overwrites the file and any other response leaves it uncopied
progs/
├── file1.txt
├── file2.txt
└── subdir/
└── file3.txt
newprogs/
├── file1.txt
├── file2.txt
└── subdir/
└── file3.txt
cp - Options
Assuming newprogs already exists and has some files:
newprogs/
├── existingfile.txt
newprogs/
├── existingfile.txt
└── progs/
├── file1.txt
├── file2.txt
└── subdir/
└── file3.txt
rm - Deleting files
rm command deletes one or more files
It should be used with caution A file once deleted cant be recovered
rm chap01 chap02 chap03 //removes three files
rm wont remove a directory, but it can remove files from one. For example to Remove two
chapters from progs directory without having to "cd" to it
A y removes the file, any other response leaves the file undeleted.
rm Options
Recursive deletion (-r or -R)
rm performs a tree walk-a thorough recursive search for all directories and files within the
subdirectories. At each stage it deletes everything it finds.
rm wont remove directories but when used with this option it will remove directories.
$rm –r * //it behaves partially like rmdir
This will delete all files in the current directory and all its subdirectories. If you don’t have a
backup, then these files will be lost forever.
rm Options
Forcing removal –f
rm prompts for removal if the file is write-protected. But the -f option overrides this minor
protection also.
The –f option overrides this minor protection and forces removal
And when you combine the -r option with it, it could be the most dangerous thing that you’ve ever
done:
$rm –rf * //Deletes everything in the current directory and below
If you don’t have a backup, then these files will be lost forever. Note that this command will delete
hidden files in all directories except the current directory.
rm Options
Make sure you are doing the right thing before you use rm *
The first command removes only ordinary files in the current directory.
If the root user (the superuser) invokes rm -rf * in the / directory, the entire
UNIX system will be wiped out from the hard disk!
mv - Renaming files
mv command renames (moves) files. It has two distinct functions
It renames a file (or directory)
It moves a group of files to a different Directory
It does not create a copy of the file it merely renames it. If destination file doesn’t exist, it will
be created.
mv chap01 man01 // renames the file chap01 to man01
cp adds an entry into the directory with the name of the destination file and inode number that is
allotted by kernel.
mv replaces the name of an existing directory entry without disturbing its inode number.
wc –w infile
wc –c infile
When used with multiple filenames wc produces a line for each file as well as total count
The od command particularly useful for debugging scripts, examining binary files, or
visualizing non-human-readable data.
od: Displaying data in octal
The tab character [Ctrl-i] is shown as \t and its octal value is 011
The linefeed character [Ctrl-j] is shown as \n and 012. Now od makes the
newline character visible too.
Comparing Files
You’ll often need to compare two files. They could be identical, in which case you may
want to delete one of them.
Two configuration files may have small differences, and knowledge of these differences
could help you understand why one system behaves differently from another.
UNIX supports three commands—cmp, diff, and comm—that compare two files and
present their differences.
cmp: Byte-by-Byte Comparison
cmp makes a comparison of each byte of two files and terminates the moment it encounters a
difference
cmp file1 file2
If the files are identical, cmp returns no output and exits with a status of 0.
If the files differ, it reports the byte and line number where the first difference occurs and
exits with a status of 1
cmp: Byte-by-Byte Comparison
Let’s say we have two files, file1.txt and file2.txt.
file1.txt:
Hello, World! This is a test file.
file2.txt:
Hello, World! This is a test file with a difference.
To compare these files, you can use the cmp command as follows:
cmp file1.txt file2.txt
Output
file1.txt file2.txt differ: byte 34, line 2
This output indicates that the first difference between the two files occurs at byte 34 on line 2.
cmp: Byte-by-Byte Comparison
Verbose Mode:
cmp -l file1.txt file2.txt
This will list all differing bytes.
If the two files are identical, cmp displays no message but simply returns the prompt.
comm: What Is Common?
While cmp compares two files character by character, comm compares them line by line and
displays the common and differing lines.
To drop a particular column, simply use its column number as an option prefix. You can
also combine options and display only those lines that are common:
cat > a.txt
comm -3 foo1 foo2 Line1 #Selects lines not common to both files
comm -13 foo1 foo2 Line2 #Selects lines present only in second file
Line4
Line7
diff is the third command that can be used to display file differences.
Unlike its fellow members, cmp and comm, it also tells you which lines in one file have to
be changed to make the two files identical.
Line Numbers:
The numbers before the letters (e.g., 2d1, 4c3) refer to the line numbers in the files being
compared.
The first number refers to the line number in the first file (f1.txt), and the second number refers to
the line number in the second file (f2.txt).
Operation Codes:
d (delete): A line in the first file needs to be deleted to match the second file.
c (change): A line in the first file needs to be changed to match the second file.
a (add): A line in the second file needs to be added to match the first file.
Summary
d: Delete a line from the first file.
c: Change a line in the first file to match the second file.
a: Add a line from the second file to the first file.
diff: Converting One File to Another
cat > a.txt
This is file number 1
This is file number 2
1d0: This means that line 1 in the first file should be deleted to match the second file.
< This is file number 1: The < symbol indicates that the line “This is file number 1” is present in the
first file but not in the second file. Since there is no line 0 in f2.txt, it indicates that the line in f1.txt
does not exist in f2.txt
2a2: This means that after line 2 in the first file, you should append the line from the second file.
> This is file number 3 means: The > symbol indicates that the line “This is file number 3 means” is
present in the second file but not in the first file.
If you are simply interested in knowing whether two files are identical or not, use cmp without
any options.
diff: Converting One File to Another
2d1
2: Line number in file1.txt.
diff file1.txt file2.txt
d: Delete operation.
2d1
Explanation: The second line in file1.txt (custard
< custard apple
apple) is deleted to match file2.txt.
4c3
< guava
---
> grapes
5a5
> kiwi
diff: Converting One File to Another
5a5
5: Line number in file1.txt.
diff file1.txt file2.txt
a: Add operation.
2d1
5: Line number in file2.txt.
< custard apple
Explanation: After the fifth line in file1.txt, a
4c3
new line (kiwi) is added to match file2.txt.
< guava
---
> grapes
5a5
> kiwi
diff: Converting One File to Another
1a2:
This means that after line 1 in file2.txt, there is
diff file2.txt file1.txt
an additional line (line 2) in file1.txt that is not
1a2
present in file2.txt.
> custard apple
The line added in file2.txt is custard apple
3c4
< grapes
---
> guava
5d5
< kiwi
diff: Converting One File to Another
3c4:
This indicates a change between line 3 in
diff file2.txt file1.txt
file2.txt and line 4 in file1.txt.
1a2
The line grapes in file2.txt is changed to guava
> custard apple
as in file1.txt.
3c4
< grapes
---
> guava
5d5
< kiwi
diff: Converting One File to Another
5d5:
This means that line 5 in file2.txt is deleted
diff file2.txt file1.txt
The line deleted from file2.txt is kiwi
1a2
> custard apple
3c4
< grapes
---
> guava
5d5
< kiwi
End of Module 2