0% found this document useful (0 votes)
19 views96 pages

UNIX Module 2 Chapter 4 and 5

unix shell programming notes

Uploaded by

Sayed Shanwaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views96 pages

UNIX Module 2 Chapter 4 and 5

unix shell programming notes

Uploaded by

Sayed Shanwaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UNIX SHELL PROGRAMMING

22CSE351

Chapter 4 and 5
File
A file is a container for storing information.

A sequence of characters.

A file's size is not stored in the file.

All file attributes are kept in a separate area of hard disk accessible only to kernel.

UNIX treats directories and devices as files.

A directory is a folder that stores filenames and other directories.

All physical devices(hard disk, memory, printer ,CD ROM etc) are treated as files.
The shell and the kernel is also a file.
The File system
UNIX looks at everything as files.

Any UNIX system has thousands of files.


UNIX organizes files in

directories. Naming Files

•A filename can consist of up to 255 characters

•Files may or may not have extensions

•Can consist of any ASCII character except the / and the NULL character

•Permitted to use control characters or other unprintable characters

^@, ^A, ^B, ^C etc


valid filenames
.last-time

List.

^V^B^D-++bcd
{}[]ac

@#$%*abcd

a.b.c.d.e

It is recommended only the following characters be used in filenames

Alphabetic characters and numerals

The period (.) , hyphen(-) and underscore(_)

No rules for filename extensions


Application impose restrictions

– Shell script : *.sh


– C program: *.c
– SQL scripts: *.sql

•A file can have as many dots embedded in its name. a.b.c.d.e

•A filename can begin with a dot and also end with a dot

•UNIX is sensitive to case

chap01,Chap01,CHAP01 can exist in the same directory

Never use a – at the beginning of a filename. It may be treated as option for a command.
Basic File Types/Categories
Files can be divided into 3 categories
1) Ordinary file
2) Directory file
3) Device file

Ordinary file : Also known as regular file contains data as stream of characters.

Directory file : Contains files and other directories – name and a number associated with
each name

Device file : All devices are represented by files – To access a device file need, one has to
perform these operations on the associated file.
Ordinary (Regular) File
•Most common file type.
•All the programs you write belong to this type.
•An Ordinary file can be divided into:
– Text file
– Binary file

Text file
– A text file contains only printable characters can be viewed.
– All C and java program sources, shell and perl scripts
– A text file contains lines of characters with each line terminated by a newline character ,
also known as linefeed(LF).
– LF is appended to every line when [Enter] is pressed.
Binary file

– Contains both printable and unprintable characters that cover entire ASCII range (0-255)

– Most of Unix commands are binary files.

– Object code and executables that are produced by compiling C programs are binary files.

– Picture, sound and video files are binary files.

– Binary files cannot be displayed using cat command.


Directory file
Directory contains no data – contains some details of files and subdirectories that it
contains.
•UNIX file system is organized with a number of directories and subdirectories.
•Group a set of files pertaining to a specific application within a directory.
•Can two files have same filename?
It is not possible to create files with same name in same directory.
•A directory file contains an entry of every file and subdirectory
•Each entry has 2 components
– The Filename
– A unique identification number for the file or directory(inode number)
•A directory contains a file -interpreted as- “ a directory contains the filename and not file's
content”
Device file
• Devices are treated as files.
• Printing files, installing software – these activities are performed by reading or writing
the file representing the device.

• Commands used to access an ordinary file also work with device files.

• Device filenames are found under /dev

• A device file is indeed a special; it‟s not really a stream of characters. In fact, it
doesn‟t
contain anything at all.

• The operation of a device is entirely governed by the attributes of its associated file.
Organization of Files
• All the directories, folders, devices and files are considered as files in UNIX operating
system.

• Hierarchical organization of files: UNIX organizes files and directories in a


hierarchical tree structure.

• Each directory can contain files and subdirectories (which are also directories). The
top-level directory is called the root directory (denoted by /), and all other directories
branch out from there.
• Basic commands to list and manipulate files: UNIX provides essential commands for
working with files and directories

• Independent of physical file system organization: The UNIX file system provides a
logical view of files and directories. This view is independent of the physical storage
devices (such as hard drives or SSDs) where the data resides. Users interact with files
and directories through logical paths (e.g., /home/user/documents/file.txt) without
needing to know the physical location.

• Always single tree: Unlike some other file systems, UNIX maintains a single tree
structure. There’s no concept of multiple drives (like C:, D:, etc.). All directories and
files are part of this unified tree, starting from the root directory.
Hidden Files
• A hidden folder (sometimes hidden directory) or hidden file is a folder or file which file
system utilities do not display by default when showing a directory listing.

• They are commonly used for storing user preferences or preserving the state of a utility,
and are frequently created implicitly by using various utilities.

• They are not a security mechanism because access is not restricted - usually the intent
is simply to not "clutter" the display of the contents of a directory listing with files the user
did not directly create.
Hidden Files
• In Unix-like operating systems, any file or folder that starts with a dot character (for
example, .profile and .config), commonly called a dot file or dotfile, is to be treated as
hidden.

• The hidden files are often found in home directory, that normally don‟t show up in the
listing.

• ls command with –a option lists all hidden files.


$ls –a
. .. .profile .exrc
• The file .profile contains a set of instructions that are performed when a user logs in.
• The .exrc file contains a sequence of startup instructions for the vi editor.
. and .. are special directories.
Standard Directories
• /bin and /usr/bin – Commonly used UNIX commands are found (binary files).

• /sbin and /usr/sbin – commands which can be executed by system administrator.


First
group
• /etc – configuration files of system.

• /dev – all device files, they don‟t occupy space on disk.

• /lib and /usr/lib – library files in binary form.

• /usr/include – standard header files used in C program.


Second
group
• /usr/share/man – man (manual) pages are stored.

• /tmp – users are allowed to create temporary files. These files are wiped away regularly.

• /var – variable part of file system. Contains all your print jobs and outgoing and incoming mails.
• First group contains files that are made available during system installation
• Second group contains directories that would change as more software and utilities are
added to the system
/bin and /usr/bin:
These directories contain essential binary files (executable programs) that are
commonly used by all users. Commands like ls, cp, and mv reside here.

/sbin and /usr/sbin:


Similar to /bin, these directories store binary files, but they are typically meant for
system administrators. Commands like ifconfig and fdisk are found here.

/etc:
The /etc directory holds configuration files for various system components. These files
control system behavior, services, and applications.
/dev:
Device files reside in /dev. These files represent hardware devices (e.g., disks, printers,
terminals) and allow communication between software and hardware. They don’t
consume disk space.

/lib and /usr/lib:


These directories contain shared library files needed by programs. Libraries provide
common functions and routines for software development.

/usr/include:
It contains standard header files used in C and C++ programming. These headers define
functions, data types, and macros for developers.
/usr/share/man:
Man pages (manual pages) are stored here. They provide detailed documentation for
commands, system calls, and library functions. Use man <command> to access them.

/tmp:
Users can create temporary files in /tmp. These files are automatically cleaned up
periodically.

/var:
The /var directory holds variable data, including logs, spool files (e.g., print jobs), and
mail. It’s a dynamic part of the file system.
The Parent - Child relationship
• File system in UNIX is a collection of all related files organized in a hierarchical structure.

• The top that serves as the reference point for all files is called root.

• Root is represented by /, where Root is a directory.


• Root has a number of subdirectories under it, Subdirectories in turn have more
subdirectories and files under them.
• Every file apart from root must have a parent and it should be possible to
trace the ultimate parentage of a file to root.

• Ex: home directory is the parent of kumar, while / is the parent of home and
grandparent of kumar.

• In these parent-child relationships the parent is always a directory

• Ex: home and kumar are both directories as they are parents of at least one file or
directory
The Parent - Child relationship
The home Directory and HOME variable
• When logged in ; UNIX places the user in a directory called the home directory.
• It is created when a user account is opened
• The shell variable HOME has the home directory
Ex: echo $HOME
/home/kumar (Absolute pathname)
• The HOME variable gives the absolute pathname -a sequence of directory names separated by
slashes.
• An absolute pathname shows a file location with reference to the top i.e., root.
• To locate file ‘foo’ in home directory use Example: $HOME/foo
• Also ‘~’ can be used to refer home directory. This symbol can refer to any user‟s home
directory
and not just your own.
• ~/foo - own user(Kumar) home - A tilde(~) followed by a / refers to one’s own home
• ~Sharma/foo - user Sharma home – A tilde(~) followed by a string refers to home directory of he
Reaching required files – the PATH variable, Manipulating
PATH
• To execute the UNIX command, shell looks for PATH variable to determine the sequence of
directories it has to search.

• To see the value of PATH Example: $echo $PATH

/bin:/usr/bin:/usr/local/bin

• To include the user directory to $PATH variable, we have to reassign it.


PATH=$PATH:/usr/abc/xyz

$echo $PATH

• The output will be the appended value along with default value of PATH.
Absolute pathname
• Many UNIX commands use file and directory names as arguments – presumed to existing current
directory

Ex:$cat login.sql

works only if login.sql is in current directory

• If placed in /usr and want to access login.sql in /home/kumar pathname of the file to be used

$cat /home/kumar/login.sql
• If the first character of the pathname is / the file‟s location determined with respect to the root –
this is absolute pathname

• When there is one / in the pathname for each / descend one level in the file system
Using absolute pathname for a command

• When date command is specified the system has to locate the file date from a list of directories
specified in PATH variable

Eg: /bin/date

Fri Aug 13 12:00:00 IST 2010

• If the location of a particular command is known the complete path can precede the command

• For any command that resides in the directory specified in the path variable the --absolute
pathname need not be used

• If in other directory – absolute pathname to be specified /usr/local/bin/less


Relative pathname
• Relative pathname – uses either the current or parent directory as reference and specifies path
relative to it
$cd progs // progs is in current directory
$cd progs/scripts
• A relative pathname using dot (.) and double dots (..) notation:
• (.) single dot: this represents the current directory
• (..) double dots: this represents the parent directory
• (..) Used with cd can be used to move to parent directory
Ex: $pwd
/home/kumar/progs/data/text
$cd ..
$pwd
/home/kumar/progs/data
Relative pathname
• cd .. Change your directory to the parent of the current directory
• Any number of .. Can be combined with cd separated by /s
$pwd
/home/kumar/pics
$cd .. /..
$pwd
/home

• (.) refers to current directory


• Any command that uses current directory as the argument works with single dot
$cp ../sharma/.profile .
copies the file to current directory
$cd progs or $cd . /progs both are same.
Directory Commands
pwd
cd
mkdir
Rmdir

pwd- checking current directory

UNIX places a user in a specific directory of the file system when logged in
User can navigate from one directory to another but at any point of time user is located in only one
directory current directory

Command to know the current directory – pwd (print working directory)

Ex: $pwd

pwd displays absolute pathname.


Directory Commands

cd : Changing the current directory


Navigating around in the file system can be done by cd command.
When used with an argument cd changes the current directory to directory specified
as
argument
cd used without absolute pathname
$pwd
/home/kumar
$cd progs
$pwd
/home/kumar/progs
cd : Changing the current directory
Navigating around in the file system can be done by cd command.
When used with an argument cd changes the current directory to directory specified as argument
cd used without absolute pathname
$pwd
/home/kumar
$cd progs
$pwd
/home/kumar/progs
cd used with absolute pathname
$pwd
/home/kumar/progs
$cd /bin
$pwd
/bin
cd : Changing the current directory
cd can be used without arguments – in which case forces an immediate return to the home
directory
$cd /home/sharma
$pwd
/home/sharma
$cd
$pwd
/home/kumar
cd simply changes and does not show the current directory.
mkdir – Making Directories
Directories are created with mkdir command
The command is followed by name of the directories to be created
$mkdir pkt
a directory pkt is created under the current directory
Can create a number of directories with one mkdir command
$mkdir pdk dbs doc
Three directories created.
UNIX system lets us create a directory tree with just one invocation of the
command
$mkdir pis pis/progs pis/data
Creates 1 directory pis and 2 subdirectories under pis
Order of specification of the arguments is Important
Can‟t create a sub directory before creation of its parent directory
Order to be followed create parent directory first then the subdirectories.
mkdir – Making Directories

Error if order not followed


$mkdir pis/data pis/progs
error pis is not created
Sometimes system refuses to create a directory

Reasons:
The directory already exists
There may be an ordinary file by that name in the current directory
Permissions set for the current directory don‟t permit the creation of files and directories by
the user.
rmdir – Removing Directory
rmdir command removes directories
$rmdir pis
rmdir can remove more than one directory at once
$rmdir pis/data pis/progs pis
When a directory and subdirectories need to be deleted a reverse logic is applied
Sequence followed is subdirectories first then the parent directory
$rmdir pis pis/progs pis/data
<error>
The error message leads to 2 rules to be remembered when deleting directories
Cannot delete a directory using rmdir command unless it is empty – pis could not be removed
because of progs and data under it.
Cannot remove a subdirectory, unless you are placed in a directory that is hierarchically above the
one you have chosen to remove.
rmdir – Removing Directory
Try removing progs being in progs
$cd progs
$pwd
/home/kumar/pis/progs
$rmdir /home/kumar/pis/progs
<error>
To remove this directory place yourself in a directory above progs
– $cd /home/kumar/pis
$pwd
/home/kumar/pis
$rmdir progs
How files/directories are created and removed?
A file(ordinary or directory) is associated with a name and a number called inode
number.
When a directory is created an entry comprising these 2 parameters is made in file‟s
parent directory
The entry is removed when the directory(or ordinary file) is removed.
ls: listing files in current directory

The ls command in UNIX is used to list files and directories within the file system

1. Basic Listing
Command:
$ ls

Description: Lists files and directories in the current directory.

$ ls /home/user
Description: Lists files and directories in the /home/user directory

The files are arranged alphabetically with uppercase having precedence over lower.
Common Options:
-l: Long listing format. Displays detailed information about each file, including
permissions, number of links, owner, group, size, and timestamp.

-a: Lists all files, including hidden files (those starting with a dot).

-h: Human-readable format. Makes file sizes easier to read (e.g., 1K, 234M, 2G).

-R: Recursively lists subdirectories.

-t: Sorts files by modification time, newest first.

-S: Sorts files by size, largest first.


Long Listing Format
Command:
ls -l

Description: Provides detailed information about each file and directory.


Example Output:
-rw-r--r-- 1 user group 1234 Aug 12 20:00 file.txt
drwxr-xr-x 2 user group 4096 Aug 12 20:00 directory

Show Hidden Files


Command:
ls -a

Description: Lists all files, including hidden files (those starting with a dot).
Example Output:
.bashrc .profile file.txt directory
Human-Readable Sizes
•Command:

ls -lh
•Description: Shows file sizes in a human-readable format (e.g., KB, MB).

•Example Output:
drwxr-xr-x 2 user group 4.0K Aug 12 20:00 directory
-rw-r--r-- 1 user group 1.2K Aug 12 20:00 file.txt
Combine Options
Command:
ls -alh

Description: Combines multiple options to list all files (including hidden ones) in long
format with human-readable sizes.

Example Output:
drwxr-xr-x 5 user group 4.0K Aug 12 20:00 .
drwxr-xr-x 3 user group 4.0K Aug 12 20:00 ..
-rw-r--r-- 1 user group 1.2K Aug 12 20:00 .bashrc
drwxr-xr-x 2 user group 4.0K Aug 12 20:00 directory
-rw-r--r-- 1 user group 1.2K Aug 12 20:00 file.txt
Consider the following directory structure:

ls -R /home/user

Output:
/home/user:
dir1 file1.txt file4.txt

/home/user/dir1:
dir2 file2.txt

/home/user/dir1/dir2:
file3.txt
The ls -x command in Linux lists directory contents in columns, sorted horizontally.

This means that the files and directories are displayed in rows across the terminal window,
rather than in a single column or sorted vertically.
The ls -F command in Linux is used to append a character to each file name indicating the
file type. This makes it easier to distinguish between different types of files and directories
at a glance.

Here’s what the appended characters mean:

/ for directories
* for executable files
@ for symbolic links
| for FIFOs (named pipes)
= for sockets
#!/bin/bash
# This script displays the current user, date and time, and the contents of the current
directory

echo "Current user: $(whoami)"


echo "Current date and time: $(date)"
echo "Contents of the current directory:"
ls -l
ls

• ls
• LS
• ls chap*
• ls –l
• ls –l chap*
ls

• ls –d current directory
• ls –l list with long format -show permissions
• ls –la list long format including hidden file
• ls –s list file size
• ls –S sort by file size
• ls –t sort by time & date
ls options

• ls -x output in multiple columns


• ls –F Identifying directories and executable
• ls -a list all files including hidden file starting with.
• ls –x dir _name Listing directory contents
• ls –xR Recursive listing
File Related Commands
Some of the common file handling commands are:
cat
mv
rm
cp
wc
od
cat - Displaying and creating files
cat is used to display the contents of small file on the terminal
$cat dept.lst
01|accounts|4532
02|progss|3532
The name “cat” expands to “concatenation,” which means it can concatenate multiple files.
The contents of second file are shown immediately after the first
$cat chap01 chap02

cat options (-v and -n)


Displaying nonprinting characters(-v)
cat is used for displaying text files only
Executables when seen with cat simply display junk
To display nonprinting ASCII characters use cat with –v option
cat - Displaying and creating files
Numbering lines (-n)
Numbering facility helps in debugging programs
vi editor can show line numbers

Using cat to create a file


cat is also useful for creating files
To create a file enter cat command followed by the >(the right chevron) character and the file
Name

$ cat > foo


To create a file enter cat command followed by the >(the right chevron) character and the file
name. Now the cat command waits for the user input. When the user press [ctrl-d], it signifies
end of input to the system.
[ctrl-d]
$_
cp - Copying a file
cp(copy) copies a file or group of files. It creates an exact image of the file on disk with different
name
The syntax requires at least two filenames (source and destination) to be specified:
cp fork.c fork.c.bak

Even though we used simple filenames here, both source and destination can also be pathnames.

If the destination file (fork.c.bak) doesn’t exist, cp first creates it.

Otherwise, it simply overwrites the file without any warning. So check with ls whether the
destination file exists before you use cp.
cp - Copying a file

When both are ordinary files, the first one is copied to the second
cp chap01 unit1

If there is only one file to be copied the destination can be either an ordinary or directory
cp chap01 progs/unit1 //chap01 copied to unit1 under progs
cp chap01 progs // chap01 retains its name under progs
cp - Copying a file

cp is often used with shorthand notation . (dot) to signify the current directory as destination
cp /home/sharma/.profile .profile //Destination is a file
cp /home/sharma/.profile . // destination is a current directory

cp is used to copy more than one file. The last file name must be a directory
cp chap01 chap02 chap03 progs // progs directory must exist and cp wont create it
cp chap* progs // copies all files beginning with chap

cp overwrites without warning the destination file if it exists!. Run ls before you use cp unless you
are sure that the destination file doesn’t exist or deserves to be overwritten.
cp - Options
Interactive copying (-i)
The –i option warns the user before overwriting the destination file.
If unit1 exists cp prompts for a response
cp –i chap01 unit1
cp: overwrite unit1 (yes/no)? y
A y at this prompt overwrites the file and any other response leaves it uncopied

Copying directory structures(-R): The -R (recursive) option can be used to copy an


entire directory tree. This command copies all files and subdirectories in progs to
newprogs:
cp –R progs newprogs newprogs must not exist
cp - Options

If you have a directory structure like this:

progs/
├── file1.txt
├── file2.txt
└── subdir/
└── file3.txt

After running cp -R progs newprogs, you will get:

newprogs/
├── file1.txt
├── file2.txt
└── subdir/
└── file3.txt
cp - Options
Assuming newprogs already exists and has some files:

newprogs/
├── existingfile.txt

After running cp -R progs newprogs, the structure will be:

newprogs/
├── existingfile.txt
└── progs/
├── file1.txt
├── file2.txt
└── subdir/
└── file3.txt
rm - Deleting files
rm command deletes one or more files
It should be used with caution A file once deleted cant be recovered
rm chap01 chap02 chap03 //removes three files

rm wont remove a directory, but it can remove files from one. For example to Remove two
chapters from progs directory without having to "cd" to it

rm progs/chap01 progs/chap02 or rm progs/chap0[12]

To delete all files under a directory use


$ rm * // All files gone!
$_
When you delete files in this manner, the system won’t prompt you with the message All files in
the directory will be deleted! Before removing the files. The $ prompt will return silently.
rm Options
Interactive deletion (-i)
Like cp the –i option makes the command ask the user for confirmation before removing
each file
rm –i chap01 chap02 chap03
rm: remove chap01 (yes/no)? y
rm: remove chap02 (yes/no)? N
rm: remove chap03 (yes/no) ? [enter] //no response file not deleted

A y removes the file, any other response leaves the file undeleted.
rm Options
Recursive deletion (-r or -R)
rm performs a tree walk-a thorough recursive search for all directories and files within the
subdirectories. At each stage it deletes everything it finds.

rm wont remove directories but when used with this option it will remove directories.
$rm –r * //it behaves partially like rmdir
This will delete all files in the current directory and all its subdirectories. If you don’t have a
backup, then these files will be lost forever.
rm Options

Forcing removal –f
rm prompts for removal if the file is write-protected. But the -f option overrides this minor
protection also.
The –f option overrides this minor protection and forces removal
And when you combine the -r option with it, it could be the most dangerous thing that you’ve ever
done:
$rm –rf * //Deletes everything in the current directory and below

If you don’t have a backup, then these files will be lost forever. Note that this command will delete
hidden files in all directories except the current directory.
rm Options
Make sure you are doing the right thing before you use rm *

Be doubly sure before you use rm -rf *

The first command removes only ordinary files in the current directory.

The second one removes everything—files and directories alike.

If the root user (the superuser) invokes rm -rf * in the / directory, the entire
UNIX system will be wiped out from the hard disk!
mv - Renaming files
mv command renames (moves) files. It has two distinct functions
It renames a file (or directory)
It moves a group of files to a different Directory
It does not create a copy of the file it merely renames it. If destination file doesn’t exist, it will
be created.
mv chap01 man01 // renames the file chap01 to man01

Group of files can be moved to a directory


mv chap01 chap02 chap03 progs // moves three files to progs directory
mv can also be used to rename a directory
mv pis perdir // renames directory pis to perdir

mv also supports a -i option which makes it behave interactively


What is Directory status after cp, mv and rm?
cp, mv and rm work by modifying the directory entries of the files they access.

cp adds an entry into the directory with the name of the destination file and inode number that is
allotted by kernel.

mv replaces the name of an existing directory entry without disturbing its inode number.

rm removes inode number from an entry in the directory.


What is Directory status after cp, mv and rm?
wc - Counting lines, words & characters
It takes one or more filenames as arguments and display a four-columnar output
$ cat infile
I am the wc command
I count characters, words and lines
$wc infile
2 11 56 infile
Line: group of characters not containing a new line
Word: group of characters not containing space tab or newline
Character: smallest unit of information and includes space , tab, and newline
wc offers three options to make a specific count
-l : counts number of lines
-w: counts words
-c: counts characters
wc –l infile
wc - Counting lines, words & characters

wc –w infile

wc –c infile

When used with multiple filenames wc produces a line for each file as well as total count

$wc chap01 chap02

305 4058 23456 chap01

550 1234 13452 chap02

855 5292 37908 Total


od: Displaying data in octal
• Many files contain nonprinting characters and most unix commands don‟t display them
properly
$ more odfile $cat –v odfile
White space includes a White space includes a
The ^G character rings a bell The ^G character rings a bell
The ^L character skips a page The ^L character skips a page

The od command particularly useful for debugging scripts, examining binary files, or
visualizing non-human-readable data.
od: Displaying data in octal

od command displays the ASCII octal value of its input.


The –b option displays this value of each character separately
$od –b odfile
0000000 127 150 151 164 145 040 163 160
When –b and –c options are combined the output is friendlier
$od –bc odfile
0000000 127 150 151 164 145 040 163 160 141 143 145 040 151
w h i t e s p a c e i n c
od: Displaying data in octal
od: Displaying data in octal

The tab character [Ctrl-i] is shown as \t and its octal value is 011

The bell character [Ctrl-g] is shown as 007, some systems show it as \a

The formfeed character [Ctrl-l] is shown as \f and 014.

The linefeed character [Ctrl-j] is shown as \n and 012. Now od makes the
newline character visible too.
Comparing Files

You’ll often need to compare two files. They could be identical, in which case you may
want to delete one of them.
Two configuration files may have small differences, and knowledge of these differences
could help you understand why one system behaves differently from another.
UNIX supports three commands—cmp, diff, and comm—that compare two files and
present their differences.
cmp: Byte-by-Byte Comparison

cmp makes a comparison of each byte of two files and terminates the moment it encounters a
difference
cmp file1 file2

If the files are identical, cmp returns no output and exits with a status of 0.
If the files differ, it reports the byte and line number where the first difference occurs and
exits with a status of 1
cmp: Byte-by-Byte Comparison
Let’s say we have two files, file1.txt and file2.txt.
file1.txt:
Hello, World! This is a test file.

file2.txt:
Hello, World! This is a test file with a difference.

To compare these files, you can use the cmp command as follows:
cmp file1.txt file2.txt
Output
file1.txt file2.txt differ: byte 34, line 2

This output indicates that the first difference between the two files occurs at byte 34 on line 2.
cmp: Byte-by-Byte Comparison

Print Differing Bytes:

cmp -b file1.txt file2.txt


This will show the differing bytes in the output.

a.txt b.txt differ: byte 34, line 1 is 56 . 40


56: The octal value of the byte in a.txt at position 34.
40: The octal value of the byte in b.txt at position 34.
In this case, the difference is at byte 34, where a.txt has a period (.) and b.txt has a
space.
cmp: Byte-by-Byte Comparison
echo $?
0 means the files are identical.
1 means the files differ.

Verbose Mode:
cmp -l file1.txt file2.txt
This will list all differing bytes.

cmp -lb a.txt b.txt


34 56 . 40
35 40 167 w
36 12 ^J 151 i
cmp: EOF on a.txt after byte 36

If the two files are identical, cmp displays no message but simply returns the prompt.
comm: What Is Common?
While cmp compares two files character by character, comm compares them line by line and
displays the common and differing lines.

Also, comm requires both files to be sorted.

By default, it displays in three columns:


Column 1 Lines unique to the first file.
Column 2 Lines unique to the second file.
Column 3 Lines common (hence its name) to both files.
comm: What Is Common?
cat > a.txt
Line1
Line2 comm a.txt b.txt
Line4 Line1
Line7 Line2
Line3
cat > b.txt Line4
Line1 Line7
Line3 Line9
Line4
Line9
comm: What Is Common?
This output provides a good summary to look at but is not of much use to other commands
that work on single-column input.
comm can produce single-column output using the options -1, -2, or -3.

To drop a particular column, simply use its column number as an option prefix. You can
also combine options and display only those lines that are common:
cat > a.txt
comm -3 foo1 foo2 Line1 #Selects lines not common to both files
comm -13 foo1 foo2 Line2 #Selects lines present only in second file
Line4
Line7

cat > b.txt


Line1
Line3
Line4
Line9
diff: Converting One File to Another

diff is the third command that can be used to display file differences.

Unlike its fellow members, cmp and comm, it also tells you which lines in one file have to
be changed to make the two files identical.

When used with the same files, it produces a detailed output:

diff a.txt b.txt


1d0
< This is file number 1
2a2
> This is file number 3
diff: Converting One File to Another

Line Numbers:
The numbers before the letters (e.g., 2d1, 4c3) refer to the line numbers in the files being
compared.
The first number refers to the line number in the first file (f1.txt), and the second number refers to
the line number in the second file (f2.txt).
Operation Codes:
d (delete): A line in the first file needs to be deleted to match the second file.
c (change): A line in the first file needs to be changed to match the second file.
a (add): A line in the second file needs to be added to match the first file.
Summary
d: Delete a line from the first file.
c: Change a line in the first file to match the second file.
a: Add a line from the second file to the first file.
diff: Converting One File to Another
cat > a.txt
This is file number 1
This is file number 2

cat > b.txt


This is file number 2
This is file number 3

1d0: This means that line 1 in the first file should be deleted to match the second file.
< This is file number 1: The < symbol indicates that the line “This is file number 1” is present in the
first file but not in the second file. Since there is no line 0 in f2.txt, it indicates that the line in f1.txt
does not exist in f2.txt
2a2: This means that after line 2 in the first file, you should append the line from the second file.
> This is file number 3 means: The > symbol indicates that the line “This is file number 3 means” is
present in the second file but not in the first file.

If you are simply interested in knowing whether two files are identical or not, use cmp without
any options.
diff: Converting One File to Another

2d1
2: Line number in file1.txt.
diff file1.txt file2.txt
d: Delete operation.
2d1
Explanation: The second line in file1.txt (custard
< custard apple
apple) is deleted to match file2.txt.
4c3
< guava
---
> grapes
5a5
> kiwi
diff: Converting One File to Another

diff file1.txt file2.txt 4c3


2d1 4: Line number in file1.txt.
< custard apple c: Change operation.
4c3 3: Line number in file2.txt.
< guava Explanation: The fourth line in file1.txt (guava)
--- is changed to the third line in file2.txt (grapes).
> grapes
5a5
> kiwi
diff: Converting One File to Another

5a5
5: Line number in file1.txt.
diff file1.txt file2.txt
a: Add operation.
2d1
5: Line number in file2.txt.
< custard apple
Explanation: After the fifth line in file1.txt, a
4c3
new line (kiwi) is added to match file2.txt.
< guava
---
> grapes
5a5
> kiwi
diff: Converting One File to Another

1a2:
This means that after line 1 in file2.txt, there is
diff file2.txt file1.txt
an additional line (line 2) in file1.txt that is not
1a2
present in file2.txt.
> custard apple
The line added in file2.txt is custard apple
3c4
< grapes
---
> guava
5d5
< kiwi
diff: Converting One File to Another

3c4:
This indicates a change between line 3 in
diff file2.txt file1.txt
file2.txt and line 4 in file1.txt.
1a2
The line grapes in file2.txt is changed to guava
> custard apple
as in file1.txt.
3c4
< grapes
---
> guava
5d5
< kiwi
diff: Converting One File to Another

5d5:
This means that line 5 in file2.txt is deleted
diff file2.txt file1.txt
The line deleted from file2.txt is kiwi
1a2
> custard apple
3c4
< grapes
---
> guava
5d5
< kiwi
End of Module 2

You might also like