Data Structure
Data Structure
Structu re:
1. Int ro duct io n
2. Object ive
3. Present at io n o f Co nt ent s
3.1 Elementary Data Organization
3.2 Data Structures
3.2.1 Why Data Structure?
3.2.2 Types of Data Structures
3.3 Abstract Data Type
3.4 Data Types, Data Structures and Abstract Data Types
3.5 Operations on Data Structures
3.6 Algorithm
3.6.1 Analysis of Algorithm
3.6.2 Complexity
3.6.2.1 Asymptotic Analysis
3.6.2.2 Tradeoff between space and time complexity
3.6.3 Measuring the Running Time of a Program
3.6.4 Time Complexity
3.6.5 Space Complexity
3.6.6 Comparison of Complexities
4. Su mmar y
5. Suggest ed Read ing s/Refer ence mat er ial
6. Self Assess ment Quest io ns ( S AQ)
1
1. Int roduction
Data structures are building blocks of a program. They are like pillars of a huge structure. If a
program is built using improper data structures, then the program may not work as expected
always. It is very much important to use right data structures for a program.
When the software is developed, it is very important to consider space and time complexities as
essential parameters that are to be met by it. Software may be developed, but, it may take a
longer time to produce output and hence, it may not be used. The same is the case with respect to
space. A program should not occupy more than a specific amount of memory. Both these
parameters are technically termed as Time and Space complexities. A program/algorithm must
Data structures enable a programmer to structure a program in such a way that the data are
2. Objectives
At the end of this chapter the reader must be able to understand the basic concept of data
structure. The user must also be able to know about various primitive and composite data
structures. The various types of common operations to be performed on different data structures
are also explained in this chapter. As data structure is generally based on algorithms, therefore it
is important to understand how to design an efficient algorithm. In this chapter, the time-space
complexity is discussed in detail, which helps in creating efficient algorithm. Overall, the reader
2
3. Presentation o f Content s
Data are nothing but some type of a statistic or a set of values (statistics). A data
item refers to a single unit of data. Data can be either raw or processed. Raw data can be divided
into two subparts i.e. group items and elementary items. For example name of a person may be
assumed as group item as it is a combination of first name, middle name and surname. While
employee number in an employee record can not be divided further that’s why known as
If we arrange some data in an appropriate sequence, then it forms a Structure and gives us a
meaning. This meaning is called information. Information is also known as processed data.
Data can be represented in a hierarchy in a computer science. This hierarchy comprises of fields,
records and files. For understanding this hierarchy let us introduce some more concepts.
An entity is something that has certain properties or attributes which may be assigned some
value which can be either numeric or non-numeric. For example, an employee may be assumed
as an entity, while its attributes are name, age, sex, etc. Entities with similar attributes such as all
The way the data are organized into hierarchy as above reflects the relationship between
attributes, entities and entity sets. Thus we can say that, a field is a single elementary unit of
This organization of data into fields, records and files is not sufficient to maintain and process all
types of collection of data efficiently. Due to this reason, data are also organized into more
3
complex types of structures. The study of these complex types of structures form the basis for
(3) Quantitative analysis of the structure, which determines time and space complexity
Data structure can be defined in a number of ways, some of the definitions of data
(3) When we define a data structure we are in fact creating a new data type of our
own.
b. Such new types are then used to reference variables type within a program
But the most useful definition of data structures is: “the logical or mathematical model of a
The choice of a particular data model depends on two considerations. First, it must be rich
enough to represent the actual relationship of data in real world. Second, the structure should
simple enough that can effectively process the data whenever required.
4
3.2.1 Why Data Structure?
Following are some of the reasons for using the data structures:
Data structures study how data are stored in a computer so that operations can be
implemented efficiently
Data structures are especially important when you have a large amount of information
Conceptual and concrete ways to organize data for efficient storage and manipulation.
Data structures can be categorized in two different ways viz. Primitive and Non-
Primitive Data Structure: Basically primitive data structures are not data structures instead they
are termed as primitive data types. It's a basic data structure and it's directly operated upon the
In computer science, primitive data type can refer to either of the following concepts:
A built-in type is a data type for which the programming language provides built-in
support.
Classic basic primitive types may include Character, Integer, Floating-point number, Boolean,
Pointer, etc.
INTEGERS: The quantity representing objects that are discrete in nature can be represented by
an integer. For example: 2, 4, 6, 0, -23,-56 etc. are integers but 2.5, -34.56 are not integers.
5
FLOAT: Float is a simple data type which takes 4 bytes in memory and we can assign decimal
CHARACTERS: Characters are the literal representation of some elements selected from
alphabets and characters are defined in the single quotes. There are wide varieties of character
sets. Two widely used character sets are represented by EBCDIC and ASCII. For example: '*', 'r',
POINTERS: Pointers is a reference to the data structure and a simple type of variable which
store the address of the another variable. Pointer is a single fixed size data item.
Here x is the integer, where p is a pointer of integer type and pointer p points to x.
BOOLEAN: Boolean is that data type which provides only two possible results i.e. true or false.
Composite (Non-Primitive) Data Structure: Composite data structure is based on the primitive
data structure. While primitive data structures are the basic building blocks, composite data
Linear Data Structure: Linear data structure is said to be its elements or items from a sequence
one after other. Linear data structure comprises various data structures such as string, array,
data type storing a sequence of data values, usually bytes, in which elements usually stand for
6
characters according to a character encoding, which differentiates it from the more general array
data type.
known as an array, one of the simplest data structures. Arrays hold a series of data elements,
usually of the same size and data type. Individual elements are accessed by their position in the
array. The position is given by an index, which is also called a subscript. The index usually uses
a consecutive range of integers, but the index can have any ordinal set of values.
Some arrays are multi-dimensional, meaning they are indexed by a fixed number of integers.
Generally, one- and two-dimensional arrays are the most common. Most programming languages
Link List: In computer science, a linked list is one of the fundamental data structures used in
computer programming. It consists of a sequence of nodes, each containing arbitrary data fields
and one or two references pointing to the next and/or previous nodes. A linked list is a self-
referential data type because it contains a link to another data of the same type. Linked lists
permit insertion and removal of nodes at any point in the list in linear time, but do not allow
random access.
7
Fig. 1.2: Linked-List Representation
There are various types of linked lists such as Singly-linked List, Doubly-linked List,
Stack: A stack is a linear Structure in which item may be added or removed only at one end.
There are certain frequent situations in computer science when one wants to restrict insertions
and deletions so that they can take place only at the beginning or at the end the list, not in the
middle. Two of the Data Structures that are useful in such situations are stacks and queues. A
stack is a list of elements in which an element may be inserted or deleted only at one end, called
the top. This means, in particular, the elements are removed from a stack in the reverse order of
that which they are inserted in to the stack. The stack also called "last-in first -out (LIFO)” list.
Special terminology is used for two basic operation associated with stack:
8
2. "Pop" is the term used to delete an element from a stack.
Queue: A queue is a linear list of elements in which deletions can take place only at one end,
called the front and insertion can take place only at the other end, called rear. The term front and
rear are used in describing a linear list only when it is implemented as a queue.
Queues are also called “first-in first-out“ (FIFO) list. Since the first element in a queue will be
the first element out of the queue. In other words, the order in which elements enter in a queue is
the order in which they leave. The real life example: the people waiting in a line at Railway
ticket Counter form a queue, where the first person in a line is the first person to be waited on.
programs with the same priority form a queue while waiting to be executed.
Non-Linear Data Structure: Non linear data structure is the hierarchical relationship between
individual data items. Every data item is attached to several other data items in a way that is
specific for reflecting relationships. The data items are not arranged in a sequential structure. Ex:
Trees, Graphs
Trees: Data frequently contain a hierarchical relationship between various elements. This non-
linear Data structure which reflects this relationship is called a rooted tree graph or, tree. This
9
A tree consist of a distinguished node r , called the root and zero or more (sub) tree t1 , t2 , ... tn ,
In the tree of figure 1.5, the root is r, Node t 2 has r as a parent and t2.1, t2.2 and t2.3 as children.
Each node may have arbitrary number of children, possibly zero. Nodes with no children are
known as leaves.
Graph: A graph consists of a set of Vertices(nodes) and a set of edges. Each edge in a graph is
specified by a pair of vertices. A vertex n is incident to an edge x if n is one of the two nodes in
the ordered pair of nodes that constitute x. The degree of a node is the number of arcs incident to
it. The indegree of a node n is the number of arcs that have n as the head, and the outdegree of n
The graph is the nonlinear data structure. The graph shown in the figure 1.6 represents 7 vertices
and 12 edges. The Vertices are { 1, 2, 3, 4, 5, 6, 7} and the arcs are {(1,2), (1,3), (1,4), (2,4),
(2,5), (3,4), (3,6), (4,5), (4,6), (4,7), (5,7), (6,7) }. Node (4) in figure 1.6 has indegree 3,
10
Fig. 1.6: Graph Representation
Abstract Data Types (ADT's) are a model used to understand the design of a data
structure. ADTs specify the type of data stored and the operations that support the data. Viewing
a data structure as an ADT allows a programmer to focus on an idealized model of the data and
its operations.
We can think of an abstract data type (ADT) as a mathematical model with a collection of
operations defined on that model. Sets of integers, together with the operations of union,
Although the terms "data type", "data structure" and "abstract data type" sounds alike,
they have different meanings. In a programming language, the data type of a variable is the set of
values that the variable may assume. For example, a variable of type boolean can assume either
the value true or the value false, but no other value. The basic data types vary from language to
language; in ‘C’ they are integer, float, and char. The rules for constructing composite data types
11
An abstract data type is a mathematical model, together with various operations defined on the
model. We shall design algorithms in terms of ADT's, but to implement an algorithm in a given
programming language we must find some way of representing the ADT's in terms of the data
types and operators supported by the programming language itself. To represent the
mathematical model underlying an ADT we use data structures, which are collections of
The data appearing in our data structure is processed by means of certain operations. In
fact, the particular data structure that one chooses for a given situation depends largely on the
frequency with which specific operations are performed. The following four operations play a
major role:
Traversing: Accessing each record exactly once so that certain items in the record may
Searching: Finding the location of the record with a given key value, or finding the
There are two more operations, which can be used in special situations and are discussed below:
Sorting: Arranging the records in some logical order (e.g. alphabetically in ascending or
key, etc.)
Merging: Combining the records in two different sorted files into a single sorted file.
12
Some other operations such as copying and concatenation may also be performed on some data
structures.
3.6 Algorithm
terminates with the production of correct output from the given input. Algorithms may be
written in pseudo code that resembles programming languages like C and Pascal.
Algorithm should have five basic characteristic features such as Input, Output, Definiteness,
Once we have a suitable mathematical model for our problem, we can attempt to find a solution
in terms of that model. Our initial goal is to find a solution in the form of an algorithm, which is
a finite sequence of instructions, each of which has a clear meaning and can be performed with a
finite amount of effort in a finite length of time. An integer assignment statement such as
algorithm instructions can be executed any number of times, provided the instructions
themselves indicate the repetition. However, we require that, no matter what the input values
may be, an algorithm terminates after executing a finite number of instructions. Thus, a program
Step 1: FACT = 1
Step 2: for i = 1 to n do
13
Specification: Computes n!.
Pre-condition: n >= 0
Post-condition: FACT = n!
For better understanding conditions can also be defined after any statement, to specify
Pre-condition and post-condition can also be defined for loop, to define conditions
What is remain true before as well as after execution of the ith iteration of a loop is called
"loop invariant".
Moreover, these conditions can also be used for giving correctness proof.
resources (such as time and storage) that are utilized by computer to execute. Most algorithms
When solving a problem, a frequent situation is to choose among various algorithms. On what
2. We would like an algorithm that makes efficient use of the computer's resources, especially,
one that runs as fast as possible and use minimum space as well.
14
3.6.2 Complexity
Complexity refers to the rate at which the required storage or consumed time
grows as a function of the problem size. The absolute growth depends on the machine used to
execute the program, the compiler used to construct the program, and many other factors. We
would like to have a way of describing the inherent complexity of a program (or piece of a
program), independent of machine/compiler considerations. This means that we must not try to
describe the absolute time or storage needed. We must instead concentrate on a “proportionality”
approach, expressing the complexity in terms of its relationship to some known function. This
type of analysis is known as asymptotic analysis. It may be noted that we are dealing with
complexity of an algorithm not that of a problem. For example, the simple problem could have
Asymptotic analysis is based on the idea that as the problem size grows,
the complexity can be described as a simple proportionality to some known function. This idea is
incorporated in the Big O, Omega and Theta notation for asymptotic performance.
For example, we may have to choose a data structure that requires a lot of storage in order to
reduce the computation time. Therefore, the programmer must make a judicious choice from an
informed point of view. The programmer must have some verifiable basis based on which a data
We will learn about various techniques to bind the complexity function. In fact, our aim is not to
count the exact number of steps of a program or the exact amount of time required for executing
15
an algorithm. In theoretical analysis of algorithms, it is common to estimate their complexity in
asymptotic sense, i.e., to estimate the complexity function for reasonably large length of input n.
Big-O notation, omega notation(Ω) and theta notation(Θ) are used for this purpose. In order to
measure the performance of an algorithm underlying the computer program, our approach would
be based on a concept called asymptotic measure of complexity of algorithm. There are notations
like big-O, Θ, and Ω for asymptotic measure of growth functions of algorithms. The most
common being big-O notation. The asymptotic analysis of algorithms is often used because time
taken to execute an algorithm varies with the input n and other factors which may differ from
computer to computer and from run to run. The essences of these asymptotic notations are to
bind the growth function of time complexity with a function for sufficiently large input.
To talk about growth rates of functions we use what is known as big-oh notation. For example,
when we say the running time T(n) of some program is O(n2), read as "big oh of n squared", we
mean that there are positive constants c and n0 such that for n equal to or greater than n0, we
Example: Suppose T(0) = 1, T(1) = 4, and in general T(n) = (n+1)2. Then we see that T(n) is
O(n2), as we may let n0 = 1 and c = 4. That is, for n ≥ 1, we have (n + 1)2 ≤ 4n2, as the reader
In what follows, we assume all running-time functions are defined on the nonnegative integers,
and their values are always nonnegative, although not necessarily integers. We say that T(n) is
O(f(n)) if there are constants c and n0 such that T(n) ≤ cf(n) whenever n ≥ n0. A program whose
16
When we say T(n) is O(f(n)), we know that f(n) is an upper bound on the growth rate of T(n). To
specify a lower bound on the growth rate of T(n) we can use the notation T(n) is Ω(g(n)), read
"big omega of g(n)", to mean that there exists a positive constant c such that T(n) ≥ cg(n)
Example: To verify that the function T(n)= n3 + 2n2 is Ω(n3), let c = 1. Then T(n) ≥ cn3 for n =
0, 1, . . ..
For another example, let T(n) = n for all odd n ≥ 1 and T(n) = n2/100 for all even n ≥ 0. To verify
that T(n) is Ω(n2), let c = 1/100 and consider the infinite set of n's: n = 0, 2, 4, 6, . . ..
Asymptotic notation
= O (n3)
Example: f(n) = n² + 3n + 4 is O(n²), since n² + 3n + 4 < 2n² for all n > 10.
By definition of big-O, 3n + 4 is also O(n²), O(n3) & above, too, but as a convention, we use the
1. f(n) = O(f(n)) for any function f. In other words, every function is bounded by itself.
2. aknk + ak−1nk−1 + · · · + a1n + a0 = O(nk) for all k ≥ 0 and for all a0, a1, . . . , ak Є R. In other
words, every polynomial of degree k can be bounded by the function nk. Smaller order terms can
17
3. Basis of Logarithm can be ignored in big-O notation i.e. loga n = O(logb n) for any bases a, b.
4. Any logarithmic function can be bounded by a polynomial i.e. logb n = O(nc) for any b (base of
5. Any polynomial function can be bounded by an exponential function i.e. nk = O (bn.) for any
constant b.
6. Any exponential function can be bound by the factorial function. For example, a n = O(n!) for
any constant a.
2. The quality of code generated by the compiler used to create the object program,
3. The nature and speed of the instructions on the machine used to execute the program,
The fact that running time depends on the input tells us that the running time of a program should
be defined as a function of the input. Often, the running time depends not on the exact input but
only on the size of the input. A good example is the process known as sorting. In a sorting
problem, we are given as input a list of items to be sorted, and we are to produce as output the
same items, but smallest (or largest) first. For example, given 2, 1, 3, 1, 5, and 8 as input we
might wish to produce 1, 1, 2, 3, 5, and 8 as output. The latter list is said to be sorted smallest
first. The natural size measure for inputs to a sorting program is the number of items to be sorted,
or in other words, the length of the input list. In general, the length of the input is an appropriate
size measure, and we shall assume that measure of size unless we specifically state otherwise.
18
It is customary, then, to talk of T(n), the running time of a program on inputs of size n.
For example, some program may have a running time T(n) = cn2, where c is a constant. The units
of T(n) will be left unspecified, but we can think of T(n) as being the number of instructions
For many programs, the running time is really a function of the particular input, and not just of
the input size. In that case we define T(n) to be the worst case running time, that is, the
maximum, over all inputs of size n, of the running time on that input. We also consider Tavg(n),
the average, over all inputs of size n, of the running time on that input. While Tavg(n) appears a
fairer measure, it is often fallacious to assume that all inputs are equally likely. In practice, the
average running time is often much harder to determine than the worst-case running time, both
because the analysis becomes mathematically intractable and because the notion of average input
frequently has no obvious meaning. Thus, we shall use worst-case running time as the principal
can do so meaningfully.
There was a model named as RAM devised by John von Neumann to analyze
algorithms. We will try to analyze the algorithms based on this model. According to this model
• Loops and subroutine calls are not simple operations, but depend upon the size of the data and
The complexity of algorithms using big-O notation can be defined in the following way for a
problem of size n:
19
• Constant-time method is order 1 : O(1). The time required is constant independent of the input
size.
• Linear-time method is order n: O(n). The time required is proportional to the input size. If the
input size doubles, then, the time to run the algorithm also doubles.
• Quadratic-time method is order N squared: O(n2). The time required is proportional to the
square of the input size. If the input size doubles, then, the time required will increase by four
times.
• Logarithmic-time method is order log n: O(log n). The time required is proportional to the log
• Linear Logarithmic-time method is order nlogn: O(n log n). The time required is proportional
The process of analysis of algorithm (program) involves analyzing each step of the algorithm. It
Statement 1;
Statement 2;
...
...
Statement k;
The total time can be found out by adding the times for all statements:
20
It may be noted that time required by each statement will greatly vary depending on whether
each statement is simple (involves only basic operations) or otherwise. Assuming that each of the
above statements involve only basic operation, the time for each simple statement is constant and
In this example, assume the statements are simple unless noted otherwise.
if-then-else statements
if (cond) {
sequence of statements 1
else {
sequence of statements 2
In this, if-else statement, either sequence 1 will execute, or sequence 2 will execute depending on
the boolean condition. The worst-case time in this case is the slower of the two possibilities. For
example, if sequence 1 is O(N2) and sequence 2 is O(1), then the worst-case time for the whole
for (i = 0; i < n; i + +) {
sequence of statements
Here, the loop executes n times. So, the sequence of statements also executes n times. Since we
assume the time complexity of the statements are O(1), the total time for the loop is n * O(1),
21
which is O(n). Here, the number of statements does not matter as it will increase the running
time by a constant factor and the overall complexity will be same O(n).
for (i = 0; i < n; i + +) {
for (j = 0; j < m; j + +) {
sequence of statements
Here, we observe that, the outer loop executes n times. Every time the outer loop executes, the
inner loop executes m times. As a result of this, statements in the inner loop execute a total of
n*m times. Thus, the time complexity is O(n * m). If we modify the conditional variables, where
the condition of the inner loop is j < n instead of j < m (i.e., the inner loop also executes n times),
somehow memory becomes a constraint in a program, the program will not execute at all.
Therefore, in that case it becomes more critical issue than time. Also it is always assumed to be
For an iterative program, it is usually just a matter of looking at the variable declarations and
The analysis of recursive program with respect to space complexity is more complicated as the
space used at any time is the total space used by all recursive calls active at that time.
22
Example: Find the greatest common divisor (GCD) of two integers, m and n. The algorithm for
Subtract n from m.
n is the GCD
The space-complexity of the above algorithm is a constant. It just requires space for three
The time complexity depends on the loop and on the condition whether m>n or not. The real
issue is how much iteration takes place? The answer depends on both m and n.
Best case: If m = n, then there is just one iteration. O(1) Worst case : If n = 1,then there are m
iterations; this is the worst-case (also equivalently, if m = 1 there are n iterations) O(n).
The space complexity of a computer program is the amount of memory required for its proper
execution. The important concept behind space required is that unlike time, space can be reused
during the execution of the program. As discussed, there is often a trade-off between the time
23
O(n2) Quadratic Insertion Sort
O(cn) Exponential
O(n!) Factorial
The next table represents comparison of typical running time of algorithm of different order
5 3 5 25 15 125 32
4 Summary
In this chapter we have discussed the basic concept of data, records and files initially as it
is building block for understanding the concept of data structure. Data structure is the logical or
mathematical model of a particular organization of data. Data structure is used to represent data
There are two basic types of data structures i.e. primitive and non-primitive. Primitive data
structure consists of integer, real, Boolean, character, pointer, etc. While non-primitive data
structure further consists of two sub-categories viz. linear & non-linear data structure. Linear
data structure comprises of array, linked list, stack, queue, etc. and non-linear data structure
24
We can perform some basic common operations on data structures such as traversing, insertion,
deletion and searching. Also some special operations can be performed on some other data
structures. The concept of abstract data types is also introduced in this chapter.
In the second part of the chapter, the concept of algorithm and analysis of algorithm is explained.
There might be number of algorithm for solving a similar problem. Then the question arises that
how to choose best algorithm. Here analysis of algorithm comes into picture. We have to take a
few considerations while picking an algorithm such as simplicity, easy to debug, fast, efficient
on storage and other computer resources. Complexity is one of the ways with the help of which
we can find which algorithm is best. There are two types of complexity i.e. time and space.
Tradeoffs between time & space complexity is also discussed in this chapter.
In this chapter the asymptotic notations of Big-O, Omega, and Theta are also discussed. These
notations, especially Big-O helps in finding the complexities of the algorithms. The time & space
Addison-Wesley.
25
Given an array of n integers, write an algorithm to find the smallest element. Find
number of instruction executed by your algorithm. What are the time and space
complexities?
Write an algorithm to find the median of n numbers. Find number of instruction executed
in the usual matrix multiplication algorithm. Find the optimal order in which to multiply
the matrices so as to minimize the total number of scalar operations. How would you find
structures.
A professor keeps a class list containing the following data for each student:
(a) State the entities, attributes and entity set of the list.
Suppose a data set S contains n elements. Compare the running time T 1 of the linear
search algorithm with the running time T2 of the binary search algorithm when (i) n =
26
Write an algorithm which finds the location of the largest and second largest element in
an array.
Suppose P(n) = a0 + a1n + a2n2 + … + amnm ; that is, suppose P(n) = m. Prove that p(n) =
O(nm).
(a) 30n2 (b) 10n3 + 6n2 (c) 5nlogn + 30n (d) log n + 3n (e) log n + 32
printMultiplicationTable(int max){
for (i = 1; i <= n; i *= 2)
j = 1;
What is the running time of the above program segment in big O notation?
27
Write and explain the various asymptotic notations and explain with the help of
examples.
28
Lesson : 2 Writer : Dr. Pardeep Kumar Mittal
Structu re:
1. Int ro duct io n
2. Object ive
3. Present at io n o f Co nt ent s
3.7Implementation of Strings in C
4. Su mmar y
1
1. Int roduction
We have all used text processing programs such as Microsoft Word, Word Perfect, or
PageMaker to prepare documents, search existing documents for prespecified words or phrases.
However, we often fail to recognize that this type of effort involves the use of a very specific
constant or as some kind of variable. A string is generally understood as a data type and is often
Depending on programming language and precise data type used, a variable declared to be a
string may either cause storage in memory to be statically allocated for a predetermined
maximum length or employ dynamic allocation to allow it to hold variable number of elements.
2. Objectives
At the end of this chapter the reader must be able to understand the concept of strings.
The user must also be able to know about various operations that can be applied on strings. The
various types of storage mechanisms on strings are also discussed in this chapter. One of the
important concept in string is pattern matching is discussed in detail in this chapter. The C
programs for handling various operations on strings are also explained in this chapter.
3. Presentation o f Content s
alphabets, digits and special characters. The representation of strings may vary from language to
language. The basic terms that are associated with strings are length, substring, concatenation,
etc.
2
A string may consists of a finite sequence of zero or more characters. The number of characters
that a string contains is known as its length. A string is known to be empty string if its length is
zero. Strings may be denoted by single quotes(') or double quotes(“) depending on a particular
Here the length of the string is 5. The string '' is having length 0 and the string 'India is a great
country' is having length 23. One must be careful while calculating the length of the string as
Another basic operation applied on strings is string concatenation. Although there is no standard
symbol for string concatenation, we will be using // to indicate concatenation operation. The
concatenation is nothing but combining two string according to the specified order. For example,
As clearly visible from the result that if we want to insert a blank space in between, we have to
One more basic operation that should be understood before going into the details of string
processing is substring. A substring as its name indicates, is nothing but a subset of the original
string. In the example given above, 'Hello', ' ' and 'India' are all substring of the concatenated
string. 'Hello' is also known as initial substring as there is no substring before that, while 'India' is
The maximum length of a substring can be the length of the string itself, while the minimum
The string in some of the programming languages such as C are stored as arrays of characters,
while in some other languages such as Java they are stored as classes and in some other
3
languages such as VB they are defined as a data type. Whatever may be the way to store the
Record-Oriented, Fixed-Length Storage: In this type of storage, the length of the string is
assumed to be fixed and each string is assumed to be a record. Obviously, each record will be
having the same length. An example of such type of storage can be a program with the maximum
size of the line being 80 characters. Each line in the program will be assumed as a record.
This type of storage is the oldest and simplest method to store strings. The main advantage of
this method is its simplicity. But there are a number of disadvantages of this method such as
(I) Entire record is to be read even if most of the storage consists of blank spaces in the end.
(II) Some of the records may need more space than the fixed-length.
(III) When a modification is required which consists of more or fewer characters than
Variable-Length, Fixed-Maximum Storage: One of the major problem of the previous method
was that the complete record has to be read even if there are blank spaces in the last part of the
record. This problem can easily be removed by this method. The storage in this method can be
done in two ways: (I) using a marker such as @ to signal the end of the string (II) using a pointer
array which stores the length of each string and points to each string.
4
1 India is a great country@
2 I like my country@
3 Hello@
.
.
.
n Bye@
Fig. 2.1(a) Records with markers
Although the problem of reading unnecessary blank spaces is solved in this type of storage. But
still the maximum length is fixed. Therefore any string having length more than this fixed-length
can not be stored here. Also the contiguous memory space is required to store such type of
strings.
Linked Storage: The major problem with earlier two methods is fixed maximum length. Now-a-
days, computers are used very frequently for the purpose of word processing and editing in a
word-processor is one of the basic necessity. Due to fixed-length in above two methods, editing
becomes difficult. Therefore one of the better method for storing string could be linked storage.
5
In a linked storage data can be placed anywhere which is connected through links. The major
advantage of such a storage is that consecutive space is not required for storing large strings.
Also if any modification has to be made to the original string, it can be easily implemented. The
only disadvantage is the space required for storing pointers. But this disadvantage is not as
problematic as was the problem of editing with earlier two methods. Overall it has been observed
that linked storage has been proved to be a very good and efficient method for storing strings.
I N D
Fig. 2.2 (a) Linked Storage using one character per node
IND IA X
Fig. 2.2 (b) Linked Storage using three characters per node ( X indicates null value)
Character data type can be handled by various programming languages in various ways.
Constants: Some of the programming languages such as C & C++, stores character constant in
single quote such as 'A', 'B', etc., while string constant in double quotes such as “India is a great
country”. In Java the character constants are handled in a similar manner to C, but the strings are
handled as a class. In Visual Basic the string is itself a data type and the characters are stored in
6
str1=”India”
char1=”a”
Variables: Although each programming language follows different rules for forming strings, still
the formation of strings can be classified into three categories: static, semistatic and dynamic.
In a static character variable a variable's length is defined before the program is executed and can
not change during the execution. A semistatic character variable is a variable whose length may
vary during the execution of the program subject to a fixed maximum. While a dynamic
character variable means a variable whose length may change during the execution of the
program.
str1=”India”
char *str2;
7
Concatenation: Given two strings s and t, the concatenation operation, written as s // t or s + t,
appends t to s, such that the last character of s is followed by the first character of t and the
Given two strings s and t of length M and N, respectively, the concatenation operation s // t can
1. i = 0; j = 0;
i++;
STR1[i]= STR2[j];
i =i+1;
j = j+1;
4. STR1[i]= '\o';
5. Return STR1;
Length: Given a string s, the length operation returns the number of characters in s. Given s =
"hello", length(s) = 5. Given s = "hello world", length(s) = 11, since leading blank symbols and
1. length = 0, i=0;
i++;
length=i;
8
3. return length;
length less than or equal to N. Accessing a substring from a given string requires three pieces of
information i.e. name of the string, position of the first character of the substring in the given
string and the length of the substring. The substring operation can be written as:
For example, Let String=”India is a great country” then SUBSTRING(string, 12, 5) = “great”.
{ for k = i to j do
output(s[k])
Indexing: Indexing is finding the position of a pattern in a given string. Indexing is also known
INDEX(text, pattern), which returns the first position of the pattern in a given string. If the
For example, Let Text=”India is a great country”, then INDEX(text, “great”) = 12 and
INDEX(text, “greatest”)=0.
Although the string operations discussed above are the basic operations in string
processing, yet these operations are insufficient for word processing. Hence some additional
9
operations are needed for this purpose. We will be discussing three such operations in this
Insertion: Insertion means we wish to insert a string within given text. Suppose the given text is
denoted by T, the string we wish to insert is S and the position where we wish to insert the string
INSERT(T, K, S)
For example, Let T=”India a great country”, S=”is “ and K=7, then INSERT(T, K, S) = “India is
a great country”.
The string operation discussed in the above section remains the basic building blocks even for
the word processing operations. Insertion operation can be implemented using these basic
operations as follows:
Deletion: Deletion means we wish to delete a string within given text. Suppose the given text is
denoted by T, the string we wish to delete begins in position K and has length L, then deletion
DELETE(T, K, L)
For example, Let T=”India is a great great country”, DELETE(T, 12, 6) = “India is a great
country”.
+ 1)
10
Sometimes, we are given with the text and the pattern to be deleted from the text. In that case,
first position of the pattern in the text has to be identified and then the above-said operation can
be implemented. The position of the pattern can be identified with the help of INDEX operation.
Suppose text is denoted by T and pattern by P, then deletion operation can be performed as:
For example, Let T=”India is a great great country”, P = “great ”, then DELETE(T, INDEX(T,
As quite common in word processing, suppose we wish to delete every occurrence of the pattern
1. K = INDEX(T, P)
2. Repeat while K ≠ 0
(b) K = INDEX(T, P)
3. Write T
4. Exit
For example, if T = “India is a great great country” and P = “great ”, then executing the above
One can consider another interesting example, suppose T = “XAAABBBY”, and P = “AB”, then
although it appear that the pattern “AB” is appearing just once, but when we apply the above
algorithm, after first execution of the algorithm the text T becomes “XAABBY”, which shows
that pattern “AB” appears again and continued execution of the algorithm produces the final
output as T = “XY”.
11
Replacement: Replacement means we wish to replace a string from another string within given
text. Suppose the given text is denoted by T, the string we wish to replace is P1 by string P2,
For example, if T = “India was a great country”, P1 = “was”, and P2 = “is”, then REPLACE(T,
If T = “India is a great country”, P1 = “was”, and P2 = “is”, then REPLACE(T, P1, P2) = “India
is a great country”. As “was” is not existing in the text, there is no change in the original text.
Replace operation can not be implemented by a single line operation using the basic string
operations. Now we will see how can replace operation be implemented in three steps.
K = INDEX(T, P1)
T = DELETE(T, K, LENGTH(P1))
INSERT(T, K, P2)
This implementation will replace the first occurrence of the pattern P1 with pattern P2. If we
want to replace every occurrence of pattern P1 with pattern P2 in the text, we have to follow the
following algorithm:
1. K = INDEX(T, P1)
2. Repeat while K ≠ 0
3. Write T
4. Exit
12
For example, if T = “India is a greatest greatest country”, P1 = “greatest”, and P2 = “great”, then
One has to take special care while using the above algorithm with the following type of data:
Suppose T = “XAY”, P1 = “A” and P2 = “AB”, then T = “XABY” after first execution of the
algorithm and T = “XABBY”, and so on. The algorithm will never terminate.
query string P in a given text T. Generally the size of the pattern to be searched is smaller than
the given text. There may be more than one occurrences of the pattern P in the text T. Sometimes
we have to find all the occurrences of the pattern in the text. There are several applications of the
string matching. Some of these are text editors, search engines, etc.
Since string-matching algorithms are used extensively, these should be efficient in terms of time
and space. Let P [1..m] is the pattern to be searched and its size is m. T [1..n] is the given text
whose size is n. Assume that the pattern occurs in T at position (or shift) i. Then the output of the
matching algorithm will be the integer i where 1 <= i <= n-m. If there are multiple occurrences of
the pattern in the text, then sometimes it is required to output all the shifts where the pattern
occurs.
Let Si is the substring of T, beginning at the i th position and whose length is same as
pattern P.
We compare P, character by character, with the first substring S1. If all the corresponding
characters are same, then the pattern P appears in T at shift 1. If some of the characters of
13
S1 are not matched with the corresponding characters of P, then we try for the next
substring S2. This procedure continues till the input text exhausts.
1. i = 1;
3. for j= 1 to m
goto step 5
5. i= i + 1
6. exit
The complexity of the brute force string matching algorithm is O(nm). On average the inner loop
runs fewer than m times to know that there is a mismatch. The worst case situation arises when
first m character are matched for all substrings Si. If pattern is of the am-1b and text is of the form
an-1b, where an-1 denotes a repeated n -1 times. In this case the inner loop runs exactly for m
times before knowing that there is a mismatch. In this situation there will be exactly m*(n-m+1)
number of comparisons.
Another Pattern Matching Algorithm: The second pattern matching algorithm uses a table which
is derived from given pattern P and is independent of text T. Suppose P = “aaba” and T =
T1T2T3 …, Ti denotes the ith character of T and suppose that first two characters of T match
those of P i.e. suppose T = aa.... Then T has one of the following three forms: (I) T = aab..., (ii) T
= aaa..., (iii) T = aax, where x is any character different from a or b. Suppose we read T3 and
14
find that T3 = b. Then we next read T4 to see that if T4 = a, which gives P = W1, but if T3 = a,
then obviously P ≠ W1. But it is also known that W2 = aa..., i.e., first two characters of substring
W2 match those of P. Hence we next read T4 to see if T4 = b. Last suppose that T3 = x, then we
know that P ≠ W1, but we also know that P ≠ W2 and P ≠ W3, since x does not appear in P.
The important point in above discussion is that we can start the comparison from much
ahead(size of the pattern or less) as compared to from the second character of the text in some
cases. Overall we can make a table as in fig.2.3, showing all the possibilities. The various
f(Qi, t)
a b x
Q0 Q1 Q0 Q0
Q1 Q2 Q0 Q0
Q2 Q2 Q3 Q0
Q3 P Q0 Q0
Fig. 2.3 Pattern Matching Table
Algorithm (Pattern Matching) : The pattern matching table F(Q1, T) of a pattern P is in memory,
1. K = 1 and S1 = Q0
3. Read TK.
5. K = K + 1
15
6. If SK = P, then
INDEX = K – LENGTH(P)
else
INDEX = 0
7. Exit
The complexity of the above algorithm is based on number of times step 2 is executed. In worst
case whole of text T is read and the loop is executed LENGTH(T) times. On the basis of this, we
can say that complexity of this algorithm is O(n), which is obviously less than the Brute-Force
string-matching algorithm.
Some of the most commonly used functions in the string library in C language are:
#include<stdio.h>
16
#include<string.h>
void main()
char s2[50];
char *p1,*p2;
strcpy(s2,s1);
printf("%s %s\n",s1,s2);
strcat(s2,"Los Angeles,California.");
printf("%s\n",s2);
/*The strlen function tells how many characters are there in the string.*/
p1=strchr(s1,'c');
printf("[%c][%s]\n",*p1,p1);
p1=strstr(s1,"the");
printf("[%c][%s]\n",*p1,p1);
strcpy(s2,s1);
if(strcmp(s1,s2)==0)
printf("Same.\n");
17
else
printf("Different.\n");
4 Summary
In this chapter we have discussed the basic concept of strings. Various basic operations
along with the operations required in word processing are also discussed in this chapter. The
mechanisms of storing strings in computer memory are also discussed. One of the important
concept of pattern matching has been explained along with its algorithms. In the end the
Addison-Wesley.
Let T be the string “ABCD”. (a) Find the length of T. (b) List all substrings of T. (c) List
Describe various ways of storing strings in computer memory along with advantages and
disadvantages of each.
Suppose T = “India is a great country” and S = “Yes”. Then using the string operations
find (a) LENGTH(T) (b) SUBSTRING(T, 12, 5) (c) INDEX(T, “great”) (d) T // ” “ // S.
18
What will be the result in following cases:
Consider the pattern P = “aaabb”. Construct the table used in the second pattern matching
algorithm.
For each of the following cases find the number of comparisons to find the index (first
a) P =cat, T = bcbcbcbc
b) P= bbb, T= aabbaabbaabbaabbb
c) P= xxx, T= xyxxyxxxyxxxyxxxxy
What is the complexity of the brute force string-matching algorithm in the best case?
Write a procedure to count the number of the time the word 'the' appears in a given text .
19
Lesson : 3 Writer : Dr. Pardeep Kumar Mittal
Structu re:
1. Int ro duct io n
2. Object ive
3. Present at io n o f Co nt ent s
3.5Mat r ices
4. Su mmar y
1
1. Introdu ction
One o f t he basic and impo rt ant dat a st ruct ures o f a pro gram is Arra y.
Array is a dat a st ruct ure which can represent a co llect io n o f ele ment s o f same
be st o red.
The simp lest fo r m o f arra y is a o ne-d ime nsio nal array t hat may be d efined as a
memo r y lo cat io ns. Fo r examp le, an ar ray ma y co nt ain all int eger s o r all
charact ers o r any o t her d at a t ype, but may no t co nt ain a mix o f d ifferent dat a
t yp es. Var io u s o perat io ns such as t raver sal, insert io n, dele t io n, search ing and
2. Objectives
und erst and arra ys alo ng wit h it s t ypes aft er read ing t his chapt er. Then, dat a
co mput ers is also exp lained. The reader can appr eciat e t he var io us o perat io ns
such as t raversal, insert io ns and d elet io ns alo ng wit h examp les aft er read ing t his
chapt er. The co ncept o f mu lt id imens io nal arra ys is also d iscu ssed at lengt h in
t his chapt er. One o f t he impo rt ant to pic fr o m t he po int o f view o f saving st o rage
space is spar se mat r ix and is d iscu ssed in det ail. Overall t he read er will be able
2
3. Presentation of Contents
which are all o f t he same t yp e; it is t her efo re c alled a ho mo geneo us st ruct ure.
The array is a rando m- access st ruct ure, becau se all co mpo nent s can be select ed
at rando m and are equ ally qu ick ly accessib le. I n o rder to deno t e an ind iv id ua l
co mpo nent , t he na me o f t he ent ire st ruct ure is au g ment ed by t he ind ex select ing
t he co mpo nent . This ind ex is generally an int eger bet ween 0 and n- 1, where n is
An ind iv id ual co mpo nent o f an array can be select ed by an ind ex. Given an
array var iable x, we can deno t e an array select o r by t he array name fo llo wed by
t he respect ive co mpo nent 's index i, and can be wr it t en as x i o r x[ i]. Because o f
Lengt h = UB – LB + 1
where, UB is t he larg est ind ex, called t he upper bo u nd, and LB is t he sma llest
t hat :
3
A[1]=247, A[2]=56, A[3]=429, A[4]=135, A[5]=87, A[ 6]=156. Here UB = 7 and
Figur e 3.1
Figur e 3.2
Decla ring an a rra y: T he g eneral fo r m fo r declar ing a sing le d imens io nal array
is:
where d at a_t ype repr esent s dat a t ype o f t he array i.e., int eger, char, flo at et c.,
array_ name is t he name o f array and exp ressio n which ind icat es t he max imu m
4
Memo r y requ ir ed ( in byt es) = size o f (dat a t yp e) X lengt h o f ar ray
The fir st array index valu e is referred t o as it s lo wer bo und and in C it is always
We co nclu de t he fo llo wing fact s fro m t hese examp les while using C:
array is o pt io nal.
( ii) T ill t he arra y element s are no t giv en any specific valu es, t hey co nt ain
garbage values.
5
1000
1001
1002
1003
1004
1005
Figur e 3.3
The co mput er calc u lat es t he ad dress o f any ele ment o f an array A u sing t he
flo at in C t he value o f W is 4.
6
Fo r examp le, co nsid er an array A, which reco rds t he sales o f a co mp any in eac h
LOC( A[ 1832]) = 100, LOC( A[1833]) = 10 4, LOC( A[183 4]) = 108, …….
Let us find t he addr ess o f t he array ele ment fo r t he year K = 187 5. It can be
Ag ain, it is clear t hat t he co nt ent s o f t his ele ment can be o bt ained wit ho ut
(e) So rt ing: Arrang ing t he ele ment s in so me t ype o f o rder i.e. ascend ing
o r descend ing.
merg ing in det ail. As t he o perat io ns o f so rt ing, search ing and merg ing will be
7
3.3.1 Traversing Lin ea r Array s
o f element s o f A wit h a g iven pro pert y such as find ing t he valu es great er t han
60. This can be acco mp lished b y t raversin g A, i.e, b y vis it ing each element o f A
exact ly o nce.
The fo llo wing algo r it hm can be used t o traverse a linear array A having lo wer
bo u nd LB and upp er bo und UB. This algo r it hm t raver ses A app lying an
1. K = LB.
4. K= K+ 1
5. Exit .
The abo ve algo r it hm can also be wr it t en using fo r lo o p inst ead o f rep eat -while
lo o p as fo llo ws:
1. Repeat fo r K = LB to UB:
Ap p ly P ROCESS to A[K].
2. Exit .
ind exes fro m 1 t o 10. We have t o find t he nu mber o f st udent s who have go t
mo re t han o r equal t o 60. The fo llo wing algo r it h m carr y o ut t he g iven o perat io n
8
Find t he nu mber NUM o f st udent s who go t mo re t han o r equal t o 60 marks:
1. NUM = 0.
2. Repeat fo r K = 1 to 10
3. Ret urn
means t he o perat io n o f ad d ing ano t her element to t he array A, and d elet ing
d iscu ss t he pro cedure o f insert ing and d elet ing an element when A is a linear
array.
Insert ing an e lement at t he end o f a linear array can be easily do ne pro vided t he
memo r y sp ace allo cat ed fo r t he array is large eno ugh t o acco mmo dat e t he
add it io nal ele ment . On t he o t her hand, su ppo se we need t o insert an element i n
mo ved do wnward t o new lo cat io ns t o acco mmo dat e t he new element and k eep
S imilar ly, delet ing an element at t he end o f an array present s no d ifficu lt ies, but
9
Fo r examp le co nsid er an array A has been declar ed as a 5 -ele ment array but dat a
have been reco rded o nly fo r A[1], A[2 ], and A[3]. T hen t he array can be
10 20 22
fig.3.4
ma y co nclude t hat we canno t add any new element t o t his Linear Array due t o
No w co nsid er ano t her examp le, suppo se MARKS is an 8-ele ment linear array,
and suppo se five mark s are in t he array, as in Figure 3.5(a). Assu me t hat t he
mark s are st o red in ascend ing o rder, and suppo se we want to keep t he arra y
so rt ed at all t imes. Then, if we want to insert a new ele ment , we may have t o
10 20 30 40
10 20 25 30 40
10
No w assu me t hat 20 needs t o be delet ed fo r m t he abo ve array, t hen t he new
array wo u ld be
10 25 30 40
It can be o bser ved clear ly t hat such mo vement o f d at a wo uld be ver y exp ensive
Insertion Algo rithm: T he fo llo wing algo r it hm insert s a dat a element ITEM int o
K<N.
INSERT(A,N, K,ITEM):
1. I = N
3. A[I+1] = A[I]
4. I = I-1
5. A[ K] = ITEM
6. N = N+1
7. Exit
Here t he first st ep is u sed t o sto re t he valu e o f N t o ano t her var iab le. T he next
linear array A and ass ig ns it t o a var iable ITEM wit h t he assu mpt io n t hat K<N.
11
DELETE(LA, N, K, ITEM):
1. ITEM = A[K]
2. Repeat fo r I = K to N-1
A[I] = A[I+1]
3. N = N – 1.
4. Exit .
way o f st o ring t he d at a.
The linear arrays d iscu ssed so far are als o called o ne d imensio nal array,
since eac h element in t he array is referen ced b y a sing le su bscr ipt . No w we will
12
3.4.1 Two-Dimensi ona l Arrays
element s such t hat each element is specified by a p air o f int eg ers ( such as I, J),
The element o f A wit h first su bscr ipt s i and seco nd su bscr ipt j will be d eno t ed
by A I ,J o r A[I, J]
Two -dimensio nal arrays are generally called mat r ices in mat hemat ics and t able s
in bu siness app licat io ns; hence t wo -dimen sio nal arra ys are called mat r ix arrays.
The schemat ic o f a t wo -dimensio nal array o f size 3 × 5 is sho wn in Fig ure 3.6.
ROW – 1
ROW – 2
ROW – 3
Figur e 3.6
Fo r examp le t he sco res o f five p la yer in five mat ches can be sho wn by t he fig.
3.7.
13
Player Match1 Match2 Match3 Match4 Match5
A 29 78 55 0 4
B 65 101 88 9 76
C 0 0 76 199 88
D 9 8 76 44 65
E 44 5 3 9 100
The abo ve fig. 3.7 sho ws a 5*5 array co nsist ing o f 25 element s.
rect angu lar arra y o f element s wit h m ro ws and n co lu mns, t he array will be
quest io n ar ises t hat ho w t hese element s are sequenced i.e. whet her t he element s
ro w-ma jo r o rder.
The figures 3.8 (a) & ( b) sho ws t hese t wo ways when A is a t wo -dimensio nal 3 x
4 array.
14
Figur e 3.8(a) Figur e 3.8 (b)
array, a similar s it uat io n also ho lds fo r any t wo -dimensio nal M x N array A i.e.,
SCORE. Sup po se Base(SCORE) = 200 and t here are W = 4 wo rds per memo r y
cell and let t he pro gramming languag e st o res t wo -dimens io nal arrays us ing ro w-
15
Two -dimensio nal arrays clear ly sho ws t he d ifference bet ween t he lo g ical and
n int egers such as K 1 , K 2 ,…. K n called su bscr ipt s, wit h t he pro pert y t hat
A K1 , K2 , … . K n o r A[ K1, K2,…. Kn ]
The array will be st o red in memo r y in a sequ ence o f memo r y lo cat io ns. Ag ain,
where dat a_t ype is t he t ype o f arra y such as int , char et c., array_ name is t he
name o f array and expr 1, expr 2, ….expr n are po sit ive valued int eg er
expressio ns.
3.5 Matrices
16
[ ]
a11 a12 L a 1n
a21 a22 L a 2n
M M
am1 a m2 L a mn
is called an m*n matrix.
[]
2
e.g. [
2 3
1 −8
4
5 ] is a 2*3 matrix and 7
−3
is a 3*1 mat r ix.
[]
b1
b2
bn
(b) A matrix having only on column is called a column matrix or column vector.
[]
2
e.g. 7 is a column vector of order 3*1 and [− 2 − 3 − 4 ] is a row vector of order 1*3.
−3
[ ]
a11 a12 L a1n
a21 a 22 L a2n
(c) is called a squa re mat rix o f o rder n.
M O M
an1 an2 L ann
e.g. [
3 9
0 −2 ] is a square mat r ix o f o rder 2.
(d) I f all t he element s are zero , t he mat rix is called a zero mat rix o r nu ll
mat r ix, deno t ed by O m´n .
e.g. [ ]
0 0
0 0
is a 22 zero mat r ix, and d eno t ed by O2 .
A= [aij ]n*n
(e) Let be a squ are mat r ix.
(ii) If aij = 0 for all i<j, then A is called a lower triangular matrix.
17
(iii) If aij = 0 for all i>j, then A is called a upper triangular matrix.
[ ]
a11 0 0 L 0
a21 a 22 0 M
[ ]
a11 a12 L a1n
M 0
0 a22 M
an1 an2 L L a nn
0 0 M
M M
0 L 0 ann
[ ]
1 0 0
e.g. 2 1
−1 0
0
4
is a lower triangular matrix and [2 −3
0 5 ] is an upp er t riangu lar
mat r ix.
A= [aij ]n*n
( f) Let be a squar e mat r ix. I f aij = 0 fo r all i,j , t hen A is called a
d iago nal mat r ix.
[ ]
1 0 0
e.g. 0 −3 0 is a d iago nal mat r ix.
0 0 4
(a) Two matrices A and B are equal iff they are of the same order and their corresponding
elements are equal, i.e. , [aij ]m*n = [bij ]m*n means aij =bij for all i,j
.
e.g. [ ][
a
4
2
b
=
−1 c
d 1 ] means a =-1, b= 1, c= 2, d= 4 .
defined by
[ ]
a 11 a21 L a m1
a a22 L am2
A T = 12
M M
a1n a2n L a nm n*m
18
Thus, if
[ ]
1 2 3
A= 4 5 6 ,
7 8 9
the transpose of A is
T
[ ][ ]
1 2 3 1 4 7
T
A = 4 5 6 = 2 5 8
7 8 9 3 6 9
T
(c) A square mat r ix A is called a symmet ric matrix iff A =A
[ ] [ ]
1 3 −1 1 3 −1
e.g. 3 −3 0 is a symmetric matrix and 0 −3 0 is no t a symmet r ic
−1 0 6 −1 3 6
mat r ix.
Thus, if
[ ] [ ]
1 2 3 4 5 −6
A= 4 5 6 and B= − 7 8 9 ,
7 8 9 1 −2 3
[ ][
a11 +b11 a12 +b12 a13 +b13
C=A+B= a 21 +b21
a 31 +b31
a 22 +b22
a 32 +b32 a33 +b 33
1+ 4
a23 +b 23 = 4− 7
7+1
2+5
5+ 8
8− 2
3− 6
9+3 ][ 5
6+ 9 = − 3
8
7 −3
13 15
6 12 ]
For subtraction, we can simply subtract the corresponding elements:
[ ][
a 11− b11 a12 − b12 a13− b13
][ ]
1− 4 2− 5 3− (− 6) − 3 −3 9
D=A− B= a 21− b21 a22 − b22 a23− b23 = 4− (− 7 ) 5− 8 6− 9 = 11 − 3 −3
a 31− b31 a32 − b32 a33− b33 7− 1 8− (− 2 ) 9− 3 6 10 6
19
A= [aik ]m*n B= [bkj ]n*p
(e) Mat r ix Mu lt ip licat io n: Let and . Then t he pro duct AB
n
C= [c ij ]m*p
is defined as t he mp mat r ix where c ij =ai1 b1j +ai2 b 2j +L+a in b nj = ∑ aik b kj
k=1
i.e.
AB=
[∑ ]
k=1
aik bkj
m*p
.
[ ] [ ]
a11 a 12 a13 b11 b12 b 13
A= a21 a 22 a23 and B= b21 b22 b 23 ,
a31 a 32 a33 b31 b32 b 33
[ ]
a 11 b11 +a 12 b 21 +a13 b31 a11 b 12 +a12 b22 +a13 b32 a11 b13 +a 12 b 23 +a13 b33
AB= a21 b11 +a 22 b 21 +a23 b31 a 21 b12 +a22 b22 +a 23 b 32 a21 b 13 +a 22 b 23 +a 23 b 33
a31 b11 +a 32 b 21 +a33 b31 a 31 b12 +a 32 b22 +a 33 b 32 a31 b 13 +a 32 b 23 +a 33 b 33
In other words, we multiply each of the elements of a row in the left-hand matrix by the
corresponding elements of a column in the right-hand matrix (that’s why the number of elements
in the row and the column must be equal), and then sum the resulting n products to obtain one
[ ] [ ]
1 2 3 4 5 6
Let A= 4 5 6 and B= 7 8 9 .
7 8 9 1 2 3
20
We first note that multiplication of A by B is allowed because the number of columns in A is the
[ ]
1 4 +2 7+ 3 1 1 5+2 8+3 2 1 6+2 9+3 3
= 4 4+5 7+6 1 4 5+ 5 8+ 6 2 4 6+5 9+ 6 3
[ ][ ]
1 2 3 4 5 6
C=AB= 4 5 6 7 8 9 7 4+ 8 7 +9 1 7 5+8 8+ 9 2 7 6+8 9+9 3
as:
7 8 9 1 2 3
[ ]
21 27 33
= 57 72 87
93 117 141
o rder P*N. This algo r it hm calcu lat es and st o res t he mu lt ip licat io n A*B int o C o f
o rder M*N.
1. Repeat st eps 2 to 4 fo r I = 1 to M
3. C[I, J] = 0
4. Repeat fo r K = 1 t o P
6. Exit .
Mat r ices wit h go o d nu mber o f zero ent r ies are called sp arse mat r ices.
21
Figur e 3.9
A t r iang u lar mat r ix is a squar e mat r ix in which all t he eleme nt s eit her abo ve o r
belo w t he main d iago nal are zero . Tr iangu lar mat r ices ar e spar se mat r ices. A
t rid iago nal mat r ix is a squar e mat r ix in which all t he ele ment s except fo r t he
main d iago nal, d iago nals o n t he immed iat e up per and lo wer side are zero s.
Let us co nsider a sparse mat r ix fro m st o rage po int o f view. Suppo se t hat t he
ent ire sparse mat r ix is st o red. Then, a co nsid erable amo u nt o f me mo r y which
st o res t he mat r ix co nsist s o f zero s. This is no t hing but wast age o f memo r y. I n
real life app licat io ns, such wast age ma y co unt to meg abyt es. So , an efficient
22
Figur e 3.10
Suppo se we want to sto re t riang u lar ar ray A sho wn in fig. 3.11. Clear ly a lo t o f
ot her way is requ ired t o sto re t he dat a in a t riangu lar mat r ix. Act ually, d at a can
A11 0 0 0 0 0 0
A21 A22 0 0 0 0 0
… … … … … … …
… … … … … … …
… … … … … … …
Fig. 3.11(Mat r ix A)
23
i.e. we have t o generat e a fo r mu la su ch t hat B[L] = a J K .
It can easily be calcu lat ed add ing nu mber o f ele ment s in t he ro ws befo re t he ro w
L = 1 + 2 + 3 + … + (J-1) + K = J( J-1)/2 + K
Arrays are simp le, but reliab le t o use in mo re sit uat io ns t han yo u ca n
co unt . Arrays are u sed in t ho se pro ble ms when t he nu mber o f it ems t o be so lved
is fixed. They are easy t o t raverse, search and so rt. It is ver y easy t o manipu lat e
an arra y rat her t han o t her su bseq uent dat a st ruct ures. Arrays are used in t ho se
sit uat io ns where in t he size o f array can be est ablished befo r ehand. Also , t hey
are used in sit uat io ns where t he insert io ns and delet io ns are min imal o r no t
present . Insert io n and delet io n o perat io ns will lead t o wast age o f memo r y o r
element s.
4. Summa ry
Arrays help t o so lve pro blems t hat requ ire keep ing t rack o f man y p ieces
o f dat a. We mu st also learn abo ut a special k ind o f arrays called sparse arrays.
are no n-zero rando mly d ist r ibut ed. Arrays are u sed in pro gramming languag es
fo r ho ld ing gro up o f ele ment s all o f t he same k ind. Vect o rs, mat rices,
24
5. Suggested Reading /Reference mat eria l
“Dat a St ruct ures Using C and C++”, Yed id yah Langsam, Mo she J.
“Algo r it hms + Dat a St ruct ures = Pr o grams” by Nik laus Wirt h, PHI
“Fu ndament als o f Dat a St ruct ures in C++” by E.Ho ro wit z, Sahni and
“Dat a St ruct ures and Pro gram Desig n in C” b y Kru se, C.L.To no do and B.
“Fu ndament als o f Dat a St ruct ures in C” by R. B. Pat el, PHI Publicat io ns.
“Dat a St ruct ures and Algo r it hms”, V. Aho , Ho pcro pft , Ullman, LPE.
“Dat a St ruct ures”, Seymo ur Lip schut z, Schau m’s Out line Ser ies.
25
Exp lain fo llo wing t ype o f mat r ices: ( i) Tr iangu lar mat r ix ( ii) Tr id iago nal
(b) Suppo se Base( A)=3 00 and w=4 wo rds per memo r y cell fo r A. Find t he
B.
(b) Co ns ider t he ele ment s B(3, 3, 3) in B. Find t he effect ive ind ices E1,
E2 and E3 and t he address o f t he ele ment , assu ming Base(B) =400 and
Each st udent in a class o f 30 st udent s t akes 6 t est s in which sco res range
bet ween 0 and 100. Su ppo se t he t est sco res are st o red in a 30 x6 array.
Wr it e a mo du le which
26
(b) Find s t he fina l grade fo e each st udent where t he fina l grade is t he
(c) Find t he nu mber o f st udent s who have failed, i.e. who se final gr ade is
27
Lesson : 4 Writer : Dr. Pardeep Kumar Mittal
Structure:
1. Introduction
2. Objective
3. Presentation of Contents
3.4.1 Traversal
3.4.2 Search
3.4.3 Insertion
3.4.4 Deletion
3.5 Applications
4. Summary
1
1. Introduction
In the previous chapter, we have discussed arrays. Arrays are data structures of fixed
size. Insertion and deletion involves reshuffling of array elements. Thus, array manipulation
is time-consuming and inefficient. But in linked list items can be added or removed easily to
the end or beginning or even in the middle. Linked list does not need contiguous memory
space in computer memory. Therefore space can be properly utilized in linked lists as
compared to arrays. In this chapter, we will discuss linked list, its memory representation,
2. Objectives
The main objective of this chapter is to introduce linked lists to the readers. At the
end of this chapter readers will learn about linked lists, become aware of the basic properties
of linked lists. The reader can appreciate the advantages of linked lists over arrays after
reading this chapter. Another important objective of this chapter is to explore the traversal,
search, insertion and deletion operations on linked lists. The various applications of the
linked lists are being discussed in detail in this chapter. In the end the implementation of
linked list using C has also been explained so that readers can discover how to build and
3. Presentation of Contents
while inserting or deleting arbitrary elements stored at fixed distance in a fixed memory. The
linked representation reduces the expense because the elements are not stored at fixed
distance and they are represented randomly and the operations such as insertion and deletion
2
A linked list is a linked representation of the ordered list. It is a linear collection of data
elements termed as nodes whose linear order is given by means of link or pointer. Every node
consist of two parts. The first part is called INFO, contains information of the data and
second part is called LINK, contains the address of the next node in the list. A variable called
START, always points to the first node of the list and the link part of the last node always
contains null value. A null value in the START variable denotes that the list is empty. The
START
A B C D NULL
Figure 4.1(Representation of One-way Linked List with INFO and LINK field)
Along with the linked list in the memory, a special list is maintained which consists of list of
unused memory cells or unused nodes. This list is called list of available space or availability
list or list of free storage or free storage list or free pool. A variable AVAIL is used to store
Sometimes, during insertion, there may not be available space for inserting a data into a data
structure, then the situation is called OVERFLOW. Programmers generally handle the
situation by checking whether AVAIL is NULL or not. The situation where one wants to
delete data from a data structure that is empty is called UNDERFLOW. The situation is
When a linked list is maintained in computer memory, actually two lists are
maintained which are start list and list of available spaces. Let LIST be a linked list. Then
LIST will be maintained in memory as follows. First of all, LIST requires two linear arrays-
3
we will call them here INFO and LINK which contains the information part and the next
pointer field of a node of LIST respectively. START contains the location of the beginning of
the list.
The following example of linked lists indicate that more than one list may be maintained in
the same linear arrays i.e. INFO and LINK. Figure 4.2 pictures a linked list in memory where
each node of the list contains a single character. We can obtain the actual list of characters,
INFO LINK
1 C 6
2 3
4 3 5
START 4 A 9
5 7
6 D Null
2 7 8
AVAIL 8 10
9 B 1
10 Null
int data;
4
} list;
Once we have a definition for a list node, we can create a list simply by declaring a pointer to
the first element, called the “head”. A pointer is generally used instead of a regular variable.
list *head;
It is as simple as that! You now have a linked list data structure. It is not altogether useful at
the moment. You can see if the list is empty. We will be seeing how to declare and define
#include <stdio.h>
int data;
} list;
int main()
if (head == NULL)
5
(a) Traversal: Processing each element in the Linked List.
(b) Search: Finding the location of the element with a given value
sequence; this is called traversing a list. To traverse a list in order, we must start at the list
head and follow the list pointers. The list can be traversed by using the assignment PTR:=
LINK[PTR], which moves the pointer to the next node in the list. As shown in fig.4.3,
traversal is done by processing each node starting from first node, the traversal moves ahead
via next(link) field and the process is continued until the next field is NULL.
6
Algorithm (Traversing a Linked List) This algorithm traverses a list applying an operation
PROCESS to each element of the list. The variable PTR points to the node currently being
processed.
1. PTR = START.
4. PTR =LINK[PTR].
5. Exit
The algorithm starts with initializing PTR. Then process INFO[PTR], the information at the
first node. Update PTR by the assignment PTR:=LINK[PTR], and then process INFO[PTR],
the information at the second node and so on until PTR=NULL, which signals the end of the
list.
Following procedure uses the above algorithm of traversal to count the number of elements in
a linked list.
1. NUM = 0.
2. PTR = START.
4. NUM = NUM+1.
5. PTR = LINK[PTR].
6. Return.
different searching algorithms for finding the location(LOC) of the node where ITEM first
appears in LIST. First algorithm does not assume that the data is in sorted order, while the
7
second algorithm does assume that data is in sorted order. Both the algorithms will only find
LIST is Unsorted: If the data in the given list are not necessarily sorted, then we can search
for ITEM in the given list by traversing through the list using a pointer variable PTR and
comparing ITEM with the contents INFO[PTR] of each node, one by one by updating the
pointer PTR by PTR := LINK[PTR]. For continuing the loop we have to perform two tests.
First we have to check whether we reached the end of the list; i.e., PTR = NULL. If not, then
we check to see whether INFO[PTR] = ITEM. If the ITEM is found in the list, then its
This algorithm finds the location LOC of the node where ITEM first appears in LIST, or sets
LOC=NULL.
1. PTR = START.
Else
PTR = LINK[PTR].
4. LOC =NULL.
5. Exit
LIST is Sorted: If the data in the LIST are sorted, then we can search for ITEM in LIST by
traversing the list using a point variable PTR and comparing ITEM with the contents
INFO[PTR] of each node, one by one, of LIST. Here we can stop once either when ITEM
8
This algorithm finds the location LOC of the node where ITEM first appears in LIST, or sets
1. PTR = START.
PTR = LINK[PTR].
Else
4. LOC = NULL.
5. Exit.
Let a linked list be represented as in fig. 4.4. Linked list is having successive
nodes A and B, as pictured in Figure 4.5. Suppose a node N is to be inserted into the list
between nodes A and B. The schematic diagram of such an insertion appears in Figure 4.5.
9
Figure 4.5(Linked-List After Insertion)
Insertion Algorithms: We will be discussing three algorithms for insertions, which are:
(a) The first one inserts a node at the beginning of the list.
(b) The second one inserts a node into after the node with a given location.
In all the following algorithms it is assumed that the linked list is in memory and that the
variable ITEM contains the new information to be added to the list. Since our insertion
algorithms will use a node in the AVAIL list, all of the algorithms will include the following
steps:
(i) Checking to see if space is available in the AVAIL list. If not, that is, if
(ii) Removing the first node from the AVAIL list. Using the variable NEW to keep
track of the location of the new node, the following step can be implemented by the pair
(iii) Copying new information into the new node i.e. INFO[NEW] := ITEM.
Insertion at the Beginning of a List: The linked list is assumed not to be sorted. The
10
2. NEW = AVAIL and AVAIL = LINK [AVAIL].
3. INFO[NEW] = ITEM.
4. LINK[NEW] = START.
5. START := NEW
6. Exit.
Fig. 4.6 and 4.7 shows the insertion in a linked-list at the beginning of the list.
Insertion after a Given Node : Suppose we are given the value of LOC where LOC
indicates that LOC is the location of the node after which new node is to be inserted. The
following algorithm inserts ITEM into given list so that ITEM follows node for which LOC
i.e. location is given, or ITEM is the first node when LOC = NULL.
11
2. NEW = AVAIL and AVAIL = LINK[AVAIL].
3. INFO[NEW] = ITEM.
Else
5. Exit.
Start
NULL
Item Next Item Next Item Next
Start
LOC
Fig. 4.8: LOC is the location after which new node is to be inserted
Item Next
Start
Start
LOC
Inserting into a Sorted Linked List : Suppose ITEM is to be inserted into a sorted linked
list. This algorithm inserts the node into a Sorted Linked List. The ITEM must be inserted
In this case first location LOC of the node after which new node is to be inserted is to be
found. For this we will write a procedure that finds the location LOC of node A.
12
Traverse the list, using a pointer variable PTR and comparing ITEM with INFO[PTR] at each
node. While traversing, we have to keep track of the location of the preceding node by using
a pointer variable SAVE, as pictured in figure 4.10. Thus SAVE and PTR are updated by the
assignments
Figure 4.10(PTR is the node currently being processed and SAVE is the pointer to the
previous node)
The traversing stops as soon as ITEM < INFO[PTR]. Then PTR points to node B. so SAVE
7. LOC = SAVE.
8. Return
13
2. Call INSAFTERLOC(INFO, LINK, START, AVAIL, LOC, ITEM).
3. Exit.
Let LIST be a linked list with a node N between nodes A and B, as pictured in
Figure 4.11. Suppose node N is to be deleted from the linked list. The schematic diagram of
such a deletion appears in Figure 4.12. The deletion occurs as soon as the next pointer fields
of node A points to node B. Therefore, when performing deletions, one must keep track of
the address of the node which immediately precedes the node that is to be deleted.
Deletion Algorithms
14
(b) The second one deletes the node with a given ITEM of information.
All our algorithms assume that the linked list is in memory in the form LIST(INFO, LINK,
START; AVAIL) and that the variable ITEM contains the new information to be added to the
list.
All of our deletion algorithms will return the memory space of the deleted node N to the
beginning of the AVAIL list. Accordingly, all of our algorithms will include the following
Let LIST be a linked list in memory. Suppose we are given the location LOC of a
node N in LIST. Furthermore, suppose we are given the location LOCP of the node preceding
N or, when N is the first node, we are given LOCP=NULL. The following algorithm deletes
This algorithm deletes the node N with location LOC. LOCP is the location of the node
Set START:=LINK[START].
Else:
3. Exit.
15
Let LIST be a linked list in memory. Suppose we are given an ITEM of information
and we want to delete from the LIST the first node N which contains ITEM (If ITEM is a key
value, then only one node can contain ITEM). First we give a procedure which finds the
location LOC of the node N containing ITEM and the location LOCP of the node preceding
node N. If N is the first node, we set LOCP = NULL, and if ITEM does not appear in LIST,
we set LOC = NULL. (This procedure is similar to Procedure in insertion in a sorted list)
Traverse the list, using a pointer variable PTR and comparing ITEM with INFO[PTR] at each
node. While traversing, keep track of the location of the preceding node by using a pointer
variable SAVE as in figure 4.10. Thus SAVE and PTR are updated by the assignments
The traversing stops as soon as ITEM = INFO[PTR]. Then PTR contains the location LOC of
node N and SAVE contains the location LOCP of the node preceding N.
The formal statement of our procedure follows. The cases where the list is empty or where
INFO[START]=ITEM(i.e., where node N is the first node) are treated separately, since they
This procedure finds the location LOC of the first node N which contains ITEM and the
location LOCP of the node preceding N. If ITEM does not appear in the list, then the
procedure sets LOC=NULL; and if ITEM appears in the first node, then it sets
LOCP=NULL.
16
4. Repeat Steps 5 and 6 while PTR ≠ NULL.
8. Return.
This algorithm deletes from a linked list the first node N which contains the given ITEM of
information.
Else:
5. Exit.
3.5. Applications
Lists are used to maintain POLYNOMIALS in the memory. For example, we have a function
f(x)= 7x5 + 9x4 – 6x³ + 3x². Figure 4.13 depicts the representation of a Polynomial using a
singly linked list. 1000, 1050, 1200, 1300 are memory addresses.
17
Figure 4.13(Representation of Polynomial using Linked-List)
Polynomial contains two components, coefficient and an exponent, and ‘x’ is a formal
parameter. The polynomial is a sum of terms, each of which consists of coefficient and an
4. Summary
The advantage of Lists over Arrays is flexibility. Over flow is not a problem until the
computer memory is exhausted. When the individual records are quite large, it may be
difficult to determine the amount of contiguous storage that might be in need for the required
arrays. With dynamic allocation, there is no need to attempt to allocate in advance. Changes
in list, insertion and deletion can be made in the middle of the list, more quickly than in the
contiguous lists.
The drawback of lists is that the links themselves take space which is in addition to the space
that may be needed for data. One more drawback of lists is that they are not suited for
random access. With lists, we need to traverse a long path to reach a desired node.
Publications.
Pearson Education.
18
“Data Structures and Algorithms”, V.Aho, Hopcropft, Ullman, Pearson India.
Write an algorithm which deletes the last node from the linked list.
Write an algorithm which copies the contents of one link list into another.
Write a procedure which deletes the Kth element from the linked list.
Write a procedure which adds a given item at the end of the linked list.
Write a procedure which interchanges the two elements in the linked list.
19
Lesson : 5 Writer : Dr. Pardeep Kumar Mittal
Structure:
1. Introduction
2. Objective
3. Presentation of Contents
3.3.3 Differences between One-way linked lists and Two-way linked lists
4. Summary
1
1. Introduction
In the previous chapter, we have discussed the singly linked lists along with various
operations that can be performed on these. In this chapter, we are going to discuss some
variations of linked list in the form of header linked lists and circular linked lists, which have
their own applications. Also doubly linked lists are discussed which can be considered as an
improvement over singly linked list when one has to travel in either forward, backward or
both directions. Operations on these lists that is header linked list, circular linked list and
2. Objectives
In this chapter readers will learn about header and circular linked lists along with
various operations on header and circular linked lists. The implementation of these operations
in C language is also described. Another objective of this chapter is to explore doubly linked
lists and various operations on it. In the end the implementation of various operations on
3. Presentation of Contents
A header linked list is a linked list which always contains a special node,
called the header node, at the beginning of the list. The following are two kinds of widely
1. A grounded header list is a header list where the last node contains the null pointer.
2. A circular header list is a header list where the last node points back to the header
node.
2
Figure 5.1(a) & 5.1(b) contains schematic diagrams of these header lists. Unless otherwise
stated or implied, our header lists will always be circular. Accordingly, in such a case, the
header node also acts as a sentinel indicating the end of the list.
The header node of linked list can maintain global properties of entire list and act as utility
node. For example, in header node you can maintain count variable which gives number of
nodes in list. You can update header node count member whenever you add /delete any node.
Observe that the list pointer START always points to the header node. Hence,
LINK[START] = NULL indicates that a grounded header list is empty, and LINK[START] =
3.1.1 Operations on Header Linked Lists: The following are the various operations
3
Let we discuss the respective algorithms below.
Algorithm : (Traversing a Grounded Header List) Let LIST is a grounded header list in
memory. This algorithm traverses LIST, applying an operation PROCESS to each node of
LIST.
4. PTR = LINK[PTR].
5. Exit.
Explanation : The algorithm for traversing a grounded header linked list is similar to a
simple linked list except that the processing starts from LINK[START] instead of START as
LOC)
LIST is grounded header list in memory. This algorithm finds the location LOC of the node
1. PTR = LINK[START].
PTR = LINK[PTR].
LOC = PTR.
Else
LOC = NULL.
4. Exit.
4
Deletion from a grounded header list: For performing deletion it is assumed that we are
given with an item of information to be deleted, therefore first we have to identify the
location of the node and location of the node after which node is to be deleted. For this
purpose we will be considering a procedure that will find these locations and then the
This procedure finds the location LOC of the first node N which contains ITEM and the
location LOCP of the node preceding N. If ITEM does not appear in the list, then the
7. LOC = NULL.
8. Return.
This algorithm deletes from a grounded linked list the first node N which contains the given
ITEM of information.
3. LINK[LOCP] = LINK[LOC].
5
4. LINK[LOC] = AVAIL and AVAIL = LOC.
5. Exit.
Insertion into a grounded header list: Suppose we are given the value of LOC where LOC
indicates the location of the node after which new node is to be inserted. The following
algorithm inserts ITEM into given list so that ITEM follows node for which LOC i.e. location
is given. Here we are assuming that LOC can not be NULL as Header Node is always present
3. INFO[NEW] = ITEM.
5. Exit.
There are two other variations of linked lists which sometimes appear in the literature:
1. A linked list whose last node points back to the first node instead of containing the null
2. A linked list which contains both a special header node at the beginning of the list and a
6
Figure 5.2(b) Linked List with header & trailer nodes
As discussed circular linked list is a list in which last node of the list points to the
first node instead of NULL. As in case of simple linked list, we can perform various
3.2.1 Operations on Circular Lists: Let us now discuss the respective algorithms for
Algorithm : (Traversing a Circular List) Let LIST is a circular list in memory. This
5. PTR = LINK[PTR].
6. Exit.
Explanation : In the above algorithm first node is processed before the loop, then PTR
variable is initiated with the second node. The rest of the node are processed in the loop until
LOC)
LIST is circular list in memory. This algorithm finds the location LOC of the node where
7
1. IF INFO[START] = ITEM, then LOC = START and exit.
2. PTR = LINK[START].
PTR = LINK[PTR].
LOC = PTR.
Else
LOC = NULL.
5. Exit.
Explanation : The above algorithm is similar to that of traversal algorithm except that this
Deletion in a Circular Linked List : For performing deletion it is assumed that we are given
with an item of information to be deleted, therefore first we have to identify the location of
the node and location of the node after which node is to be deleted. For this purpose we first
will be considering a procedure that will find these locations and then the algorithm for
The following procedure finds the location LOC of the first node N which contains ITEM
Else
8
LOC =NULL and LOCP = SAVE.
4. Exit.
The following algorithm deletes the first node N which contains ITEM in a circular header
list.
3. LINK[LOCP] = LINK[LOC].
5. Exit.
Insertion in a Circular Linked List: Now we will be discussing the insertion in a cicular
linked list. First we will be discussing the insertion at the begining of a list and then the
Insertion at the Beginning of a List: The following algorithm inserts the node at the
3. INFO[NEW] = ITEM.
4. LINK[NEW] = START.
SAVE = PTR
9
PTR = LINK[PTR]
6. Exit.
Insertion after a given node in a circular linked list: Suppose we are given the value of
LOC where LOC indicates that LOC is the location of the node after which new node is to be
inserted. The following algorithm inserts ITEM into given list so that ITEM follows node for
3. INFO[NEW] = ITEM.
Else
5. Exit.
1) Any node can be a starting point. We can traverse the whole list by starting from any point.
We just need to stop when the first visited node is visited again.
2) Useful for implementation of queue. We don’t need to maintain two pointers for front and
rear if we use circular linked list. We can maintain a pointer to the last inserted node and
3) Circular lists are useful in applications to repeatedly go around the list. For example, when
multiple applications are running on a PC, it is common for the operating system to put the
10
running applications on a list and then to cycle through them, giving each of them a slice of
time to execute, and then making them wait while the CPU is given to another application. It
is convenient for the operating system to use a circular list so that when it reaches the end of
Let us now discuss a two-way list, which can be traversed in two directions i.e. either
(i). in the usual forward direction from the beginning of the list to the end,
(ii). in the backward direction from the end of the list to the beginning.
Furthermore, given the location LOC of a node N in the list, one now has immediate access
to both the next node and the preceding node in the list. This means, in particular, we may be
able to delete N directly from the list without traversing any part of the list.
A two-way list is a linear collection of data elements, called nodes, where each node N is
2. A pointer field FORW which contains the location of the next node in the list
3. A pointer field BACK which contains the location of the preceding node in the list
The list also requires two list pointer variables: FIRST, which points to the first node in the
list, and LAST, which points to the last node in the list. Figure 5.3 contains a schematic
diagram of such a list. Observe that the null pointer appears in the FORW field of the last
node in the list and also in the BACK field of the first node in the list.
11
Observe that, using the variable FIRST and the pointer field FORW, we can traverse a two-
way list in the forward direction. On the other hand, using the variable LAST and the pointer
field BACK, we can also traverse the list in the backward direction.
Suppose LOCA and LOCB are the locations, of nodes A and B in a two-way list respectively.
Then the way that the pointers FORW and BACK are defined gives us the Pointer property:
In other words, the statement that node B follows node A is equivalent to the statement that
The advantages of a two-way list and a circular header list may be combined
into a two way circular header list as pictured in Figure. 5.4. The list is circular because the
two end nodes point back to the header node. Observe that such a two-way list requires only
one list pointer variable START, which points to the header node. This is because the two
pointer in the header node point to the two ends of the list.
exactly once. Then we can traverse the two-way list in either direction i.e. forward or
backward. Here it is of no advantage that the data are organized as a two-way list rather than
12
as a one-way list. The algorithms for both forward and backward traversal can be written as
follows:
traverses a two-way list applying an operation PROCESS to each element of the list. The
1. PTR = FIRST.
PTR = FORW[PTR].
4. Exit
The algorithm starts with initializing PTR. Then process INFO[PTR], the information at the
first node. Update PTR by the assignment PTR = FORW[PTR], and then process
INFO[PTR], the information at the second node and so on until PTR=NULL, which signals
traverses a two-way list applying an operation PROCESS to each element of the list. The
1. PTR = LAST.
4. PTR = BACK[PTR].
5. Exit
The algorithm is similar to the previous one except that now the PTR variable is initialized
with LAST pointer and the PTR is updated with BACK[PTR] so that the list can be traversed
in backward direction.
13
Searching: Suppose we are given an ITEM of information - a key value - and we want to
find location LOC of ITEM in LIST. Then we can use search the ITEM using forward search
or backward search as in case of traversal. Here the main advantage is that we can search for
ITEM in the backward direction if we have reason to suspect that ITEM appears near the end
of the list. For example, LIST is list of names sorted alphabetically. If ITEM = Shyam, then
we would search LIST in the backward direction, but if ITEM = Dinesh, then we would
SEARCHFORW(INFO, FORW, FIRST, LOC, ITEM). This algorithm searches for an ITEM
in a two-way list in forward direction and finds the location of the ITEM. The variable PTR
1. PTR = FIRST.
4. PTR = FORW[PTR].
5. LOC = NULL
6. Exit
The algorithm starts with initializing PTR. Then compare INFO[PTR], the information at the
first node with the ITEM to be searched. Update PTR by the assignment PTR =
FORW[PTR], and then process INFO[PTR], the information at the second node and so on
until PTR=NULL or until INFO[PTR] matches with ITEM. If the ITEM is not found then the
14
a two-way list in backward direction and finds the location of the ITEM. The variable PTR
1. PTR = LAST.
4. PTR = BACK[PTR].
5. LOC = NULL
6. Exit
The algorithm is similar to the previous one except that now the PTR variable is initialized
with LAST pointer and the PTR is updated with BACK[PTR] so that the ITEM can be
Deleting : Suppose we are given the location LOC of a node N in LIST, and suppose we
want to, delete N from the list. We assume that LIST is a two-way circular header list.
Note that BACK[LOC] and FORW[LOC] are the locations, of the nodes which precede and
follow node N respectively. Accordingly, as pictured in Fig. 5.5, N is deleted from the list by
The deleted node N is then returned to the AVAIL list by the assignments:
15
The formal statement of the algorithm is as follows:
BACK[FORW[LOC]] := BACK[LOC].
3. Exit.
Here we see one main advantage of a two-way list: If the data were organized as a one way
list, then in order to delete N, we would have to traverse the one-way list to find the location
We can write another algorithm in which we have to delete a node with given ITEM of
information as follows:
2. If LOC = NULL write Item does not exist in the list and Exit.
4. Exit.
Inserting. Suppose we are given the locations LOCA and LOCB of adjacent nodes A and B
in LIST, and suppose we want to insert a given ITEM of information between nodes A and B.
As with a one-way list, first we remove the first node N from the AVAIL list, using the
variable NEW to keep track of its location, and then we copy the data ITEM into the node N;
Algorithm : INSTWL(INFO,FORW,BACK,FIRST,AVAIL,LOCA,LOCB,ITEM)
INFO[NEW] =ITEM.
4. Exit.
If above algorithm assumes that LIST contains a header node, then LOCA or LOCB may
point to the header node, in which case N will be inserted as the first node or the last node. If
LIST does not contains a header node, then we must consider the case that LOCA = NULL
and N is inserted as the first node in the list, and the case that LOCB = NULL and N is
3.3.3 Difference between One-way linked lists and Two-way linked lists :
The Single(One-way) linked list has only one advantage, that it can traverse a
list in one direction. That means one cannot get the address of its predecessor node i.e. when
we look for any previous information of the list during operations then one has to traverse
again from the start node of the one way list. Which uses an extra pointer and additional
searching time. But in case double linked list we can have the address of the next as well as
previous node. So, while we look for previous node address, we can obtain through prior part
of the two-way list which need not require extra pointer or takes less time than that of the
single linked list. So apart from the bi-directional movement facility, the two-way list also
4. Summary
Header linked list is a specialized type of linked list. The use of header node in a
linked list is to store some general purpose information about the list. The circular linked list
17
is also sometimes useful as the last node does not contain a null pointer rather contains some
useful information i.e. the address of the first node. Another variation in the linked list
discussed in this chapter is two-way linked list. Generally speaking, storing data as a two-way
linked list, which requires extra space for the backward pointers and extra time to change the
added pointers, rather than as a one-way list is not worth the expense unless one must
frequently find the location of the node which precedes a given node as in deletion.
“Data Structures Using C and C++”, Yedidyah Langsam, Moshe J. Augenstein, Aaron M
Publications.
Pearson Education.
Can we use doubly linked list as a circular linked list? If yes, explain.
Write the differences between doubly linked list and Circular linked list.
Discuss the advantages, if any, of a two-way list over a one-way list for each of the
following operations:
Suppose LIST is a header(Circular) list in memory. Write an algorithm which deletes the
Write a procedure HEAD(INFO, LINK, START, AVAIL) which forms a header circular
K), which deletes the Kth element from a two-way circular header list.
Suppose LIST (INFO, LINK, START, AVAIL) is a one-way circular header list in
values to a linear array BACK to form a two-way list from the one-way list.
19
Lesson : 6 Writer : Dr. Pardeep Kumar Mittal
Structure:
1. Introduction
2. Objectives
3. Presentation of Contents
3.1 Stacks
3.4 Applications
3.4.2 Recursion
3.4.3 Quicksort
4. Summary
1
1. Introduction
Stack is very useful concept in Computer Science. In this lesson, we shall examine
this simple data structure and see why it plays such a prominent role in the area of computer
programming. Whenever we are dealing with the function subprogram, we are actually using
stacks, the functions subprograms are kept in a stack and the calling function can only be
executed only when the called function has been executed. There are certain situations when
we wish to insert or remove an item only at the beginning or the end of the list. Stack is a data
structure which allows elements to be inserted as well as deleted only from one end. Stack is
also known as LIFO data structure. Some of the common applications of stacks are recursion,
polish notation and quicksort, which are very useful in the field of computer science.
2. Objectives
After reading this chapter the reader must be able to understand the linear data structure
named as stack. Computer representation of stacks using arrays and linked list will also be
discussed in this chapter. Insertion and deletion operations in stacks using both
representations will also be explained in this chapter. The reader will appreciate the various
applications of stacks, which are recursion, polish notation and quicksort. At the end of this
chapter, the reader must be able to understand and use this applications.
3. Presentation of Contents
3.1 Stack
Stacks and Queues are two data structures that allow insertions and deletions
operations only at the beginning or the end of the list, not in the middle.
A stack is a linear data structure in which items may be added or removed only at one end
named as the top of the stack. Everyday examples of such a structure are very common viz. a
stack of dishes, a stack of books, a stack of coins and a stack of cloths, etc. as shown in fig
6.1
2
Fig. 6.1: A Stack of Coins and Books
Stacks are also called last-in first-out (LIFO) lists. This means, that elements which are
inserted last will be removed first. Other names generally used for stacks are "piles" and
"push-down lists”. Stack has many important applications in the field of computer science.
Special terminology is used for two basic operation associated with stacks:
N 1 A
N-1 2 B
... 3 C
TOP 5 E 4 D
3
4 D 5 E TOP
3 C ...
2 B N-1
1 A N
1 2 3 4 5 … N-1 N
A B C D E
TOP
Here we will maintain the stack by a linear array STACK; a pointer variable TOP,
which contains the location of the top element of the stack; and a variable MAXSTK
which gives the maximum number of elements that can be held by the stack. The condition
Figure 6.3 shows such an array representation of a stack. Since the stacks has three
elements, X, Y, Z, and therefore TOP = 3; and since MAXSTK=8, there is room for 5
MAXSTK 8
4
TOP 3 Z
2 Y
1 X
The operation of adding (pushing) an item onto a stack and the operation of
removing (popping) and item from a stack are implemented by the procedures called PUSH
In the implementation of these operations, TOP and MAX are assumed as global variables;
hence these are not required as arguments in the algorithms, which in turn may be named as
Insertion: When we are adding a new element, first, we must test whether there is a free
space in the stack for the new item; if not, then we have the condition known as overflow. If
this condition is not there, then the value of TOP is changed before the insertion in PUSH.
2. TOP = TOP + 1
4. Exit
Deletion: In executing the procedure POP, we must first test whether there is an element in
the stack to be deleted; if not; then we have the condition known as underflow. The item to be
5
deleted is first stored in some variable, then the value of TOP is changed after the deletion in
POP.
2. ITEM = STACK[TOP]
3. TOP = TOP-1
4. Return Item
5. Exit
Example: Consider the stack in Figure 6.3. If we perform the operation PUSH (STACK, W):
1. Since TOP=3, which is less than MAXSTK i.e. 8, therefore control is transferred to Step 2.
2. TOP = 3 + 1 = 4.
4. Exit
Note that W is now the top element in the stack and value of top is 4.
Example: Consider again the stack in Figure 6.3. This time we perform the operation POP
(STACK, ITEM):
2. ITEM = STACK[3] = Z
3. TOP = 3 - 1 = 2.
4. Exit
Observe that STACK [TOP] = STACK [2] = Y is now the top element in the Stack.
A Stack contains an ordered list of elements and an array is also used to store ordered list of
elements. Hence, it would be very easy to manage a stack using an array. However, the
problem with an array is that we are required to declare the size of the array before using it in
the elements of a stack. We can declare the array with a maximum size large enough to
manage a stack.
We can avoid the size limitation of a stack implemented with an array, with
the help of a linked list to hold the stack elements. As needed in case of array, we have to
decide where to insert elements in the list and where to delete them so that push and pop will
run at the fastest. You know that while implementing stack with an array and to achieve
LIFO behavior, we used push and pop elements at the end of the array. Instead of pushing
and popping elements at the beginning of the array that contains overhead of shifting
elements towards right to push an element at the start and shifting elements towards left to
pop an element from the start. To avoid this overhead of shifting left and right, we decided to
push and pop elements at the end of the array. Now, if we use linked list to implement the
stack, where will we push the element inside the list and from where will we pop the
element? There are few facts to consider, before we make any decision.
Insertion and removal in stack takes constant time. Singly linked list can serve the purpose.
Hence, the decision is to insert the element at the start in the implementation of push
operation and remove the element from the start in the pop implementation.
TOP
1 7 5 2 NULL
7
The elements present inside this stack are 1, 7, 5 and 2. The most recent element of the stack
is 1. It may be removed if the pop() is called at this point of time. This stack has four nodes
inside it which are liked in such a fashion that the very first node pointed by the top pointer
contains the value1. This first node with value 1 is pointing to the node with value 7. The
node with value 7 is pointing to the node with value 5 while the node with value 5 is pointing
to the last node with value 2. To make a stack data structure using a linked list, we have
We are now going to implement stack through linked list. Here are the algorithms for
Insertion
3. INFO[NEW] = ITEM
4. LINK[NEW] = TOP
5. TOP = NEW
6. Exit.
In the above algorithm, first it is checked whether there is sufficient space to insert a
new item and if the space is available, the item is inserted at the top of the stack. The
Deletion
5. Exit.
In the above algorithm, first it is checked that whether the list is having any element or not. If
it is having one or more elements, then the element at the top is deleted and correspondingly
3.4 Applications
The stacks are used in numerous applications and some of them are as follows:
Backtracking
Quicksort
section gives an algorithm which finds the value of AE by using Reverse Polish (Postfix)
Notation. We will see that the stack is an essential tool in this algorithm. We will be using the
Highest : Parentheses()
2 ^ 3 + 5 * 2 ^ 2 -12 / 6
8 + 5 * 4 - 12 / 6
Then we evaluate the multiplication and division to obtain 8 + 20 - 2. Last, we evaluate the
addition and subtraction to obtain the final result, 26. Observe that the expression is traversed
2 ^ 3 + (5 * 2) ^ 2 -12 / 6
2 ^ 3 + (10) ^ 2 – 12 / 6
Then we evaluate the exponentiation to obtain 8 + 100 – 2. Last, we evaluate the addition and
subtraction to obtain the final results as 106. It can easily be observed that by inserting a
parentheses, the result has changed drastically. To avoid such confusion and problems,
postfix expression or prefix expression and then evaluates it. Therefore, now we will be
Polish Notation
In mathematics, the operator symbol is placed between its two operands. For example,
This is called infix notation. With this notation, we must distinguish between
(A + B) * C and A + (B * C)
levels discussed above. Accordingly, the order of the operators and operands in an arithmetic
10
expression does not uniquely determine the order in which the operations are to be
performed.
(A + B) * C = [+AB] * C = * +ABC
A + (B * C) = A + [*BC] = +A*BC
The fundamental property of Polish notation is that the order in which the operations are to be
performed is completely determined by the positions of the operators and operands in the
used. The computer usually evaluates an arithmetic expression written in infix notation in two
steps. First, it converts the expression to postfix notation, and then it evaluates the postfix
expression. The postfix notation is also known as Reverse Polish Notation(RPN). Some of
the examples to convert an infix expression into postfix expression are as follows:
(A + B) * C = [+AB] * C = AB+C*
A + (B * C) = A + [*BC] = ABC*+
Now we will be considering the procedure for conversion of an infix notation into postfix
converts an infix expression AE into its equivalent postfix expression PE. The algorithm uses
a stack to temporarily hold operators and left parentheses. The postfix expression PE will be
constructed from left to right using the operands from AE and the operators which are
11
removed from STACK. We begin by pushing a left parenthesis onto STACK and adding a
right parenthesis at the end of AE. The algorithm is completed when STACK is empty.
1. Push "(" onto STACK, and add ")" to the end of AE.
2. Scan AE from left to right and repeat Steps 3 to 6 for each element of AE until the
STACK is empty
a) Repeatedly pop from STACK and add to PE each operator from the top of STACK,
a) Repeatedly pop from STACK and add to PE each operator from top of STACK,
7. Exit.
Example : Let we discuss the algorithm with the help of an example. Consider the following
AE: A + (B - C * (D / E ^ F))
We will convert AE into its equivalent postfix expression PE using above algorithm. First we
push "(" onto STACK, and then we add ")" to the end of AE to obtain:
AE: A + ( B - C * ( D / E ^ F ) ))
Figure 6.5 shows the status of STACK and of the string PE as each element of AE is scanned.
12
Symbol Scanned Stack Expression PE
1 A ( A
2 + (+ A
3 ( (+( A
4 B (+( AB
5 - (+(- AB
6 C (+(- ABC
7 * (+(-* ABC
8 ( (+(-*( ABC
9 D (+(-*( ABCD
10 / (+(-*(/ ABCD
11 E (+(-*(/ ABCDE
12 ^ (+(-*(/^ ABCDE
13 F (+(-*(/^ ABCDEF
14 ) (+(-* ABCDEF^/
15 ) (+ ABCDEF^/*-
16 ) ABCDEF^/*-+
After converting as infix expression into an equivalent postfix expression, the next thing to be
done is to evaluate the postfix expression. Interestingly when we evaluate the postfix
13
Suppose PE is an arithmetic expression written in postfix notation. The following algorithm
uses a STACK to hold operands and after evaluating PE the result is stored in a variable
named as RESULT.
2. Scan PE from left to right and repeat Steps 3 and 4 for each element of until the
a) Remove the two top elements of STACK, where let A is the top element and B is
b) Evaluate B (x) A.
5. RESULT = TOP[STACK]
6. Exit.
Example : Let us consider the following arithmetic expression AE written in infix notation:
AE: 1 + 2 – 3 * (4 /2)
PE: 1, 2, +, 3, 4, 2, /, *, -
We will now evaluate PE using EVAL algorithm. First we add a sentinel right parenthesis at
P: 1, 2, +, 3, 4, 2, /, *, -, )
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
14
Figure 6.6 shows the contents of STACK as each element of PE is scanned. The final number
in STACK, -3, which is assigned to RESULT when the sentinel “)” is scanned, is the value of
PE.
1 1 1
2 2 1, 2
3 + 3
4 3 3, 3
5 4 3, 3, 4
6 2 3, 3, 4, 2
7 / 3, 3, 2
8 * 3, 6
9 - -3
10 )
3.4.2 Recursion
algorithms can be best described in terms of recursion. Let us discuss, how recursion may
prove to be useful tool in developing algorithms for specific problems. When a procedure
containing either a Call statement to itself or a Call statement to a second procedure that may
eventually Call back to the original procedure. Then that procedure is called a recursive
15
1) There must be certain criteria, called base criteria, for which the procedure does not call
itself.
2) Each time the procedure does call itself (directly or indirectly); it must be closer to the
base criteria.
function is said to be recursively defined if the function definition refers to itself. If a function
call itself without any base criteria then although the function is said to be recursively
defined, but not well-defined. The following examples should help us to clarify these ideas.
The product of the positive integers from 1 to n, inclusive, is called "n factorial" and is
This is true for every positive integer n; that is, n! = n . (n – 1)!. As visible from the definition
of the factorial function, the function is calling itself with different variable i.e. n-1 instead of
n. The function is also well-defined since with every step the function is going closer to the
base criteria which is 1! or 0! and is equals to 1. Accordingly, the factorial function may also
be defined as follows:
Example: If we calculate 4! using the recursive definition. This calculation undergoes the
following steps:
(1) 4! = 4 . 3!
(2) 3! = 4 . 2!
(3) 2! = 2 . 1!
16
(4) 1! = 1 . 0!
(5) 0! = 1
(6) 1! = 1 . 1 = 1
(7) 2! = 2 . 1 = 2
(8) 3! = 3 . 2 = 6
(9) 4! = 4 . 6 = 24
Let us now write the procedure that calculate n! This procedure calculates n! and returns the
3. FACT = N * FACT.
4. Return.
The above procedure is a recursive procedure, since it contains a call statement to itself.
3.4.3 Quicksort
This is the most widely used internal sorting algorithm. In its basic form, it
was invented by C.A.R. Hoare in 1960. Its popularity lies in the ease of implementation,
moderate use of resources and acceptable behaviour for a variety of sorting cases. The basis
of quick sort is the divide and conquer strategy i.e. Divide the problem [list to be sorted] into
sub-problems [sub-lists], until solved sub problems [sorted sub-lists] are found. This is
implemented as follows:
Choose one item called pivot element A[I] from the list A[ ]. Generally the pivot element is
Rearrange the list so that this item is in the proper position, i.e., all preceding items have a
lesser value and all succeeding items have a greater value than this item.
17
Repeat steps 1 & 2 for sublist1 & sublist2 till A[ ] is a sorted list.
As can be seen, this algorithm has a recursive structure. This is usually implemented as
follows:
2. From the left end of the list (A[0] onwards) scan till an item A[R] is found whose value is
3. From the right end of list [A[N] backwards] scan till an item A[L] is found whose value is
5. Continue steps 2, 3 & 4 till the scan pointers cross. Stop at this stage.
Example: We will illustrate the quick sort with the help of an example
40, 30, 10, 50, 70, 90, 44, 60, 99, 20, 80, 60
Suppose we choose 40, the first element as the pivot element. Then beginning with the lst
number 60, scan the list from right to left, comparing each number with 40 and stopping at
the first number less than 44. The number is 20. Interchange 40 and 20 to obtain the list
20, 30, 10, 50, 70, 90, 44, 60, 99, 40, 80, 60
Now it is clearly visible that each number in the right of 40 is greater than 40. Now beginning
with 20, next scan the list in the opposite direction, from left to right, comparing each number
with 40 and stopping at the first number greater than 40. The number is 50. Interchange 40
20, 30, 10, 40, 70, 90, 44, 60, 99, 50, 80, 60
18
Now again it is visible that all the numbers on the left side of 40 are less than 40 and the
numbers on the right of 40 are greater than 40. Therefore we finally obtain the following list
20, 30, 10, 40, 70, 90, 44, 60, 99, 50, 80, 60
Now 40 is in its final position, dividing the list into two sublists 20, 30, 10 and 70, 90, 44, 60,
The above procedure is repeated for both the sublists until all the elements are sorted. The
sorting can be accomplished using two stacks LOWER and UPPER with each stack
maintaining the lower and upper bounds of each list or sublist until the stack is empty. The
RIGHT = RIGHT – 1
(iii) Go to Step3
LEFT = LEFT + 1
(iii) Go to Step2
19
Algorithm: Quicksort(A, N)
1. TOP = NULL
8. Exit
The above algorithm is divided into two parts: first a procedure is written which finds the
final location of pivot element; second is the actual sorting algorithm which sorts the given
The Quick sort algorithm uses the O(NLog2N) comparisons on average. The performance can
1. Switch to a faster sorting scheme like insertion sort when the sublist size becomes
comparatively small.
In worst case, the Quick sort algorithm uses the O(N2) comparisons. The worst case occurs
when the input list is already sorted and the pivot element is always picked as the first
element of the sublists. But this case occurs only as a special case, therefore the complexity
of Quick sort is assumed to be O(NLog2N). The chances of worst case can be minimized if
20
4. Summary
A stack is a list in which retrievals, insertion, and deletion can take place at the same position.
It follows the last in first out (LIFO) mechanism. In this chapter, we have studied how the
stacks are implemented using arrays and using liked list. Also, the advantages and
disadvantages of using these two schemes were discussed. For example, when a stack is
implemented using arrays, it suffers from the basic limitations of an array (fixed memory). To
overcome this problem, stacks are implemented using linked lists. Various applications of
stacks such as recursion, polish notation, and quicksort were also discussed. All of these
Publications.
Data Structures and Program Design in C by Kruse, C.L.Tonodo and B. Leung; Pearson
Education.
Translate, by inspection and hand, each infix expression into its equivalent postfix
H/K)
P: 12, 7, 3, -, /, 2, 1, 5, +, *, +
DATASTRUCTURES
Suppose the characters in S are to be sorted alphabetically. Use the quicksort algorithm
ABCDE
Find the number of comparisons to sort S using quicksort. What general conclusions can
Suppose the Fibonacci numbers F11 = 89 and F12 = 144 are given,
Write an iterative procedure to obtain the first N Fibonacci numbers F[1], F[2], …,
Write a procedure to obtain the capacity of a linked stack represented by its top pointer
TOP. The capacity of a linked stack is the number of elements in the list forming the
stack.
6+2^3+9/3–4*5
Explain the quicksort algorithm and program with the help of an example.
23
Lesson : 7 Writer : Dr. Pardeep Kumar Mittal
Structure:
1. Introduction
2. Objective
3. Presentation of Contents
3.1 Queue
3.3.1 Insertion
3.3.2 Deletion
3.4 Dequeue
4. Summary
1
1. Introduction
Queues are a very useful data structure in Computer Science. Queue is the most common data
structure which allows elements to be inserted at one end called Rear and deleted at another end
called Front. Queue is also known as FIFO data structure. In a FIFO data structure, the first element
added to the queue will be the first one to be removed. This is equivalent to the requirement that once
a new element is added, all elements that were added before have to be removed before the new
element can be removed. A queue is an example of a linear data structure, or more abstractly a
sequential collection.
2. Objectives
At the end of this chapter, the reader must be able to understand linear data structure queue.
In this chapter we will study the computer representation of queue. The basic operations on
queue i.e. insertion & deletion operation in queue will also be discussed in this chapter. Then the
various types of queues are discussed in detail. In the end, various applications of queue are
described.
3. Presentation of Contents
3.1 Queue
Queues are data structures that allow insertions and deletions operations only at
A queue is a linear structure in which element may be inserted at one end called the rear, and the
deleted at the other end called the front. Figure 7.1 pictures a queue of people waiting for their turn.
Queues are also called First-In First-Out (FIFO) lists. An important example of a queue in computer
science occurs in a timesharing system, in which programs with the same priority form a queue while
waiting to be executed.
2
Figure: 7.1(People waiting for their turn)
Computer science also has common examples of queues. Our computer laboratory has 30 computers
networked with a single printer. When students want to print, their print tasks “get in line” with all
the other printing tasks that are waiting. The first task in is the next to be completed. If you are last in
line, you must wait for all the other tasks to print ahead of you.
In addition to printing queues, operating systems use a number of different queues to control
processes within a computer. The scheduling of what gets done next is typically based on a queuing
algorithm that tries to execute programs as quickly as possible and serve as many users as it can.
Also, as we type, sometimes keystrokes get ahead of the characters that appear on the screen. This is
due to the computer doing other work at that moment. The keystrokes are being placed in a queue-
like buffer so that they can eventually be displayed on the screen in the proper order.
Queues may be represented in the computer in two ways i.e. by means of linear
arrays and one-way linked lists. First we will be considering representation of queues with the
help of arrays.
Queues will be maintained by a linear array QUEUE and two pointer variables: FRONT,
containing the location of the front(first) element of the queue; and REAR(last), containing the
location of the rear element of the queue. The condition FRONT = 0 will indicate that the queue
is empty.
3
Front = Rear = NULL
1 2 3 4 5 … N-1 N
Front = Rear = 1 A
1 2 3 4 5 … N-1 N
Front = 1 Rear = 2 A B
1 2 3 4 5 … N-1 N
Front = 1 Rear = 3 A B C
1 2 3 4 5 … N-1 N
Front = 2 Rear = 3 B C
1 2 3 4 5 … N-1 N
As indicated in the fig.7.2, when there is no element, then front and rear both are equal to NULL.
If an element is inserted, then front and rear both becomes 1. If another item is inserted, the the
rear changes to 2, and after another insertion rear becomes 3. Now if an element is deleted then
Now we will be considering representation of queues with the help of linked lists.
In this case the representation is vary much similar to the one-way linked list, except that the
address of the starting node is saved as FRONT and the address of the last node is termed as
REAR. Also this linked list is restricted linked-list in the sense that insertions can take place only
at the REAR and deletion can only take place at FRONT. The linked representation is shown
4
FRONT REAR
A B C D NULL
The memory representation of queue via linked-list can be shown in the fig. 7.4.
INFO LINK
FRONT >1 4
2 B 3
REAR 3 C NULL
4 6
AVAIL >5 A 2
6 7
7 8
8 NULL
3.3.1 Insertion
Figure 7.2 indicated the some of the ways in which elements will be
inserted in the queue and the way new elements will be deleted from the queue. Whenever an
5
element is added to the queue, the value of REAR is increased by 1; this can be implemented by
the assignment
REAR = REAR + 1
Generally QUEUE is maintained as circular array in computer science, that is, QUEUE[1] comes
after QUEUE[N] in the array. With this assumption, if we insert ITEM into the queue by
Front = 3 Rear = N C D E F G H
1 2 3 4 5 … N-1 N
Front = 3 Rear = 1 I C D E F G H
1 2 3 4 5 … N-1 N
The operation of adding an item into a queue is implemented by the following algorithm, called
QINSERT
6
Else if REAR = N, then
REAR = 1.
Else
REAR = REAR + 1.
4. Exit.
The above algorithm for inserting an element in an array has been implemented using arrays. In
this algorithm the first step checks for the available space for inserting a new element. If no
space is available, then the overflow condition occurs. Otherwise control flows to the second
step, in which location where new element is to be inserted is found. In the third step, actual
insertion is made.
The insertion in a queue can also be implemented using linked list. The following algorithm
Else
5. Exit
7
The above algorithm works in a similar manner to that of previous algorithm. The only
3.3.2 Deletion
As shown in fig 7.2, whenever an element is deleted from the queue, the
FRONT = FRONT + 1
As QUEUE is assumed to be circular, that is, that QUEUE[1] comes after QUEUE [N] in the
array. With this assumption, if FRONT = N and an element of QUEUE is deleted, we reset
FRONT = 1 instead of increasing FRONT to N+1 as shown in fig. 7.6(a) and 7.6(b).
Front = N Rear = 2 I J H
1 2 3 4 5 … N-1 N
Front = 1 Rear = 2 I J
1 2 3 4 5 … N-1 N
Suppose that our queue contains only one element, i.e., suppose that
FRONT = REAR = 1
to indicate that the queue is empty and this operation can be depicted by fig. 7.7(a) ans 7.7(b).
Front = Rear = 1 A
8
1 2 3 4 5 … N-1 N
1 2 3 4 5 … N-1 N
The operation of removing an item from a queue is implemented by the following algorithm,
called QDELETE
2. ITEM = QUEUE[FRONT].
FRONT = 1.
Else
FRONT = FRONT+1.
4. Return ITEM
5. Exit
The above algorithm for deleting an element in an array has been implemented using arrays. In
the first step, it is checked whether any element is existing in the list or not. If no element is not
9
available in the array then underflow condition occurs. In the second step the element at the
FRONT is saved in a variable. In the next step the element is actually deleted.
The deletion in a queue can also be implemented using linked list. The following algorithm
2. TEMP = FRONT
3. ITEM = INFO[TEMP]
4. FRONT = LINK[TEMP]
6. Exit
The above algorithm works in a similar manner to that of previous algorithm. The only
3.4 Dequeue
A deque is a linear list in which elements can be added or removed at either end
but not in the middle. The term deque refers to the name double-ended queue.
There are two variations of a deque - namely, an input-restricted deque and an output restricted
deque - which are intermediate between a deque and a queue. An input restricted deque is a
deque which allows insertions at only one end of the list but allows deletions at both ends of the
list; and an output-restricted deque is a deque, which allows deletions at only one end of the list
Figure 7.8 pictures a deques, with 4 elements maintained in an array with N = 8 memory locations. The
condition FRONT = NULL will be used to indicate that a deque is empty. In a deque FRONT and REAR
10
are maintained as in a normal queue. The only difference being insertions and deletions can be done at
FRONT REAR
A B C D
Insertion/Deletion Insertion/Deletion
Figure 7.8 (Representation of a deque)
A dequeue can also be represented using linked lists as if fig. 7.3 and 7.4, again the difference
simplicity that array is not maintained as a circular array and therefore the algorithm for
Else
REAR = REAR + 1.
4. Exit.
The second algorithm i.e. insertion at the front can be written as:
11
Algorithm : INSERTATFRONT (DEQUE, N, FRONT, REAR, ITEM)
1. If FRONT = 1 then Write “Can not insert at front end” and Exit.
Else
FRONT = FRONT - 1.
4. Exit.
If the deque is represented as a linked-list, then the algorithm for insertion at rear remains same
as it was written earlier in the section 3.3.1 for a simple queue. But when the insertion is to be
Else
FRONT = NEW
5. Exit
Deletion: With the assumption that array is not maintained as a circular array, the algorithm for
12
1. If FRONT = NULL then Write UNDERFLOW and Exit.
2. ITEM = DEQUE[FRONT].
Else
FRONT = FRONT+1.
4. Exit
The second algorithm i.e. deletion at the rear can be written as:
2. ITEM = DEQUE[REAR].
Else
REAR = REAR – 1.
Return ITEM
Exit
Now again if the deque is represented as a linked-list, then the algorithm for deletion at front
remains same as it was written earlier in the section 3.3.2 for a simple queue. But when the
2. TEMP = REAR
13
3. ITEM = INFO[TEMP]
SAVE = PTR
PTR = LINK[PTR]
9. Return ITEM
10. Exit
In the above algorithm, the main thing is to find the address of the node previous to the rear node
it exists. For this purpose we are moving from FRONT to REAR as we can not move backward
from REAR. We are also maintaining a variable SAVE for the purpose of getting a node
A priority queue is a collection of elements such that each element has been assigned
a priority and such that the order in which elements are deleted and processed comes from the
following rules:
2) Two elements with the same priority are processed according to the order in which
14
There are various ways for maintaining a priority queue in computer memory. The two main
ways are using one-way list and array representation. We will be discussing each of them in
brief.
One way to maintain a priority queue in computer memory is by using one-way list, as follows:
Each node in the list will consists of three parts i.e. INFO, which will contain the
information of the field, PRN, which will store the information about the priority of the
node, and LINK, the link field, which will store the address of the next node.
A node X will precede a node Y in the list (1) when X has higher priority than Y or (2)
The representation of priority queues using linked-list can be shown as in fig. 7.9
FRONT REAR
A 1 B 2 C 2 D 5 NULL
One can insert or delete an element in a priority queue using above representation by the
following algorithms:
Algorithm: Delete_Priority
ITEM = INFO[FRONT]
15
Process ITEM
Exit
The above algorithm can also be written as similar to LINK_DELETE algorithm in section 3.3.2
Algorithm: Insert_Priority
1. Traverse the one-way list until finding a node X whose priority number exceeds N. Insert
2. If no such node is found, insert ITEM as the last element in the list.
Here PRN indicates the priority number of the various nodes and PRT indicates the priority
Else
10. Exit
Array Representation
Another way to maintain a priority queue is to use a separate queue for each level of
priority. Each such queue will appear in its own circular array and will have its own pair of
pointers, i.e. FRONT and REAR. Each array will be allocated same amount of space. Actually a
1 2 1 A B
2 2 2 C
2 2 3 D
3 4 4 E F
NULL NULL 5
One can insert or delete an element in a priority queue using above representation by the
following algorithms:
Algorithm: Delete_ArrPrior
17
3. Exit
Algorithm: Insert_ArrPrior
2. Exit.
Many application involving queues require priority queues rather than the simple
FIFO strategy. For elements of same priority, the FIFO order is used. For example, in a multiuser
system, there will be several programs competing for use of the central processor at one time.
The programs have a priority value associated to them and are held in a priority queue. The
program with the highest priority is given first use of the central processor.
Scheduling of jobs within a time-sharing system is another application of queues. In such system
many users may request processing at a time and computer time divided among these requests.
The simplest approach sets up one queue that store all requests for processing. Computer
processes the request at the front of the queue and finished it before starting on the next. Same
approach is also used when several users want to use the same output device, say a printer. In a
time sharing system, another common approach used is to process a job only for a specified
maximum length of time. If the program is fully processed within that time, then the computer
goes on to the next process. If the program is not completely processed within the specified time,
the intermediate values are stored and the remaining part of the program is put back on the
queue. This approach is useful in handling a mix of long and short jobs.
18
1) Serving requests of a single shared resource (printer, disk, CPU), transferring data
asynchronously (data not necessarily received at same rate as sent) between two
2) Call center phone systems will use a queue to hold people in line until a service
representative is free.
3) Buffers on MP3 players and portable CD players, iPod playlist. Playlist for jukebox - add
4) When programming a real-time system that can be interrupted (e.g., by a mouse click or
proceeding with the current activity. If the interrupts should be handles in the same order
6) When data is transferred asynchronously (data not necessarily received at same rate as
sent) between two processes. Examples include IO Buffers, pipes, file IO, etc.
4. Summary
In this chapter, we discussed the data structure Queue. It had two ends. One is front from
where the elements can be deleted and the other if rear to where the elements can be added. It
follows first in first out (FIFO) order. A queue can be implemented using Arrays or Linked lists.
Each representation is having it’s own advantages and disadvantages. The problems with arrays
are that they are limited in space. Hence, the queue is having a limited capacity. If queues are
19
implemented using linked lists, then this problem is solved. Now, there is no limit on the
capacity of the queue. The only overhead is the memory occupied by the pointers.
There are a number of variants of the queues. Normally, queues mean circular queues. A special
type of queue called Dequeue was also discussed in this chapter. Dequeues permit elements to be
added or deleted at either of the rear or front. We also discussed the array and linked list
implementations of Dequeue. Priority queues are also discussed in detail in this chapter.
Queues are employed in many situations. The items on a queue may be vehicle waiting at a
crossing, car waiting at the service station, customers in a check-out line at a departmental store,
etc. In computer science the queue is generally used in operating systems, networking
simulation, etc.
Publications.
Data Structures and Program Design in C by Kruse, C.L.Tonodo and B. Leung; Pearson
Education.
Suppose each data structure is stored in a circular array with N memory cells.
20
Find the number of elements in a queue in terms of FRONT and REAR.
(a) Suppose an element is added to the deque. How is LEFT or RIGHT changed?
Describe the queue, including FRONT and REAR as the following operations take place:
(a) X is added, (b) two elements are deleted (c) Y is added (d) Z is added (e) three elements are
Suppose a queue is maintained by a circular queue QUEUE with N = 12 memory cells. Find
the number of elements in QUEUE if (a) FRONT = 4, REAR = 8, (b) FRONT = 10, REAR
= 3 and (c) FRONT = 5, REAR = 6 and then two elements are deleted.
Compare the array and linked representation of a queue. Explain your answer.
Write the pseudocode for the various possible operations in an input restricted deque.
Write the pseudocode for the various possible operations in an output restricted deque.
21