0% found this document useful (0 votes)
193 views

Data Structure

This document provides an introduction to data structures. It discusses elementary data organization including fields, records, and files. It then defines data structures and explains why they are important. The document outlines different types of data structures including primitive structures like integers, floats, characters, pointers, and booleans. It also discusses linear and non-linear composite data structures. Finally, it covers abstract data types, the relationship between data types and structures, and analyzing algorithms based on time and space complexity.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
193 views

Data Structure

This document provides an introduction to data structures. It discusses elementary data organization including fields, records, and files. It then defines data structures and explains why they are important. The document outlines different types of data structures including primitive structures like integers, floats, characters, pointers, and booleans. It also discusses linear and non-linear composite data structures. Finally, it covers abstract data types, the relationship between data types and structures, and analyzing algorithms based on time and space complexity.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 156

Lesson : 1 Writer : Dr.

Pardeep Kumar Mittal


Title : Data Structure : An Introduction Vetter : Prof. Rakesh Kumar

Structu re:
1. Int ro duct io n
2. Object ive
3. Present at io n o f Co nt ent s
3.1 Elementary Data Organization
3.2 Data Structures
3.2.1 Why Data Structure?
3.2.2 Types of Data Structures
3.3 Abstract Data Type
3.4 Data Types, Data Structures and Abstract Data Types
3.5 Operations on Data Structures
3.6 Algorithm
3.6.1 Analysis of Algorithm
3.6.2 Complexity
3.6.2.1 Asymptotic Analysis
3.6.2.2 Tradeoff between space and time complexity
3.6.3 Measuring the Running Time of a Program
3.6.4 Time Complexity
3.6.5 Space Complexity
3.6.6 Comparison of Complexities
4. Su mmar y
5. Suggest ed Read ing s/Refer ence mat er ial
6. Self Assess ment Quest io ns ( S AQ)

1
1. Int roduction

It is important for every Computer Science student to understand the concept of

information and how it is organized or how it can be utilized.

Data structures are building blocks of a program. They are like pillars of a huge structure. If a

program is built using improper data structures, then the program may not work as expected

always. It is very much important to use right data structures for a program.

When the software is developed, it is very important to consider space and time complexities as

essential parameters that are to be met by it. Software may be developed, but, it may take a

longer time to produce output and hence, it may not be used. The same is the case with respect to

space. A program should not occupy more than a specific amount of memory. Both these

parameters are technically termed as Time and Space complexities. A program/algorithm must

be analyzed for its space and time complexities.

Data structures enable a programmer to structure a program in such a way that the data are

represented in the same way as they are represented in real life.

2. Objectives

At the end of this chapter the reader must be able to understand the basic concept of data

structure. The user must also be able to know about various primitive and composite data

structures. The various types of common operations to be performed on different data structures

are also explained in this chapter. As data structure is generally based on algorithms, therefore it

is important to understand how to design an efficient algorithm. In this chapter, the time-space

complexity is discussed in detail, which helps in creating efficient algorithm. Overall, the reader

must be able to understand what data structure is.

2
3. Presentation o f Content s

3.1 Elementary Data Organization

Data are nothing but some type of a statistic or a set of values (statistics). A data

item refers to a single unit of data. Data can be either raw or processed. Raw data can be divided

into two subparts i.e. group items and elementary items. For example name of a person may be

assumed as group item as it is a combination of first name, middle name and surname. While

employee number in an employee record can not be divided further that’s why known as

elementary data item.

If we arrange some data in an appropriate sequence, then it forms a Structure and gives us a

meaning. This meaning is called information. Information is also known as processed data.

Data can be represented in a hierarchy in a computer science. This hierarchy comprises of fields,

records and files. For understanding this hierarchy let us introduce some more concepts.

An entity is something that has certain properties or attributes which may be assigned some

value which can be either numeric or non-numeric. For example, an employee may be assumed

as an entity, while its attributes are name, age, sex, etc. Entities with similar attributes such as all

the employees in a university, forms an entity set.

The way the data are organized into hierarchy as above reflects the relationship between

attributes, entities and entity sets. Thus we can say that, a field is a single elementary unit of

information representing an attribute of an entity, a record is collection of field values of a given

entity and a file is a collection of records of the entities in a given set.

This organization of data into fields, records and files is not sufficient to maintain and process all

types of collection of data efficiently. Due to this reason, data are also organized into more

3
complex types of structures. The study of these complex types of structures form the basis for

data structure and comprise of following three steps:

(1) Logical or mathematical representation of structure

(2) Implementation of the structure on a computer.

(3) Quantitative analysis of the structure, which determines time and space complexity

for the processing of the structure.

3.2 Data Structures

Data structure can be defined in a number of ways, some of the definitions of data

structures are as follows:

(1) A data structure is a systematic way of organizing and accessing data.

(2) A data structure tries to structure data:

a. Usually more than one piece of data

b. Should define legal operations on the data

c. The data might be grouped together (e.g. in an linked list)

(3) When we define a data structure we are in fact creating a new data type of our

own.

a. Using predefined types or previously user defined types.

b. Such new types are then used to reference variables type within a program

But the most useful definition of data structures is: “the logical or mathematical model of a

particular organization of data is called a data structure”.

The choice of a particular data model depends on two considerations. First, it must be rich

enough to represent the actual relationship of data in real world. Second, the structure should

simple enough that can effectively process the data whenever required.

4
3.2.1 Why Data Structure?

Following are some of the reasons for using the data structures:

 Data structures study how data are stored in a computer so that operations can be

implemented efficiently

 Data structures are especially important when you have a large amount of information

 Conceptual and concrete ways to organize data for efficient storage and manipulation.

3.2.2 Types of Data Structures

Data structures can be categorized in two different ways viz. Primitive and Non-

Primitive (Composite) data structure.

Primitive Data Structure: Basically primitive data structures are not data structures instead they

are termed as primitive data types. It's a basic data structure and it's directly operated upon the

machine instructions. These structures have different representation on different computers.

In computer science, primitive data type can refer to either of the following concepts:

 A basic type is a data type provided by a programming language as a basic building

block. Most languages allow more complicated composite types to be recursively

constructed starting from basic types.

 A built-in type is a data type for which the programming language provides built-in

support.

Classic basic primitive types may include Character, Integer, Floating-point number, Boolean,

Pointer, etc.

INTEGERS: The quantity representing objects that are discrete in nature can be represented by

an integer. For example: 2, 4, 6, 0, -23,-56 etc. are integers but 2.5, -34.56 are not integers.

5
FLOAT: Float is a simple data type which takes 4 bytes in memory and we can assign decimal

number in to it. For example: 2.5, 245.67 etc.

CHARACTERS: Characters are the literal representation of some elements selected from

alphabets and characters are defined in the single quotes. There are wide varieties of character

sets. Two widely used character sets are represented by EBCDIC and ASCII. For example: '*', 'r',

'A', '2' etc. are constants.

POINTERS: Pointers is a reference to the data structure and a simple type of variable which

store the address of the another variable. Pointer is a single fixed size data item.

For example: int x; int *p; p=&x;

Here x is the integer, where p is a pointer of integer type and pointer p points to x.

BOOLEAN: Boolean is that data type which provides only two possible results i.e. true or false.

Composite (Non-Primitive) Data Structure: Composite data structure is based on the primitive

data structure. While primitive data structures are the basic building blocks, composite data

structures are created using these primitive data structures.

Composite data structure can again be divided into two categories:

(1) Linear Data Structure

(2) Non-Linear Data Structure

Let us try to understand them one by one.

Linear Data Structure: Linear data structure is said to be its elements or items from a sequence

one after other. Linear data structure comprises various data structures such as string, array,

linked list, stack, queue, etc.

String: A string is, essentially, a sequence of characters. A string is generally understood as a

data type storing a sequence of data values, usually bytes, in which elements usually stand for

6
characters according to a character encoding, which differentiates it from the more general array

data type.

Array: In computer programming, a group of homogeneous elements of a specific data type is

known as an array, one of the simplest data structures. Arrays hold a series of data elements,

usually of the same size and data type. Individual elements are accessed by their position in the

array. The position is given by an index, which is also called a subscript. The index usually uses

a consecutive range of integers, but the index can have any ordinal set of values.

Some arrays are multi-dimensional, meaning they are indexed by a fixed number of integers.

Generally, one- and two-dimensional arrays are the most common. Most programming languages

have a built-in array data type.

Fig 1.1: Array Representation

Link List: In computer science, a linked list is one of the fundamental data structures used in

computer programming. It consists of a sequence of nodes, each containing arbitrary data fields

and one or two references pointing to the next and/or previous nodes. A linked list is a self-

referential data type because it contains a link to another data of the same type. Linked lists

permit insertion and removal of nodes at any point in the list in linear time, but do not allow

random access.

7
Fig. 1.2: Linked-List Representation

There are various types of linked lists such as Singly-linked List, Doubly-linked List,

Circular-linked List, Header linked-List, etc.

Stack: A stack is a linear Structure in which item may be added or removed only at one end.

There are certain frequent situations in computer science when one wants to restrict insertions

and deletions so that they can take place only at the beginning or at the end the list, not in the

middle. Two of the Data Structures that are useful in such situations are stacks and queues. A

stack is a list of elements in which an element may be inserted or deleted only at one end, called

the top. This means, in particular, the elements are removed from a stack in the reverse order of

that which they are inserted in to the stack. The stack also called "last-in first -out (LIFO)” list.

Fig. 1.3: Stack Representation

Special terminology is used for two basic operation associated with stack:

1. "Push" is the term used to insert an element into a stack.

8
2. "Pop" is the term used to delete an element from a stack.

Queue: A queue is a linear list of elements in which deletions can take place only at one end,

called the front and insertion can take place only at the other end, called rear. The term front and

rear are used in describing a linear list only when it is implemented as a queue.

Queues are also called “first-in first-out“ (FIFO) list. Since the first element in a queue will be

the first element out of the queue. In other words, the order in which elements enter in a queue is

the order in which they leave. The real life example: the people waiting in a line at Railway

ticket Counter form a queue, where the first person in a line is the first person to be waited on.

An important example of a queue in computer science occurs in timesharing system, in which

programs with the same priority form a queue while waiting to be executed.

Fig. 1.4: Queue Representation

Non-Linear Data Structure: Non linear data structure is the hierarchical relationship between

individual data items. Every data item is attached to several other data items in a way that is

specific for reflecting relationships. The data items are not arranged in a sequential structure. Ex:

Trees, Graphs

Trees: Data frequently contain a hierarchical relationship between various elements. This non-

linear Data structure which reflects this relationship is called a rooted tree graph or, tree. This

structure is mainly used to represent data containing a hierarchical relationship between

elements, e.g. record, family tree and table of contents.

9
A tree consist of a distinguished node r , called the root and zero or more (sub) tree t1 , t2 , ... tn ,

each of whose roots are connected by a directed edge to r .

In the tree of figure 1.5, the root is r, Node t 2 has r as a parent and t2.1, t2.2 and t2.3 as children.

Each node may have arbitrary number of children, possibly zero. Nodes with no children are

known as leaves.

Fig 1.5: Tree Representation

Graph: A graph consists of a set of Vertices(nodes) and a set of edges. Each edge in a graph is

specified by a pair of vertices. A vertex n is incident to an edge x if n is one of the two nodes in

the ordered pair of nodes that constitute x. The degree of a node is the number of arcs incident to

it. The indegree of a node n is the number of arcs that have n as the head, and the outdegree of n

is the number of arcs that have n as the tail.

The graph is the nonlinear data structure. The graph shown in the figure 1.6 represents 7 vertices

and 12 edges. The Vertices are { 1, 2, 3, 4, 5, 6, 7} and the arcs are {(1,2), (1,3), (1,4), (2,4),

(2,5), (3,4), (3,6), (4,5), (4,6), (4,7), (5,7), (6,7) }. Node (4) in figure 1.6 has indegree 3,

outdegree 3 and degree 6.

10
Fig. 1.6: Graph Representation

3.3 Abstract Data Type

Abstract Data Types (ADT's) are a model used to understand the design of a data

structure. Abstract implies that we give an implementation-independent view of the data

structure. ADTs specify the type of data stored and the operations that support the data. Viewing

a data structure as an ADT allows a programmer to focus on an idealized model of the data and

its operations.

We can think of an abstract data type (ADT) as a mathematical model with a collection of

operations defined on that model. Sets of integers, together with the operations of union,

intersection, and set difference, form a simple example of an ADT.

3.4 Data Types, Data Structures and Abstract Data Types

Although the terms "data type", "data structure" and "abstract data type" sounds alike,

they have different meanings. In a programming language, the data type of a variable is the set of

values that the variable may assume. For example, a variable of type boolean can assume either

the value true or the value false, but no other value. The basic data types vary from language to

language; in ‘C’ they are integer, float, and char. The rules for constructing composite data types

out of basic ones also vary from language to language.

11
An abstract data type is a mathematical model, together with various operations defined on the

model. We shall design algorithms in terms of ADT's, but to implement an algorithm in a given

programming language we must find some way of representing the ADT's in terms of the data

types and operators supported by the programming language itself. To represent the

mathematical model underlying an ADT we use data structures, which are collections of

variables, possibly of several different data types, connected in various ways.

3.5 Operations on Data Structures

The data appearing in our data structure is processed by means of certain operations. In

fact, the particular data structure that one chooses for a given situation depends largely on the

frequency with which specific operations are performed. The following four operations play a

major role:

 Traversing: Accessing each record exactly once so that certain items in the record may

be processed. (This accessing or processing is sometimes called visiting the records.)

 Searching: Finding the location of the record with a given key value, or finding the

locations of all records, which satisfy one or more conditions.

 Inserting: Adding new records to the structure.

 Deleting: Removing a record from the structure.

There are two more operations, which can be used in special situations and are discussed below:

 Sorting: Arranging the records in some logical order (e.g. alphabetically in ascending or

descending order according to name, or in numerical order according to some number

key, etc.)

 Merging: Combining the records in two different sorted files into a single sorted file.

12
Some other operations such as copying and concatenation may also be performed on some data

structures.

3.6 Algorithm

An algorithm can be defined as a sequence of definite and effective instructions, which

terminates with the production of correct output from the given input. Algorithms may be

written in pseudo code that resembles programming languages like C and Pascal.

Algorithm should have five basic characteristic features such as Input, Output, Definiteness,

Effectiveness and Termination.

Once we have a suitable mathematical model for our problem, we can attempt to find a solution

in terms of that model. Our initial goal is to find a solution in the form of an algorithm, which is

a finite sequence of instructions, each of which has a clear meaning and can be performed with a

finite amount of effort in a finite length of time. An integer assignment statement such as

x: = y + z is an example of an instruction that can be executed in a finite amount of effort. In an

algorithm instructions can be executed any number of times, provided the instructions

themselves indicate the repetition. However, we require that, no matter what the input values

may be, an algorithm terminates after executing a finite number of instructions. Thus, a program

is an algorithm as long as it never enters an infinite loop on any input.

Consider a simple algorithm for finding the factorial of n.

Algorithm Factorial (n)

Step 1: FACT = 1

Step 2: for i = 1 to n do

Step 3: FACT = FACT * i

Step 4: print FACT

13
Specification: Computes n!.

Pre-condition: n >= 0

Post-condition: FACT = n!

 For better understanding conditions can also be defined after any statement, to specify

values in particular variables.

 Pre-condition and post-condition can also be defined for loop, to define conditions

satisfied before starting and after completion of loop respectively.

 What is remain true before as well as after execution of the ith iteration of a loop is called

"loop invariant".

 These conditions are useful during debugging process of algorithms implementation.

 Moreover, these conditions can also be used for giving correctness proof.

3.6.1 Analysis of Algorithm

Analysis of Algorithms is a field of computer science whose overall goal is to

understand the complexity of algorithms. To analyze an algorithm is to determine the amount of

resources (such as time and storage) that are utilized by computer to execute. Most algorithms

are designed to work with inputs of arbitrary length.

When solving a problem, a frequent situation is to choose among various algorithms. On what

basis should we choose? There are two often contradictory goals.

1. We would like an algorithm that is easy to understand, code, and debug.

2. We would like an algorithm that makes efficient use of the computer's resources, especially,

one that runs as fast as possible and use minimum space as well.

For analyzing an algorithm, the complexity of an algorithm is taken into account.

14
3.6.2 Complexity

Complexity refers to the rate at which the required storage or consumed time

grows as a function of the problem size. The absolute growth depends on the machine used to

execute the program, the compiler used to construct the program, and many other factors. We

would like to have a way of describing the inherent complexity of a program (or piece of a

program), independent of machine/compiler considerations. This means that we must not try to

describe the absolute time or storage needed. We must instead concentrate on a “proportionality”

approach, expressing the complexity in terms of its relationship to some known function. This

type of analysis is known as asymptotic analysis. It may be noted that we are dealing with

complexity of an algorithm not that of a problem. For example, the simple problem could have

high order of time complexity and vice-versa.

3.6.2.1 Asymptotic Analysis

Asymptotic analysis is based on the idea that as the problem size grows,

the complexity can be described as a simple proportionality to some known function. This idea is

incorporated in the Big O, Omega and Theta notation for asymptotic performance.

3.6.2.2 Tradeoff between space and time complexity

We may sometimes seek a tradeoff between space and time complexity.

For example, we may have to choose a data structure that requires a lot of storage in order to

reduce the computation time. Therefore, the programmer must make a judicious choice from an

informed point of view. The programmer must have some verifiable basis based on which a data

structure or algorithm can be selected. Complexity analysis provides such a basis.

We will learn about various techniques to bind the complexity function. In fact, our aim is not to

count the exact number of steps of a program or the exact amount of time required for executing

15
an algorithm. In theoretical analysis of algorithms, it is common to estimate their complexity in

asymptotic sense, i.e., to estimate the complexity function for reasonably large length of input n.

Big-O notation, omega notation(Ω) and theta notation(Θ) are used for this purpose. In order to

measure the performance of an algorithm underlying the computer program, our approach would

be based on a concept called asymptotic measure of complexity of algorithm. There are notations

like big-O, Θ, and Ω for asymptotic measure of growth functions of algorithms. The most

common being big-O notation. The asymptotic analysis of algorithms is often used because time

taken to execute an algorithm varies with the input n and other factors which may differ from

computer to computer and from run to run. The essences of these asymptotic notations are to

bind the growth function of time complexity with a function for sufficiently large input.

Big-Oh and Big-Omega Notation

To talk about growth rates of functions we use what is known as big-oh notation. For example,

when we say the running time T(n) of some program is O(n2), read as "big oh of n squared", we

mean that there are positive constants c and n0 such that for n equal to or greater than n0, we

have T(n) ≤ cn2.

Example: Suppose T(0) = 1, T(1) = 4, and in general T(n) = (n+1)2. Then we see that T(n) is

O(n2), as we may let n0 = 1 and c = 4. That is, for n ≥ 1, we have (n + 1)2 ≤ 4n2, as the reader

may prove easily.

In what follows, we assume all running-time functions are defined on the nonnegative integers,

and their values are always nonnegative, although not necessarily integers. We say that T(n) is

O(f(n)) if there are constants c and n0 such that T(n) ≤ cf(n) whenever n ≥ n0. A program whose

running time is O(f (n)) is said to have growth rate f(n).

16
When we say T(n) is O(f(n)), we know that f(n) is an upper bound on the growth rate of T(n). To

specify a lower bound on the growth rate of T(n) we can use the notation T(n) is Ω(g(n)), read

"big omega of g(n)", to mean that there exists a positive constant c such that T(n) ≥ cg(n)

infinitely often (for an infinite number of values of n).

Example: To verify that the function T(n)= n3 + 2n2 is Ω(n3), let c = 1. Then T(n) ≥ cn3 for n =

0, 1, . . ..

For another example, let T(n) = n for all odd n ≥ 1 and T(n) = n2/100 for all even n ≥ 0. To verify

that T(n) is Ω(n2), let c = 1/100 and consider the infinite set of n's: n = 0, 2, 4, 6, . . ..

Asymptotic notation

Let us take some more examples of the above notations

Example: f(n) = 3n3 + 2n2 + 4n + 3

= 3n3 + 2n2 + O (n), as 4n + 3 is of O (n)

= 3n3+ O (n2), as 2n2 + O (n) is O (n2)

= O (n3)

Example: f(n) = n² + 3n + 4 is O(n²), since n² + 3n + 4 < 2n² for all n > 10.

By definition of big-O, 3n + 4 is also O(n²), O(n3) & above, too, but as a convention, we use the

tighter bound to the function, i.e., O(n).

Here are some rules about big-O notation:

1. f(n) = O(f(n)) for any function f. In other words, every function is bounded by itself.

2. aknk + ak−1nk−1 + · · · + a1n + a0 = O(nk) for all k ≥ 0 and for all a0, a1, . . . , ak Є R. In other

words, every polynomial of degree k can be bounded by the function nk. Smaller order terms can

be ignored in big-O notation.

17
3. Basis of Logarithm can be ignored in big-O notation i.e. loga n = O(logb n) for any bases a, b.

We generally write O(log n) to denote a logarithm n to any base.

4. Any logarithmic function can be bounded by a polynomial i.e. logb n = O(nc) for any b (base of

logarithm) and any positive exponent c > 0.

5. Any polynomial function can be bounded by an exponential function i.e. nk = O (bn.) for any

constant b.

6. Any exponential function can be bound by the factorial function. For example, a n = O(n!) for

any constant a.

3.6.3 Measuring the Running Time of a Program

The running time of a program depends on factors such as:

1. The input to the program,

2. The quality of code generated by the compiler used to create the object program,

3. The nature and speed of the instructions on the machine used to execute the program,

4. The time complexity of the algorithm underlying the program.

The fact that running time depends on the input tells us that the running time of a program should

be defined as a function of the input. Often, the running time depends not on the exact input but

only on the size of the input. A good example is the process known as sorting. In a sorting

problem, we are given as input a list of items to be sorted, and we are to produce as output the

same items, but smallest (or largest) first. For example, given 2, 1, 3, 1, 5, and 8 as input we

might wish to produce 1, 1, 2, 3, 5, and 8 as output. The latter list is said to be sorted smallest

first. The natural size measure for inputs to a sorting program is the number of items to be sorted,

or in other words, the length of the input list. In general, the length of the input is an appropriate

size measure, and we shall assume that measure of size unless we specifically state otherwise.

18
It is customary, then, to talk of T(n), the running time of a program on inputs of size n.

For example, some program may have a running time T(n) = cn2, where c is a constant. The units

of T(n) will be left unspecified, but we can think of T(n) as being the number of instructions

executed on an idealized computer.

For many programs, the running time is really a function of the particular input, and not just of

the input size. In that case we define T(n) to be the worst case running time, that is, the

maximum, over all inputs of size n, of the running time on that input. We also consider Tavg(n),

the average, over all inputs of size n, of the running time on that input. While Tavg(n) appears a

fairer measure, it is often fallacious to assume that all inputs are equally likely. In practice, the

average running time is often much harder to determine than the worst-case running time, both

because the analysis becomes mathematically intractable and because the notion of average input

frequently has no obvious meaning. Thus, we shall use worst-case running time as the principal

measure of time complexity, although we shall mention average-case complexity wherever we

can do so meaningfully.

3.6.4 Time Complexity

There was a model named as RAM devised by John von Neumann to analyze

algorithms. We will try to analyze the algorithms based on this model. According to this model

• Each “simple” operation (+, -, =, if, call) takes exactly 1 step.

• Loops and subroutine calls are not simple operations, but depend upon the size of the data and

the contents of a subroutine.

• Each memory access takes exactly 1 step.

The complexity of algorithms using big-O notation can be defined in the following way for a

problem of size n:

19
• Constant-time method is order 1 : O(1). The time required is constant independent of the input

size.

• Linear-time method is order n: O(n). The time required is proportional to the input size. If the

input size doubles, then, the time to run the algorithm also doubles.

• Quadratic-time method is order N squared: O(n2). The time required is proportional to the

square of the input size. If the input size doubles, then, the time required will increase by four

times.

• Logarithmic-time method is order log n: O(log n). The time required is proportional to the log

of the input size.

• Linear Logarithmic-time method is order nlogn: O(n log n). The time required is proportional

to the multiplication of input size and log of input size.

The process of analysis of algorithm (program) involves analyzing each step of the algorithm. It

depends on the kinds of statements used in the program.

Consider the following example:

Example 1: Simple sequence of statements

Statement 1;

Statement 2;

...

...

Statement k;

The total time can be found out by adding the times for all statements:

Total time = time(statement 1) + time(statement 2) + ... + time(statement k).

20
It may be noted that time required by each statement will greatly vary depending on whether

each statement is simple (involves only basic operations) or otherwise. Assuming that each of the

above statements involve only basic operation, the time for each simple statement is constant and

the total time is also constant: O(1).

Example 2: if-then-else statements

In this example, assume the statements are simple unless noted otherwise.

if-then-else statements

if (cond) {

sequence of statements 1

else {

sequence of statements 2

In this, if-else statement, either sequence 1 will execute, or sequence 2 will execute depending on

the boolean condition. The worst-case time in this case is the slower of the two possibilities. For

example, if sequence 1 is O(N2) and sequence 2 is O(1), then the worst-case time for the whole

if-then-else statement would be O(N2).

Example 3: for loop

for (i = 0; i < n; i + +) {

sequence of statements

Here, the loop executes n times. So, the sequence of statements also executes n times. Since we

assume the time complexity of the statements are O(1), the total time for the loop is n * O(1),

21
which is O(n). Here, the number of statements does not matter as it will increase the running

time by a constant factor and the overall complexity will be same O(n).

Example 4: nested for loop

for (i = 0; i < n; i + +) {

for (j = 0; j < m; j + +) {

sequence of statements

Here, we observe that, the outer loop executes n times. Every time the outer loop executes, the

inner loop executes m times. As a result of this, statements in the inner loop execute a total of

n*m times. Thus, the time complexity is O(n * m). If we modify the conditional variables, where

the condition of the inner loop is j < n instead of j < m (i.e., the inner loop also executes n times),

then the total complexity for the nested loop is O(n2).

3.6.5 Space Complexity

Although memory now-a-days is quite cheap and no issue at all. But if

somehow memory becomes a constraint in a program, the program will not execute at all.

Therefore, in that case it becomes more critical issue than time. Also it is always assumed to be

good program if along with being faster it is also memory efficient.

For an iterative program, it is usually just a matter of looking at the variable declarations and

storage allocation calls, e.g., number of variables, length of an array etc.

The analysis of recursive program with respect to space complexity is more complicated as the

space used at any time is the total space used by all recursive calls active at that time.

22
Example: Find the greatest common divisor (GCD) of two integers, m and n. The algorithm for

GCD may be defined as follows:

While m is greater than zero:

If n is greater than m, swap m and n.

Subtract n from m.

n is the GCD

The space-complexity of the above algorithm is a constant. It just requires space for three

integers m, n and t. So, the space complexity is O(1).

The time complexity depends on the loop and on the condition whether m>n or not. The real

issue is how much iteration takes place? The answer depends on both m and n.

Best case: If m = n, then there is just one iteration. O(1) Worst case : If n = 1,then there are m

iterations; this is the worst-case (also equivalently, if m = 1 there are n iterations) O(n).

The space complexity of a computer program is the amount of memory required for its proper

execution. The important concept behind space required is that unlike time, space can be reused

during the execution of the program. As discussed, there is often a trade-off between the time

and space required to run a program.

3.6.6 Comparison of Complexities

Following table represents the various types of complexities:

Big-O Notation Name Examples

O(1) Constant One-Dimensional Array

O(logn) Logarithmic Binary Search

O(n) Linear Loop for an array

O(n log n) Linearithmetic or nlogn Merge Sort

23
O(n2) Quadratic Insertion Sort

O(nc) Polynomial or Geometric

O(cn) Exponential

O(n!) Factorial

The next table represents comparison of typical running time of algorithm of different order

Array Logarithmic: Linear: Quadratic: N * Cubic: Exponential:

Size Log2N N N2 Log2N N3 2N

5 3 5 25 15 125 32

10 4 10 100 40 103 103

100 7 100 104 700 106 1033

1000 10 1000 106 104 109 10300

4 Summary

In this chapter we have discussed the basic concept of data, records and files initially as it

is building block for understanding the concept of data structure. Data structure is the logical or

mathematical model of a particular organization of data. Data structure is used to represent data

in its inherent representation.

There are two basic types of data structures i.e. primitive and non-primitive. Primitive data

structure consists of integer, real, Boolean, character, pointer, etc. While non-primitive data

structure further consists of two sub-categories viz. linear & non-linear data structure. Linear

data structure comprises of array, linked list, stack, queue, etc. and non-linear data structure

comprises of trees and graphs.

24
We can perform some basic common operations on data structures such as traversing, insertion,

deletion and searching. Also some special operations can be performed on some other data

structures. The concept of abstract data types is also introduced in this chapter.

In the second part of the chapter, the concept of algorithm and analysis of algorithm is explained.

There might be number of algorithm for solving a similar problem. Then the question arises that

how to choose best algorithm. Here analysis of algorithm comes into picture. We have to take a

few considerations while picking an algorithm such as simplicity, easy to debug, fast, efficient

on storage and other computer resources. Complexity is one of the ways with the help of which

we can find which algorithm is best. There are two types of complexity i.e. time and space.

Tradeoffs between time & space complexity is also discussed in this chapter.

In this chapter the asymptotic notations of Big-O, Omega, and Theta are also discussed. These

notations, especially Big-O helps in finding the complexities of the algorithms. The time & space

complexities are discussed in detail.

In the end, comparisons of various complexities and algorithms are done.

5 Suggested Reading/Reference materi a l

 Seymour Lipschutz, “DATA STRUCTURES”, Tata McGraw Hill

 Aho, Hopcroft, Ullman, “Data Structure and Algorithm”, Addison Wesley

 Knuth, D. E., “The Art of Computer Programming Vol. I: Fundamental Algorithms”,

Addison-Wesley.

 Wirth, N., “Algorithms + Data Structures = Programs”, Prentice-Hall

6 Self Assessment Questions (SAQ)

25
 Given an array of n integers, write an algorithm to find the smallest element. Find

number of instruction executed by your algorithm. What are the time and space

complexities?

 Write an algorithm to find the median of n numbers. Find number of instruction executed

by your algorithm. What are the time and space complexities?

 Suppose we wish to multiply four matrices of real numbers M1 × M2 × M3 × M4 where

M1 is 10 by 20, M2 is 20 by 50, M3 is 50 by 1, and M4 is 1 by 100. Assume that the

multiplication of a p × q matrix by a q × r matrix requires pqr scalar operations, as it does

in the usual matrix multiplication algorithm. Find the optimal order in which to multiply

the matrices so as to minimize the total number of scalar operations. How would you find

this optimal ordering if there are an arbitrary number of matrices?

 What do you understand by data structures? Discuss various operations on data

structures.

 Discuss various types of data structures in detail.

 What do you mean by algorithm? Explain with examples.

 Discuss tradeoffs between time and space complexity of an algorithm.

 A professor keeps a class list containing the following data for each student:

Name, Major, Student Number, Test score, Final Grade

(a) State the entities, attributes and entity set of the list.

(b) Describe the field values, records and files.

 Suppose a data set S contains n elements. Compare the running time T 1 of the linear

search algorithm with the running time T2 of the binary search algorithm when (i) n =

1000 and (ii) n = 10000.

26
 Write an algorithm which finds the location of the largest and second largest element in

an array.

 Suppose P(n) = a0 + a1n + a2n2 + … + amnm ; that is, suppose P(n) = m. Prove that p(n) =

O(nm).

 Given: T1(n) = O(f(n)) and T2(n) = O(g(n)). Find T1(n).T2(n).

 Give simplified big-O notation for the following growth functions:

(a) 30n2 (b) 10n3 + 6n2 (c) 5nlogn + 30n (d) log n + 3n (e) log n + 32

 Find the complexity of the following program in big O notation:

printMultiplicationTable(int max){

for(int i = 1 ; i <= max ; i + +)

for(int j = 1 ; j <= max ; j + +)

cout << (i * j) << “ “ ;

cout << endl ;

 Consider the following program segment:

for (i = 1; i <= n; i *= 2)

j = 1;

What is the running time of the above program segment in big O notation?

 Why space complexity is more critical than time complexity?

27
 Write and explain the various asymptotic notations and explain with the help of

examples.

 Explain the procedure to compute the complexity of an algorithm.

28
Lesson : 2 Writer : Dr. Pardeep Kumar Mittal

Title : Strings Vetter : Prof. Rakesh Kumar

Structu re:

1. Int ro duct io n

2. Object ive

3. Present at io n o f Co nt ent s

3.1 Basic Terminology for Strings

3.2 Storing Strings

3.3 Character Data Type

3.4 String Operations

3.5 Word Processing

3.6 Pattern Matching

3.7Implementation of Strings in C

4. Su mmar y

5. Suggest ed Read ing s/Refer ence mat er ial

6. Self Assess ment Quest io ns (S AQ)

1
1. Int roduction

We have all used text processing programs such as Microsoft Word, Word Perfect, or

PageMaker to prepare documents, search existing documents for prespecified words or phrases.

However, we often fail to recognize that this type of effort involves the use of a very specific

type of data structure, called a string.

In computer programming, a string is traditionally a sequence of characters, either as a literal

constant or as some kind of variable. A string is generally understood as a data type and is often

implemented as an array that stores a sequence of elements, typically characters.

Depending on programming language and precise data type used, a variable declared to be a

string may either cause storage in memory to be statically allocated for a predetermined

maximum length or employ dynamic allocation to allow it to hold variable number of elements.

2. Objectives

At the end of this chapter the reader must be able to understand the concept of strings.

The user must also be able to know about various operations that can be applied on strings. The

various types of storage mechanisms on strings are also discussed in this chapter. One of the

important concept in string is pattern matching is discussed in detail in this chapter. The C

programs for handling various operations on strings are also explained in this chapter.

3. Presentation o f Content s

3.1 Basic Terminology for Strings

A string is basically a collection of characters. These character comprise of

alphabets, digits and special characters. The representation of strings may vary from language to

language. The basic terms that are associated with strings are length, substring, concatenation,

etc.

2
A string may consists of a finite sequence of zero or more characters. The number of characters

that a string contains is known as its length. A string is known to be empty string if its length is

zero. Strings may be denoted by single quotes(') or double quotes(“) depending on a particular

language. For example the string can be represented as 'India' or “India”.

Here the length of the string is 5. The string '' is having length 0 and the string 'India is a great

country' is having length 23. One must be careful while calculating the length of the string as

blank space is also part of string length.

Another basic operation applied on strings is string concatenation. Although there is no standard

symbol for string concatenation, we will be using // to indicate concatenation operation. The

concatenation is nothing but combining two string according to the specified order. For example,

'Hello' // 'India' will provide the result as 'HelloIndia'.

As clearly visible from the result that if we want to insert a blank space in between, we have to

concatenate that blank space as well, as shown below:

'Hello' // ' ' // 'India' = 'Hello India'

One more basic operation that should be understood before going into the details of string

processing is substring. A substring as its name indicates, is nothing but a subset of the original

string. In the example given above, 'Hello', ' ' and 'India' are all substring of the concatenated

string. 'Hello' is also known as initial substring as there is no substring before that, while 'India' is

termed as terminal substring as there is no substring after that.

The maximum length of a substring can be the length of the string itself, while the minimum

length can be zero.

The string in some of the programming languages such as C are stored as arrays of characters,

while in some other languages such as Java they are stored as classes and in some other

3
languages such as VB they are defined as a data type. Whatever may be the way to store the

strings, the operations remains same.

3.2 Storing Strings

Strings can be stored in computer memory using following three ways:

(a) Record-Oriented, Fixed-Length Storage

(b) Variable-Length, Fixed-Maximum Storage

(c) Linked Storage

Record-Oriented, Fixed-Length Storage: In this type of storage, the length of the string is

assumed to be fixed and each string is assumed to be a record. Obviously, each record will be

having the same length. An example of such type of storage can be a program with the maximum

size of the line being 80 characters. Each line in the program will be assumed as a record.

This type of storage is the oldest and simplest method to store strings. The main advantage of

this method is its simplicity. But there are a number of disadvantages of this method such as

(I) Entire record is to be read even if most of the storage consists of blank spaces in the end.

(II) Some of the records may need more space than the fixed-length.

(III) When a modification is required which consists of more or fewer characters than

the original text, the entire record needs to be changed.

Variable-Length, Fixed-Maximum Storage: One of the major problem of the previous method

was that the complete record has to be read even if there are blank spaces in the last part of the

record. This problem can easily be removed by this method. The storage in this method can be

done in two ways: (I) using a marker such as @ to signal the end of the string (II) using a pointer

array which stores the length of each string and points to each string.

The two types of storage can be understood by the fig. 2.1

4
1 India is a great country@
2 I like my country@
3 Hello@
.
.
.
n Bye@
Fig. 2.1(a) Records with markers

1 24 India is a great country@


2 17 I like my country@
3 5 Hello@
.
.
.
n 3 Bye@
Fig. 2.1(b) Record with specified length.

Although the problem of reading unnecessary blank spaces is solved in this type of storage. But

still the maximum length is fixed. Therefore any string having length more than this fixed-length

can not be stored here. Also the contiguous memory space is required to store such type of

strings.

Linked Storage: The major problem with earlier two methods is fixed maximum length. Now-a-

days, computers are used very frequently for the purpose of word processing and editing in a

word-processor is one of the basic necessity. Due to fixed-length in above two methods, editing

becomes difficult. Therefore one of the better method for storing string could be linked storage.

5
In a linked storage data can be placed anywhere which is connected through links. The major

advantage of such a storage is that consecutive space is not required for storing large strings.

Also if any modification has to be made to the original string, it can be easily implemented. The

only disadvantage is the space required for storing pointers. But this disadvantage is not as

problematic as was the problem of editing with earlier two methods. Overall it has been observed

that linked storage has been proved to be a very good and efficient method for storing strings.

The linked storage can be shown by the fig 2.2.

I N D

Fig. 2.2 (a) Linked Storage using one character per node
IND IA X

Fig. 2.2 (b) Linked Storage using three characters per node ( X indicates null value)

3.3 Character Data Type

Character data type can be handled by various programming languages in various ways.

Characters can be stored as constants or variables depending on the requirements. We will be

discussing both type of storage in various languages.

Constants: Some of the programming languages such as C & C++, stores character constant in

single quote such as 'A', 'B', etc., while string constant in double quotes such as “India is a great

country”. In Java the character constants are handled in a similar manner to C, but the strings are

handled as a class. In Visual Basic the string is itself a data type and the characters are stored in

the char data type. For example,

Dim str1 as string

Dim char1 as char

6
str1=”India”

char1=”a”

Variables: Although each programming language follows different rules for forming strings, still

the formation of strings can be classified into three categories: static, semistatic and dynamic.

In a static character variable a variable's length is defined before the program is executed and can

not change during the execution. A semistatic character variable is a variable whose length may

vary during the execution of the program subject to a fixed maximum. While a dynamic

character variable means a variable whose length may change during the execution of the

program.

In VB, the static variables can be declared as:

dim str1 as string*10

str1=”India”

In C and C++, the semistatic variable can be declared as follows:

char str1[10]; str1=”India”;

The dynamic variables in VB can be declared as:

dim str2 as string

str2=”india is a great country”

In C and C++, the dynamic variables can be declared as:

char *str2;

str2=”India is a great country”

3.4 String Operations

Strings can be manipulated by operations such as concatenation, length, substring and

pattern recognition. We discuss each of these operations as follows.

7
Concatenation: Given two strings s and t, the concatenation operation, written as s // t or s + t,

appends t to s, such that the last character of s is followed by the first character of t and the

remainder of t. For example, given s = "hello" and t = "dave" s // t or s + t = "hellodave".

Given two strings s and t of length M and N, respectively, the concatenation operation s // t can

be expressed in pseudocode, as follows:

Algorithm ( String Concatenation)

1. i = 0; j = 0;

2. while STR1[i] != '\o'

i++;

3. while STR2[j] != '\o'

STR1[i]= STR2[j];

i =i+1;

j = j+1;

4. STR1[i]= '\o';

5. Return STR1;

Length: Given a string s, the length operation returns the number of characters in s. Given s =

"hello", length(s) = 5. Given s = "hello world", length(s) = 11, since leading blank symbols and

internal blank symbols are included in the length calculation.

ALGORITHM FOR StringLength

1. length = 0, i=0;

2. while STR[i] != '\0'

i++;

length=i;

8
3. return length;

Substring: Given a string s of length N, a substring of s is a contiguous partition of s that has

length less than or equal to N. Accessing a substring from a given string requires three pieces of

information i.e. name of the string, position of the first character of the substring in the given

string and the length of the substring. The substring operation can be written as:

SUBSTRING(string, initial, length)

For example, Let String=”India is a great country” then SUBSTRING(string, 12, 5) = “great”.

Given a string s, the [i:j]-th substring of s can be computed as follows:

substring(s : string, i,j: integer):

{ for k = i to j do

output(s[k])

Indexing: Indexing is finding the position of a pattern in a given string. Indexing is also known

as pattern matching. It can be written as:

INDEX(text, pattern), which returns the first position of the pattern in a given string. If the

pattern does not exists in the text, then zero is returned.

For example, Let Text=”India is a great country”, then INDEX(text, “great”) = 12 and

INDEX(text, “greatest”)=0.

As pattern matching is a very important problem in computer science, it will be discussed in

detail in a separate section later on.

3.5 Word Processing

Although the string operations discussed above are the basic operations in string

processing, yet these operations are insufficient for word processing. Hence some additional

9
operations are needed for this purpose. We will be discussing three such operations in this

section, which are insertion, deletion and replacement.

Insertion: Insertion means we wish to insert a string within given text. Suppose the given text is

denoted by T, the string we wish to insert is S and the position where we wish to insert the string

is K, then insertion operation can be defined as:

INSERT(T, K, S)

For example, Let T=”India a great country”, S=”is “ and K=7, then INSERT(T, K, S) = “India is

a great country”.

The string operation discussed in the above section remains the basic building blocks even for

the word processing operations. Insertion operation can be implemented using these basic

operations as follows:

INSERT(T, K, S) = SUBSTRING(T, 1, K-1) // S // SUBSTRING(T, K, LENGTH(T) – K + 1).

Deletion: Deletion means we wish to delete a string within given text. Suppose the given text is

denoted by T, the string we wish to delete begins in position K and has length L, then deletion

operation can be defined as:

DELETE(T, K, L)

For example, Let T=”India is a great great country”, DELETE(T, 12, 6) = “India is a great

country”.

Using basic operations deletion operation can be implemented as:

DELETE(T, K, L) = SUBSTRING(T, 1, K – 1) // SUBSTRING(T, K + L, LENGTH(T) – K – L

+ 1)

10
Sometimes, we are given with the text and the pattern to be deleted from the text. In that case,

first position of the pattern in the text has to be identified and then the above-said operation can

be implemented. The position of the pattern can be identified with the help of INDEX operation.

Suppose text is denoted by T and pattern by P, then deletion operation can be performed as:

DELETE(T, INDEX(T, P), LENGTH(P))

For example, Let T=”India is a great great country”, P = “great ”, then DELETE(T, INDEX(T,

P), LENGTH(P)) = “India is a great country”, where INDEX(T, P) = 12 and LENGTH(P) = 6

As quite common in word processing, suppose we wish to delete every occurrence of the pattern

in a given text, then we can use the following algorithm.

Algorithm for deletion of every occurrence of the given pattern

1. K = INDEX(T, P)

2. Repeat while K ≠ 0

(a) T = DELETE(T, INDEX(T, P), LENGTH(P))

(b) K = INDEX(T, P)

3. Write T

4. Exit

For example, if T = “India is a great great country” and P = “great ”, then executing the above

algorithm given the result as T = “India is a country”

One can consider another interesting example, suppose T = “XAAABBBY”, and P = “AB”, then

although it appear that the pattern “AB” is appearing just once, but when we apply the above

algorithm, after first execution of the algorithm the text T becomes “XAABBY”, which shows

that pattern “AB” appears again and continued execution of the algorithm produces the final

output as T = “XY”.

11
Replacement: Replacement means we wish to replace a string from another string within given

text. Suppose the given text is denoted by T, the string we wish to replace is P1 by string P2,

then replacement operation can be defined as:

REPLACE(T, P1, P2)

For example, if T = “India was a great country”, P1 = “was”, and P2 = “is”, then REPLACE(T,

P1, P2) = “India is a great country”.

If T = “India is a great country”, P1 = “was”, and P2 = “is”, then REPLACE(T, P1, P2) = “India

is a great country”. As “was” is not existing in the text, there is no change in the original text.

Replace operation can not be implemented by a single line operation using the basic string

operations. Now we will see how can replace operation be implemented in three steps.

K = INDEX(T, P1)

T = DELETE(T, K, LENGTH(P1))

INSERT(T, K, P2)

This implementation will replace the first occurrence of the pattern P1 with pattern P2. If we

want to replace every occurrence of pattern P1 with pattern P2 in the text, we have to follow the

following algorithm:

Algorithm to replace every occurrence of pattern P1 with pattern P2 in given text.

1. K = INDEX(T, P1)

2. Repeat while K ≠ 0

(a) T = REPLACE(T, P1, P2)

(b) K = INDEX(T, P1)

3. Write T

4. Exit

12
For example, if T = “India is a greatest greatest country”, P1 = “greatest”, and P2 = “great”, then

after execution of above algorithm, we get T = “India is a great great country”.

One has to take special care while using the above algorithm with the following type of data:

Suppose T = “XAY”, P1 = “A” and P2 = “AB”, then T = “XABY” after first execution of the

algorithm and T = “XABBY”, and so on. The algorithm will never terminate.

3.6 Pattern Matching

String matching is a most important problem. String matching consists of searching a

query string P in a given text T. Generally the size of the pattern to be searched is smaller than

the given text. There may be more than one occurrences of the pattern P in the text T. Sometimes

we have to find all the occurrences of the pattern in the text. There are several applications of the

string matching. Some of these are text editors, search engines, etc.

Since string-matching algorithms are used extensively, these should be efficient in terms of time

and space. Let P [1..m] is the pattern to be searched and its size is m. T [1..n] is the given text

whose size is n. Assume that the pattern occurs in T at position (or shift) i. Then the output of the

matching algorithm will be the integer i where 1 <= i <= n-m. If there are multiple occurrences of

the pattern in the text, then sometimes it is required to output all the shifts where the pattern

occurs.

Brute Force String Matching algorithm.

 Let Si is the substring of T, beginning at the i th position and whose length is same as

pattern P.

 We compare P, character by character, with the first substring S1. If all the corresponding

characters are same, then the pattern P appears in T at shift 1. If some of the characters of

13
S1 are not matched with the corresponding characters of P, then we try for the next

substring S2. This procedure continues till the input text exhausts.

 In this algorithm we have to compare P with n-m+1 substrings of T. 1]

Let P[0..m] is the given pattern and T[0..n] is the text.

1. i = 1;

2. Repeat steps 3 to 5 while i <=n-m+1 do

3. for j= 1 to m

If P[j] ≠ T[i+j-1] then

goto step 5

4. Print "Pattern found at shift i "

5. i= i + 1

6. exit

The complexity of the brute force string matching algorithm is O(nm). On average the inner loop

runs fewer than m times to know that there is a mismatch. The worst case situation arises when

first m character are matched for all substrings Si. If pattern is of the am-1b and text is of the form

an-1b, where an-1 denotes a repeated n -1 times. In this case the inner loop runs exactly for m

times before knowing that there is a mismatch. In this situation there will be exactly m*(n-m+1)

number of comparisons.

Another Pattern Matching Algorithm: The second pattern matching algorithm uses a table which

is derived from given pattern P and is independent of text T. Suppose P = “aaba” and T =

T1T2T3 …, Ti denotes the ith character of T and suppose that first two characters of T match

those of P i.e. suppose T = aa.... Then T has one of the following three forms: (I) T = aab..., (ii) T

= aaa..., (iii) T = aax, where x is any character different from a or b. Suppose we read T3 and

14
find that T3 = b. Then we next read T4 to see that if T4 = a, which gives P = W1, but if T3 = a,

then obviously P ≠ W1. But it is also known that W2 = aa..., i.e., first two characters of substring

W2 match those of P. Hence we next read T4 to see if T4 = b. Last suppose that T3 = x, then we

know that P ≠ W1, but we also know that P ≠ W2 and P ≠ W3, since x does not appear in P.

Hence we read T4 next to see if T4 = a, to match the pattern further.

The important point in above discussion is that we can start the comparison from much

ahead(size of the pattern or less) as compared to from the second character of the text in some

cases. Overall we can make a table as in fig.2.3, showing all the possibilities. The various

possibilities in this algorithm are denoted by Qi's as shown below:

Q0 = Λ, Q1 = a, Q2 = aa, Q3 = aab, Q4 = aaba = P.

f(Qi, t)

a b x
Q0 Q1 Q0 Q0
Q1 Q2 Q0 Q0
Q2 Q2 Q3 Q0
Q3 P Q0 Q0
Fig. 2.3 Pattern Matching Table

The algorithm can be written as follows:

Algorithm (Pattern Matching) : The pattern matching table F(Q1, T) of a pattern P is in memory,

and the input is an N-character string T = T1T2...TN.

1. K = 1 and S1 = Q0

2. Repeat steps 3 to 5 while SK ≠ P and K ≤ N.

3. Read TK.

4. SK+1 = F( SK, TK)

5. K = K + 1

15
6. If SK = P, then

INDEX = K – LENGTH(P)

else

INDEX = 0

7. Exit

The complexity of the above algorithm is based on number of times step 2 is executed. In worst

case whole of text T is read and the loop is executed LENGTH(T) times. On the basis of this, we

can say that complexity of this algorithm is O(n), which is obviously less than the Brute-Force

string-matching algorithm.

3.7 Implementation of String in C

Some of the most commonly used functions in the string library in C language are:

 strcat - concatenate two strings

 strchr - string scanning operation

 strcmp - compare two strings

 strcpy - copy a string

 strlen - get string length

 strncat - concatenate one string with part of another

 strncmp - compare parts of two strings

 strncpy - copy part of a string

 strrchr - string scanning operation.

Implementation of these function can be easily done by the following program:

/*Using string operation using the standard string facility */

#include<stdio.h>

16
#include<string.h>

void main()

char s1[25]="This is the city.";

char s2[50];

char *p1,*p2;

/*strcpy copies the s1 string to the s2 string.*/

strcpy(s2,s1);

printf("%s %s\n",s1,s2);

/*Use strcat to add a second string to the s2 string.*/

strcat(s2,"Los Angeles,California.");

printf("%s\n",s2);

/*The strlen function tells how many characters are there in the string.*/

printf("\"%s\"has %d characters,\"%s\"has %d characters.\n\n",s1, strlen(s1), s2, strlen(s2));

/*A strstr and strchr search for a substring or a member character.*/

p1=strchr(s1,'c');

printf("[%c][%s]\n",*p1,p1);

p1=strstr(s1,"the");

printf("[%c][%s]\n",*p1,p1);

strcpy(s2,s1);

/*You may use strcmp to compare strings.*/

if(strcmp(s1,s2)==0)

printf("Same.\n");

17
else

printf("Different.\n");

4 Summary

In this chapter we have discussed the basic concept of strings. Various basic operations

along with the operations required in word processing are also discussed in this chapter. The

mechanisms of storing strings in computer memory are also discussed. One of the important

concept of pattern matching has been explained along with its algorithms. In the end the

implementation of various built-in functions in C language has also been discussed.

5 Suggested Reading/Reference materi a l

 Seymour Lipschutz, “DATA STRUCTURES”, Tata McGraw Hill

 Aho, Hopcroft, Ullman, “Data Structure and Algorithm”, Addison Wesley

 Knuth, D. E., “The Art of Computer Programming Vol. I: Fundamental Algorithms”,

Addison-Wesley.

 Wirth, N., “Algorithms + Data Structures = Programs”, Prentice-Hall

6 Self Assessment Questions (SAQ)

 Let T be the string “ABCD”. (a) Find the length of T. (b) List all substrings of T. (c) List

all initial substrings of T.

 Describe various ways of storing strings in computer memory along with advantages and

disadvantages of each.

 Discuss the meaning of static, semistatic and dynamic character variables.

 Suppose T = “India is a great country” and S = “Yes”. Then using the string operations

find (a) LENGTH(T) (b) SUBSTRING(T, 12, 5) (c) INDEX(T, “great”) (d) T // ” “ // S.

18
 What will be the result in following cases:

(a) INSERT(“AAA”, 1, “BBB”) (b) INSERT(“AAA”, 2, “BBB”) (c)

DELETE(“AABB”, 2, 2) (d) REPLACE( “AABB”, “AA”, “BB”).

 Find the number of comparisons in the Brute-Force String-matching algorithm, when P =

“abc”, and T = “ababababab”.

 Consider the pattern P = “aaabb”. Construct the table used in the second pattern matching

algorithm.

 For each of the following cases find the number of comparisons to find the index (first

occurrence) of the pattern P in the text T.

a) P =cat, T = bcbcbcbc

b) P= bbb, T= aabbaabbaabbaabbb

c) P= xxx, T= xyxxyxxxyxxxyxxxxy

 What is the complexity of the brute force string-matching algorithm in the best case?

 Write a procedure to count the number of the time the word 'the' appears in a given text .

19
Lesson : 3 Writer : Dr. Pardeep Kumar Mittal

Title : Arrays Vetter : Pro f. Rak esh Ku mar

Structu re:

1. Int ro duct io n

2. Object ive

3. Present at io n o f Co nt ent s

3.1 Linear Arra ys

3.2 Represent at io n o f Linear Arrays in Memo r y

3.3 Operat io ns o n Arrays

3.3.1 Traver sal

3.3.2 Insert io n & Delet io ns

3.4 Mu lt id imensio nal Arrays

3.4.1 Two Dimensio nal Arrays

3.4.2 Represent at io n o f t wo dimensio nal arrays in memo r y

3.4.3 General Mu lt id imensio nal Arrays

3.5Mat r ices

3.6Sparse Mat r ices

3.7Ap p licat io ns o f Arrays

4. Su mmar y

5. Suggest ed Read ing s/Refer ence mat er ial

6. Self Assess ment Quest io ns (S AQ)

1
1. Introdu ction

One o f t he basic and impo rt ant dat a st ruct ures o f a pro gram is Arra y.

Array is a dat a st ruct ure which can represent a co llect io n o f ele ment s o f same

dat a t ype. An array can be o f any d imensio ns. It can be o ne, t wo o r

mu lt id imensio nal. An array is g enerally u sed when a large amo u nt o f d at a is t o

be st o red.

The simp lest fo r m o f arra y is a o ne-d ime nsio nal array t hat may be d efined as a

fin it e o rdered set o f ho mo geneo us ele ment s, which is st o red in co nt iguo us

memo r y lo cat io ns. Fo r examp le, an ar ray ma y co nt ain all int eger s o r all

charact ers o r any o t her d at a t ype, but may no t co nt ain a mix o f d ifferent dat a

t yp es. Var io u s o perat io ns such as t raver sal, insert io n, dele t io n, search ing and

so rt ing can be per fo r med o n arrays.

2. Objectives

In t his chapt er an ins ig ht int o arrays is t aken. The reader mu st be able t o

und erst and arra ys alo ng wit h it s t ypes aft er read ing t his chapt er. Then, dat a

o rganizat io n ins ide t he array is t aken car e o f. T he memo r y st o rage o f arrays in

co mput ers is also exp lained. The reader can appr eciat e t he var io us o perat io ns

such as t raversal, insert io ns and d elet io ns alo ng wit h examp les aft er read ing t his

chapt er. The co ncept o f mu lt id imens io nal arra ys is also d iscu ssed at lengt h in

t his chapt er. One o f t he impo rt ant to pic fr o m t he po int o f view o f saving st o rage

space is spar se mat r ix and is d iscu ssed in det ail. Overall t he read er will be able

to underst and t he co ncept o f arrays in dep t h at t he end o f t his chapt er.

2
3. Presentation of Contents

3.1 Linea r Array s

The array is pro bably t he mo st widely used dat a st ruct ure; in so me

languag es it is even t he o nly o ne availa ble. An array co nsist s o f co mpo nent s

which are all o f t he same t yp e; it is t her efo re c alled a ho mo geneo us st ruct ure.

The array is a rando m- access st ruct ure, becau se all co mpo nent s can be select ed

at rando m and are equ ally qu ick ly accessib le. I n o rder to deno t e an ind iv id ua l

co mpo nent , t he na me o f t he ent ire st ruct ure is au g ment ed by t he ind ex select ing

t he co mpo nent . This ind ex is generally an int eger bet ween 0 and n- 1, where n is

t he nu mber o f element s, t he size, o f t he ar ray.

An ind iv id ual co mpo nent o f an array can be select ed by an ind ex. Given an

array var iable x, we can deno t e an array select o r by t he array name fo llo wed by

t he respect ive co mpo nent 's index i, and can be wr it t en as x i o r x[ i]. Because o f

t he first , co nvent io na l no t at io n, a co mpo nent o f an arra y co mpo nent is t herefo re

also called a su bscr ipt ed var iable.

The gener al eq uat io n t hat can be u sed to find t he lengt h o r nu mber o f d at a

element s o f t he array is:

Lengt h = UB – LB + 1

where, UB is t he larg est ind ex, called t he upper bo u nd, and LB is t he sma llest

ind ex, called t he lo wer bo u nd o f t he array which is generally zero o r o ne.

Therefo r e when LB = 0, Lengt h( Array)=UB-1 and t he lengt h=UB when LB=1.

Fo r examp le, Co nsider an Array A as a 6 -element linear array o f int eg ers su ch

t hat :

3
A[1]=247, A[2]=56, A[3]=429, A[4]=135, A[5]=87, A[ 6]=156. Here UB = 7 and

LB = 6. Therefo re, lengt h[ A] = 6

The fo llo wing fig ures can be u sed t o represent t he array A.

Figur e 3.1

Figur e 3.2

Decla ring an a rra y: T he g eneral fo r m fo r declar ing a sing le d imens io nal array

is:

dat a_t yp e array_ name[expressio n] ;

where d at a_t ype repr esent s dat a t ype o f t he array i.e., int eger, char, flo at et c.,

array_ name is t he name o f array and exp ressio n which ind icat es t he max imu m

nu mber o f ele ment s in t he array.

Fo r exa mp le, co nsider t he fo llo wing C declarat io n:

int a[10 0];

It declares an array o f 100 int eg ers.

The amo u nt o f st o rage requ ir ed t o ho ld an array is d irect ly relat ed t o it s t ype

and size. Fo r a sing le d imensio n array, t he t ot al size in b yt es requ ired fo r t he

array is co mput ed as sho wn belo w.

4
Memo r y requ ir ed ( in byt es) = size o f (dat a t yp e) X lengt h o f ar ray

The fir st array index valu e is referred t o as it s lo wer bo und and in C it is always

0 and t he maximu m ind ex value is called it s up per bo u nd. T he nu mber o f

element s in t he array, called it s range is g iven b y upper bo und - lo wer bo u nd +1.

We st o re valu es in t he arrays dur ing pr o gram execut io n. Let us no w see t he

pro cess o f in it ializ ing an array while declar ing it .

int a[4] = {34,60,93,2} ;

int b[] = {2,3,4,5};

flo at c[] = {-4,6,81,− 60};

We co nclu de t he fo llo wing fact s fro m t hese examp les while using C:

( i) I f t he array is in it ia lized at t he t ime o f declar at io n, t hen t he d imens io n o f t he

array is o pt io nal.

( ii) T ill t he arra y element s are no t giv en any specific valu es, t hey co nt ain

garbage values.

3.2 Rep resentation o f Linea r Arra ys in Computer Memo ry

The linear arrays in co mput er memo r y are represent ed u sing co nt iguo us

memo r y lo cat io ns. Fo r examp le, co nsid er A t o be a linear array in t he co mp ut er

memo r y. As we k no w t hat t he co mput er memo r y is simp ly a sequ ence o f

addressed lo cat io n as sho wn in figure 3.3 belo w:

5
1000

1001

1002

1003

1004

1005

Figur e 3.3

Then we can u se t he fo llo wing no t at io n fo r ca lcu lat ing t he address o f an y

element in linear arrays in co mput er memo ry,

LOC( A[ K]) = ad dress o f t he element A[ K] o f t he arr ay A.

Here t he co mput er do es no t need t o keep t rack o f t he address o f ever y ele ment

o f array A, but o nly need s t o keep t rack of t he ad dress o f t he fir st element o f A,

which is d eno t ed by Base( A) and t er med as t he base addr ess o f arra y A.

The co mput er calc u lat es t he ad dress o f any ele ment o f an array A u sing t he

fo llo wing fo r mu la:

LOC( A[ K]) = Base( A) + W*( K- LB)

where W is t he nu mber o f wo rds per memo ry ce ll fo r t he array A e.g., in case o f

flo at in C t he value o f W is 4.

6
Fo r examp le, co nsid er an array A, which reco rds t he sales o f a co mp any in eac h

year fro m 183 2 t hro ugh 1884.

Assu me, Base( A) = 100 and W = 4 wo rds per memo r y cell.

Then t he base addresses o f fo llo wing arrays are,

LOC( A[ 1832]) = 100, LOC( A[1833]) = 10 4, LOC( A[183 4]) = 108, …….

Let us find t he addr ess o f t he array ele ment fo r t he year K = 187 5. It can be

o bt ained in t his way:

LOC( A[ 1875]) = Base( A) + W*(18 75 – LB)

LOC( A[ 1875]) = 200 + 4*(1875-18 32) = 272

Ag ain, it is clear t hat t he co nt ent s o f t his ele ment can be o bt ained wit ho ut

scann ing any o t her element in array A.

3.3 Operations on Array s

We can per fo r m t he fo llo w ing o perat io ns o n arrays

(a) Traversal: Pro cessing each ele ment in t he arra y.

(b) Search: Find ing t he lo cat io n o f t he element wit h a g iven value

(c) Insert io n: Ad d ing a new e lement to t he array.

(d) Delet io n: Remo ving an element fro m t he array.

(e) So rt ing: Arrang ing t he ele ment s in so me t ype o f o rder i.e. ascend ing

o r descend ing.

( f) Merg ing: Co mbin ing t wo arrays int o a sing le array.

No w we will be d iscu ssing t hese o per at io ns except so rt ing, searching and

merg ing in det ail. As t he o perat io ns o f so rt ing, search ing and merg ing will be

d iscu ssed in lat er chapt ers.

7
3.3.1 Traversing Lin ea r Array s

Let A be an array st o red in t he me mo r y o f t he co mput er. Suppo se

we want to eit her pr int t he co nt ent s o f each element o f A o r t o co unt t he nu mber

o f element s o f A wit h a g iven pro pert y such as find ing t he valu es great er t han

60. This can be acco mp lished b y t raversin g A, i.e, b y vis it ing each element o f A

exact ly o nce.

The fo llo wing algo r it hm can be used t o traverse a linear array A having lo wer

bo u nd LB and upp er bo und UB. This algo r it hm t raver ses A app lying an

o perat io n PROCESS t o each element o f A.

1. K = LB.

2. Repeat St eps 3 and 4 while K< UB

3. App ly P ROCESS t o A[K]

4. K= K+ 1

5. Exit .

The abo ve algo r it hm can also be wr it t en using fo r lo o p inst ead o f rep eat -while

lo o p as fo llo ws:

1. Repeat fo r K = LB to UB:

Ap p ly P ROCESS to A[K].

2. Exit .

Fo r examp le co nsider an array A st o ring t he marks o f st udent s o f a class having

ind exes fro m 1 t o 10. We have t o find t he nu mber o f st udent s who have go t

mo re t han o r equal t o 60. The fo llo wing algo r it h m carr y o ut t he g iven o perat io n

invo lves t raver sing A.

8
Find t he nu mber NUM o f st udent s who go t mo re t han o r equal t o 60 marks:

1. NUM = 0.

2. Repeat fo r K = 1 to 10

I f A[ K] >= 60, t hen NUM = NUM + 1

3. Ret urn

3.3.2 Inserting and Deleting

Let A be an array st o red in t he memo r y o f t he co mp ut er . Insert ing

means t he o perat io n o f ad d ing ano t her element to t he array A, and d elet ing

means t he o perat io n o f remo ving o ne o f t he ele ment s fro m array A. Let us

d iscu ss t he pro cedure o f insert ing and d elet ing an element when A is a linear

array.

Insert ing an e lement at t he end o f a linear array can be easily do ne pro vided t he

memo r y sp ace allo cat ed fo r t he array is large eno ugh t o acco mmo dat e t he

add it io nal ele ment . On t he o t her hand, su ppo se we need t o insert an element i n

t he midd le o f t he array. T hen, o n t he average, half o f t he element s mu st be

mo ved do wnward t o new lo cat io ns t o acco mmo dat e t he new element and k eep

t he o rder o f t he o t her element s.

S imilar ly, delet ing an element at t he end o f an array present s no d ifficu lt ies, but

delet ing an e lement so mewhere in t he midd le o f t he array wo u ld requ ire t hat

each su bsequ ent element be mo ved o ne lo cat io n up ward in o rder to fill up t he

empt y sp ace in t he arra y.

9
Fo r examp le co nsid er an array A has been declar ed as a 5 -ele ment array but dat a

have been reco rded o nly fo r A[1], A[2 ], and A[3]. T hen t he array can be

represent ed as sho wn in fig. 3.4

10 20 22

A[1] A[2] A[3] A[4] A[5]

fig.3.4

I f X is t he va lue t o t he next element , t hen we ma y simp ly assig n, A[4]: = X t o

add X t o t he Linear Array. S imilar ly, if Y is t he value o f t he su bsequent

element , t hen we ma y assig n, A[5]: = Y t o add Y to t he Linear Array. No w, we

ma y co nclude t hat we canno t add any new element t o t his Linear Array due t o

t he reach o f upper bo u nd.

No w co nsid er ano t her examp le, suppo se MARKS is an 8-ele ment linear array,

and suppo se five mark s are in t he array, as in Figure 3.5(a). Assu me t hat t he

mark s are st o red in ascend ing o rder, and suppo se we want to keep t he arra y

so rt ed at all t imes. Then, if we want to insert a new ele ment , we may have t o

mo ve t he dat a do wnwards as sho wn in fig 3.5(b) and if we want to delet e so me

dat a, we may have t o mo ve t he d at a upwards.

10 20 30 40

Figure 3.5 (a)

Suppo se 25 need s t o be insert ed in t he ab o ve array, t hen t he new array wo u ld be

10 20 25 30 40

Figur e 3.5 (b)

10
No w assu me t hat 20 needs t o be delet ed fo r m t he abo ve array, t hen t he new

array wo u ld be

10 25 30 40

Figur e 3.5 (c)

It can be o bser ved clear ly t hat such mo vement o f d at a wo uld be ver y exp ensive

if t ho usand s o f dat a it ems are in t he array.

Insertion Algo rithm: T he fo llo wing algo r it hm insert s a dat a element ITEM int o

t he Kt h po sit io n in a linear array A wit h N element s wit h t he assu mpt io n t hat

K<N.

INSERT(A,N, K,ITEM):

1. I = N

2. Repeat St eps 3 and 4 while I > K.

3. A[I+1] = A[I]

4. I = I-1

5. A[ K] = ITEM

6. N = N+1

7. Exit

Here t he first st ep is u sed t o sto re t he valu e o f N t o ano t her var iab le. T he next

t hree st eps are u sed t o mo ve t he dat a do wnwards so t hat necessar y space fo r t he

new element t o be insert ed can be creat ed. In t he fift h st ep t he new it em is

insert ed at kt h lo cat io n. In t he end t he nu mber o f ele ment s are increased by o ne.

Deletion Algo rithm: T he fo llo wing algo r it hm delet es t he Kt h element fro m a

linear array A and ass ig ns it t o a var iable ITEM wit h t he assu mpt io n t hat K<N.

11
DELETE(LA, N, K, ITEM):

1. ITEM = A[K]

2. Repeat fo r I = K to N-1

A[I] = A[I+1]

3. N = N – 1.

4. Exit .

Here t he first st ep is u sed t o sto re t he d elet ed dat a in so me var iable. T hen t he

seco nd st ep is u sed t o delet e t he dat a fro m array. T he final st ep is t o reset t he

value o f N aft er delet io n.

We may co nclude t hat if man y delet io ns and insert io ns are t o be mad e in a

co llect io n o f dat a element s, t hen a linear array may no t be t he mo st efficient

way o f st o ring t he d at a.

3.4 Multidimen siona l Array s

The linear arrays d iscu ssed so far are als o called o ne d imensio nal array,

since eac h element in t he array is referen ced b y a sing le su bscr ipt . No w we will

be d iscussing t he mu lt id imensio nal array, i.e., each element o f t he array will

no w be referenced b y mo r e t han o ne subscr ipt depend ing o n t he nu mber o f

d imensio ns. Mo st pro gramming languages allo w t wo -d imensio nal o r t hree-

d imensio nal arrays in general, but so me pro gramming languag es allo w t he

nu mber o f d imensio ns in an array t o be higher.

12
3.4.1 Two-Dimensi ona l Arrays

A t wo -d imensio na l m x n array A is a co llect io n o f m. n dat a

element s such t hat each element is specified by a p air o f int eg ers ( such as I, J),

called su bscr ipt s, wit h t he fo llo wing pro pert y t hat ,

1 < = I < = m and 1 <= J <= n

The element o f A wit h first su bscr ipt s i and seco nd su bscr ipt j will be d eno t ed

by A I ,J o r A[I, J]

Two -dimensio nal arrays are generally called mat r ices in mat hemat ics and t able s

in bu siness app licat io ns; hence t wo -dimen sio nal arra ys are called mat r ix arrays.

The schemat ic o f a t wo -dimensio nal array o f size 3 × 5 is sho wn in Fig ure 3.6.

A[0][0] A[0][1] A[0][2] A[0][3] A[0][4]

ROW – 1

A[1][0] A[1][1] A[1][2] A[1][3] A[1][4]

ROW – 2

A[2][0] A[2][1] A[2][2] A[2][3] A[2][4]

ROW – 3

Figur e 3.6

Fo r examp le t he sco res o f five p la yer in five mat ches can be sho wn by t he fig.

3.7.

13
Player Match1 Match2 Match3 Match4 Match5

A 29 78 55 0 4

B 65 101 88 9 76

C 0 0 76 199 88

D 9 8 76 44 65

E 44 5 3 9 100

Figur e 3.7 (SCORE)

The abo ve fig. 3.7 sho ws a 5*5 array co nsist ing o f 25 element s.

In t he case o f a t wo -d imensio nal arra y, t he fo llo wing fo r mu la yield s t he nu mber

o f byt es o f memo r y needed t o ho ld it :

byt es = size o f 1 st index × size o f 2 nd ind ex × size o f ( base t yp e)

3.4.2 Representation of Two-Di men sion al Array s in Memo ry

Let A be a t wo -d imensio nal m x n array. Alt ho ug h A is v iewed as a

rect angu lar arra y o f element s wit h m ro ws and n co lu mns, t he array will be

represent ed in memo r y b y a blo ck o f m* n sequent ial memo r y lo cat io ns. No w t he

quest io n ar ises t hat ho w t hese element s are sequenced i.e. whet her t he element s

are st o red ro w wise o r co lu mn wise? It act ually d epends o n t he o perat ing

syst em. Specifically, t he pro gramming languag es will st o re t he array A in

eit her, co lu mn by co lu mn, called co lu mn- majo r o rder, o r, ro w by ro w, called

ro w-ma jo r o rder.

The figures 3.8 (a) & ( b) sho ws t hese t wo ways when A is a t wo -dimensio nal 3 x

4 array.

14
Figur e 3.8(a) Figur e 3.8 (b)

As we were able t o find ad dress o f any element o f t he arra y in case o f linear

array, a similar s it uat io n also ho lds fo r any t wo -dimensio nal M x N array A i.e.,

t he co mput er keep s t rack o f base( A), which is t he ad dress o f t he fir st element

A[1, 1] o f A and co mput es t he ad dress LOC( A[I, J]).

The fo r mu la fo r co lu mn majo r o rder is,

LOC( A[I, J]) = Base( A) + W*[ M(J-1) + (I-1)]

The fo r mu la fo r ro w majo r o rder is,

LOC( A[I, J]) = Base( A) + W*[N(I -1) + (J-1)]

Fo r examp le, co ns ider t he previo u s examp le o f fig. 3.7, o f 5 x 5 mat r ix arra y

SCORE. Sup po se Base(SCORE) = 200 and t here are W = 4 wo rds per memo r y

cell and let t he pro gramming languag e st o res t wo -dimens io nal arrays us ing ro w-

majo r o rder. Then t he ad dress o f S CORE[3,3], t he t hir d mat ch o f t he t hird

p layer fo llo ws:

LOC(S CORE[3,3]) = 200 + 4*[5*(3 -1) + (3 -1)] = 200 + 4*[12] = 248.

15
Two -dimensio nal arrays clear ly sho ws t he d ifference bet ween t he lo g ical and

t he p hys ical views o f d at a. Figure 3.7 sho ws t hat ho w we co u ld lo g ically views

a 5 x 5 mat r ix array A. On t he o t her han d, t he dat a will be p hys ically st o red in

memo r y b y a linear co llect io n o f memo r y cells.

3.4.3 Genera l Multidi mensiona l Arrays

General mu lt id imensio nal arrays are d efined in a similar manner.

Mo re specifically, an n d imensio nal m 1 x m 2 x …… x m n ., array A is a co llect io n

o f m 1 , m 2 …….. m n dat a element s in wh ich each ele ment is sp ecified by a list o f

n int egers such as K 1 , K 2 ,…. K n called su bscr ipt s, wit h t he pro pert y t hat

1 <= K 1 <= m 1 , 1 <= K 2 <= m 2 , ……….. 1 <= K n <= m n ,

The ele ment o f A wit h su bscr ipt s K 1 , K 2 ,…. K n will be deno t ed by

A K1 , K2 , … . K n o r A[ K1, K2,…. Kn ]

The array will be st o red in memo r y in a sequ ence o f memo r y lo cat io ns. Ag ain,

t he pro gramming languag e will st o re t he array A eit her in ro w- majo r o rder o r

co lu mn- majo r o rder.

The fo r mat o f declarat io n o f a mu lt id imen sio nal arra y in C is g iven belo w:


dat a_t yp e array_ name [expr 1] [expr 2] …. [expr n] ;

where dat a_t ype is t he t ype o f arra y such as int , char et c., array_ name is t he

name o f array and expr 1, expr 2, ….expr n are po sit ive valued int eg er

expressio ns.

3.5 Matrices

A rectangular array of m * n numbers arranged in the form

16
[ ]
a11 a12 L a 1n
a21 a22 L a 2n
M M
am1 a m2 L a mn
is called an m*n matrix.

[]
2
e.g. [
2 3
1 −8
4
5 ] is a 2*3 matrix and 7
−3
is a 3*1 mat r ix.

The var io us t yp es o f mat r ices are d efined belo w:

(a) A matrix having only one row [a 1 a2 an ]


is called a row matrix or row vector.

[]
b1
b2
bn
(b) A matrix having only on column is called a column matrix or column vector.

[]
2
e.g. 7 is a column vector of order 3*1 and [− 2 − 3 − 4 ] is a row vector of order 1*3.
−3

[ ]
a11 a12 L a1n
a21 a 22 L a2n
(c) is called a squa re mat rix o f o rder n.
M O M
an1 an2 L ann

e.g. [
3 9
0 −2 ] is a square mat r ix o f o rder 2.

(d) I f all t he element s are zero , t he mat rix is called a zero mat rix o r nu ll
mat r ix, deno t ed by O m´n .

e.g. [ ]
0 0
0 0
is a 22 zero mat r ix, and d eno t ed by O2 .

A= [aij ]n*n
(e) Let be a squ are mat r ix.

(i) If aij = 0 for all i, j, then A is called a zero matrix.

(ii) If aij = 0 for all i<j, then A is called a lower triangular matrix.

17
(iii) If aij = 0 for all i>j, then A is called a upper triangular matrix.

[ ]
a11 0 0 L 0
a21 a 22 0 M

[ ]
a11 a12 L a1n
M 0
0 a22 M
an1 an2 L L a nn
0 0 M
M M
0 L 0 ann

i.e. Lower triangular matrix Upper triangular matrix

[ ]
1 0 0
e.g. 2 1
−1 0
0
4
is a lower triangular matrix and [2 −3
0 5 ] is an upp er t riangu lar

mat r ix.

A= [aij ]n*n
( f) Let be a squar e mat r ix. I f aij = 0 fo r all i,j , t hen A is called a
d iago nal mat r ix.

[ ]
1 0 0
e.g. 0 −3 0 is a d iago nal mat r ix.
0 0 4

Arithmetic of Mat rices:

(a) Two matrices A and B are equal iff they are of the same order and their corresponding

elements are equal, i.e. , [aij ]m*n = [bij ]m*n means aij =bij for all i,j
.

e.g. [ ][
a
4
2
b
=
−1 c
d 1 ] means a =-1, b= 1, c= 2, d= 4 .

A= [aij ]m*n T '


(b) Let . The tran spo se o f A, deno t ed by A , o r A , is

defined by

[ ]
a 11 a21 L a m1
a a22 L am2
A T = 12
M M
a1n a2n L a nm n*m

18
Thus, if

[ ]
1 2 3
A= 4 5 6 ,
7 8 9

the transpose of A is
T

[ ][ ]
1 2 3 1 4 7
T
A = 4 5 6 = 2 5 8
7 8 9 3 6 9

T
(c) A square mat r ix A is called a symmet ric matrix iff A =A

[ ] [ ]
1 3 −1 1 3 −1
e.g. 3 −3 0 is a symmetric matrix and 0 −3 0 is no t a symmet r ic
−1 0 6 −1 3 6
mat r ix.

A= [aij ]m*n B= [bij ]m*n


(d) Mat rix Add it io n and Subt ract io n: Let and . We can
C= [c ij ]m*n
define A + B as t he mat r ix o f t he same o rder such t hat c ij =aij +b ij fo r
all i=1,2,...,m and j=1,2,...,n.

Thus, if

[ ] [ ]
1 2 3 4 5 −6
A= 4 5 6 and B= − 7 8 9 ,
7 8 9 1 −2 3

summing the two matrices yields

[ ][
a11 +b11 a12 +b12 a13 +b13
C=A+B= a 21 +b21
a 31 +b31
a 22 +b22
a 32 +b32 a33 +b 33
1+ 4
a23 +b 23 = 4− 7
7+1
2+5
5+ 8
8− 2
3− 6

9+3 ][ 5
6+ 9 = − 3
8
7 −3
13 15
6 12 ]
For subtraction, we can simply subtract the corresponding elements:

[ ][
a 11− b11 a12 − b12 a13− b13

][ ]
1− 4 2− 5 3− (− 6) − 3 −3 9
D=A− B= a 21− b21 a22 − b22 a23− b23 = 4− (− 7 ) 5− 8 6− 9 = 11 − 3 −3
a 31− b31 a32 − b32 a33− b33 7− 1 8− (− 2 ) 9− 3 6 10 6

19
A= [aik ]m*n B= [bkj ]n*p
(e) Mat r ix Mu lt ip licat io n: Let and . Then t he pro duct AB
n
C= [c ij ]m*p
is defined as t he mp mat r ix where c ij =ai1 b1j +ai2 b 2j +L+a in b nj = ∑ aik b kj
k=1

i.e.
AB=
[∑ ]
k=1
aik bkj
m*p
.

I f we want to mu lt ip ly t wo mat r ices t hen t he nu mber o f co lu mns o f t he fir st

mat r ix sho u ld be equal t o nu mber o f ro ws o f seco nd mat r ix.

Well, if we wish to calculate the product of two matrices A and B:

[ ] [ ]
a11 a 12 a13 b11 b12 b 13
A= a21 a 22 a23 and B= b21 b22 b 23 ,
a31 a 32 a33 b31 b32 b 33

then n = 3, and the product C = AB is defined by:

[ ]
a 11 b11 +a 12 b 21 +a13 b31 a11 b 12 +a12 b22 +a13 b32 a11 b13 +a 12 b 23 +a13 b33
AB= a21 b11 +a 22 b 21 +a23 b31 a 21 b12 +a22 b22 +a 23 b 32 a21 b 13 +a 22 b 23 +a 23 b 33
a31 b11 +a 32 b 21 +a33 b31 a 31 b12 +a 32 b22 +a 33 b 32 a31 b 13 +a 32 b 23 +a 33 b 33

In other words, we multiply each of the elements of a row in the left-hand matrix by the

corresponding elements of a column in the right-hand matrix (that’s why the number of elements

in the row and the column must be equal), and then sum the resulting n products to obtain one

element in the product matrix.

For example if we have to find the product of two matrices, A and B.

[ ] [ ]
1 2 3 4 5 6
Let A= 4 5 6 and B= 7 8 9 .
7 8 9 1 2 3

20
We first note that multiplication of A by B is allowed because the number of columns in A is the

same as the number of rows in B, which allows us to calculate C = AB

[ ]
1 4 +2 7+ 3 1 1 5+2 8+3 2 1 6+2 9+3 3
= 4 4+5 7+6 1 4 5+ 5 8+ 6 2 4 6+5 9+ 6 3

[ ][ ]
1 2 3 4 5 6
C=AB= 4 5 6 7 8 9 7 4+ 8 7 +9 1 7 5+8 8+ 9 2 7 6+8 9+9 3
as:
7 8 9 1 2 3

[ ]
21 27 33
= 57 72 87
93 117 141

The algo r it hm fo r mat r ix mu lt ip licat io n can be wr it t en as fo llo ws:

MATMUL( A, B, C, M, P, N) : Here A is a mat r ix o f o rder M*P, B is a mat r ix o f

o rder P*N. This algo r it hm calcu lat es and st o res t he mu lt ip licat io n A*B int o C o f

o rder M*N.

1. Repeat st eps 2 to 4 fo r I = 1 to M

2. Repeat st eps 3 and 4 fo r J = 1 t o

3. C[I, J] = 0

4. Repeat fo r K = 1 t o P

5. C[I, J] = C[I, J] + A[I, K] * B[ K, J]

6. Exit .

3.6 Sparse Matrices

Mat r ices wit h go o d nu mber o f zero ent r ies are called sp arse mat r ices.

Co ns ider t he mat r ices o f Figure 3.9

21
Figur e 3.9

A t r iang u lar mat r ix is a squar e mat r ix in which all t he eleme nt s eit her abo ve o r

belo w t he main d iago nal are zero . Tr iangu lar mat r ices ar e spar se mat r ices. A

t rid iago nal mat r ix is a squar e mat r ix in which all t he ele ment s except fo r t he

main d iago nal, d iago nals o n t he immed iat e up per and lo wer side are zero s.

Tr id iago nal mat r ices ar e also spar se mat r ices.

Let us co nsider a sparse mat r ix fro m st o rage po int o f view. Suppo se t hat t he

ent ire sparse mat r ix is st o red. Then, a co nsid erable amo u nt o f me mo r y which

st o res t he mat r ix co nsist s o f zero s. This is no t hing but wast age o f memo r y. I n

real life app licat io ns, such wast age ma y co unt to meg abyt es. So , an efficient

met ho d o f st o ring spar se mat r ices has t o be lo o ked int o .

Figur e 3.10 sho ws a sp arse mat r ix o f o rder 7 × 6.

22
Figur e 3.10

Suppo se we want to sto re t riang u lar ar ray A sho wn in fig. 3.11. Clear ly a lo t o f

space wo u ld be wast ed if no r mal mat r ix represent at io n is used. Hence so me

ot her way is requ ired t o sto re t he dat a in a t riangu lar mat r ix. Act ually, d at a can

be st o red using linear array.

A11 0 0 0 0 0 0

A21 A22 0 0 0 0 0

A31 A32 A33 0 0 0 0

… … … … … … …

… … … … … … …

… … … … … … …

An1 An2 An3 … … … Ann

Fig. 3.11(Mat r ix A)

The dat a can be st o red using linear array B as fo llo ws:

B[1] = a11, B[2] = a21, B[3] = a22, B[4] = a31, …

23
i.e. we have t o generat e a fo r mu la su ch t hat B[L] = a J K .

It can easily be calcu lat ed add ing nu mber o f ele ment s in t he ro ws befo re t he ro w

in which t he element is exist ing and nu mber o f co lu mns upt o t he co lu mn in

which t he ele ment is exist ing.

L = 1 + 2 + 3 + … + (J-1) + K = J( J-1)/2 + K

pro vides t he index t hat accesses t he valu e a J K fro m t he linear array B.

3.7 Applicati ons o f Arrays

Arrays are simp le, but reliab le t o use in mo re sit uat io ns t han yo u ca n

co unt . Arrays are u sed in t ho se pro ble ms when t he nu mber o f it ems t o be so lved

is fixed. They are easy t o t raverse, search and so rt. It is ver y easy t o manipu lat e

an arra y rat her t han o t her su bseq uent dat a st ruct ures. Arrays are used in t ho se

sit uat io ns where in t he size o f array can be est ablished befo r ehand. Also , t hey

are used in sit uat io ns where t he insert io ns and delet io ns are min imal o r no t

present . Insert io n and delet io n o perat io ns will lead t o wast age o f memo r y o r

will increase t he t ime co mp lexit y o f t he pro gram due t o t he reshu ffling o f

element s.

4. Summa ry

Arrays help t o so lve pro blems t hat requ ire keep ing t rack o f man y p ieces

o f dat a. We mu st also learn abo ut a special k ind o f arrays called sparse arrays.

In a sp arse array, a large pro po rt io n o f t he element s are zero , but t ho se wh ich

are no n-zero rando mly d ist r ibut ed. Arrays are u sed in pro gramming languag es

fo r ho ld ing gro up o f ele ment s all o f t he same k ind. Vect o rs, mat rices,

chessbo ards, net wo rks, po lyno mials, et c. can be represent ed as arrays.

24
5. Suggested Reading /Reference mat eria l

 “Dat a St ruct ures Using C and C++”, Yed id yah Langsam, Mo she J.

Au genst ein, Aaro n M Tenenbau m, Seco nd Ed it io n, PHI Publicat io ns.

 “Algo r it hms + Dat a St ruct ures = Pr o grams” by Nik laus Wirt h, PHI

pub licat io ns.

 “Fu ndament als o f Dat a St ruct ures in C++” by E.Ho ro wit z, Sahni and

D.Meht a; Galgo t ia Pu blicat io ns.

 “Dat a St ruct ures and Pro gram Desig n in C” b y Kru se, C.L.To no do and B.

Leu ng ; Pearso n Educat io n.

 “Fu ndament als o f Dat a St ruct ures in C” by R. B. Pat el, PHI Publicat io ns.

 “Dat a St ruct ures and Algo r it hms”, V. Aho , Ho pcro pft , Ullman, LPE.

 “Dat a St ruct ures”, Seymo ur Lip schut z, Schau m’s Out line Ser ies.

6. Self Assessment Question s (SAQ)

 What are t he var io us o perat io ns t hat co u ld be per fo r med wit h arrays?

 What is an arra y? Exp lain var io us t yp es o f arrays in d et ail.

 Wr it e and exp la in t he algo r it hm fo r insert ing a n element in an array.

 Wr it e and exp la in t he algo r it hm fo r delet ing an element in an array.

 What is a spar se mat r ix? E xp la in it s st o rage in co mput er memo r y.

 Wh y spar se mat r ices are st ud ied sp ecia lly in co mput er science?

 Exp lain ho w arrays are st o red in co mput er.

 Wr it e an algo r it hm fo r mu lt ip licat io n o f t wo mat rices.

25
 Exp lain fo llo wing t ype o f mat r ices: ( i) Tr iangu lar mat r ix ( ii) Tr id iago nal

mat r ix ( iii) S ymmet r ic mat r ix.

 Descr ibe var io u s app licat io ns o f arrays.

 Co ns ider t he linear arrays A(5:5 0), B( -5,10) and C(1 8).

(a) Find t he nu mber o f element s in each array.

(b) Suppo se Base( A)=3 00 and w=4 wo rds per memo r y cell fo r A. Find t he

address o f A(15), A(35) and A(5 5).

 Suppo se mu lt id imensio nal arrays A and B are declared u sing

A(-2:2, 2:22) and B(1 :8, -5:5, -10:5)

(a) Find t he lengt h o f each d imensio n and t he nu mber o f element s in A and

B.

(b) Co ns ider t he ele ment s B(3, 3, 3) in B. Find t he effect ive ind ices E1,

E2 and E3 and t he address o f t he ele ment , assu ming Base(B) =400 and

t here are w=4 wo rds per memo r y lo cat io n.

 Let A be an n* n squ are mat r ix array. Wr it e algo r it hms fo r fo llo wing :

(a) Find t he nu mber o f no nzero element in A.

(b) Find t he su m o f element s abo ve t he d iago nal.

(c) Find t he pro duct o f t he d iago nal ele ment s.

 Each st udent in a class o f 30 st udent s t akes 6 t est s in which sco res range

bet ween 0 and 100. Su ppo se t he t est sco res are st o red in a 30 x6 array.

Wr it e a mo du le which

(a) Find s t he average grade fo r each t est .

26
(b) Find s t he fina l grade fo e each st udent where t he fina l grade is t he

averag e o f t he st udent ’s five hig hest t est sco res.

(c) Find t he nu mber o f st udent s who have failed, i.e. who se final gr ade is

less t han 60.

(d) Find s t he average o f t he final grades.

27
Lesson : 4 Writer : Dr. Pardeep Kumar Mittal

Title : Linked List - I Vetter : Prof. Rakesh Kumar

Structure:

1. Introduction

2. Objective

3. Presentation of Contents

3.1 Linked List

3.2 Representation of Linked List in Memory

3.3 Declaration of Linked List

3.4 Operations on Linked List

3.4.1 Traversal

3.4.2 Search

3.4.3 Insertion

3.4.4 Deletion

3.5 Applications

4. Summary

5. Suggested Readings/Reference material

6. Self Assessment Questions (SAQ)

1
1. Introduction

In the previous chapter, we have discussed arrays. Arrays are data structures of fixed

size. Insertion and deletion involves reshuffling of array elements. Thus, array manipulation

is time-consuming and inefficient. But in linked list items can be added or removed easily to

the end or beginning or even in the middle. Linked list does not need contiguous memory

space in computer memory. Therefore space can be properly utilized in linked lists as

compared to arrays. In this chapter, we will discuss linked list, its memory representation,

operations on linked lists and their applications.

2. Objectives

The main objective of this chapter is to introduce linked lists to the readers. At the

end of this chapter readers will learn about linked lists, become aware of the basic properties

of linked lists. The reader can appreciate the advantages of linked lists over arrays after

reading this chapter. Another important objective of this chapter is to explore the traversal,

search, insertion and deletion operations on linked lists. The various applications of the

linked lists are being discussed in detail in this chapter. In the end the implementation of

linked list using C has also been explained so that readers can discover how to build and

manipulate a linked list.

3. Presentation of Contents

3.1 Linked List

We know that the sequential representation of the ordered list is expensive

while inserting or deleting arbitrary elements stored at fixed distance in a fixed memory. The

linked representation reduces the expense because the elements are not stored at fixed

distance and they are represented randomly and the operations such as insertion and deletion

requires changes in link only rather than movement of data.

2
A linked list is a linked representation of the ordered list. It is a linear collection of data

elements termed as nodes whose linear order is given by means of link or pointer. Every node

consist of two parts. The first part is called INFO, contains information of the data and

second part is called LINK, contains the address of the next node in the list. A variable called

START, always points to the first node of the list and the link part of the last node always

contains null value. A null value in the START variable denotes that the list is empty. The

linked list is shown in fig. 4.1.

START

A B C D NULL

Figure 4.1(Representation of One-way Linked List with INFO and LINK field)

Along with the linked list in the memory, a special list is maintained which consists of list of

unused memory cells or unused nodes. This list is called list of available space or availability

list or list of free storage or free storage list or free pool. A variable AVAIL is used to store

the starting address of the availability list.

Sometimes, during insertion, there may not be available space for inserting a data into a data

structure, then the situation is called OVERFLOW. Programmers generally handle the

situation by checking whether AVAIL is NULL or not. The situation where one wants to

delete data from a data structure that is empty is called UNDERFLOW. The situation is

encountered when START is NULL.

3.2 Representation of Linked List in Memory

When a linked list is maintained in computer memory, actually two lists are

maintained which are start list and list of available spaces. Let LIST be a linked list. Then

LIST will be maintained in memory as follows. First of all, LIST requires two linear arrays-

3
we will call them here INFO and LINK which contains the information part and the next

pointer field of a node of LIST respectively. START contains the location of the beginning of

the list.

The following example of linked lists indicate that more than one list may be maintained in

the same linear arrays i.e. INFO and LINK. Figure 4.2 pictures a linked list in memory where

each node of the list contains a single character. We can obtain the actual list of characters,

or, in other words, the string, as follows:

INFO LINK

1 C 6

2 3

4 3 5

START 4 A 9

5 7

6 D Null

2 7 8

AVAIL 8 10

9 B 1

10 Null

Fig. 4.2(Memory Representation of One-Way Linked List)

3.3 Declaration of Linked List

Consider the following definition in C:

typedef struct node

int data;

struct node *next;

4
} list;

Once we have a definition for a list node, we can create a list simply by declaring a pointer to

the first element, called the “head”. A pointer is generally used instead of a regular variable.

List can be defined as

list *head;

It is as simple as that! You now have a linked list data structure. It is not altogether useful at

the moment. You can see if the list is empty. We will be seeing how to declare and define

list-using pointers in the following program.

#include <stdio.h>

typedef struct node

int data;

struct node *next;

} list;

int main()

list *head = NULL; /* initialize list head to NULL */

if (head == NULL)

printf("The list is empty!\n");

Program: Creation of a linked list

3.4 Operations on Linked List

We may perform the following operations on Linked List

5
(a) Traversal: Processing each element in the Linked List.

(b) Search: Finding the location of the element with a given value

(c) Insertion: Adding a new element to the Linked List.

(d) Deletion: Removing an element from the Linked List.

3.4.1 Traversing Linked List

In many list-processing operations, we must process each node in the list in

sequence; this is called traversing a list. To traverse a list in order, we must start at the list

head and follow the list pointers. The list can be traversed by using the assignment PTR:=

LINK[PTR], which moves the pointer to the next node in the list. As shown in fig.4.3,

traversal is done by processing each node starting from first node, the traversal moves ahead

via next(link) field and the process is continued until the next field is NULL.

Item Next Item Next NULL


Llist

Item Next Item Next NULL


Llist

Figure 4.3(Traversing a Linked-List on a node by node basis)

The algorithm for traversal can be written as follows.

6
Algorithm (Traversing a Linked List) This algorithm traverses a list applying an operation

PROCESS to each element of the list. The variable PTR points to the node currently being

processed.

1. PTR = START.

2. Repeat Steps 3 and 4 while PTR ≠ NULL.

3. Apply PROCESS to INFO[PTR].

4. PTR =LINK[PTR].

5. Exit

The algorithm starts with initializing PTR. Then process INFO[PTR], the information at the

first node. Update PTR by the assignment PTR:=LINK[PTR], and then process INFO[PTR],

the information at the second node and so on until PTR=NULL, which signals the end of the

list.

Following procedure uses the above algorithm of traversal to count the number of elements in

a linked list.

Procedure: COUNT (INFO, LINK, START, NUM)

1. NUM = 0.

2. PTR = START.

3. Repeat Steps 4 and 5 while PTR ≠ NULL.

4. NUM = NUM+1.

5. PTR = LINK[PTR].

6. Return.

3.4.2 Searching in Linked List

Suppose we have to search an ITEM in a given list. We will be discussing two

different searching algorithms for finding the location(LOC) of the node where ITEM first

appears in LIST. First algorithm does not assume that the data is in sorted order, while the

7
second algorithm does assume that data is in sorted order. Both the algorithms will only find

the first occurrence of the ITEM in the list.

LIST is Unsorted: If the data in the given list are not necessarily sorted, then we can search

for ITEM in the given list by traversing through the list using a pointer variable PTR and

comparing ITEM with the contents INFO[PTR] of each node, one by one by updating the

pointer PTR by PTR := LINK[PTR]. For continuing the loop we have to perform two tests.

First we have to check whether we reached the end of the list; i.e., PTR = NULL. If not, then

we check to see whether INFO[PTR] = ITEM. If the ITEM is found in the list, then its

location is returned otherwise NULL is stored in the location.

Algorithm: SEARCH (INFO, LINK, START, ITEM, LOC)

This algorithm finds the location LOC of the node where ITEM first appears in LIST, or sets

LOC=NULL.

1. PTR = START.

2. Repeat Step 3 while PTR ≠ NULL

3. If ITEM = INFO[PTR] then

LOC = PTR and Exit.

Else

PTR = LINK[PTR].

4. LOC =NULL.

5. Exit

LIST is Sorted: If the data in the LIST are sorted, then we can search for ITEM in LIST by

traversing the list using a point variable PTR and comparing ITEM with the contents

INFO[PTR] of each node, one by one, of LIST. Here we can stop once either when ITEM

matches with any node of the list or it exceeds INFO[PTR].

Algorithm: SEARCHSORT(INFO, LINK, START, ITEM, LOC)

8
This algorithm finds the location LOC of the node where ITEM first appears in LIST, or sets

LOC=NULL. Here the given list is assumed to be sorted.

1. PTR = START.

2. Repeat Step 3 while PTR ≠ NULL

3. If ITEM < INFO[PTR], then

PTR = LINK[PTR].

Else if ITEM = INFO[PTR], then

LOC = PTR, and Exit.

Else

LOC = NULL, and Exit.

4. LOC = NULL.

5. Exit.

3.4.3 Insertion in Linked List

Let a linked list be represented as in fig. 4.4. Linked list is having successive

nodes A and B, as pictured in Figure 4.5. Suppose a node N is to be inserted into the list

between nodes A and B. The schematic diagram of such an insertion appears in Figure 4.5.

Figure 4.4(Linked-List Before Insertion)

9
Figure 4.5(Linked-List After Insertion)

Insertion Algorithms: We will be discussing three algorithms for insertions, which are:

(a) The first one inserts a node at the beginning of the list.

(b) The second one inserts a node into after the node with a given location.

(c) The third one inserts a node into a sorted list.

In all the following algorithms it is assumed that the linked list is in memory and that the

variable ITEM contains the new information to be added to the list. Since our insertion

algorithms will use a node in the AVAIL list, all of the algorithms will include the following

steps:

(i) Checking to see if space is available in the AVAIL list. If not, that is, if

AVAIL=NULL, then the algorithm will print the message OVERFLOW.

(ii) Removing the first node from the AVAIL list. Using the variable NEW to keep

track of the location of the new node, the following step can be implemented by the pair

of assignments NEW= AVAIL, AVAIL = LINK[AVAIL].

(iii) Copying new information into the new node i.e. INFO[NEW] := ITEM.

Insertion at the Beginning of a List: The linked list is assumed not to be sorted. The

following algorithm inserts the node at the beginning of the list.

Algorithm: INSBEG(INFO, LINK, START, AVAIL, ITEM)

This algorithm inserts ITEM as the first node in the list.

1. If AVAIL = NULL, then: Write OVERFLOW and Exit.

10
2. NEW = AVAIL and AVAIL = LINK [AVAIL].

3. INFO[NEW] = ITEM.

4. LINK[NEW] = START.

5. START := NEW

6. Exit.

Fig. 4.6 and 4.7 shows the insertion in a linked-list at the beginning of the list.

Figure 4.6(Representation of Linked-List Before Insertion)

Figure 4.7(Linked-List after Inserting an Element at the Beginning of the List)

Insertion after a Given Node : Suppose we are given the value of LOC where LOC

indicates that LOC is the location of the node after which new node is to be inserted. The

following algorithm inserts ITEM into given list so that ITEM follows node for which LOC

i.e. location is given, or ITEM is the first node when LOC = NULL.

Algorithm : INSAFTERLOC(INFO, LINK, START, AVAIL, LOC, ITEM)

1. If AVAIL=NULL, then Write OVERFLOW and Exit.

11
2. NEW = AVAIL and AVAIL = LINK[AVAIL].

3. INFO[NEW] = ITEM.

4. If LOC = NULL, then

LINK[NEW] = START and START = NEW.

Else

LINK[NEW] = LINK [LOC] and LINK[LOC] =NEW.

5. Exit.

Item Next NULL

Start

NULL
Item Next Item Next Item Next
Start
LOC

Fig. 4.8: LOC is the location after which new node is to be inserted

Item Next

Start

Item Next Item Next Item Next NULL

Start
LOC

Fig. 4.9: Insertion after a given node

Inserting into a Sorted Linked List : Suppose ITEM is to be inserted into a sorted linked

list. This algorithm inserts the node into a Sorted Linked List. The ITEM must be inserted

between nodes A and B so that INFO(A) < ITEM < INFO(B)

In this case first location LOC of the node after which new node is to be inserted is to be

found. For this we will write a procedure that finds the location LOC of node A.

12
Traverse the list, using a pointer variable PTR and comparing ITEM with INFO[PTR] at each

node. While traversing, we have to keep track of the location of the preceding node by using

a pointer variable SAVE, as pictured in figure 4.10. Thus SAVE and PTR are updated by the

assignments

SAVE = PTR and PTR = LINK[PTR]

Figure 4.10(PTR is the node currently being processed and SAVE is the pointer to the

previous node)

The traversing stops as soon as ITEM < INFO[PTR]. Then PTR points to node B. so SAVE

will contain the location of the node A.

Procedure : FINDLOCA(INFO, LINK, START, ITEM, LOC)

1. If START = NULL, then LOC = NULL, and Return.

2. If ITEM <INFO[START], then LOC = NULL, and Return.

3. SAVE = START and PTR = LINK[START].

4. Repeat Steps 5 and 6 while PTR ≠ NULL.

5. If ITEM < INFO[PTR], then

LOC = SAVE and Return.

6. SAVE = PTR and PTR = LINK[PTR].

7. LOC = SAVE.

8. Return

Algorithm : INSSORT(INFO, LINK, START, AVAIL, ITEM)

This algorithm inserts ITEM into a sorted linked list.

1. Call FINDLOCA(INFO, LINK, START, ITEM, LOC).

13
2. Call INSAFTERLOC(INFO, LINK, START, AVAIL, LOC, ITEM).

3. Exit.

3.4.4 Deletion from Linked List

Let LIST be a linked list with a node N between nodes A and B, as pictured in

Figure 4.11. Suppose node N is to be deleted from the linked list. The schematic diagram of

such a deletion appears in Figure 4.12. The deletion occurs as soon as the next pointer fields

of node A points to node B. Therefore, when performing deletions, one must keep track of

the address of the node which immediately precedes the node that is to be deleted.

Figure 4.11(Representation of Linked-List before deletion)

Figure 4.12(Linked-List after deleting a node N)

Deletion Algorithms

We will discuss two deletion algorithms.

(a) The first one deletes a node following a given node.

14
(b) The second one deletes the node with a given ITEM of information.

All our algorithms assume that the linked list is in memory in the form LIST(INFO, LINK,

START; AVAIL) and that the variable ITEM contains the new information to be added to the

list.

All of our deletion algorithms will return the memory space of the deleted node N to the

beginning of the AVAIL list. Accordingly, all of our algorithms will include the following

pair of assignments, where LOC is the deleted node N:

LINK[LOC] := AVAIL and then AVAIL := LOC

If START=NULL, then the algorithm will print the message UNDERFLOW.

Deleting the Node Following a Given Node

Let LIST be a linked list in memory. Suppose we are given the location LOC of a

node N in LIST. Furthermore, suppose we are given the location LOCP of the node preceding

N or, when N is the first node, we are given LOCP=NULL. The following algorithm deletes

N from the list.

Algorithm : DEL(INFO, LINK, START, AVAIL, LOC, LOCP)

This algorithm deletes the node N with location LOC. LOCP is the location of the node

which precedes N or, when N is the first node, LOCP=NULL.

1. If LOCP = NULL, then:

Set START:=LINK[START].

Else:

Set LINK[LOCP] := LINK[LOC].

2. Set LINK[LOC] := AVAIL and AVAIL := LOC.

3. Exit.

Deleting the Node with a Given ITEM of Information

15
Let LIST be a linked list in memory. Suppose we are given an ITEM of information

and we want to delete from the LIST the first node N which contains ITEM (If ITEM is a key

value, then only one node can contain ITEM). First we give a procedure which finds the

location LOC of the node N containing ITEM and the location LOCP of the node preceding

node N. If N is the first node, we set LOCP = NULL, and if ITEM does not appear in LIST,

we set LOC = NULL. (This procedure is similar to Procedure in insertion in a sorted list)

Traverse the list, using a pointer variable PTR and comparing ITEM with INFO[PTR] at each

node. While traversing, keep track of the location of the preceding node by using a pointer

variable SAVE as in figure 4.10. Thus SAVE and PTR are updated by the assignments

SAVE := PTR and PTR := LINK[PTR]

The traversing stops as soon as ITEM = INFO[PTR]. Then PTR contains the location LOC of

node N and SAVE contains the location LOCP of the node preceding N.

The formal statement of our procedure follows. The cases where the list is empty or where

INFO[START]=ITEM(i.e., where node N is the first node) are treated separately, since they

do not involve the variable SAVE.

Procedure : FINDLOC1(INFO, LINK, START, ITEM, LOC, LOCP)

This procedure finds the location LOC of the first node N which contains ITEM and the

location LOCP of the node preceding N. If ITEM does not appear in the list, then the

procedure sets LOC=NULL; and if ITEM appears in the first node, then it sets

LOCP=NULL.

1. If START = NULL, then:

Set LOC:=NULL and LOCP:=NULL, and Return.

2. If INFO[START] = ITEM, then:

Set LOC:=START and LOCP=NULL, and Return.

3. Set SAVE := START and PTR := LINK[START].

16
4. Repeat Steps 5 and 6 while PTR ≠ NULL.

5. If INFO[PTR] = ITEM, then:

Set LOC := PTR and LOCP := SAVE, and Return.

6. SAVE := PTR and PTR := LINK[PTR].

7. Set LOC := NULL.

8. Return.

Algorithm : DELETE (INFO, LINK, START, AVAIL, ITEM)

This algorithm deletes from a linked list the first node N which contains the given ITEM of

information.

1. Call FINDLOC1(INFO, LINK, START, ITEM, LOC, LOCP)

2. If LOC = NULL, then:

Write: ITEM not in list, and Exit.

3. If LOCP = NULL, then:

Set START := LINK[START].

Else:

Set LINK[LOCP] := LINK[LOC].

4. LINK[LOC] := AVAIL and AVAIL = LOC.

5. Exit.

3.5. Applications

Lists are used to maintain POLYNOMIALS in the memory. For example, we have a function

f(x)= 7x5 + 9x4 – 6x³ + 3x². Figure 4.13 depicts the representation of a Polynomial using a

singly linked list. 1000, 1050, 1200, 1300 are memory addresses.

17
Figure 4.13(Representation of Polynomial using Linked-List)

Polynomial contains two components, coefficient and an exponent, and ‘x’ is a formal

parameter. The polynomial is a sum of terms, each of which consists of coefficient and an

exponent. In computer, we implement the polynomial as list of structures consisting of

coefficients and an exponent.

4. Summary

The advantage of Lists over Arrays is flexibility. Over flow is not a problem until the

computer memory is exhausted. When the individual records are quite large, it may be

difficult to determine the amount of contiguous storage that might be in need for the required

arrays. With dynamic allocation, there is no need to attempt to allocate in advance. Changes

in list, insertion and deletion can be made in the middle of the list, more quickly than in the

contiguous lists.

The drawback of lists is that the links themselves take space which is in addition to the space

that may be needed for data. One more drawback of lists is that they are not suited for

random access. With lists, we need to traverse a long path to reach a desired node.

5. Suggested Readings/Reference material Summary

 “Data Structures Using C and C++”, Yedidyah Langsam, Moshe J. Augenstein,

Aaron M Tenenbaum, Second Edition, PHI Publications.

 “Algorithms + Data Structures = Programs” by Niklaus Wirth, PHI publications.

 “Fundamentals of Data Structures in C++” by E.Horowitz, Sahni and D.Mehta; Galgotia

Publications.

 “Data Structures and Program Design in C” by Kruse, C.L.Tonodo and B. Leung;

Pearson Education.

 “Fundamentals of Data Structures in C” by R.B. Patel, PHI Publications.

18
 “Data Structures and Algorithms”, V.Aho, Hopcropft, Ullman, Pearson India.

 “Data Structures”, Seymour Lipschutz, Schaum’s Outline Series, TMH.

6. Self Assessment Questions (SAQ)

 Write the procedure for following

a) Find the number of times a given item occurs in a linked list

b) Find the of nonzero elements in the link list

c) Add a given value to each element in link list

 Write an algorithm which deletes the last node from the linked list.

 Write an algorithm which copies the contents of one link list into another.

 Write a procedure for following

a) Find the maximum value in the linked list.

b) Find the average of values in the link list.

c) Find the product of the elements in the linked list.

 Write a procedure which deletes the Kth element from the linked list.

 Write a procedure which adds a given item at the end of the linked list.

 Differentiate between array and linked list.

 Write a procedure which interchanges the two elements in the linked list.

19
Lesson : 5 Writer : Dr. Pardeep Kumar Mittal

Title : Linked List - II Vetter : Prof. Rakesh Kumar

Structure:

1. Introduction

2. Objective

3. Presentation of Contents

3.1 Header Linked List

3.1.1. Operations on Header Linked Lists

3.2 Circular Linked List

3.2.1 Operations on Circular Linked Lists

3.2.2 Advantages of Circular Linked Lists

3.3 Two Way(Doubly) Linked List

3.3.1 Two way Header Lists

3.3.2 Operations on two way lists

3.3.3 Differences between One-way linked lists and Two-way linked lists

4. Summary

5. Suggested Readings/Reference material

6. Self Assessment Questions (SAQ)

1
1. Introduction

In the previous chapter, we have discussed the singly linked lists along with various

operations that can be performed on these. In this chapter, we are going to discuss some

variations of linked list in the form of header linked lists and circular linked lists, which have

their own applications. Also doubly linked lists are discussed which can be considered as an

improvement over singly linked list when one has to travel in either forward, backward or

both directions. Operations on these lists that is header linked list, circular linked list and

doubly linked lists are also discussed in this chapter.

2. Objectives

In this chapter readers will learn about header and circular linked lists along with

various operations on header and circular linked lists. The implementation of these operations

in C language is also described. Another objective of this chapter is to explore doubly linked

lists and various operations on it. In the end the implementation of various operations on

doubly linked lists are also described.

3. Presentation of Contents

3.1 Header Linked List

A header linked list is a linked list which always contains a special node,

called the header node, at the beginning of the list. The following are two kinds of widely

used header lists:

1. A grounded header list is a header list where the last node contains the null pointer.

2. A circular header list is a header list where the last node points back to the header

node.

2
Figure 5.1(a) & 5.1(b) contains schematic diagrams of these header lists. Unless otherwise

stated or implied, our header lists will always be circular. Accordingly, in such a case, the

header node also acts as a sentinel indicating the end of the list.

The header node of linked list can maintain global properties of entire list and act as utility

node. For example, in header node you can maintain count variable which gives number of

nodes in list. You can update header node count member whenever you add /delete any node.

It will help in getting list count without traversing in O(1) time.

Figure 5.1 (a): Grounded Header List

Figure 5.1 (b): Circular Header List

Observe that the list pointer START always points to the header node. Hence,

LINK[START] = NULL indicates that a grounded header list is empty, and LINK[START] =

START indicates that a circular header list is empty.

3.1.1 Operations on Header Linked Lists: The following are the various operations

performed on a grounded header list.

1. Traversing a grounded header list

2. Searching in a grounded header list

3. Deleting from a grounded header list

4. Inserting in a grounded header list

3
Let we discuss the respective algorithms below.

Algorithm : (Traversing a Grounded Header List) Let LIST is a grounded header list in

memory. This algorithm traverses LIST, applying an operation PROCESS to each node of

LIST.

1. PTR = LINK[ START].

2. Repeat Steps 3 and 4 while PTR ≠ NULL

3. Apply PROCESS to INFO[PTR].

4. PTR = LINK[PTR].

5. Exit.

Explanation : The algorithm for traversing a grounded header linked list is similar to a

simple linked list except that the processing starts from LINK[START] instead of START as

START provides the address of the Header Node.

Algorithm : (Searching a Grounded Header List) SRCHGHL(INFO, LINK, START, ITEM,

LOC)

LIST is grounded header list in memory. This algorithm finds the location LOC of the node

where ITEM first appears in LIST or sets LOC = NULL.

1. PTR = LINK[START].

2. Repeat while INFO[PTR] ≠ ITEM and PTR ≠ NULL

PTR = LINK[PTR].

3. If INFO[PTR] = ITEM, then:

LOC = PTR.

Else

LOC = NULL.

4. Exit.
4
Deletion from a grounded header list: For performing deletion it is assumed that we are

given with an item of information to be deleted, therefore first we have to identify the

location of the node and location of the node after which node is to be deleted. For this

purpose we will be considering a procedure that will find these locations and then the

algorithm for deleting the node will be written.

Procedure : FINDLOC(INFO, LINK, START, ITEM, LOC, LOCP)

This procedure finds the location LOC of the first node N which contains ITEM and the

location LOCP of the node preceding N. If ITEM does not appear in the list, then the

procedure sets LOC=NULL.

1. If START = NULL, then:

LOC =NULL and LOCP =NULL, and Return.

2. SAVE = START and PTR = LINK[START].

4. Repeat Steps 5 and 6 while PTR ≠ NULL.

5. If INFO[PTR] = ITEM, then:

LOC = PTR and LOCP = SAVE, and Return.

6. SAVE = PTR and PTR = LINK[PTR].

7. LOC = NULL.

8. Return.

Algorithm : DELETE (INFO, LINK, START, AVAIL, ITEM)

This algorithm deletes from a grounded linked list the first node N which contains the given

ITEM of information.

1. Call FINDLOC(INFO, LINK, START, ITEM, LOC, LOCP)

2. If LOC = NULL, then:

Write: ITEM not in list, and Exit.

3. LINK[LOCP] = LINK[LOC].
5
4. LINK[LOC] = AVAIL and AVAIL = LOC.

5. Exit.

Insertion into a grounded header list: Suppose we are given the value of LOC where LOC

indicates the location of the node after which new node is to be inserted. The following

algorithm inserts ITEM into given list so that ITEM follows node for which LOC i.e. location

is given. Here we are assuming that LOC can not be NULL as Header Node is always present

and it is always the first node in a list.

Algorithm : INSGHLOC(INFO, LINK, START, AVAIL, LOC, ITEM)

1. If AVAIL=NULL, then Write OVERFLOW and Exit.

2. NEW = AVAIL and AVAIL = LINK[AVAIL].

3. INFO[NEW] = ITEM.

4. LINK[NEW] = LINK [LOC] and LINK[LOC] =NEW.

5. Exit.

There are two other variations of linked lists which sometimes appear in the literature:

1. A linked list whose last node points back to the first node instead of containing the null

pointer, called a circular list

2. A linked list which contains both a special header node at the beginning of the list and a

special trailer node at the end of the list.

Figure 5.2(a) & 5.2(b) contains schematic diagrams of these lists.

Figure 5.2(a) Circular Linked List

6
Figure 5.2(b) Linked List with header & trailer nodes

3.2 Circular Linked List

As discussed circular linked list is a list in which last node of the list points to the

first node instead of NULL. As in case of simple linked list, we can perform various

operations on circular linked list as well.

3.2.1 Operations on Circular Lists: Let us now discuss the respective algorithms for

various operations in a circular linked list:

Algorithm : (Traversing a Circular List) Let LIST is a circular list in memory. This

algorithm traverses LIST, applying an operation PROCESS to each node of LIST

1. Apply PROCESS to INFO[START].

2. PTR = LINK[ START].

3. Repeat Steps 4 and 5 while PTR ≠ START

4. Apply PROCESS to INFO[PTR].

5. PTR = LINK[PTR].

6. Exit.

Explanation : In the above algorithm first node is processed before the loop, then PTR

variable is initiated with the second node. The rest of the node are processed in the loop until

PTR variable reaches start again.

Algorithm : (Searching a Circular Linked List) SRCHCL(INFO, LINK, START, ITEM,

LOC)

LIST is circular list in memory. This algorithm finds the location LOC of the node where

ITEM first appears in LIST or sets LOC = NULL.

7
1. IF INFO[START] = ITEM, then LOC = START and exit.

2. PTR = LINK[START].

3. Repeat while INFO[PTR] ≠ ITEM and PTR ≠ START

PTR = LINK[PTR].

4. If INFO[PTR] = ITEM, then

LOC = PTR.

Else

LOC = NULL.

5. Exit.

Explanation : The above algorithm is similar to that of traversal algorithm except that this

algorithm stops as and when the ITEM to be searched is found.

Deletion in a Circular Linked List : For performing deletion it is assumed that we are given

with an item of information to be deleted, therefore first we have to identify the location of

the node and location of the node after which node is to be deleted. For this purpose we first

will be considering a procedure that will find these locations and then the algorithm for

deleting the node will be written.

The following procedure finds the location LOC of the first node N which contains ITEM

and also the location LOCP of the node preceding N.

Procedure : FINDLOCS(INFO, LINK, START, ITEM, LOC, LOCP)

1. SAVE = START and PTR = LINK[START].

2. Repeat while INFO[PTR] ≠ ITEM and PTR ≠ START.

SAVE = PTR and PTR = LINK[PTR].

3. If INFO[PTR] = ITEM, then

LOC = PTR and LOCP = SAVE.

Else
8
LOC =NULL and LOCP = SAVE.

4. Exit.

The following algorithm deletes the first node N which contains ITEM in a circular header

list.

Algorithm : DELLOCCL(INFO, LINK, START, AVAIL, ITEM)

1. Call FINDLOCS(INFO, LINK, START, ITEM; LOC, LOCP).

2. If LOC = NULL, then

Write: ITEM not in list, and Exit.

3. LINK[LOCP] = LINK[LOC].

4. LINK[LOC] = AVAIL and AVAIL = LOC.

5. Exit.

Insertion in a Circular Linked List: Now we will be discussing the insertion in a cicular

linked list. First we will be discussing the insertion at the begining of a list and then the

insertion after a given node will be discussed.

Insertion at the Beginning of a List: The following algorithm inserts the node at the

beginning of the list.

Algorithm: INSBEGCL(INFO, LINK, START, AVAIL, ITEM)

This algorithm inserts ITEM as the first node in the list.

1. If AVAIL = NULL, then: Write OVERFLOW and Exit.

2. NEW = AVAIL and AVAIL = LINK [AVAIL].

3. INFO[NEW] = ITEM.

4. LINK[NEW] = START.

5. PTR = LINK[START] and SAVE = START

6. Repeat while PTR ≠ START

SAVE = PTR
9
PTR = LINK[PTR]

7. LINK[SAVE] = NEW and START = NEW

6. Exit.

Insertion after a given node in a circular linked list: Suppose we are given the value of

LOC where LOC indicates that LOC is the location of the node after which new node is to be

inserted. The following algorithm inserts ITEM into given list so that ITEM follows node for

which LOC i.e. location is given.

Algorithm : INSCLLOC(INFO, LINK, START, AVAIL, LOC, ITEM)

1. If AVAIL=NULL, then Write OVERFLOW and Exit.

2. NEW = AVAIL and AVAIL = LINK[AVAIL].

3. INFO[NEW] = ITEM.

4. If LOC = NULL then

START = NEW and LINK[NEW] = START

Else

LINK[NEW] = LINK [LOC] and LINK[LOC] =NEW.

5. Exit.

3.2.2 Advantages of Circular Linked Lists:

The major advantages of circular linked lists are:

1) Any node can be a starting point. We can traverse the whole list by starting from any point.

We just need to stop when the first visited node is visited again.

2) Useful for implementation of queue. We don’t need to maintain two pointers for front and

rear if we use circular linked list. We can maintain a pointer to the last inserted node and

front can always be obtained as next of last.

3) Circular lists are useful in applications to repeatedly go around the list. For example, when

multiple applications are running on a PC, it is common for the operating system to put the
10
running applications on a list and then to cycle through them, giving each of them a slice of

time to execute, and then making them wait while the CPU is given to another application. It

is convenient for the operating system to use a circular list so that when it reaches the end of

the list it can cycle around to the front of the list.

3.3 Two Way(Doubly) Linked List

Let us now discuss a two-way list, which can be traversed in two directions i.e. either

(i). in the usual forward direction from the beginning of the list to the end,

(ii). in the backward direction from the end of the list to the beginning.

Furthermore, given the location LOC of a node N in the list, one now has immediate access

to both the next node and the preceding node in the list. This means, in particular, we may be

able to delete N directly from the list without traversing any part of the list.

A two-way list is a linear collection of data elements, called nodes, where each node N is

divided into three parts:

1. An information field INFO which contains the data of N

2. A pointer field FORW which contains the location of the next node in the list

3. A pointer field BACK which contains the location of the preceding node in the list

The list also requires two list pointer variables: FIRST, which points to the first node in the

list, and LAST, which points to the last node in the list. Figure 5.3 contains a schematic

diagram of such a list. Observe that the null pointer appears in the FORW field of the last

node in the list and also in the BACK field of the first node in the list.

Figure 5.3 (Representation of Doubly Linked List)

11
Observe that, using the variable FIRST and the pointer field FORW, we can traverse a two-

way list in the forward direction. On the other hand, using the variable LAST and the pointer

field BACK, we can also traverse the list in the backward direction.

Suppose LOCA and LOCB are the locations, of nodes A and B in a two-way list respectively.

Then the way that the pointers FORW and BACK are defined gives us the Pointer property:

FORW[LOCA] = LOCB if and only if BACK[LOCB] = LOCA

In other words, the statement that node B follows node A is equivalent to the statement that

node A precedes node B.

3.3.1 Two-Way Header Lists

The advantages of a two-way list and a circular header list may be combined

into a two way circular header list as pictured in Figure. 5.4. The list is circular because the

two end nodes point back to the header node. Observe that such a two-way list requires only

one list pointer variable START, which points to the header node. This is because the two

pointer in the header node point to the two ends of the list.

Figure 5.4(Representation of Doubly Linked List)

3.3.2 Operations on Two-Way Lists

Traversing: Suppose we want to traverse LIST in order to process each node

exactly once. Then we can traverse the two-way list in either direction i.e. forward or

backward. Here it is of no advantage that the data are organized as a two-way list rather than

12
as a one-way list. The algorithms for both forward and backward traversal can be written as

follows:

Algorithm (Traversing a Two-way Linked List in Forward Direction) This algorithm

traverses a two-way list applying an operation PROCESS to each element of the list. The

variable PTR points to the node currently being processed.

1. PTR = FIRST.

2. Repeat Steps 3 and 4 while PTR ≠ NULL.

3. Apply PROCESS to INFO[PTR].

PTR = FORW[PTR].

4. Exit

The algorithm starts with initializing PTR. Then process INFO[PTR], the information at the

first node. Update PTR by the assignment PTR = FORW[PTR], and then process

INFO[PTR], the information at the second node and so on until PTR=NULL, which signals

the end of the list.

Algorithm (Traversing a Two-way Linked List in Backward Direction) This algorithm

traverses a two-way list applying an operation PROCESS to each element of the list. The

variable PTR points to the node currently being processed.

1. PTR = LAST.

2. Repeat Steps 3 and 4 while PTR ≠ NULL.

3. Apply PROCESS to INFO[PTR].

4. PTR = BACK[PTR].

5. Exit

The algorithm is similar to the previous one except that now the PTR variable is initialized

with LAST pointer and the PTR is updated with BACK[PTR] so that the list can be traversed

in backward direction.
13
Searching: Suppose we are given an ITEM of information - a key value - and we want to

find location LOC of ITEM in LIST. Then we can use search the ITEM using forward search

or backward search as in case of traversal. Here the main advantage is that we can search for

ITEM in the backward direction if we have reason to suspect that ITEM appears near the end

of the list. For example, LIST is list of names sorted alphabetically. If ITEM = Shyam, then

we would search LIST in the backward direction, but if ITEM = Dinesh, then we would

search LIST in the forward direction.

Algorithm (Searching a Two-way Linked List in Forward Direction)

SEARCHFORW(INFO, FORW, FIRST, LOC, ITEM). This algorithm searches for an ITEM

in a two-way list in forward direction and finds the location of the ITEM. The variable PTR

points to the node currently being processed.

1. PTR = FIRST.

2. Repeat Steps 3 and 4 while PTR ≠ NULL.

3. If INFO[PTR] = ITEM then LOC = PTR and exit.

4. PTR = FORW[PTR].

5. LOC = NULL

6. Exit

The algorithm starts with initializing PTR. Then compare INFO[PTR], the information at the

first node with the ITEM to be searched. Update PTR by the assignment PTR =

FORW[PTR], and then process INFO[PTR], the information at the second node and so on

until PTR=NULL or until INFO[PTR] matches with ITEM. If the ITEM is not found then the

LOC becomes NULL.

Algorithm (Searching a Two-way Linked List in Backward Direction)

SEARCHBACK(INFO, BACK, LAST, LOC, ITEM).This algorithm searches for an ITEM in

14
a two-way list in backward direction and finds the location of the ITEM. The variable PTR

points to the node currently being processed.

1. PTR = LAST.

2. Repeat Steps 3 and 4 while PTR ≠ NULL.

3. If INFO[PTR] = ITEM then LOC = PTR and exit.

4. PTR = BACK[PTR].

5. LOC = NULL

6. Exit

The algorithm is similar to the previous one except that now the PTR variable is initialized

with LAST pointer and the PTR is updated with BACK[PTR] so that the ITEM can be

searched in backward direction.

Deleting : Suppose we are given the location LOC of a node N in LIST, and suppose we

want to, delete N from the list. We assume that LIST is a two-way circular header list.

Note that BACK[LOC] and FORW[LOC] are the locations, of the nodes which precede and

follow node N respectively. Accordingly, as pictured in Fig. 5.5, N is deleted from the list by

changing the following pair of pointers:

FORW[BACK[LOC]] := FORW[LOC] and

BACK[FORW[LOC]] := BACK [LOC]

Figure 5.5 (Deleting a node from a doubly linked list)

The deleted node N is then returned to the AVAIL list by the assignments:

FORW[LOC] := AVAIL and AVAIL := LOC

15
The formal statement of the algorithm is as follows:

Algorithm : DELTWL(INFO, FORW, BACK, FIRST, AVAIL, LOC)

1. Set FORW[BACK[LOC]] := FORW[LOC]and

BACK[FORW[LOC]] := BACK[LOC].

2. Set FORW[LOC] := AVAIL and AVAIL := LOC.

3. Exit.

Here we see one main advantage of a two-way list: If the data were organized as a one way

list, then in order to delete N, we would have to traverse the one-way list to find the location

of the node preceding N.

We can write another algorithm in which we have to delete a node with given ITEM of

information as follows:

Algorithm : DELITEMTWL(INFO, FORW, BACK, FIRST, AVAIL, LOC, ITEM)

1. Call SEARCHFORW(INFO, FORW, FIRST, LOC, ITEM).

2. If LOC = NULL write Item does not exist in the list and Exit.

3. Call DELTWL(INFO, FORW, BACK, FIRST, AVAIL, LOC).

4. Exit.

Inserting. Suppose we are given the locations LOCA and LOCB of adjacent nodes A and B

in LIST, and suppose we want to insert a given ITEM of information between nodes A and B.

As with a one-way list, first we remove the first node N from the AVAIL list, using the

variable NEW to keep track of its location, and then we copy the data ITEM into the node N;

that is, we set:

NEW = AVAIL, AVAIL = FORW[AVAIL], INFO[NEW] = ITEM

The formal statement of our algorithm is available in below.

Algorithm : INSTWL(INFO,FORW,BACK,FIRST,AVAIL,LOCA,LOCB,ITEM)

1. If AVAIL = NULL, then,


16
Write: OVERFLOW, and Exit.

2. NEW = AVAIL, AVAIL = FORW[AVAIL],

INFO[NEW] =ITEM.

3. FORW[LOCA] = NEW, FORW[NEW] = LOCB,

BACK[LOCB] = NEW, BACK[NEW] = LOCA.

4. Exit.

If above algorithm assumes that LIST contains a header node, then LOCA or LOCB may

point to the header node, in which case N will be inserted as the first node or the last node. If

LIST does not contains a header node, then we must consider the case that LOCA = NULL

and N is inserted as the first node in the list, and the case that LOCB = NULL and N is

inserted as the last node in the list.

3.3.3 Difference between One-way linked lists and Two-way linked lists :

The Single(One-way) linked list has only one advantage, that it can traverse a

list in one direction. That means one cannot get the address of its predecessor node i.e. when

we look for any previous information of the list during operations then one has to traverse

again from the start node of the one way list. Which uses an extra pointer and additional

searching time. But in case double linked list we can have the address of the next as well as

previous node. So, while we look for previous node address, we can obtain through prior part

of the two-way list which need not require extra pointer or takes less time than that of the

single linked list. So apart from the bi-directional movement facility, the two-way list also

saves the time and space during traversal operation.

4. Summary

Header linked list is a specialized type of linked list. The use of header node in a

linked list is to store some general purpose information about the list. The circular linked list
17
is also sometimes useful as the last node does not contain a null pointer rather contains some

useful information i.e. the address of the first node. Another variation in the linked list

discussed in this chapter is two-way linked list. Generally speaking, storing data as a two-way

linked list, which requires extra space for the backward pointers and extra time to change the

added pointers, rather than as a one-way list is not worth the expense unless one must

frequently find the location of the node which precedes a given node as in deletion.

5. Suggested Readings/Reference material

 “Data Structures Using C and C++”, Yedidyah Langsam, Moshe J. Augenstein, Aaron M

Tenenbaum, Second Edition, PHI Publications.

 “Algorithms + Data Structures = Programs” by Niklaus Wirth, PHI publications.

 “Fundamentals of Data Structures in C++” by E.Horowitz, Sahni and D.Mehta; Galgotia

Publications.

 “Data Structures and Program Design in C” by Kruse, C.L.Tonodo and B. Leung;

Pearson Education.

 “Fundamentals of Data Structures in C” by R.B. Patel, PHI Publications.

 “Data Structures and Algorithms”, V.Aho, Hopcropft, Ullman, Pearson India.

 “Data Structures”, Seymour Lipschutz, Schaum’s Outline Series, TMH.

6. Self Assessment Questions (SAQ)

 Can we use doubly linked list as a circular linked list? If yes, explain.

 Write the differences between doubly linked list and Circular linked list.

 Discuss the advantages, if any, of a two-way list over a one-way list for each of the

following operations:

 Traversing the list to process each node.

 Deleting a node whose location LOC is given.

 Searching an unsorted list for a given element ITEM.


18
 Searching a sorted list for a given element ITEM.

 Inserting a node before the node with a given location LOC.

 Inserting a node after the node with a given location LOC.

 Suppose LIST is a header(Circular) list in memory. Write an algorithm which deletes the

last node from the LIST.

 Write a procedure HEAD(INFO, LINK, START, AVAIL) which forms a header circular

list from an ordinary one-way list.

 Given an integer K, write a procedure DELK(INFO, FORW, BACK, START, AVAIL,

K), which deletes the Kth element from a two-way circular header list.

 Suppose LIST (INFO, LINK, START, AVAIL) is a one-way circular header list in

memory. Write a procedure TWOWAY(INFO, LINK, BACK, START) which assigns

values to a linear array BACK to form a two-way list from the one-way list.

 Explain the purpose of Header Linked List.

 Write and explain applications of two-way linked lists.

19
Lesson : 6 Writer : Dr. Pardeep Kumar Mittal

Title : Stacks Vetter : Prof. Rakesh Kumar

Structure:

1. Introduction

2. Objectives

3. Presentation of Contents

3.1 Stacks

3.2 Computer Representation of Stack

3.3 Operations on Stacks

3.3.1 Implementation of Stacks using Arrays

3.3.2 Implementation of Stacks using Linked List

3.4 Applications

3.4.1 Arithmetic Expression; Polish Notation

3.4.2 Recursion

3.4.3 Quicksort

4. Summary

5. Suggested Readings/Reference material

6. Self Assessment Questions (SAQ)

1
1. Introduction

Stack is very useful concept in Computer Science. In this lesson, we shall examine

this simple data structure and see why it plays such a prominent role in the area of computer

programming. Whenever we are dealing with the function subprogram, we are actually using

stacks, the functions subprograms are kept in a stack and the calling function can only be

executed only when the called function has been executed. There are certain situations when

we wish to insert or remove an item only at the beginning or the end of the list. Stack is a data

structure which allows elements to be inserted as well as deleted only from one end. Stack is

also known as LIFO data structure. Some of the common applications of stacks are recursion,

polish notation and quicksort, which are very useful in the field of computer science.

2. Objectives

After reading this chapter the reader must be able to understand the linear data structure

named as stack. Computer representation of stacks using arrays and linked list will also be

discussed in this chapter. Insertion and deletion operations in stacks using both

representations will also be explained in this chapter. The reader will appreciate the various

applications of stacks, which are recursion, polish notation and quicksort. At the end of this

chapter, the reader must be able to understand and use this applications.

3. Presentation of Contents

3.1 Stack

Stacks and Queues are two data structures that allow insertions and deletions

operations only at the beginning or the end of the list, not in the middle.

A stack is a linear data structure in which items may be added or removed only at one end

named as the top of the stack. Everyday examples of such a structure are very common viz. a

stack of dishes, a stack of books, a stack of coins and a stack of cloths, etc. as shown in fig

6.1
2
Fig. 6.1: A Stack of Coins and Books

Stacks are also called last-in first-out (LIFO) lists. This means, that elements which are

inserted last will be removed first. Other names generally used for stacks are "piles" and

"push-down lists”. Stack has many important applications in the field of computer science.

Special terminology is used for two basic operation associated with stacks:

(a) "Push" is the term used to insert an element into a stack.

(b) "Pop" is the term used to delete an element from a stack.

Example: Suppose that 5 elements are pushed onto an empty stack A, B, C, D, E

Figure 6.2 shows three ways of picturing such a stack.

N 1 A

N-1 2 B

... 3 C

TOP 5 E 4 D
3
4 D 5 E TOP

3 C ...

2 B N-1

1 A N

1 2 3 4 5 … N-1 N

A B C D E

TOP

Figure: 6.2(Ways to represent a Stack)

3.2 Computer Representation of Stack

Here we will maintain the stack by a linear array STACK; a pointer variable TOP,

which contains the location of the top element of the stack; and a variable MAXSTK

which gives the maximum number of elements that can be held by the stack. The condition

TOP=0 or TOP=NULL indicates that the stack is empty.

Figure 6.3 shows such an array representation of a stack. Since the stacks has three

elements, X, Y, Z, and therefore TOP = 3; and since MAXSTK=8, there is room for 5

more items in the stack.

MAXSTK 8

4
TOP 3 Z

2 Y

1 X

Figure: 6.3(Array Representation of Stack)

3.3 Operations on Stacks

The operation of adding (pushing) an item onto a stack and the operation of

removing (popping) and item from a stack are implemented by the procedures called PUSH

and POP respectively.

In the implementation of these operations, TOP and MAX are assumed as global variables;

hence these are not required as arguments in the algorithms, which in turn may be named as

PUSH (STACK, ITEM) and POP (STACK, ITEM) respectively.

3.3.1 Implementation of Stacks using Arrays

Insertion: When we are adding a new element, first, we must test whether there is a free

space in the stack for the new item; if not, then we have the condition known as overflow. If

this condition is not there, then the value of TOP is changed before the insertion in PUSH.

After changing the value of TOP, insertion is done.

Algorithm : PUSH (STACK, ITEM)

1. If TOP=MAXSTK, then Write OVERFLOW and Exit

2. TOP = TOP + 1

3. STACK [TOP] = ITEM

4. Exit

Deletion: In executing the procedure POP, we must first test whether there is an element in

the stack to be deleted; if not; then we have the condition known as underflow. The item to be

5
deleted is first stored in some variable, then the value of TOP is changed after the deletion in

POP.

Algorithm: POP (STACK, ITEM)

1. If TOP = 0, then Write UNDERFLOW and Exit

2. ITEM = STACK[TOP]

3. TOP = TOP-1

4. Return Item

5. Exit

Example: Consider the stack in Figure 6.3. If we perform the operation PUSH (STACK, W):

1. Since TOP=3, which is less than MAXSTK i.e. 8, therefore control is transferred to Step 2.

2. TOP = 3 + 1 = 4.

3. STACK [TOP] = STACK [4] = W

4. Exit

Note that W is now the top element in the stack and value of top is 4.

Example: Consider again the stack in Figure 6.3. This time we perform the operation POP

(STACK, ITEM):

1. Since TOP = 3, which is non-zero, therefore control is transferred to Step 2.

2. ITEM = STACK[3] = Z

3. TOP = 3 - 1 = 2.

4. Exit

Observe that STACK [TOP] = STACK [2] = Y is now the top element in the Stack.

A Stack contains an ordered list of elements and an array is also used to store ordered list of

elements. Hence, it would be very easy to manage a stack using an array. However, the

problem with an array is that we are required to declare the size of the array before using it in

a program. Therefore, the size of stack would be fixed.


6
Though an array and a stack are totally different data structures, an array can be used to store

the elements of a stack. We can declare the array with a maximum size large enough to

manage a stack.

3.3.2 Implementation of Stacks using Linked List

We can avoid the size limitation of a stack implemented with an array, with

the help of a linked list to hold the stack elements. As needed in case of array, we have to

decide where to insert elements in the list and where to delete them so that push and pop will

run at the fastest. You know that while implementing stack with an array and to achieve

LIFO behavior, we used push and pop elements at the end of the array. Instead of pushing

and popping elements at the beginning of the array that contains overhead of shifting

elements towards right to push an element at the start and shifting elements towards left to

pop an element from the start. To avoid this overhead of shifting left and right, we decided to

push and pop elements at the end of the array. Now, if we use linked list to implement the

stack, where will we push the element inside the list and from where will we pop the

element? There are few facts to consider, before we make any decision.

Insertion and removal in stack takes constant time. Singly linked list can serve the purpose.

Hence, the decision is to insert the element at the start in the implementation of push

operation and remove the element from the start in the pop implementation.

TOP

1 7 5 2 NULL

Fig. 6.4 : Stack using linked list.

7
The elements present inside this stack are 1, 7, 5 and 2. The most recent element of the stack

is 1. It may be removed if the pop() is called at this point of time. This stack has four nodes

inside it which are liked in such a fashion that the very first node pointed by the top pointer

contains the value1. This first node with value 1 is pointing to the node with value 7. The

node with value 7 is pointing to the node with value 5 while the node with value 5 is pointing

to the last node with value 2. To make a stack data structure using a linked list, we have

inserted new nodes at the start of the linked list.

We are now going to implement stack through linked list. Here are the algorithms for

implementation of stacks using linked lists.

Insertion

Algorithm: PUSHLINK(INFO, LINK, TOP, AVAIL, ITEM)

1. If AVAIL = NULL then Write OVERFLOW and Exit.

2. NEW = AVAIL and AVAIL = LINK[AVAIL].

3. INFO[NEW] = ITEM

4. LINK[NEW] = TOP

5. TOP = NEW

6. Exit.

In the above algorithm, first it is checked whether there is sufficient space to insert a

new item and if the space is available, the item is inserted at the top of the stack. The

insertion actually is made at the beginning of the linked-list.

Deletion

Algorithm: POPLINK(INFO, LINK, TOP, AVAIL, ITEM)

1. If TOP = NULL then Write UNDERFLOW and Exit.


8
2. ITEM = INFO[TOP].

3. TEMP = TOP and TOP = LINK[TOP]

4. LINK[TEMP] = AVAIL and AVAIL = TEMP.

5. Exit.

In the above algorithm, first it is checked that whether the list is having any element or not. If

it is having one or more elements, then the element at the top is deleted and correspondingly

pointers are changed.

3.4 Applications

The stacks are used in numerous applications and some of them are as follows:

 Arithmetic expression evaluation

 Undo operation of a document editor or a similar environment

 Implementation of recursive procedures

 Backtracking

 Keeping track of page-visited history of a web user

 Quicksort

We will be discussing some of these applications in detail.

3.4.1 Arithmetic Expression: Polish Notation

Let AE be an arithmetic expression involving constants and operations. This

section gives an algorithm which finds the value of AE by using Reverse Polish (Postfix)

Notation. We will see that the stack is an essential tool in this algorithm. We will be using the

following levels of precedence in our AE.

Highest : Parentheses()

Next Highest : Exponentiation (^)

Next highest : Multiplication (*) and division (/)

Lowest : Addition (+) and subtraction (-)


9
Example : Let we evaluate the following parenthesis-free arithmetic expression:

2 ^ 3 + 5 * 2 ^ 2 -12 / 6

First we evaluate the exponentiations to obtain

8 + 5 * 4 - 12 / 6

Then we evaluate the multiplication and division to obtain 8 + 20 - 2. Last, we evaluate the

addition and subtraction to obtain the final result, 26. Observe that the expression is traversed

three times, each time corresponding to a level of precedence of the operations.

Now if the expression is written using parentheses:

2 ^ 3 + (5 * 2) ^ 2 -12 / 6

Now first we evaluate the parentheses to obtain

2 ^ 3 + (10) ^ 2 – 12 / 6

Then we evaluate the exponentiation to obtain 8 + 100 – 2. Last, we evaluate the addition and

subtraction to obtain the final results as 106. It can easily be observed that by inserting a

parentheses, the result has changed drastically. To avoid such confusion and problems,

computer first convert the expression as a parentheses-free expression, which may be a

postfix expression or prefix expression and then evaluates it. Therefore, now we will be

studying the concept of these notations and use of stack in them.

Polish Notation

In mathematics, the operator symbol is placed between its two operands. For example,

A+B C-D E*F G/H

This is called infix notation. With this notation, we must distinguish between

(A + B) * C and A + (B * C)

by using either parentheses or operator-precedence convention such as the usual precedence

levels discussed above. Accordingly, the order of the operators and operands in an arithmetic

10
expression does not uniquely determine the order in which the operations are to be

performed.

We can convert an infix expression into polish notation as follows:

(A + B) * C = [+AB] * C = * +ABC

A + (B * C) = A + [*BC] = +A*BC

(A + B) / (C - D) = [+AB] / [-CD] = / +AB – CD

The fundamental property of Polish notation is that the order in which the operations are to be

performed is completely determined by the positions of the operators and operands in the

expression. There is no need of parentheses when writing expressions in Polish notation.

But when an expression in written in a program in computer, it is infix expression that is

used. The computer usually evaluates an arithmetic expression written in infix notation in two

steps. First, it converts the expression to postfix notation, and then it evaluates the postfix

expression. The postfix notation is also known as Reverse Polish Notation(RPN). Some of

the examples to convert an infix expression into postfix expression are as follows:

(A + B) * C = [+AB] * C = AB+C*

A + (B * C) = A + [*BC] = ABC*+

(A + B) / (C - D) = [+AB] / [-CD] = AB+CD-/

Now we will be considering the procedure for conversion of an infix notation into postfix

notation and then evaluating the postfix expression.

Converting an Infix Expressions into Postfix Expressions

Let AE be an arithmetic expression written in infix notation. The following algorithm

converts an infix expression AE into its equivalent postfix expression PE. The algorithm uses

a stack to temporarily hold operators and left parentheses. The postfix expression PE will be

constructed from left to right using the operands from AE and the operators which are

11
removed from STACK. We begin by pushing a left parenthesis onto STACK and adding a

right parenthesis at the end of AE. The algorithm is completed when STACK is empty.

Algorithm : POLISH(AE, PE)

1. Push "(" onto STACK, and add ")" to the end of AE.

2. Scan AE from left to right and repeat Steps 3 to 6 for each element of AE until the

STACK is empty

3. If an operand is encountered, add it to PE.

4. If a left parenthesis is encountered, push it onto STACK.

5. If an operator (x) is encountered, then

a) Repeatedly pop from STACK and add to PE each operator from the top of STACK,

which has the same precedence as or higher precedence than (x).

b) Add (x) to STACK.

6. If a right parenthesis is encountered, then

a) Repeatedly pop from STACK and add to PE each operator from top of STACK,

until a left parenthesis is encountered.

b) Remove the left parenthesis.

7. Exit.

Example : Let we discuss the algorithm with the help of an example. Consider the following

arithmetic infix expression

AE: A + (B - C * (D / E ^ F))

We will convert AE into its equivalent postfix expression PE using above algorithm. First we

push "(" onto STACK, and then we add ")" to the end of AE to obtain:

AE: A + ( B - C * ( D / E ^ F ) ))

Figure 6.5 shows the status of STACK and of the string PE as each element of AE is scanned.

12
Symbol Scanned Stack Expression PE

1 A ( A

2 + (+ A

3 ( (+( A

4 B (+( AB

5 - (+(- AB

6 C (+(- ABC

7 * (+(-* ABC

8 ( (+(-*( ABC

9 D (+(-*( ABCD

10 / (+(-*(/ ABCD

11 E (+(-*(/ ABCDE

12 ^ (+(-*(/^ ABCDE

13 F (+(-*(/^ ABCDEF

14 ) (+(-* ABCDEF^/

15 ) (+ ABCDEF^/*-

16 ) ABCDEF^/*-+

Figure: 6.5 (Status of STACK and PE as AE is scanned)

After converting as infix expression into an equivalent postfix expression, the next thing to be

done is to evaluate the postfix expression. Interestingly when we evaluate the postfix

expression, the stack is used again.

Evaluation of a Postfix Expression

13
Suppose PE is an arithmetic expression written in postfix notation. The following algorithm

uses a STACK to hold operands and after evaluating PE the result is stored in a variable

named as RESULT.

Algorithm : EVAL(PE, RESULT).

1. Add a right parenthesis ")"at the end of PE.

2. Scan PE from left to right and repeat Steps 3 and 4 for each element of until the

sentinel ")" is encountered.

3. If an operand is encountered, push it on STACK.

4. If an operator (x) is encountered, then

a) Remove the two top elements of STACK, where let A is the top element and B is

the next-to-top element.

b) Evaluate B (x) A.

c) Push the result of part (b) back on STACK

5. RESULT = TOP[STACK]

6. Exit.

Example : Let us consider the following arithmetic expression AE written in infix notation:

AE: 1 + 2 – 3 * (4 /2)

The equivalent postfix expression PE becomes:

PE: 1, 2, +, 3, 4, 2, /, *, -

We will now evaluate PE using EVAL algorithm. First we add a sentinel right parenthesis at

the end of PE to obtain

P: 1, 2, +, 3, 4, 2, /, *, -, )

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)

14
Figure 6.6 shows the contents of STACK as each element of PE is scanned. The final number

in STACK, -3, which is assigned to RESULT when the sentinel “)” is scanned, is the value of

PE.

Symbol Scanned STACK

1 1 1

2 2 1, 2

3 + 3

4 3 3, 3

5 4 3, 3, 4

6 2 3, 3, 4, 2

7 / 3, 3, 2

8 * 3, 6

9 - -3

10 )

Figure: 6.5 (Status of Stack when PE is scanned)

3.4.2 Recursion

Recursion is an important concept in computer science. Many

algorithms can be best described in terms of recursion. Let us discuss, how recursion may

prove to be useful tool in developing algorithms for specific problems. When a procedure

containing either a Call statement to itself or a Call statement to a second procedure that may

eventually Call back to the original procedure. Then that procedure is called a recursive

procedure. A recursive procedure must have the following two properties:

15
1) There must be certain criteria, called base criteria, for which the procedure does not call

itself.

2) Each time the procedure does call itself (directly or indirectly); it must be closer to the

base criteria.

A recursive procedure with these two properties is said to be well-defined. Similarly, a

function is said to be recursively defined if the function definition refers to itself. If a function

call itself without any base criteria then although the function is said to be recursively

defined, but not well-defined. The following examples should help us to clarify these ideas.

Example : Factorial Function

The product of the positive integers from 1 to n, inclusive, is called "n factorial" and is

usually denoted by n! i.e. n! = 1 . 2 . 3 . . . (n - 2) (n - 1) n

It is also defined that 0! = l. Thus we can say that,

5! = 1 . 2 . 3 . 4 . 5 = 120 and 6! = 1 . 2 . 3 . 4 . 5 . 6 = 720

This is true for every positive integer n; that is, n! = n . (n – 1)!. As visible from the definition

of the factorial function, the function is calling itself with different variable i.e. n-1 instead of

n. The function is also well-defined since with every step the function is going closer to the

base criteria which is 1! or 0! and is equals to 1. Accordingly, the factorial function may also

be defined as follows:

Definition: (Factorial Function)

(a) If n = 0, then n!= 1.

(b) If n > 0, then n! = n . (n-1)!

Example: If we calculate 4! using the recursive definition. This calculation undergoes the

following steps:

(1) 4! = 4 . 3!
(2) 3! = 4 . 2!
(3) 2! = 2 . 1!
16
(4) 1! = 1 . 0!
(5) 0! = 1
(6) 1! = 1 . 1 = 1
(7) 2! = 2 . 1 = 2
(8) 3! = 3 . 2 = 6
(9) 4! = 4 . 6 = 24

Let us now write the procedure that calculate n! This procedure calculates n! and returns the

value in the variable FACT.

Procedure : FACTORIAL (FACT, N)

1. If N = 0 then FACT = 1 and return.

2. Call FACTORIAL (FACT, N-1).

3. FACT = N * FACT.

4. Return.

The above procedure is a recursive procedure, since it contains a call statement to itself.

3.4.3 Quicksort

This is the most widely used internal sorting algorithm. In its basic form, it

was invented by C.A.R. Hoare in 1960. Its popularity lies in the ease of implementation,

moderate use of resources and acceptable behaviour for a variety of sorting cases. The basis

of quick sort is the divide and conquer strategy i.e. Divide the problem [list to be sorted] into

sub-problems [sub-lists], until solved sub problems [sorted sub-lists] are found. This is

implemented as follows:

Choose one item called pivot element A[I] from the list A[ ]. Generally the pivot element is

the first element of the list.

Rearrange the list so that this item is in the proper position, i.e., all preceding items have a

lesser value and all succeeding items have a greater value than this item.

1. Place A[0], A[1] .. A[I-1] in sublist 1.

2. Place A[I + 1], A[I + 2] ... A[N] in sublist 2

17
Repeat steps 1 & 2 for sublist1 & sublist2 till A[ ] is a sorted list.

As can be seen, this algorithm has a recursive structure. This is usually implemented as

follows:

1. Choose A[I] as the dividing element.

2. From the left end of the list (A[0] onwards) scan till an item A[R] is found whose value is

greater than A[I].

3. From the right end of list [A[N] backwards] scan till an item A[L] is found whose value is

less than A[I].

4. Swap A[R] & A[L].

5. Continue steps 2, 3 & 4 till the scan pointers cross. Stop at this stage.

6. At this point, sublist1 & sublist are ready.

7. Now do the same for each of sublist1 & sublist2.

Example: We will illustrate the quick sort with the help of an example

Suppose A is the following list of 12 numbers:

40, 30, 10, 50, 70, 90, 44, 60, 99, 20, 80, 60

Suppose we choose 40, the first element as the pivot element. Then beginning with the lst

number 60, scan the list from right to left, comparing each number with 40 and stopping at

the first number less than 44. The number is 20. Interchange 40 and 20 to obtain the list

20, 30, 10, 50, 70, 90, 44, 60, 99, 40, 80, 60

Now it is clearly visible that each number in the right of 40 is greater than 40. Now beginning

with 20, next scan the list in the opposite direction, from left to right, comparing each number

with 40 and stopping at the first number greater than 40. The number is 50. Interchange 40

and 50 to obtain the list

20, 30, 10, 40, 70, 90, 44, 60, 99, 50, 80, 60

18
Now again it is visible that all the numbers on the left side of 40 are less than 40 and the

numbers on the right of 40 are greater than 40. Therefore we finally obtain the following list

20, 30, 10, 40, 70, 90, 44, 60, 99, 50, 80, 60

Now 40 is in its final position, dividing the list into two sublists 20, 30, 10 and 70, 90, 44, 60,

99, 50, 80, 60.

The above procedure is repeated for both the sublists until all the elements are sorted. The

sorting can be accomplished using two stacks LOWER and UPPER with each stack

maintaining the lower and upper bounds of each list or sublist until the stack is empty. The

algorithm for the quicksort using stacks can be written as follows.

Procedure : QUICK(A, N, FIRST, LAST, LOC)

1. LEFT = FIRST, RIGHT = LAST and LOC = FIRST

2. (a) Repeat while A(LOC) ≤ A(RIGHT) and LOC ≠ RIGHT

RIGHT = RIGHT – 1

(b) If LOC = RIGHT then Return

(c) If A(LOC) > A(RIGHT), then

(i) TEMP = A(LOC), A(LOC) = A(RIGHT), A(RIGHT) = TEMP

(ii) Set LOC = RIGHT

(iii) Go to Step3

3. (a) Repeat while A(LEFT) ≤ A(LOC) and LOC ≠ LEFT

LEFT = LEFT + 1

(b) If LOC = LEFT then Return

(c) If A(LEFT) > A(LOC), then

(i) TEMP = A(LOC), A(LOC) = A(LEFT), A(LEFT) = TEMP

(ii) Set LOC = LEFT

(iii) Go to Step2
19
Algorithm: Quicksort(A, N)

1. TOP = NULL

2. If N > 1, then TOP = TOP + 1, LOWER(1) = 1, UPPER(1) = N

3. Repeat steps 4 to 7 while TOP ≠ NULL

4. FIRST = LOWER(TOP), END = UPPER(TOP), TOP = TOP – 1

5. Call QUICK(A, N, FIRST, LAST, LOC)

6. If FIRST < LOC – 1 then

TOP = TOP + 1, LOWER(TOP) = FIRST, UPPER(TOP) = LOC – 1

7. If LOC + 1 < LAST, then

TOP = TOP + 1, LOWER(TOP) = LOC + 1, UPPER(TOP) = LAST

8. Exit

The above algorithm is divided into two parts: first a procedure is written which finds the

final location of pivot element; second is the actual sorting algorithm which sorts the given

list using the above procedure.

The Quick sort algorithm uses the O(NLog2N) comparisons on average. The performance can

be improved by keeping in mind the following points.

1. Switch to a faster sorting scheme like insertion sort when the sublist size becomes

comparatively small.

2. Use a better dividing element in the implementations.

In worst case, the Quick sort algorithm uses the O(N2) comparisons. The worst case occurs

when the input list is already sorted and the pivot element is always picked as the first

element of the sublists. But this case occurs only as a special case, therefore the complexity

of Quick sort is assumed to be O(NLog2N). The chances of worst case can be minimized if

the pivot element is picked randomly in every step.

20
4. Summary

A stack is a list in which retrievals, insertion, and deletion can take place at the same position.

It follows the last in first out (LIFO) mechanism. In this chapter, we have studied how the

stacks are implemented using arrays and using liked list. Also, the advantages and

disadvantages of using these two schemes were discussed. For example, when a stack is

implemented using arrays, it suffers from the basic limitations of an array (fixed memory). To

overcome this problem, stacks are implemented using linked lists. Various applications of

stacks such as recursion, polish notation, and quicksort were also discussed. All of these

applications are found to be quite useful in computer science.

5. Suggested Readings/Reference material

 Data Structures Using C and C++, Yedidyah Langsam, Moshe J. Augenstein,

Aaron M Tenenbaum, Second Edition, PHI Publications.

 Algorithms + Data Structures = Programs by Niklaus Wirth, PHI publications.

 Fundamentals of Data Structures in C++ by E.Horowitz, Sahni and D.Mehta; Galgotia

Publications.

 Data Structures and Program Design in C by Kruse, C.L.Tonodo and B. Leung; Pearson

Education.

 Fundamentals of Data Structures in C by R.B. Patel, PHI Publications.

 Data Structures and Algorithms, V.Aho, Hopcropft, Ullman, Pearson India.

 Data Structures, Seymour Lipschutz, Schaum’s Outline Series, TMH.

6. Self Assessment Questions (SAQ)

 Consider the following stack of characters, where STACK is allocated N = 8 memory

cells: STACK: A, C, D, F, K, ___, ___, ___,

Describe the stack as the following operation take place:

(a) POP(STACK, ITEM) (b) POP(STACK, ITEM) (c) PUSH(STACK, L)


21
(d) PUSH(STACK, P) (e) POP(STACK, ITEM) (f) PUSH(STACK, R)

(g) PUSH(STACK, S) (h) POP(STACK, ITEM)

 Translate, by inspection and hand, each infix expression into its equivalent postfix

expression: (a) (A – B) * (D/E) (b) (A + B^D)/(E – F) + G (c) A * (B + D)/E – F * (G +

H/K)

 Consider the following arithmetic expression P, written in postfix notation:

P: 12, 7, 3, -, /, 2, 1, 5, +, *, +

 Translate P into its equivalent infix expression

 Evaluate the infix expression.

 Suppose S is the following list of 14 alphabetic characters:

DATASTRUCTURES

Suppose the characters in S are to be sorted alphabetically. Use the quicksort algorithm

to find the final position of the first character D.

 Suppose S consists of the following n = 5 letters:

ABCDE

Find the number of comparisons to sort S using quicksort. What general conclusions can

one make, if any?

 Suppose the Fibonacci numbers F11 = 89 and F12 = 144 are given,

 Should one use recursion or iteration to obtain F16? Find F16.

 Write an iterative procedure to obtain the first N Fibonacci numbers F[1], F[2], …,

F[N], where N > 2.

 Write a procedure to obtain the capacity of a linked stack represented by its top pointer

TOP. The capacity of a linked stack is the number of elements in the list forming the

stack.

 Evaluate each of the following parenthesis-free arithmetic expression:


22
 5+3^2–8/4*3+6

 6+2^3+9/3–4*5

 Explain the quicksort algorithm and program with the help of an example.

 What is recursion? If an algorithm can be written in both recursive and non-recursive

manner, which method would you prefer and why?

 Differentiate between the array and linked representation of stacks.

 Enlist some major applications of stacks.

23
Lesson : 7 Writer : Dr. Pardeep Kumar Mittal

Title : Queues Vetter : Prof. Rakesh Kumar

Structure:

1. Introduction

2. Objective

3. Presentation of Contents

3.1 Queue

3.2 Computer Representation of Queue

3.3 Operations on Queue

3.3.1 Insertion

3.3.2 Deletion

3.4 Dequeue

3.4.1 Array implementation of Dequeue

3.4.2 Linked List implementation of Dequeue

3.5 Priority Queues

3.5.1 Representation in Computer memory

4. Summary

5. Suggested Readings/Reference material

6. Self Assessment Questions (SAQ)

1
1. Introduction

Queues are a very useful data structure in Computer Science. Queue is the most common data

structure which allows elements to be inserted at one end called Rear and deleted at another end

called Front. Queue is also known as FIFO data structure. In a FIFO data structure, the first element

added to the queue will be the first one to be removed. This is equivalent to the requirement that once

a new element is added, all elements that were added before have to be removed before the new

element can be removed. A queue is an example of a linear data structure, or more abstractly a

sequential collection.

2. Objectives

At the end of this chapter, the reader must be able to understand linear data structure queue.

In this chapter we will study the computer representation of queue. The basic operations on

queue i.e. insertion & deletion operation in queue will also be discussed in this chapter. Then the

various types of queues are discussed in detail. In the end, various applications of queue are

described.

3. Presentation of Contents

3.1 Queue

Queues are data structures that allow insertions and deletions operations only at

the beginning or the end of the list, not in the middle.

A queue is a linear structure in which element may be inserted at one end called the rear, and the

deleted at the other end called the front. Figure 7.1 pictures a queue of people waiting for their turn.

Queues are also called First-In First-Out (FIFO) lists. An important example of a queue in computer

science occurs in a timesharing system, in which programs with the same priority form a queue while

waiting to be executed.

2
Figure: 7.1(People waiting for their turn)

Computer science also has common examples of queues. Our computer laboratory has 30 computers

networked with a single printer. When students want to print, their print tasks “get in line” with all

the other printing tasks that are waiting. The first task in is the next to be completed. If you are last in

line, you must wait for all the other tasks to print ahead of you.

In addition to printing queues, operating systems use a number of different queues to control

processes within a computer. The scheduling of what gets done next is typically based on a queuing

algorithm that tries to execute programs as quickly as possible and serve as many users as it can.

Also, as we type, sometimes keystrokes get ahead of the characters that appear on the screen. This is

due to the computer doing other work at that moment. The keystrokes are being placed in a queue-

like buffer so that they can eventually be displayed on the screen in the proper order.

3.2 Computer Representation of Queue

Queues may be represented in the computer in two ways i.e. by means of linear

arrays and one-way linked lists. First we will be considering representation of queues with the

help of arrays.

Queues will be maintained by a linear array QUEUE and two pointer variables: FRONT,

containing the location of the front(first) element of the queue; and REAR(last), containing the

location of the rear element of the queue. The condition FRONT = 0 will indicate that the queue

is empty.

3
Front = Rear = NULL

1 2 3 4 5 … N-1 N

Front = Rear = 1 A

1 2 3 4 5 … N-1 N

Front = 1 Rear = 2 A B

1 2 3 4 5 … N-1 N

Front = 1 Rear = 3 A B C

1 2 3 4 5 … N-1 N

Front = 2 Rear = 3 B C

1 2 3 4 5 … N-1 N

Figure: 7.2(Array Representation of a Queue)

As indicated in the fig.7.2, when there is no element, then front and rear both are equal to NULL.

If an element is inserted, then front and rear both becomes 1. If another item is inserted, the the

rear changes to 2, and after another insertion rear becomes 3. Now if an element is deleted then

the front changes to 2.

Now we will be considering representation of queues with the help of linked lists.

In this case the representation is vary much similar to the one-way linked list, except that the

address of the starting node is saved as FRONT and the address of the last node is termed as

REAR. Also this linked list is restricted linked-list in the sense that insertions can take place only

at the REAR and deletion can only take place at FRONT. The linked representation is shown

diagrammatically in fig. 7.3.

4
FRONT REAR

A B C D NULL

Fig. 7.3 ( Linked Representation of Queue)

The memory representation of queue via linked-list can be shown in the fig. 7.4.

INFO LINK

FRONT >1 4

2 B 3

REAR 3 C NULL

4 6

AVAIL >5 A 2

6 7

7 8

8 NULL

Fig. 7.4 (Memory representation of queue via linked-list)

3.3 Operations on Queue

3.3.1 Insertion

Figure 7.2 indicated the some of the ways in which elements will be

inserted in the queue and the way new elements will be deleted from the queue. Whenever an

5
element is added to the queue, the value of REAR is increased by 1; this can be implemented by

the assignment

REAR = REAR + 1

Generally QUEUE is maintained as circular array in computer science, that is, QUEUE[1] comes

after QUEUE[N] in the array. With this assumption, if we insert ITEM into the queue by

assigning ITEM to QUEUE[1]. Specifically, instead of increasing REAR to N+1, we reset

REAR=1 and then assign,

QUEUE [REAR] = ITEM

This operation can be shown by fig. 7.5(a) and 7.5(b)

Front = 3 Rear = N C D E F G H

1 2 3 4 5 … N-1 N

Fig. 7.5(a) (A queue in which REAR is N)

Front = 3 Rear = 1 I C D E F G H

1 2 3 4 5 … N-1 N

Fig. 7.5(b) (After inserting an element in the queue of fig. 7.5(a))

The operation of adding an item into a queue is implemented by the following algorithm, called

QINSERT

Algorithm : INSERT (QUEUE, N, FRONT, REAR, ITEM)

1. If FRONT = 1 and REAR = N, or if FRONT = REAR+1, then

Write OVERFLOW and Exit.

2. If FRONT = NULL, then

FRONT = 1 and REAR = 1.

6
Else if REAR = N, then

REAR = 1.

Else

REAR = REAR + 1.

3. QUEUE [REAR] = ITEM.

4. Exit.

The above algorithm for inserting an element in an array has been implemented using arrays. In

this algorithm the first step checks for the available space for inserting a new element. If no

space is available, then the overflow condition occurs. Otherwise control flows to the second

step, in which location where new element is to be inserted is found. In the third step, actual

insertion is made.

The insertion in a queue can also be implemented using linked list. The following algorithm

shows the insertion in a queue using linked list.

Algorithm : LINK_INSERT(INFO, LINK, FRONT, REAR, AVAIL, ITEM)

1. If AVAIL = NULL then Write OVERFLOW and Exit

2. NEW = AVAIL and AVAIL = LINK[AVAIL]

3. INFO[NEW] = ITEM and LINK[NEW] = NULL

4. If FRONT = NULL then

FRONT = REAR = NEW

Else

LINK[REAR] = NEW and REAR = NEW

5. Exit

7
The above algorithm works in a similar manner to that of previous algorithm. The only

difference being here instead of array linked-list is used.

3.3.2 Deletion

As shown in fig 7.2, whenever an element is deleted from the queue, the

value of FRONT is increased by 1; this can be implemented by the assignment

FRONT = FRONT + 1

As QUEUE is assumed to be circular, that is, that QUEUE[1] comes after QUEUE [N] in the

array. With this assumption, if FRONT = N and an element of QUEUE is deleted, we reset

FRONT = 1 instead of increasing FRONT to N+1 as shown in fig. 7.6(a) and 7.6(b).

Front = N Rear = 2 I J H

1 2 3 4 5 … N-1 N

Fig. 7.6(a) (A queue in which FRONT is N)

Front = 1 Rear = 2 I J

1 2 3 4 5 … N-1 N

Fig. 7.6(b) (After deleting an element in the queue of fig. 7.6(a))

Suppose that our queue contains only one element, i.e., suppose that

FRONT = REAR = 1

and suppose that the element is deleted. Then we assign

FRONT = NULL and REAR = NULL

to indicate that the queue is empty and this operation can be depicted by fig. 7.7(a) ans 7.7(b).

Front = Rear = 1 A

8
1 2 3 4 5 … N-1 N

Fig. 7.7(a) (A queue in which FRONT = REAR = N)

Front = Rear = NULL

1 2 3 4 5 … N-1 N

Fig. 7.7(b) (After deleting an element in the queue of fig. 7.7(a))

The operation of removing an item from a queue is implemented by the following algorithm,

called QDELETE

Algorithm : QDELETE (QUEUE, N, FRONT, REAR, ITEM)

1. If FRONT = NULL then

Write UNDERFLOW and Exit.

2. ITEM = QUEUE[FRONT].

3. If FRONT = REAR then

FRONT = NULL and REAR = NULL.

Else if FRONT = N then

FRONT = 1.

Else

FRONT = FRONT+1.

4. Return ITEM

5. Exit

The above algorithm for deleting an element in an array has been implemented using arrays. In

the first step, it is checked whether any element is existing in the list or not. If no element is not

9
available in the array then underflow condition occurs. In the second step the element at the

FRONT is saved in a variable. In the next step the element is actually deleted.

The deletion in a queue can also be implemented using linked list. The following algorithm

shows the deletion in a queue using linked list.

Algorithm : LINK_DELETE(INFO, LINK, FRONT, REAR, AVAIL, ITEM)

1. If FRONT = NULL then Write UNDERFLOW and Exit

2. TEMP = FRONT

3. ITEM = INFO[TEMP]

4. FRONT = LINK[TEMP]

5. LINK[TEMP] = AVAIL and AVAIL = TEMP

6. Exit

The above algorithm works in a similar manner to that of previous algorithm. The only

difference being here instead of array linked-list is used.

3.4 Dequeue

A deque is a linear list in which elements can be added or removed at either end

but not in the middle. The term deque refers to the name double-ended queue.

There are two variations of a deque - namely, an input-restricted deque and an output restricted

deque - which are intermediate between a deque and a queue. An input restricted deque is a

deque which allows insertions at only one end of the list but allows deletions at both ends of the

list; and an output-restricted deque is a deque, which allows deletions at only one end of the list

buy allows insertions at both ends of the list.

Figure 7.8 pictures a deques, with 4 elements maintained in an array with N = 8 memory locations. The

condition FRONT = NULL will be used to indicate that a deque is empty. In a deque FRONT and REAR

10
are maintained as in a normal queue. The only difference being insertions and deletions can be done at

any end as shown in fig. 7.8

FRONT REAR

A B C D

Insertion/Deletion Insertion/Deletion
Figure 7.8 (Representation of a deque)

A dequeue can also be represented using linked lists as if fig. 7.3 and 7.4, again the difference

being insertions/deletions can be done at any end.

3.4.1 Operations on dequeue

Insertions: If a dequeue is implemented using arrays, then we will assume for

simplicity that array is not maintained as a circular array and therefore the algorithm for

insertions in a deque i.e. insertion at the rear can be written as follows:

Algorithm : INSERTATREAR (DEQUE, N, FRONT, REAR, ITEM)

1. If REAR = N then Write OVERFLOW and Exit.

2. If FRONT = NULL then

FRONT = 1 and REAR = 1.

Else

REAR = REAR + 1.

3. DEQUE [REAR] = ITEM.

4. Exit.

The second algorithm i.e. insertion at the front can be written as:

11
Algorithm : INSERTATFRONT (DEQUE, N, FRONT, REAR, ITEM)

1. If FRONT = 1 then Write “Can not insert at front end” and Exit.

2. If FRONT = NULL then

FRONT = 1 and REAR = 1.

Else

FRONT = FRONT - 1.

3. DEQUE [FRONT] = ITEM.

4. Exit.

If the deque is represented as a linked-list, then the algorithm for insertion at rear remains same

as it was written earlier in the section 3.3.1 for a simple queue. But when the insertion is to be

made at the front, the algorithm can be written as follows:

Algorithm : LK_INSERT_FRONT_DEQUE(INFO, LINK, FRONT, REAR, AVAIL, ITEM)

1. If AVAIL = NULL then Write OVERFLOW and Exit

2. NEW = AVAIL and AVAIL = LINK[AVAIL]

3. INFO[NEW] = ITEM and LINK[NEW] = FRONT

4. If FRONT = NULL then

FRONT = REAR = NEW

Else

FRONT = NEW

5. Exit

Deletion: With the assumption that array is not maintained as a circular array, the algorithm for

deletion in a deque i.e. deletion at the front can be written as follows:

Algorithm : DQDELFRONT (DEQUE, N, FRONT, REAR, ITEM)

12
1. If FRONT = NULL then Write UNDERFLOW and Exit.

2. ITEM = DEQUE[FRONT].

3. If FRONT = REAR then

FRONT = NULL and REAR = NULL.

Else

FRONT = FRONT+1.

4. Exit

The second algorithm i.e. deletion at the rear can be written as:

Algorithm : DQDELREAR (DEQUE, N, FRONT, REAR, ITEM)

1. If FRONT = REAR = NULL then Write UNDERFLOW and Exit.

2. ITEM = DEQUE[REAR].

3. If FRONT = REAR then

FRONT = NULL and REAR = NULL.

Else

REAR = REAR – 1.

 Return ITEM

 Exit

Now again if the deque is represented as a linked-list, then the algorithm for deletion at front

remains same as it was written earlier in the section 3.3.2 for a simple queue. But when the

deletion is to be made at the rear, the algorithm can be written as follows:

Algorithm : LK_DELETE_DEQUE_REAR(INFO, LINK, FRONT, REAR, AVAIL, ITEM)

1. If FRONT = REAR = NULL then Write UNDERFLOW and Exit

2. TEMP = REAR

13
3. ITEM = INFO[TEMP]

4. If FRONT = REAR then

FRONT = REAR = NULL and goto 8

5. SAVE = FRONT and PTR = LINK[FRONT]

6. Repeat while PTR ≠ REAR

SAVE = PTR

PTR = LINK[PTR]

7. REAR = SAVE and LINK[REAR] = NULL

8. LINK[TEMP] = AVAIL and AVAIL = TEMP

9. Return ITEM

10. Exit

In the above algorithm, the main thing is to find the address of the node previous to the rear node

it exists. For this purpose we are moving from FRONT to REAR as we can not move backward

from REAR. We are also maintaining a variable SAVE for the purpose of getting a node

previous to the REAR node.

3.5 Priority Queues

A priority queue is a collection of elements such that each element has been assigned

a priority and such that the order in which elements are deleted and processed comes from the

following rules:

1) An element of higher priority is processed before any element of lower priority.

2) Two elements with the same priority are processed according to the order in which

they were added to the queue.

14
There are various ways for maintaining a priority queue in computer memory. The two main

ways are using one-way list and array representation. We will be discussing each of them in

brief.

3.5.1 Representation in Computer Memory

One-Way List Representation

One way to maintain a priority queue in computer memory is by using one-way list, as follows:

 Each node in the list will consists of three parts i.e. INFO, which will contain the

information of the field, PRN, which will store the information about the priority of the

node, and LINK, the link field, which will store the address of the next node.

 A node X will precede a node Y in the list (1) when X has higher priority than Y or (2)

when both have same priority, but X has arrived before Y.

The representation of priority queues using linked-list can be shown as in fig. 7.9

FRONT REAR

A 1 B 2 C 2 D 5 NULL

Fig. 7.9 (Representation of priority queue using linked-list)

One can insert or delete an element in a priority queue using above representation by the

following algorithms:

Algorithm: Delete_Priority

 ITEM = INFO[FRONT]

 Delete first node from the list

15
 Process ITEM

 Exit

The above algorithm can also be written as similar to LINK_DELETE algorithm in section 3.3.2

Algorithm: Insert_Priority

1. Traverse the one-way list until finding a node X whose priority number exceeds N. Insert

ITEM in front of node X.

2. If no such node is found, insert ITEM as the last element in the list.

The above algorithm can also be written as follows:

Algorithm : LK_PRIOR_INSERT(INFO, LINK, FRONT, REAR, AVAIL, ITEM, PRN, PRT)

Here PRN indicates the priority number of the various nodes and PRT indicates the priority

number of the item to be inserted.

1. If AVAIL = NULL then Write OVERFLOW and Exit

2. NEW = AVAIL and AVAIL = LINK[AVAIL]

3. INFO[NEW] = ITEM and PRN[NEW] = PRT

4. If FRONT = NULL then

FRONT = REAR = NEW and LINK[NEW] = NULL and Exit

5. SAVE = NULL and PTR = FRONT

6. Repeat steps 7 & 8 while PTR ≠ NULL

7. If PRN[PTR] > PRT then go to step 9

8. SAVE = PTR and PTR = LINK[PTR]

9. If SAVE = NULL then


16
LINK[NEW] = FRONT and FRONT = NEW

Else

LINK[NEW] = LINK[SAVE] and LINK[SAVE] = NEW

10. Exit

Array Representation

Another way to maintain a priority queue is to use a separate queue for each level of

priority. Each such queue will appear in its own circular array and will have its own pair of

pointers, i.e. FRONT and REAR. Each array will be allocated same amount of space. Actually a

two-dimensional array can be used for maintaining priority queue in memory.

The array representation of priority queue can be shown as in fig. 7.10

FRONT REAR PRIORITY 1 2 3 4 5

1 2 1 A B

2 2 2 C

2 2 3 D

3 4 4 E F

NULL NULL 5

Fig. 7.10 (Representation of priority queue using arrays)

One can insert or delete an element in a priority queue using above representation by the

following algorithms:

Algorithm: Delete_ArrPrior

1. Find the smallest K such that FRONT[K] ≠ NULL

2. Delete and process the front element in row K of QUEUE

17
3. Exit

Algorithm: Insert_ArrPrior

1. Insert ITEM as the rear element in row M(priority number) of QUEUE.

2. Exit.

1.6 Applications of Queue

Many application involving queues require priority queues rather than the simple

FIFO strategy. For elements of same priority, the FIFO order is used. For example, in a multiuser

system, there will be several programs competing for use of the central processor at one time.

The programs have a priority value associated to them and are held in a priority queue. The

program with the highest priority is given first use of the central processor.

Scheduling of jobs within a time-sharing system is another application of queues. In such system

many users may request processing at a time and computer time divided among these requests.

The simplest approach sets up one queue that store all requests for processing. Computer

processes the request at the front of the queue and finished it before starting on the next. Same

approach is also used when several users want to use the same output device, say a printer. In a

time sharing system, another common approach used is to process a job only for a specified

maximum length of time. If the program is fully processed within that time, then the computer

goes on to the next process. If the program is not completely processed within the specified time,

the intermediate values are stored and the remaining part of the program is put back on the

queue. This approach is useful in handling a mix of long and short jobs.

Major applications of queue can be summarized as follows:

18
1) Serving requests of a single shared resource (printer, disk, CPU), transferring data

asynchronously (data not necessarily received at same rate as sent) between two

processes (IO buffers), e.g., pipes, file IO, sockets.

2) Call center phone systems will use a queue to hold people in line until a service

representative is free.

3) Buffers on MP3 players and portable CD players, iPod playlist. Playlist for jukebox - add

songs to the end, play from the front of the list.

4) When programming a real-time system that can be interrupted (e.g., by a mouse click or

wireless connection), it is necessary to attend to the interrupts immediately, before

proceeding with the current activity. If the interrupts should be handles in the same order

they arrive, then a FIFO queue is the appropriate data structure.

5) When a resource is shared among multiple consumers. Examples include CPU

scheduling, Disk Scheduling.

6) When data is transferred asynchronously (data not necessarily received at same rate as

sent) between two processes. Examples include IO Buffers, pipes, file IO, etc.

4. Summary

In this chapter, we discussed the data structure Queue. It had two ends. One is front from

where the elements can be deleted and the other if rear to where the elements can be added. It

follows first in first out (FIFO) order. A queue can be implemented using Arrays or Linked lists.

Each representation is having it’s own advantages and disadvantages. The problems with arrays

are that they are limited in space. Hence, the queue is having a limited capacity. If queues are
19
implemented using linked lists, then this problem is solved. Now, there is no limit on the

capacity of the queue. The only overhead is the memory occupied by the pointers.

There are a number of variants of the queues. Normally, queues mean circular queues. A special

type of queue called Dequeue was also discussed in this chapter. Dequeues permit elements to be

added or deleted at either of the rear or front. We also discussed the array and linked list

implementations of Dequeue. Priority queues are also discussed in detail in this chapter.

Queues are employed in many situations. The items on a queue may be vehicle waiting at a

crossing, car waiting at the service station, customers in a check-out line at a departmental store,

etc. In computer science the queue is generally used in operating systems, networking

simulation, etc.

5. Suggested Readings/Reference material

 Data Structures Using C and C++, Yedidyah Langsam, Moshe J. Augenstein,

Aaron M Tenenbaum, Second Edition, PHI Publications.

 Algorithms + Data Structures = Programs by Niklaus Wirth, PHI publications.

 Fundamentals of Data Structures in C++ by E.Horowitz, Sahni and D.Mehta; Galgotia

Publications.

 Data Structures and Program Design in C by Kruse, C.L.Tonodo and B. Leung; Pearson

Education.

 Fundamentals of Data Structures in C by R.B. Patel, PHI Publications.

 Data Structures and Algorithms, V.Aho, Hopcropft, Ullman, Pearson India.

 Data Structures, Seymour Lipschutz, Schaum’s Outline Series, TMH.

6. Self Assessment Questions (SAQ)

 Suppose each data structure is stored in a circular array with N memory cells.

20
 Find the number of elements in a queue in terms of FRONT and REAR.

 Find the number of elements in a deque in terms of LEFT and RIGHT.

 When will array be filled?

 Consider a deque maintained by a circular array with N memory cells.

(a) Suppose an element is added to the deque. How is LEFT or RIGHT changed?

(b) Suppose an element is deleted. How is LEFT or RIGHT changed?

 Consider the following queue where QUEUE is allocated 6 memory cells:

FRONT = 2, REAR = 5 QUEUE: __________, P, Q, R, S, __________

Describe the queue, including FRONT and REAR as the following operations take place:

(a) X is added, (b) two elements are deleted (c) Y is added (d) Z is added (e) three elements are

deleted (f) M is added.

 Suppose a queue is maintained by a circular queue QUEUE with N = 12 memory cells. Find

the number of elements in QUEUE if (a) FRONT = 4, REAR = 8, (b) FRONT = 10, REAR

= 3 and (c) FRONT = 5, REAR = 6 and then two elements are deleted.

 Compare the array and linked representation of a queue. Explain your answer.

 Describe the representation of deque in computer memory.

 What is a priority queue? Explain its representation in computer memory.

 What are various applications of queue? Explain in brief.

 Write the pseudocode for the various possible operations in an input restricted deque.

 Write the pseudocode for the various possible operations in an output restricted deque.

21

You might also like