0% found this document useful (0 votes)
12 views100 pages

Data Structure and Algorithm

Uploaded by

Priyanshu Nishad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views100 pages

Data Structure and Algorithm

Uploaded by

Priyanshu Nishad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 100

DATA STRUCTURE AND ALGORITHM

What is an Algorithm?

In computer programming terms, an algorithm is a set of well-defined

instructions to solve a particular problem. It takes a set of input(s) and

produces the desired output. For example,

An algorithm to add two numbers:

1. Take two number inputs

2. Add numbers using the + operator

3. Display the result.

Qualities of a Good Algorithm

 Input and output should be defined precisely.

 Each step in the algorithm should be clear and unambiguous.

 Algorithms should be most effective among many different ways to


solve a problem.

 An algorithm shouldn't include computer code. Instead, the algorithm


should be written in such a way that it can be used in different
programming languages.
Algorithm 1: Add two numbers entered by the user

Step 1: Start

Step 2: Declare variables num1, num2 and sum.

Step 3: Read values num1 and num2.

Step 4: Add num1 and num2 and assign the result to sum.

sum←num1+num2

Step 5: Display sum

Step 6: Stop

Algorithm 2: Find the largest number among three numbers

Step 1: Start

Step 2: Declare variables a,b and c.

Step 3: Read variables a,b and c.

Step 4: If a > b

If a > c

Display a is the largest number.

Else

Display c is the largest number.

Else

If b > c

Display b is the largest number.

Else
Display c is the greatest number.

Step 5: Stop

Algorithm 3: Find Roots of a Quadratic Equation ax2 + bx + c = 0


Step 1: Start

Step 2: Declare variables a, b, c, D, x1, x2, rp and ip;

Step 3: Calculate discriminant

D ← b2-4ac

Step 4: If D ≥ 0

r1 ← (-b+√D)/2a

r2 ← (-b-√D)/2a

Display r1 and r2 as roots.

Else

Calculate real part and imaginary part

rp ← -b/2a

ip ← √(-D)/2a

Display rp+j(ip) and rp-j(ip) as roots

Step 5: Stop

What are Data Structures?

Data structure is a storage that is used to store and organize data. It is a


way of arranging data on a computer so that it can be accessed and
updated efficiently.
Depending on your requirement and project, it is important to choose the
right data structure for your project. For example, if you want to store data
sequentially in the memory, then you can go for the Array data structure.

Array
data Structure Representation

Note: Data structure and data types are slightly different. Data structure is
the collection of data types arranged in a specific order.

Types of Data Structure

Basically, data structures are divided into two categories:

 Linear data structure

 Non-linear data structure

Let's learn about each type in detail.


Linear data structures

In linear data structures, the elements are arranged in sequence one after
the other. Since elements are arranged in particular order, they are easy to
implement.

However, when the complexity of the program increases, the linear data
structures might not be the best choice because of operational
complexities.

Popular linear data structures are:


1. Array Data Structure

In an array, elements in memory are arranged in continuous memory. All


the elements of an array are of the same type. And, the type of elements
that can be stored in the form of arrays is determined by the programming
language.

To learn more, visit Java Array.

An array with each element represented by an index

2. Stack Data Structure

In stack data structure, elements are stored in the LIFO principle. That is,
the last element stored in a stack will be removed first.

It works just like a pile of plates where the last plate kept on the pile will be
removed first. To learn more, visit Stack Data Structure.
In a stack, operations can be perform only from one end (top
here).

3. Queue Data Structure

Unlike stack, the queue data structure works in the FIFO principle where
first element stored in the queue will be removed first.

It works just like a queue of people in the ticket counter where first person
on the queue will get the ticket first. To learn more, visit Queue Data
Structure.

In a queue, addition and removal are performed from separate ends.

4. Linked List Data Structure

In linked list data structure, data elements are connected through a series
of nodes. And, each node contains the data items and address to the next
node.
Non linear data structures

Unlike linear data structures, elements in non-linear data structures are not
in any sequence. Instead they are arranged in a hierarchical manner where
one element will be connected to one or more elements.

Non-linear data structures are further divided into graph and tree based
data structures.

1. Graph Data Structure

In graph data structure, each node is called vertex and each vertex is
connected to other vertices through edges.

Graph data structure example

Popular Graph Based Data Structures:


 Spanning Tree and Minimum Spanning Tree
 Strongly Connected Components
 Adjacency Matrix
 Adjacency List
2. Trees Data Structure

Similar to a graph, a tree is also a collection of vertices and edges.


However, in tree data structure, there can only be one edge between two
vertices.
Popular Tree based Data Structure

 Binary Tree
 Binary Search Tree
 AVL Tree
 B-Tree
 B+ Tree
 Red-Black Tree

Linear Vs Non-linear Data Structures

Now that we know about linear and non-linear data structures, let's see the
major differences between them.

Linear Data Structures Non Linear Data Structures

The data items are arranged in sequential order, The data items are arranged in non-sequential
one after the other. order (hierarchical manner).

All the items are present on the single layer. The data items are present at different layers.

It can be traversed on a single run. That is, if we It requires multiple runs. That is, if we start from
start from the first element, we can traverse all the first element it might not be possible to
the elements sequentially in a single pass. traverse all the elements in a single pass.

Different structures utilize memory in different


The memory utilization is not efficient.
efficient ways depending on the need.

The time complexity increase with the data size. Time complexity remains the same.

Example: Arrays, Stack, Queue Example: Tree, Graph, Map


Why Learn Data Structures and
Algorithms?
In this article, we will learn why every programmer should learn data
structures and algorithms with the help of examples.

This article is for those who have just started learning algorithms and
wondered how impactful it will be to boost their career/programming skills.
It is also for those who wonder why big companies like Google, Facebook,
and Amazon hire programmers who are exceptionally good at optimizing
Algorithms.

What are Algorithms?

Informally, an algorithm is nothing but a mention of steps to solve a


problem. They are essentially a solution.

For example, an algorithm to solve the problem of factorials might look


something like this:

Problem: Find the factorial of n

Initialize fact = 1

For every value v in range 1 to n:

Multiply the fact by v

fact contains the factorial of n


Here, the algorithm is written in English. If it was written in a programming
language, we would call it to code instead. Here is a code for finding the
factorial of a number in C++.

int factorial(int n) {
int fact = 1;
for (int v = 1; v <= n; v++) {
fact = fact * v;
}
return fact;
}

Programming is all about data structures and algorithms. Data structures


are used to hold data while algorithms are used to solve the problem using
that data.

Data structures and algorithms (DSA) goes through solutions to standard


problems in detail and gives you an insight into how efficient it is to use
each one of them. It also teaches you the science of evaluating the
efficiency of an algorithm. This enables you to choose the best of various
choices.

Use of Data Structures and Algorithms to Make Your Code Scalable

Time is precious.
Suppose, Alice and Bob are trying to solve a simple problem of finding the
sum of the first 1011 natural numbers. While Bob was writing the algorithm,
Alice implemented it proving that it is as simple as criticizing Donald Trump.
Algorithm (by Bob)

Initialize sum = 0

for every natural number n in range 1 to 1011(inclusive):


add n to sum
sum is your answer

Code (by Alice)

int findSum() {
int sum = 0;
for (int v = 1; v <= 100000000000; v++) {
sum += v;
}
return sum;
}

Alice and Bob are feeling euphoric of themselves that they could build
something of their own in almost no time. Let's sneak into their workspace
and listen to their conversation.

Alice: Let's run this code and find out the sum.
Bob: I ran this code a few minutes back but it's still not showing the
output. What's wrong with it?

Oops, something went wrong! A computer is the most deterministic


machine. Going back and trying to run it again won't help. So let's analyze
what's wrong with this simple code.

Two of the most valuable resources for a computer program


are time and memory.
The time taken by the computer to run code is:
Time to run code = number of instructions * time to execute each
instruction

The number of instructions depends on the code you used, and the time
taken to execute each code depends on your machine and compiler.

In this case, the total number of instructions executed (let's say x) are x = 1

+ (1011 + 1) + (1011) + 1 , which is x = 2 * 1011 + 3

Let us assume that a computer can execute y = 108 instructions in one


second (it can vary subject to machine configuration). The time taken to run
above code is

Time to run y instructions = 1 second

Time to run 1 instruction = 1 / y seconds

Time to run x instructions = x * (1/y) seconds = x / y seconds

Hence,

Time to run the code = x / y

= (2 * 1011 + 3) / 108 (greater than 33 minutes)

Is it possible to optimize the algorithm so that Alice and Bob do not have to
wait for 33 minutes every time they run this code?

I am sure that you already guessed the right method. The sum of
first N natural numbers is given by the formula:

Sum = N * (N + 1) / 2

Converting it into code will look something like this:

int sum(int N) {

return N * (N + 1) / 2;
}

This code executes in just one instruction and gets the task done no matter
what the value is. Let it be greater than the total number of atoms in the
universe. It will find the result in no time.

The time taken to solve the problem, in this case, is 1/y (which is 10
nanoseconds). By the way, the fusion reaction of a hydrogen bomb takes
40-50 ns, which means your program will complete successfully even if
someone throws a hydrogen bomb on your computer at the same time you
ran your code. :)

Note: Computers take a few instructions (not 1) to compute multiplication


and division. I have said 1 just for the sake of simplicity.

More on Scalability

Scalability is scale plus ability, which means the quality of an


algorithm/system to handle the problem of larger size.

Consider the problem of setting up a classroom of 50 students. One of the


simplest solutions is to book a room, get a blackboard, a few chalks, and
the problem is solved.

But what if the size of the problem increases? What if the number of
students increased to 200?
The solution still holds but it needs more resources. In this case, you will
probably need a much larger room (probably a theater), a projector screen
and a digital pen.
What if the number of students increased to 1000?
The solution fails or uses a lot of resources when the size of the problem
increases. This means, your solution wasn't scalable.

What is a scalable solution then?


Consider a site like Khanacademy, millions of students can see videos,
read answers at the same time and no more resources are required. So,
the solution can solve the problems of larger size under resource crunch.
If you see our first solution to find the sum of first N natural numbers, it
wasn't scalable. It's because it required linear growth in time with the linear
growth in the size of the problem. Such algorithms are also known as
linearly scalable algorithms.
Our second solution was very scalable and didn't require the use of any
more time to solve a problem of larger size. These are known as constant-
time algorithms.

Memory is expensive

Memory is not always available in abundance. While dealing with


code/system which requires you to store or produce a lot of data, it is
critical for your algorithm to save the usage of memory wherever possible.
For example: While storing data about people , you can save memory by
storing only their date of birth, not their age. You can always calculate it on
the fly using their date of birth and current date.
Examples of an Algorithm's Efficiency

Here are some examples of what learning algorithms and data structures
enable you to do:

Example 1: Age Group Problem

Problems like finding the people of a certain age group can easily be
solved with a little modified version of the binary search
algorithm (assuming that the data is sorted).
The naive algorithm which goes through all the persons one by one, and
checks if it falls in the given age group is linearly scalable. Whereas, binary
search claims itself to be a logarithmically scalable algorithm. This means
that if the size of the problem is squared, the time taken to solve it is only
doubled.

Suppose, it takes 1 second to find all the people at a certain age for a
group of 1000. Then for a group of 1 million people,

 the binary search algorithm will take only 2 seconds to solve the
problem

 the naive algorithm might take 1 million seconds, which is around 12


days

The same binary search algorithm is used to find the square root of a
number.

Example 2: Rubik's Cube Problem

Imagine you are writing a program to find the solution of a Rubik's cube.
This cute looking puzzle has annoyingly 43,252,003,274,489,856,000
positions, and these are just positions! Imagine the number of paths one
can take to reach the wrong positions.

Fortunately, the way to solve this problem can be represented by the graph
data structure. There is a graph algorithm known as Dijkstra's
algorithm which allows you to solve this problem in linear time. Yes, you
heard it right. It means that it allows you to reach the solved position in a
minimum number of states.

Example 3: DNA Problem

DNA is a molecule that carries genetic information. They are made up of


smaller units which are represented by Roman characters A, C, T, and G.

Imagine yourself working in the field of bioinformatics. You are assigned


the work of finding out the occurrence of a particular pattern in a DNA
strand.

It is a famous problem in computer science academia. And, the simplest


algorithm takes the time proportional to

(number of character in DNA strand) * (number of characters in pattern)

A typical DNA strand has millions of such units. Eh! worry not. KMP
algorithm can get this done in time which is proportional to

(number of character in DNA strand) + (number of characters in pattern)

The * operator replaced by + makes a lot of change.


Considering that the pattern was of 100 characters, your algorithm is now
100 times faster. If your pattern was of 1000 characters, the KMP algorithm
would be almost 1000 times faster. That is, if you were able to find the
occurrence of pattern in 1 second, it will now take you just 1 ms. We can
also put this in another way. Instead of matching 1 strand, you can match
1000 strands of similar length at the same time.

And there are infinite such stories...

Final Words

Generally, software development involves learning new technologies on a


daily basis. You get to learn most of these technologies while using them in
one of your projects. However, it is not the case with algorithms.

If you don't know algorithms well, you won't be able to identify if you can
optimize the code you are writing right now. You are expected to know
them in advance and apply them wherever possible and critical.

We specifically talked about the scalability of algorithms. A software system


consists of many such algorithms. Optimizing any one of them leads to a
better system.

However, it's important to note that this is not the only way to make a
system scalable. For example, a technique known as distributed
computing allows independent parts of a program to run to multiple
machines together making it even more scalable.
Asymptotic Analysis: Big-O
Notation and More
In this tutorial, you will learn what asymptotic notations are. Also, you will
learn about Big-O notation, Theta notation and Omega notation.

The efficiency of an algorithm depends on the amount of time, storage and


other resources required to execute the algorithm. The efficiency is
measured with the help of asymptotic notations.

An algorithm may not have the same performance for different types of
inputs. With the increase in the input size, the performance will change.

The study of change in performance of the algorithm with the change in the
order of the input size is defined as asymptotic analysis.

Asymptotic Notations

Asymptotic notations are the mathematical notations used to describe the


running time of an algorithm when the input tends towards a particular
value or a limiting value.

For example: In bubble sort, when the input array is already sorted, the
time taken by the algorithm is linear i.e. the best case.
But, when the input array is in reverse condition, the algorithm takes the
maximum time (quadratic) to sort the elements i.e. the worst case.

When the input array is neither sorted nor in reverse order, then it takes
average time. These durations are denoted using asymptotic notations.

There are mainly three asymptotic notations:

 Big-O notation

 Omega notation

 Theta notation

Big-O Notation (O-notation)

Big-O notation represents the upper bound of the running time of an


algorithm. Thus, it gives the worst-case complexity of an algorithm.

Big-O gives
the upper bound of a function
O(g(n)) = { f(n): there exist positive constants c and n0
such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0 }

The above expression can be described as a function f(n) belongs to the


set O(g(n)) if there exists a positive constant c such that it lies
between 0 and cg(n) , for sufficiently large n .
For any value of n , the running time of an algorithm does not cross the time
provided by O(g(n)) .

Since it gives the worst-case running time of an algorithm, it is widely used


to analyze an algorithm as we are always interested in the worst-case
scenario.

Omega Notation (Ω-notation)

Omega notation represents the lower bound of the running time of an


algorithm. Thus, it provides the best case complexity of an algorithm.
Omega gives the
lower bound of a function

Ω(g(n)) = { f(n): there exist positive constants c and n0


such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0 }

The above expression can be described as a function f(n) belongs to the


set Ω(g(n)) if there exists a positive constant c such that it lies
above cg(n) , for sufficiently large n .
For any value of n , the minimum time required by the algorithm is given by
Omega Ω(g(n)) .

Theta Notation (Θ-notation)

Theta notation encloses the function from above and below. Since it
represents the upper and the lower bound of the running time of an
algorithm, it is used for analyzing the average-case complexity of an
algorithm.
Theta bounds
the function within constants factors

For a function g(n) , Θ(g(n)) is given by the relation:

Θ(g(n)) = { f(n): there exist positive constants c1, c2 and n0


such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0 }

The above expression can be described as a function f(n) belongs to the


set Θ(g(n)) if there exist positive constants c1 and c2 such that it can be
sandwiched between c1g(n) and c2g(n) , for sufficiently large n.
If a function f(n) lies anywhere in between c1g(n) and c2g(n) for all n ≥ n0 ,

then f(n) is said to be asymptotically tight bound.

Master Theorem

In this tutorial, you will learn what master theorem is and how it is used for
solving recurrence relations.

The master method is a formula for solving recurrence relations of the form:

T(n) = aT(n/b) + f(n),


where,

n = size of input

a = number of subproblems in the recursion

n/b = size of each subproblem. All subproblems are assumed

to have the same size.

f(n) = cost of the work done outside the recursive call,

which includes the cost of dividing the problem and

cost of merging the solutions

Here, a ≥ 1 and b > 1 are constants, and f(n) is an asymptotically


positive function.

An asymptotically positive function means that for a sufficiently large value


of n , we have f(n) > 0 .

The master theorem is used in calculating the time complexity of


recurrence relations (divide and conquer algorithms) in a simple and quick
way.

Master Theorem

If a ≥ 1 and b > 1 are constants and f(n) is an asymptotically positive


function, then the time complexity of a recursive relation is given by

T(n) = aT(n/b) + f(n)


where, T(n) has the following asymptotic bounds:

1. If f(n) = O(nlog b a-ϵ), then T(n) = Θ(nlog b a).

2. If f(n) = Θ(nlog b a), then T(n) = Θ(nlog b a * log n).

3. If f(n) = Ω(nlog b a+ϵ), then T(n) = Θ(f(n)).

ϵ > 0 is a constant.

Each of the above conditions can be interpreted as:

1. If the cost of solving the sub-problems at each level increases by a


certain factor, the value of f(n) will become polynomially smaller
than nlog b a . Thus, the time complexity is oppressed by the cost of the
last level ie. nlog b a

2. If the cost of solving the sub-problem at each level is nearly equal,


then the value of f(n) will be nlog b a . Thus, the time complexity will
be f(n) times the total number of levels ie. nlog b a * log n

3. If the cost of solving the subproblems at each level decreases by a


certain factor, the value of f(n) will become polynomially larger
than nlog b a . Thus, the time complexity is oppressed by the cost
of f(n) .

Solved Example of Master Theorem

T(n) = 3T(n/2) + n2

Here,

a = 3
n/b = n/2

f(n) = n2

logb a = log2 3 ≈ 1.58 < 2

ie. f(n) < nlog b a+ϵ , where, ϵ is a constant.

Case 3 implies here.

Thus, T(n) = f(n) = Θ(n2)

Master Theorem Limitations

The master theorem cannot be used if:

 T(n) is not monotone. eg. T(n) = sin n

 f(n) is not a polynomial. eg. f(n) = 2n

 a is not a constant. eg. a = 2n

 a < 1

Divide and Conquer Algorithm

In this tutorial, you will learn how the divide and conquer algorithm works.
We will also compare the divide and conquer approach versus other
approaches to solve a recursive problem.

A divide and conquer algorithm is a strategy of solving a large problem


by
1. breaking the problem into smaller sub-problems

2. solving the sub-problems, and


3. combining them to get the desired output.

To use the divide and conquer algorithm, recursion is used. Learn about
recursion in different programming languages:
 Recursion in Java
 Recursion in Python
 Recursion in C++

How Divide and Conquer Algorithms Work?

Here are the steps involved:

1. Divide: Divide the given problem into sub-problems using recursion.


2. Conquer: Solve the smaller sub-problems recursively. If the
subproblem is small enough, then solve it directly.
3. Combine: Combine the solutions of the sub-problems that are part of
the recursive process to solve the actual problem.
Let us understand this concept with the help of an example.

Here, we will sort an array using the divide and conquer approach
(ie. merge sort).

1. Let the given array be:

2. Array for merge sort


3. Divide the array into two halves.

4. Divide the array into two subparts


Again, divide each subpart recursively into two halves until you get
individual elements.

5. Divide the array into smaller subparts


6. Now, combine the individual elements in a sorted manner.
Here, conquer and combine steps go side by side.

Com
bine the subparts

Time Complexity

The complexity of the divide and conquer algorithm is calculated using


the master theorem.

T(n) = aT(n/b) + f(n),


where,

n = size of input

a = number of subproblems in the recursion

n/b = size of each subproblem. All subproblems are assumed to have the
same size.

f(n) = cost of the work done outside the recursive call, which includes
the cost of dividing the problem and cost of merging the solutions

Let us take an example to find the time complexity of a recursive problem.

For a merge sort, the equation can be written as:

T(n) = aT(n/b) + f(n)

= 2T(n/2) + O(n)

Where,

a = 2 (each time, a problem is divided into 2 subproblems)

n/b = n/2 (size of each sub problem is half of the input)

f(n) = time taken to divide the problem and merging the subproblems

T(n/2) = O(n log n) (To understand this, please refer to the master
theorem.)

Now, T(n) = 2T(n log n) + O(n)

≈ O(n log n)

Divide and Conquer Vs Dynamic approach

The divide and conquer approach divides a problem into smaller


subproblems; these subproblems are further solved recursively. The result
of each subproblem is not stored for future reference, whereas, in a
dynamic approach, the result of each subproblem is stored for future
reference.

Use the divide and conquer approach when the same subproblem is not
solved multiple times. Use the dynamic approach when the result of a
subproblem is to be used multiple times in the future.

Let us understand this with an example. Suppose we are trying to find the
Fibonacci series. Then,

Divide and Conquer approach:

fib(n)

If n < 2, return 1

Else , return f(n - 1) + f(n -2)

Dynamic approach:

mem = []

fib(n)

If n in mem: return mem[n]

else,

If n < 2, f = 1

else , f = f(n - 1) + f(n -2)

mem[n] = f

return f

In a dynamic approach, mem stores the result of each subproblem.

Advantages of Divide and Conquer Algorithm

 The complexity for the multiplication of two matrices using the naive
method is O(n3) , whereas using the divide and conquer approach (i.e.
Strassen's matrix multiplication) is O(n2.8074) . This approach also
simplifies other problems, such as the Tower of Hanoi.
 This approach is suitable for multiprocessing systems.

 It makes efficient use of memory caches.

Divide and Conquer Applications

 Binary Search
 Merge Sort
 Quick Sort
 Strassen's Matrix multiplication

 Karatsuba Algorithm

Stack Data Structure

In this tutorial, you will learn about the stack data structure and its
implementation in Python, Java and C/C++.

A stack is a linear data structure that follows the principle of Last In First
Out (LIFO). This means the last element inserted inside the stack is
removed first.
You can think of the stack data structure as the pile of plates on top of
another.
Stack representation similar to a pile of plate

Here, you can:

 Put a new plate on top

 Remove the top plate

And, if you want the plate at the bottom, you must first remove all the plates
on top. This is exactly how the stack data structure works.

LIFO Principle of Stack

In programming terms, putting an item on top of the stack is


called push and removing an item is called pop.

Stack Push and Pop Operations

In the above image, although item 3 was kept last, it was removed first.
This is exactly how the LIFO (Last In First Out) Principle works.
We can implement a stack in any programming language like C, C++, Java,
Python or C#, but the specification is pretty much the same.
Basic Operations of Stack

There are some basic operations that allow us to perform different actions
on a stack.

 Push: Add an element to the top of a stack


 Pop: Remove an element from the top of a stack
 IsEmpty: Check if the stack is empty
 IsFull: Check if the stack is full
 Peek: Get the value of the top element without removing it

Working of Stack Data Structure

The operations work as follows:

1. A pointer called TOP is used to keep track of the top element in the
stack.
2. When initializing the stack, we set its value to -1 so that we can
check if the stack is empty by comparing TOP == -1 .

3. On pushing an element, we increase the value of TOP and place the


new element in the position pointed to by TOP .

4. On popping an element, we return the element pointed to by TOP and


reduce its value.
5. Before pushing, we check if the stack is already full

6. Before popping, we check if the stack is already empty


Working of Stack Data Structure

Stack Time Complexity

For the array-based implementation of a stack, the push and pop


operations take constant time, i.e. O(1) .

Applications of Stack Data Structure

Although stack is a simple data structure to implement, it is very powerful.


The most common uses of a stack are:

 To reverse a word - Put all the letters in a stack and pop them out.
Because of the LIFO order of stack, you will get the letters in reverse
order.
 In compilers - Compilers use the stack to calculate the value of
expressions like 2 + 4 / 5 * (7 - 9) by converting the expression to
prefix or postfix form.
 In browsers - The back button in a browser saves all the URLs you
have visited previously in a stack. Each time you visit a new page, it
is added on top of the stack. When you press the back button, the
current URL is removed from the stack, and the previous URL is
accessed.
Queue Data Structure

In this tutorial, you will learn what a queue is. Also, you will find
implementation of queue in C, C++, Java and Python.

A queue is a useful data structure in programming. It is similar to the ticket


queue outside a cinema hall, where the first person entering the queue is
the first person who gets the ticket.

Queue follows the First In First Out (FIFO) rule - the item that goes in first
is the item that comes out first.

FIFO Representation of Queue

In the above image, since 1 was kept in the queue before 2, it is the first to
be removed from the queue as well. It follows the FIFO rule.
In programming terms, putting items in the queue is called enqueue, and
removing items from the queue is called dequeue.
We can implement the queue in any programming language like C, C++,
Java, Python or C#, but the specification is pretty much the same.

Basic Operations of Queue

A queue is an object (an abstract data structure - ADT) that allows the
following operations:

 Enqueue: Add an element to the end of the queue


 Dequeue: Remove an element from the front of the queue
 IsEmpty: Check if the queue is empty
 IsFull: Check if the queue is full
 Peek: Get the value of the front of the queue without removing it

Working of Queue

Queue operations work as follows:

 two pointers FRONT and REAR

 FRONT track the first element of the queue


 REAR track the last element of the queue
 initially, set value of FRONT and REAR to -1
Enqueue Operation

 check if the queue is full

 for the first element, set the value of FRONT to 0


 increase the REAR index by 1
 add the new element in the position pointed to by REAR

Dequeue Operation

 check if the queue is empty

 return the value pointed by FRONT

 increase the FRONT index by 1


 for the last element, reset the values of FRONT and REAR to -1
Enqueue and Dequeue Operations
Limitations of Queue

As you can see in the image below, after a bit of enqueuing and dequeuing,
the size of the queue has been reduced.

Limitation of a queue

And we can only add indexes 0 and 1 only when the queue is reset (when
all the elements have been dequeued).

After REAR reaches the last index, if we can store extra elements in the
empty spaces (0 and 1), we can make use of the empty spaces. This is
implemented by a modified queue called the circular queue.

Complexity Analysis

The complexity of enqueue and dequeue operations in a queue using an


array is O(1) . If you use pop(N) in python code, then the complexity might
be O(n) depending on the position of the item to be popped.

Applications of Queue

 CPU scheduling, Disk Scheduling

 When data is transferred asynchronously between two


processes.The queue is used for synchronization. For example: IO
Buffers, pipes, file IO, etc

 Handling of interrupts in real-time systems.


 Call Center phone systems use Queues to hold people calling them
in order.

Types of Queues

In this tutorial, you will learn different types of queues with along with
illustration.

A queue is a useful data structure in programming. It is similar to the ticket


queue outside a cinema hall, where the first person entering the queue is
the first person who gets the ticket.
There are four different types of queues:

 Simple Queue

 Circular Queue

 Priority Queue

 Double Ended Queue

Simple Queue

In a simple queue, insertion takes place at the rear and removal occurs at
the front. It strictly follows the FIFO (First in First out) rule.

Simple Queue Representation


To learn more, visit Queue Data Structure.

Circular Queue

In a circular queue, the last element points to the first element making a
circular link.

Circular Queue Representation

The main advantage of a circular queue over a simple queue is better


memory utilization. If the last position is full and the first position is empty,
we can insert an element in the first position. This action is not possible in a
simple queue.

Priority Queue

A priority queue is a special type of queue in which each element is


associated with a priority and is served according to its priority. If elements
with the same priority occur, they are served according to their order in the
queue.
Priority Queue Representation

Insertion occurs based on the arrival of the values and removal occurs
based on priority.

To learn more, visit Priority Queue Data Structure.

Deque (Double Ended Queue)

In a double ended queue, insertion and removal of elements can be


performed from either from the front or rear. Thus, it does not follow the
FIFO (First In First Out) rule.

Deque Representation

Circular Queue Data Structure

In this tutorial, you will learn what a circular queue is. Also, you will find
implementation of circular queue in C, C++, Java and Python.

A circular queue is the extended version of a regular queue where the last
element is connected to the first element. Thus forming a circle-like
structure.
Circular queue representation

The circular queue solves the major limitation of the normal queue. In a
normal queue, after a bit of insertion and deletion, there will be non-usable
empty space.

Limitation of the regular Queue

Here, indexes 0 and 1 can only be used after resetting the queue (deletion
of all elements). This reduces the actual size of the queue.

How Circular Queue Works

Circular Queue works by the process of circular increment i.e. when we try
to increment the pointer and we reach the end of the queue, we start from
the beginning of the queue.

Here, the circular increment is performed by modulo division with the queue
size. That is,
if REAR + 1 == 5 (overflow!), REAR = (REAR + 1)%5 = 0 (start of queue)

Circular Queue Operations

The circular queue work as follows:

 two pointers FRONT and REAR

 FRONT track the first element of the queue


 REAR track the last elements of the queue
 initially, set value of FRONT and REAR to -1
1. Enqueue Operation

 check if the queue is full

 for the first element, set value of FRONT to 0


 circularly increase the REAR index by 1 (i.e. if the rear reaches the
end, next it would be at the start of the queue)
 add the new element in the position pointed to by REAR

2. Dequeue Operation

 check if the queue is empty

 return the value pointed by FRONT

 circularly increase the FRONT index by 1


 for the last element, reset the values of FRONT and REAR to -1
However, the check for full queue has a new additional case:

 Case 1: FRONT = 0 && REAR == SIZE - 1

 Case 2: FRONT = REAR + 1


The second case happens when REAR starts from 0 due to circular
increment and when its value is just 1 less than FRONT , the queue is full.

Enque and Deque Operations


Circular Queue Complexity Analysis

The complexity of the enqueue and dequeue operations of a circular queue


is O(1) for (array implementations).

Applications of Circular Queue

 CPU scheduling

 Memory management

 Traffic Management

Priority Queue

In this tutorial, you will learn what priority queue is. Also, you will learn
about it's implementations in Python, Java, C, and C++.

A priority queue is a special type of queue in which each element is


associated with a priority value. And, elements are served on the basis of
their priority. That is, higher priority elements are served first.
However, if elements with the same priority occur, they are served
according to their order in the queue.

Assigning Priority Value


Generally, the value of the element itself is considered for assigning the
priority. For example,

The element with the highest value is considered the highest priority
element. However, in other cases, we can assume the element with the
lowest value as the highest priority element.
We can also set priorities according to our needs.

Removing Highest
Priority Element

Difference between Priority Queue and Normal Queue

In a queue, the first-in-first-out rule is implemented whereas, in a priority


queue, the values are removed on the basis of priority. The element with
the highest priority is removed first.

Implementation of Priority Queue

Priority queue can be implemented using an array, a linked list, a heap data
structure, or a binary search tree. Among these data structures, heap data
structure provides an efficient implementation of priority queues.

Hence, we will be using the heap data structure to implement the priority
queue in this tutorial. A max-heap is implemented in the following
operations. If you want to learn more about it, please visit max-heap and
min-heap.
A comparative analysis of different implementations of priority queue is
given below.

Operations peek insert delete


Linked List O(1) O(n) O(1)

Binary Heap O(1) O(log n) O(log n)

Binary Search Tree O(1) O(log n) O(log n)

Priority Queue Operations

Basic operations of a priority queue are inserting, removing, and peeking


elements.

Before studying the priority queue, please refer to the heap data
structure for a better understanding of binary heap as it is used to
implement the priority queue in this article.

1. Inserting an Element into the Priority Queue

Inserting an element into a priority queue (max-heap) is done by the


following steps.

 Insert the new element at the end of the tree.


 Insert an element at the end of the queue

 Heapify the tree. Heapify after


insertion
Algorithm for insertion of an element into priority queue (max-heap)

If there is no node,

create a newNode.

else (a node is already present)

insert the newNode at the end (last node from left to right.)

heapify the array

For Min Heap, the above algorithm is modified so that parentNode is always
smaller than newNode .

2. Deleting an Element from the Priority Queue

Deleting an element from a priority queue (max-heap) is done as follows:


 Select the element to be deleted.
Select the element to be deleted

 Swap it with the last element.


Swap with the last leaf node element

 Remove the last element.


Remove the last element leaf
 Heapify the tree.

 Heapify the priority queue

Algorithm for deletion of an element in the priority queue (max-heap)

If nodeToBeDeleted is the leafNode

remove the node

Else swap nodeToBeDeleted with the lastLeafNode

remove noteToBeDeleted

heapify the array

For Min Heap, the above algorithm is modified so that the


both childNodes are smaller than currentNode .

3. Peeking from the Priority Queue (Find max/min)

Peek operation returns the maximum element from Max Heap or minimum
element from Min Heap without deleting the node.

For both Max heap and Min Heap


return rootNode

4. Extract-Max/Min from the Priority Queue

Extract-Max returns the node with maximum value after removing it from a
Max Heap whereas Extract-Min returns the node with minimum value after
removing it from Min Heap.

Priority Queue Applications

Some of the applications of a priority queue are:

 Dijkstra's algorithm

 for implementing stack

 for load balancing and interrupt handling in an operating system

 for data compression in Huffman code

Deque Data Structure

In this tutorial, you will learn what a double ended queue (deque) is. Also,
you will find working examples of different operations on a deque in C, C+
+, Java and Python.

Deque or Double Ended Queue is a type of queue in which insertion and


removal of elements can either be performed from the front or the rear.
Thus, it does not follow FIFO rule (First In First Out).

Representation of Deque
Types of Deque

 Input Restricted Deque


In this deque, input is restricted at a single end but allows deletion at
both the ends.
 Output Restricted Deque
In this deque, output is restricted at a single end but allows insertion
at both the ends.

Operations on a Deque

Below is the circular array implementation of deque. In a circular array, if


the array is full, we start from the beginning.
But in a linear array implementation, if the array is full, no more elements
can be inserted. In each of the operations below, if the array is full,
"overflow message" is thrown.

Before performing the following operations, these steps are followed.

1. Take an array (deque) of size n .


2. Set two pointers at the first position and set front = -1 and rear = 0 .
Initialize an array and pointers for
deque

1. Insert at the Front

This operation adds an element at the front.

1. Check the position of front.

Check the position of


front

2. If front < 1 , reinitialize front = n-1 (last index).

Shift front to the end


3. Else, decrease front by 1.
4. Add the new key 5 into array[front] .

Insert the element at


Front
2. Insert at the Rear

This operation adds an element to the rear.

1. Check if the array is full.

Check if deque is full

2. If the deque is full, reinitialize rear = 0 .

3. Else, increase rear by 1.

Increase the rear


4. Add the new key 5 into array[rear] .

Insert the element at


rear
3. Delete from the Front

The operation deletes an element from the front.

1. Check if the deque is empty.

Check if deque is
empty

2. If the deque is empty (i.e. front = -1 ), deletion cannot be performed


(underflow condition).
3. If the deque has only one element (i.e. front = rear ), set front = -

1 and rear = -1 .

4. Else if front is at the end (i.e. front = n - 1 ), set go to the


front front = 0 .
5. Else, front = front + 1 .

Increase the front


4. Delete from the Rear

This operation deletes an element from the rear.

1. Check if the deque is empty.

Check if deque is
empty

2. If the deque is empty (i.e. front = -1 ), deletion cannot be performed


(underflow condition).
3. If the deque has only one element (i.e. front = rear ), set front = -

1 and rear = -1 , else follow the steps below.


4. If rear is at the front (i.e. rear = 0 ), set go to the front rear = n - 1 .

5. Else, rear = rear - 1 .

Decrease the rear


5. Check Empty

This operation checks if the deque is empty. If front = -1 , the deque is


empty.
6. Check Full

This operation checks if the deque is full. If front = 0 and rear = n -

1 OR front = rear + 1 , the deque is full.

Time Complexity

The time complexity of all the above operations is constant i.e. O(1) .

Applications of Deque Data Structure

1. In undo operations on software.

2. To store history in browsers.

3. For implementing both stacks and queues.

Linked list Data Structure

In this tutorial, you will learn about linked list data structure and it's

implementation in Python, Java, C, and C++.

A linked list is a linear data structure that includes a series of connected


nodes. Here, each node stores the data and the address of the next node.
For example,
Linked list Data Structure

You have to start somewhere, so we give the address of the first node a
special name called HEAD . Also, the last node in the linked list can be
identified because its next portion points to NULL .

Linked lists can be of multiple types: singly, doubly, and circular linked
list. In this article, we will focus on the singly linked list. To learn about
other types, visit Types of Linked List.

Note: You might have played the game Treasure Hunt, where each clue
includes the information about the next clue. That is how the linked list
operates.

Representation of Linked List

Let's see how each node of the linked list is represented. Each node
consists:

 A data item

 An address of another node

We wrap both the data item and the next node reference in a struct as:

struct node
{
int data;
struct node *next;
};

Understanding the structure of a linked list node is the key to having a


grasp on it.

Each struct node has a data item and a pointer to another struct node. Let
us create a simple Linked List with three items to understand how this
works.

/* Initialize nodes */
struct node *head;
struct node *one = NULL;
struct node *two = NULL;
struct node *three = NULL;

/* Allocate memory */
one = malloc(sizeof(struct node));
two = malloc(sizeof(struct node));
three = malloc(sizeof(struct node));

/* Assign data values */


one->data = 1;
two->data = 2;
three->data=3;

/* Connect nodes */
one->next = two;
two->next = three;
three->next = NULL;

/* Save address of first node in head */


head = one;

If you didn't understand any of the lines above, all you need is a refresher
on pointers and structs.
In just a few steps, we have created a simple linked list with three nodes.
Linked list Representation

The power of a linked list comes from the ability to break the chain and
rejoin it. E.g. if you wanted to put an element 4 between 1 and 2, the steps
would be:

 Create a new struct node and allocate memory to it.

 Add its data value as 4

 Point its next pointer to the struct node containing 2 as the data value

 Change the next pointer of "1" to the node we just created.

Linked List Complexity

Time Complexity

Worst case Average Case

Search O(n) O(n)

Insert O(1) O(1)

Deletion O(1) O(1)

Space Complexity: O(n)

Linked List Applications

 Dynamic memory allocation

 Implemented in stack and queue


 In undo functionality of softwares
 Hash tables, Graphs

Linked List Operations: Traverse, Insert and Delete

In this tutorial, you will learn different operations on a linked list. Also, you
will find implementation of linked list operations in C/C++, Python and Java.

There are various linked list operations that allow us to perform different
actions on linked lists. For example, the insertion operation adds a new
element to the linked list.

Here's a list of basic linked list operations that we will cover in this article.

 Traversal - access each element of the linked list


 Insertion - adds a new element to the linked list
 Deletion - removes the existing elements
 Search - find a node in the linked list
 Sort - sort the nodes of the linked list
Before you learn about linked list operations in detail, make sure to know
about Linked List first.
Things to Remember about Linked List

 head points to the first node of the linked list


 next pointer of the last node is NULL , so if the next current node
is NULL , we have reached the end of the linked list.
In all of the examples, we will assume that the linked list has three nodes 1

--->2 --->3 with node structure as below:

struct node {
int data;
struct node *next;
};
Traverse a Linked List

Displaying the contents of a linked list is very simple. We keep moving the
temp node to the next one and display its contents.

When temp is NULL , we know that we have reached the end of the linked list
so we get out of the while loop.

struct node *temp = head;


printf("\n\nList elements are - \n");
while(temp != NULL) {
printf("%d --->",temp->data);
temp = temp->next;
}

The output of this program will be:

List elements are -


1 --->2 --->3 --->

Insert Elements to a Linked List

You can add elements to either the beginning, middle or end of the linked
list.

1. Insert at the beginning

 Allocate memory for new node

 Store data

 Change next of new node to point to head

 Change head to point to recently created node

struct node *newNode;


newNode = malloc(sizeof(struct node));
newNode->data = 4;
newNode->next = head;
head = newNode;

2. Insert at the End

 Allocate memory for new node

 Store data

 Traverse to last node

 Change next of last node to recently created node

struct node *newNode;


newNode = malloc(sizeof(struct node));
newNode->data = 4;
newNode->next = NULL;

struct node *temp = head;


while(temp->next != NULL){
temp = temp->next;
}

temp->next = newNode;

3. Insert at the Middle

 Allocate memory and store data for new node

 Traverse to node just before the required position of new node

 Change next pointers to include new node in between

struct node *newNode;


newNode = malloc(sizeof(struct node));
newNode->data = 4;

struct node *temp = head;

for(int i=2; i < position; i++) {


if(temp->next != NULL) {
temp = temp->next;
}
}
newNode->next = temp->next;
temp->next = newNode;

Delete from a Linked List

You can delete either from the beginning, end or from a particular position.

1. Delete from beginning

 Point head to the second node

head = head->next;

2. Delete from end

 Traverse to second last element

 Change its next pointer to null

struct node* temp = head;


while(temp->next->next!=NULL){
temp = temp->next;
}
temp->next = NULL;

3. Delete from middle

 Traverse to element before the element to be deleted

 Change next pointers to exclude the node from the chain


for(int i=2; i< position; i++) {
if(temp->next!=NULL) {
temp = temp->next;
}
}

temp->next = temp->next->next;

Search an Element on a Linked List

You can search an element on a linked list using a loop using the following
steps. We are finding item on a linked list.
 Make head as the current node.
 Run a loop until the current node is NULL because the last element
points to NULL .

 In each iteration, check if the key of the node is equal to item . If it the
key matches the item, return true otherwise return false .

// Search a node
bool searchNode(struct Node** head_ref, int key) {
struct Node* current = *head_ref;

while (current != NULL) {


if (current->data == key) return true;
current = current->next;
}
return false;
}
Sort Elements of a Linked List

We will use a simple sorting algorithm, Bubble Sort, to sort the elements of
a linked list in ascending order below.
1. Make the head as the current node and create another
node index for later use.
2. If head is null, return.
3. Else, run a loop till the last node (i.e. NULL ).

4. In each iteration, follow the following step 5-6.

5. Store the next node of current in index .

6. Check if the data of the current node is greater than the next node. If
it is greater, swap current and index .

Types of Linked List - Singly linked, doubly linked and

circular

In this tutorial, you will learn different types of linked list. Also, you will find
implementation of linked list in C.

Before you learn about the type of the linked list, make sure you know
about the LinkedList Data Structure.
There are three common types of Linked List.

1. Singly Linked List


2. Doubly Linked List
3. Circular Linked List
Singly Linked List

It is the most common. Each node has data and a pointer to the next node.

Singly linked list

Node is represented as:

struct node {
int data;
struct node *next;
}

A three-member singly linked list can be created as:

/* Initialize nodes */
struct node *head;
struct node *one = NULL;
struct node *two = NULL;
struct node *three = NULL;

/* Allocate memory */
one = malloc(sizeof(struct node));
two = malloc(sizeof(struct node));
three = malloc(sizeof(struct node));

/* Assign data values */


one->data = 1;
two->data = 2;
three->data = 3;

/* Connect nodes */
one->next = two;
two->next = three;
three->next = NULL;

/* Save address of first node in head */


head = one;
Doubly Linked List

We add a pointer to the previous node in a doubly-linked list. Thus, we can


go in either direction: forward or backward.

Doubly linked list

A node is represented as

struct node {
int data;
struct node *next;
struct node *prev;
}

A three-member doubly linked list can be created as

/* Initialize nodes */
struct node *head;
struct node *one = NULL;
struct node *two = NULL;
struct node *three = NULL;

/* Allocate memory */
one = malloc(sizeof(struct node));
two = malloc(sizeof(struct node));
three = malloc(sizeof(struct node));

/* Assign data values */


one->data = 1;
two->data = 2;
three->data = 3;

/* Connect nodes */
one->next = two;
one->prev = NULL;

two->next = three;
two->prev = one;

three->next = NULL;
three->prev = two;

/* Save address of first node in head */


head = one;

If you want to learn more about it, please visit doubly linked list and
operations on it.

Circular Linked List

A circular linked list is a variation of a linked list in which the last element is
linked to the first element. This forms a circular loop.

Circ
ular linked list

A circular linked list can be either singly linked or doubly linked.

 for singly linked list, next pointer of last item points to the first item

 In the doubly linked list, prev pointer of the first item points to the last
item as well.
A three-member circular singly linked list can be created as:
/* Initialize nodes */
struct node *head;
struct node *one = NULL;
struct node *two = NULL;
struct node *three = NULL;

/* Allocate memory */
one = malloc(sizeof(struct node));
two = malloc(sizeof(struct node));
three = malloc(sizeof(struct node));

/* Assign data values */


one->data = 1;
two->data = 2;
three->data = 3;

/* Connect nodes */
one->next = two;
two->next = three;
three->next = one;

/* Save address of first node in head */


head = one;

Hash Table
In this tutorial, you will learn what hash table is. Also, you will find working
examples of hash table operations in C, C++, Java and Python.

The Hash table data structure stores elements in key-value pairs where

 Key- unique integer that is used for indexing the values


 Value - data that are associated with keys.

Key and Value in Hash table


Hashing (Hash Function)

In a hash table, a new index is processed using the keys. And, the element
corresponding to that key is stored in the index. This process is
called hashing.
Let k be a key and h(x) be a hash function.
Here, h(k) will give us a new index to store the element linked with k

Hash table Representation

Hash Collision

When the hash function generates the same index for multiple keys, there
will be a conflict (what value to be stored in that index). This is called
a hash collision.
We can resolve the hash collision using one of the following techniques.

 Collision resolution by chaining


 Open Addressing: Linear/Quadratic Probing and Double Hashing

1. Collision resolution by chaining

In chaining, if a hash function produces the same index for multiple


elements, these elements are stored in the same index by using a doubly-
linked list.

If j is the slot for multiple elements, it contains a pointer to the head of the
list of elements. If no element is present, j contains NIL .

c collision Resolution using chaining

Pseudocode for operations

chainedHashSearch(T, k)
return T[h(k)]
chainedHashInsert(T, x)
T[h(x.key)] = x //insert at the head
chainedHashDelete(T, x)
T[h(x.key)] = NIL
2. Open Addressing

Unlike chaining, open addressing doesn't store multiple elements into the
same slot. Here, each slot is either filled with a single key or left NIL .

Different techniques used in open addressing are:

i. Linear Probing

In linear probing, collision is resolved by checking the next slot.

h(k, i) = (h′(k) + i) mod m

where

 i = {0, 1, ….}

 h'(k) is a new hash function


If a collision occurs at h(k, 0) , then h(k, 1) is checked. In this way, the
value of i is incremented linearly.
The problem with linear probing is that a cluster of adjacent slots is filled.
When inserting a new element, the entire cluster must be traversed. This
adds to the time required to perform operations on the hash table.

ii. Quadratic Probing

It works similar to linear probing but the spacing between the slots is
increased (greater than one) by using the following relation.

h(k, i) = (h′(k) + c1i + c2i2) mod m

where,

 c1 and c2 are positive auxiliary constants,


 i = {0, 1, ….}

iii. Double hashing

If a collision occurs after applying a hash function h(k) , then another hash
function is calculated for finding the next slot.
h(k, i) = (h1(k) + ih2(k)) mod m
Good Hash Functions

A good hash function may not prevent the collisions completely however it
can reduce the number of collisions.

Here, we will look into different methods to find a good hash function

1. Division Method

If k is a key and m is the size of the hash table, the hash function h() is
calculated as:
h(k) = k mod m

For example, If the size of a hash table is 10 and k = 112 then h(k) =

112 mod 10 = 2 . The value of m must not be the powers of 2 . This is


because the powers of 2 in binary format are 10, 100, 1000, … . When we
find k mod m , we will always get the lower order p-bits.

if m = 22, k = 17, then h(k) = 17 mod 22 = 10001 mod 100 = 01

if m = 23, k = 17, then h(k) = 17 mod 22 = 10001 mod 100 = 001

if m = 24, k = 17, then h(k) = 17 mod 22 = 10001 mod 100 = 0001

if m = 2p, then h(k) = p lower bits of m

2. Multiplication Method

h(k) = ⌊m(kA mod 1)⌋

where,

 kA mod 1 gives the fractional part kA ,

 ⌊ ⌋ gives the floor value


 A is any constant. The value of A lies between 0 and 1. But, an
optimal choice will be ≈ (√5-1)/2 suggested by Knuth.
3. Universal Hashing

In Universal hashing, the hash function is chosen at random independent of


keys.

Applications of Hash Table

Hash tables are implemented where

 constant time lookup and insertion is required

 cryptographic applications

 indexing data is required

Heap Data Structure


In this tutorial, you will learn what heap data structure is. Also, you will find
working examples of heap operations in C, C++, Java and Python.

Heap data structure is a complete binary tree that satisfies the heap
property, where any given node is
 always greater than its child node/s and the key of the root node is
the largest among all other nodes. This property is also called max
heap property.
 always smaller than the child node/s and the key of the root node is
the smallest among all other nodes. This property is also called min
heap property.

 MAX HEAP

Min-heap

This type of data structure is also called a binary heap.


Heap Operations

Some of the important operations performed on a heap are described


below along with their algorithms.

Heapify

Heapify is the process of creating a heap data structure from a binary tree.
It is used to create a Min-Heap or a Max-Heap.

1. Let the input array be

Initial Array

2. Create a complete binary tree from the array

m complete binary tree

3. Start from the first index of non-leaf node whose index is given
by n/2- 1 .
Start from the first on leaf node
4. Set current element i as largest .

5. The index of left child is given by 2i + 1 and the right child is given
by 2i + 2 .

If leftChild is greater than currentElement (i.e. element at ith index),


set leftChildIndex as largest.
If rightChild is greater than element in largest ,

set rightChildIndex as largest .

6. Swap largest with currentElement

Swap if necessary
7. Repeat steps 3-7 until the subtrees are also heapified.

Algorithm

Heapify(array, size, i)
set i as largest
leftChild = 2i + 1
rightChild = 2i + 2

if leftChild > array[largest]


set leftChildIndex as largest
if rightChild > array[largest]
set rightChildIndex as largest
swap array[i] and array[largest]

To create a Max-Heap:

MaxHeap(array, size)
loop from the first index of non-leaf node down to zero
call heapify

For Min-Heap, both leftChild and rightChild must be larger than the
parent for all nodes.

Insert Element into Heap

Algorithm for insertion in Max Heap

If there is no node,
create a newNode.
else (a node is already present)
insert the newNode at the end (last node from left to right.)

heapify the array

1. Insert the new element at the end of the tree.

Insert at the end


2. Heapify the tree.

e
H Heapify the array
For Min Heap, the above algorithm is modified so that parentNode is always
smaller than newNode .

Delete Element from Heap

Algorithm for deletion in Max Heap

If nodeToBeDeleted is the leafNode


remove the node
Else swap nodeToBeDeleted with the lastLeafNode
remove noteToBeDeleted

heapify the array


1. Select the element to be deleted.

Select the element to be deleted

2. Swap it with the last element.

Swap with the last element

3. Remove the last element.

Remove the last element

4. Heapify the tree.


Heapify the array
For Min Heap, above algorithm is modified so that both childNodes are
greater smaller than currentNode .

Peek (Find max/min)

Peek operation returns the maximum element from Max Heap or minimum
element from Min Heap without deleting the node.

For both Max heap and Min Heap

return rootNode

Extract-Max/Min

Extract-Max returns the node with maximum value after removing it from a
Max Heap whereas Extract-Min returns the node with minimum after
removing it from Min Heap.
Heap Data Structure Applications

 Heap is used while implementing a priority queue.

 Dijkstra's Algorithm

 Heap Sort

Fibonacci Heap
In this tutorial, you will learn what a Fibonacci Heap is. Also, you will find
working examples of different operations on a fibonacci heap in C, C++,
Java and Python.
A fibonacci heap is a data structure that consists of a collection of trees
which follow min heap or max heap property. We have already
discussed min heap and max heap property in the Heap Data
Structure article. These two properties are the characteristics of the trees
present on a fibonacci heap.
In a fibonacci heap, a node can have more than two children or no children
at all. Also, it has more efficient heap operations than that supported by the
binomial and binary heaps.

The fibonacci heap is called a fibonacci heap because the trees are
constructed in a way such that a tree of order n has at least Fn+2 nodes in it,
where Fn+2 is the (n + 2)th Fibonacci number.

Fibonacci Heap

Properties of a Fibonacci Heap

Important properties of a Fibonacci heap are:

1. It is a set of min heap-ordered trees. (i.e. The parent is always


smaller than the children.)
2. A pointer is maintained at the minimum element node.

3. It consists of a set of marked nodes. (Decrease key operation)

4. The trees within a Fibonacci heap are unordered but rooted.


Memory Representation of the Nodes in a Fibonacci Heap

The roots of all the trees are linked together for faster access. The child
nodes of a parent node are connected to each other through a circular
doubly linked list as shown below.

There are two main advantages of using a circular doubly linked list.

1. Deleting a node from the tree takes O(1) time.


2. The concatenation of two such lists takes O(1) time.

Fibonacci Heap Structure

Operations on a Fibonacci Heap

Insertion

Algorithm

insert(H, x)

degree[x] = 0

p[x] = NIL

child[x] = NIL
left[x] = x

right[x] = x

mark[x] = FALSE

concatenate the root list containing x with root list H

if min[H] == NIL or key[x] < key[min[H]]

then min[H] = x

n[H] = n[H] + 1

Inserting a node into an already existing heap follows the steps below.

1. Create a new node for the element.

2. Check if the heap is empty.

3. If the heap is empty, set the new node as a root node and mark
it min .

4. Else, insert the node into the root list and update min .

I
nsertion Example

Find Min

The minimum element is always given by the min pointer.


Union

Union of two fibonacci heaps consists of following steps.

1. Concatenate the roots of both the heaps.

2. Update min by selecting a minimum key from the new root lists.

Union of two heaps

Extract Min

It is the most important operation on a fibonacci heap. In this operation, the


node with minimum value is removed from the heap and the tree is re-
adjusted.

The following steps are followed:

1. Delete the min node.

2. Set the min-pointer to the next root in the root list.

3. Create an array of size equal to the maximum degree of the trees in


the heap before deletion.

4. Do the following (steps 5-7) until there are no multiple roots with the
same degree.

5. Map the degree of current root (min-pointer) to the degree in the


array.

6. Map the degree of next root to the degree in array.


7. If there are more than two mappings for the same degree, then apply
union operation to those roots such that the min-heap property is
maintained (i.e. the minimum is at the root).

An implementation of the above steps can be understood in the example


below.

1. We will perform an extract-min operation on the heap below.

Fibonacci Heap

2. Delete the min node, add all its child nodes to the root list and set the
min-pointer to the next root in the root list.

Delete the min node

3. The maximum degree in the tree is 3. Create an array of size 4 and


map degree of the next roots with the array.
Create an array

4. Here, 23 and 7 have the same degrees, so unite them.

Uni
te those having the same degrees

5. Again, 7 and 17 have the same degrees, so unite them as well.

Uni
te those having the same degrees
6. Again 7 and 24 have the same degree, so unite them.

Un
ite those having the same degrees

7. Map the next nodes.

M Map the remaining nodes

8. Again, 52 and 21 have the same degree, so unite them

Unite those having the same degrees


9. Similarly, unite 21 and 18.

Unite those having the same degrees

10. Map the remaining root.

Map the remaining nodes

11.The final heap is.

Final fibonacci heap


Decreasing a Key and Deleting a Node

These are the most important operations which are discussed in Decrease
Key and Delete Node Operations.

Complexities

Insertion O(1)

Find Min O(1)

Union O(1)

Extract Min O(log n)

Decrease Key O(1)

Delete Node O(log n)

Fibonacci Heap Applications

1. To improve the asymptotic running time of Dijkstra's algorithm.

Decrease Key and Delete Node Operations on a

Fibonacci Heap

In this tutorial, you will learn how decrease key and delete node operations
work. Also, you will find working examples of these operations on a
fibonacci heap in C, C++, Java and Python.
A fibonacci heap is a tree based data structure which consists of a
collection of trees with min heap or max heap property. Its operations are
more efficient in terms of time complexity than those of its similar data
structures like binomial heap and binary heap.
Now, we will discuss two of its important operations.

1. Decrease a key: decreases the value of a the key to any lower value
2. Delete a node: deletes the given node
Decreasing a Key

In decreasing a key operation, the value of a key is decreased to a lower


value.

Following functions are used for decreasing the key.

Decrease-Key

1. Select the node to be decreased, x , and change its value to the new
value k .
2. If the parent of x , y , is not null and the key of parent is greater than
that of the k then call Cut(x) and Cascading-Cut(y) subsequently.
3. If the key of x is smaller than the key of min, then mark x as min.
Cut

1. Remove x from the current position and add it to the root list.
2. If x is marked, then mark it as false.
Cascading-Cut

1. If the parent of y is not null then follow the following steps.


2. If y is unmarked, then mark y .
3. Else, call Cut(y) and Cascading-Cut(parent of y) .
Decrease Key Example

The above operations can be understood in the examples below.

Example: Decreasing 46 to 15.

1. Decrease the value 46 to 15.

Decrease 46 to 15

2. Cut part: Since 24 ≠ nill and 15 < its parent , cut it and add it to
the root list. Cascading-Cut part: mark 24.

Add 15 to root list and mark 24


Example: Decreasing 35 to 5

1. Decrease the value 35 to 5.

Decrease 35 to 5

2. Cut part: Since 26 ≠ nill and 5<its parent , cut it and add it to the

root list.
Cut 5 and add it to root list
3. Cascading-Cut part: Since 26 is marked, the flow goes
to Cut and Cascading-Cut .

Cut(26): Cut 26 and add it to the root list and mark it as false.

C
u Cut 26 and add it to root list
Cascading-Cut(24):
Since the 24 is also marked, again call Cut(24) and Cascading-

Cut(7) . These operations result in the tree below.

Cut 24 and add it to root list

4. Since 5 < 7, mark 5 as min.

Mark 5 as min

Deleting a Node

This process makes use of decrease-key and extract-min operations. The


following steps are followed for deleting a node.
1. Let k be the node to be deleted.
2. Apply decrease-key operation to decrease the value of k to the
lowest possible value (i.e. -∞).
3. Apply extract-min operation to remove this node.
Complexities

Decrease Key O(1)

Delete Node O(log n)


Tree Data Structure

In this tutorial, you will learn about tree data structure. Also, you will learn
about different types of trees and the terminologies used in tree.

A tree is a nonlinear hierarchical data structure that consists of nodes


connected by edges.

A Tree

Why Tree Data Structure?

Other data structures such as arrays, linked list, stack, and queue are
linear data structures that store data sequentially. In order to perform any
operation in a linear data structure, the time complexity increases with the
increase in the data size. But, it is not acceptable in today's computational
world.

Different tree data structures allow quicker and easier access to the data as
it is a non-linear data structure.

Tree Terminologies

Node

A node is an entity that contains a key or value and pointers to its child
nodes.
The last nodes of each path are called leaf nodes or external nodes that
do not contain a link/pointer to child nodes.
The node having at least a child node is called an internal node.
Edge

It is the link between any two nodes.

Nodes and edges of a tree

Root

It is the topmost node of a tree.

Height of a Node

The height of a node is the number of edges from the node to the deepest
leaf (ie. the longest path from the node to a leaf node).

Depth of a Node

The depth of a node is the number of edges from the root to the node.

Height of a Tree

The height of a Tree is the height of the root node or the depth of the
deepest node.
Height and depth of each node in a tree

Degree of a Node

The degree of a node is the total number of branches of that node.

Forest

A collection of disjoint trees is called a forest.

Creating forest from a tree

You can create a forest by cutting the root of a tree.

Types of Tree

1. Binary Tree
2. Binary Search Tree
3. AVL Tree
4. B-Tree
Tree Traversal

In order to perform any operation on a tree, you need to reach to the


specific node. The tree traversal algorithm helps in visiting a required node
in the tree.

To learn more, please visit tree traversal.

Tree Applications

 Binary Search Trees(BSTs) are used to quickly check whether an


element is present in a set or not.

 Heap is a kind of tree that is used for heap sort.

 A modified version of a tree called Tries is used in modern routers to


store routing information.

 Most popular databases use B-Trees and T-Trees, which are variants
of the tree structure we learned above to store their data

 Compilers use a syntax tree to validate the syntax of every program


you write.

You might also like