data structure tree
data structure tree
A tree is recursively defined as a set of one or more nodes where one node is designated as the root
of the tree and all the remaining nodes can be partitioned into non-empty sets each of which is a
sub-tree of the root.
Figure 9.1 shows a tree where node A is the root node; nodes B, C, and D are children of the root
node and form sub-trees of the tree rooted at node A.
Root node The root node R is the topmost node in the tree. If R = NULL, then it means the tree is
empty. S
Sub-trees If the root node R is not NULL, then the trees T1 , T2 , and T3 are called the sub-trees of R.
Leaf node A node that has no children is called the leaf node or the terminal node. Path A sequence
of consecutive edges is called a path. For example, in Fig. 9.1, the path from the root node A to node
I is given as: A, D, and I.
Ancestor node An ancestor of a node is any predecessor node on the path from root to that node.
The root node does not have any ancestors. In the tree given in Fig. 9.1, nodes A, C, and G are the
ancestors of node K.
Descendant node A descendant node is any successor node on any path from the node to a leaf
node. Leaf nodes do not have any descendants. In the tree given in Fig. 9.1, nodes C, G, J, and K are
the descendants of node A.
Level number Every node in the tree is assigned a level number in such a way that the root node is
at level 0, children of the root node are at level number 1. Thus, every node is at one level higher
than its parent. So, all child nodes have a level number given by parent’s level number + 1.
Degree Degree of a node is equal to the number of children that a node has. The degree of a leaf
node is zero.
A binary tree is a data structure that is defined as a collection of elements called nodes.
In a binary tree, the topmost element is called the root node, and each node has 0, 1, or at
the most 2 children.
A node that has zero children is called a leaf node or a terminal node.
Every node contains a data element, a left pointer which points to the left child, and a right
pointer which points to the right child.
The root element is pointed by a 'root' pointer. If root = NULL, then it means the tree is
empty
TRAVERSING A BINARY TREE
Traversing a binary tree is the process of visiting each node in the tree exactly once in a systematic
way.
Unlike linear data structures in which the elements are traversed sequentially, tree is a nonlinear
data structure in which the elements can be traversed in many different ways.
These algorithms differ in the order in which the nodes are visited. In this section, we will discuss
these algorithms.
Pre-order Traversal
To traverse a non-empty binary tree in pre-order, the following operations are performed
recursively at each node.
the pre-order traversal of the tree is given as A, B, C. Root node first, the left sub-tree next, and then
the right sub-tree.
In this algorithm, the left sub-tree is always traversed before the right sub-tree. T
he word ‘pre’ in the pre-order specifies that the root node is accessed prior to any other nodes in
the left and right sub-trees.
In-order Traversal
To traverse a non-empty binary tree in in-order, the following operations are performed recursively
at each node.
.The in-order traversal of the tree is given as B, A, and C. Left sub-tree first, the root node next, and
then the right sub-tree.
In this algorithm, the left sub-tree is always traversed before the root node and the right sub-tree.
The word ‘in’ in the in-order specifies that the root node is accessed in between the left and the right
sub-trees.
To traverse a non-empty binary tree in post-order, the following operations are performed
recursively at each node.
Consider the tree given in Fig. 9.18. The post-order traversal of the tree is given as B, C, and A.
Left sub-tree first, the right sub-tree next, and finally the root node.
In this algorithm, the left sub-tree is always traversed before the right sub-tree and the root node.
The word ‘post’ in the post-order specifies that the root node is accessed after the left and the right
sub-trees.
in the Memory In the computer’s memory, a binary tree can be maintained either by using a linked
representation or by using a sequential representation.
Linked representation of binary trees In the linked representation of a binary tree, every node will
have three parts:
Every binary tree has a pointer ROOT, which points to the root element (topmost element) of the
tree.
the left position is used to point to the left child of the node or to store the address of the left child
of the node.
Finally, the right position is used to point to the right child of the node or to store the address of the
right child of the node.
Though it is the simplest technique for memory representation, it is inefficient as it requires a lot of
memory space.
In a binary search tree, all the nodes in the left sub-tree have a value less than that of the root node.
Correspondingly, all the nodes in the right sub-tree have a value either equal to or greater than the
root node.
Since the nodes in a binary search tree are ordered, the time needed to search an element in the
tree is greatly reduced.
Whenever we search for an element, we do not need to traverse the entire tree.
For example, in the given tree, if we have to search for 29, then we know that we have to scan only
the left sub-tree. If the value is present in the tree, it will only be in the left sub-tree, as 29 is smaller
than 39 (the root node’s value).
The left sub-tree has a root node with the value 27. Since 29 is greater than 27, we will move to the
right sub-tree, where we will find the element.
Thus, the average running time of a search operation is O(log2 n), as at every step, we eliminate half
of the sub-tree from the search process.
Due to its efficiency in searching elements, binary search trees are widely used in dictionary
problems where the code always inserts and searches the elements that are indexed by some key
value.
Binary search trees also speed up the insertion and deletion operations.
The tree has a speed advantage when the data in the structure changes rapidly.
However, in the worst case, a binary search tree will take O(n) time to search for an element.
The left sub-tree of a node N contains values that are less than N’s value.
The right sub-tree of a node N contains values that are greater than N’s value.
Both the left and the right binary trees also satisfy these properties and, thus, are binary
search trees.
In this section, we will discuss the different operations that are performed on a binary search
tree. All these operations require comparisons to be made between the nodes.
1. Searching for a Node in a Binary Search Tree
2. Inserting a New Node in a Binary Search Tree
3. Deleting a Node from a Binary Search Tree
4. Determining the Height of a Binary Search Tree
5. Determining the Number of Nodes
6. Finding the Mirror Image of a Binary Search Tree
7. Deleting a Binary Search Tree
8. Finding the Smallest Node in a Binary Search Tree
9. Finding the Largest Node in a Binary Search Tree
A threaded binary tree is the same as that of a binary tree but with a difference in storing the NULL
pointers. Consider the linked representation of a binary tree as given in Fig. 10.28.
In the linked representation, a number of nodes contain a NULL pointer, either in their left or right
fields or in both.
This space that is wasted in storing a NULL pointer can be efficiently used to store some other useful
piece of information.
For example, the NULL entries can be replaced to store a pointer to the in-order predecessor or the
in-order successor of the node.
These special pointers are called threads and binary trees containing threads are called threaded
trees.
In the linked representation of a threaded binary tree, threads will be denoted using arrows.
There are many ways of threading a binary tree and each type may vary according to the way the
tree is traversed.
Apart from this, a threaded binary tree may correspond to one-way threading or a two way
threading.
In one-way threading, a thread will appear either in the right field or the left field of the node.
A one-way threaded tree is also called a single-threaded tree. If the thread appears in the left field,
then the left field will be made to point to the in-order predecessor of the node.
On the contrary, if the thread appears in the right field, then it will point to the in-order successor of
the node.
threads will appear in both the left and the right field of the node.
While the left field will point to the in-order predecessor of the node, the right field will point to its
successor. A two-way threaded binary tree is also called a fully threaded binary tree.
One-way threading and two-way threading of binary trees are explained below. Figure 10.29 shows
a binary tree without threading and its corresponding linked representation. The in-order traversal
of the tree is given as 8, 4, 9, 2, 5, 1, 10, 6, 11, 3, 7, 12
One-way Threading
Figure 10.30 shows a binary tree with one way threading and its corresponding linked
representation.
Node 5 contains a NULL pointer in its RIGHT field, so it will be replaced to point to node 1, which is
its in-order successor.
Similarly, the RIGHT field of node 8 will point to node 4, the RIGHT field of node 9 will point to node
2, the RIGHT field of node 10 will point to node 6, the RIGHT field of node 11 will point to node 3,
and the RIGHT field of node 12 will contain NULL because it has no in-order successor.
Two-way Threading
Figure 10.31 shows a binary tree with two-way threading and its corresponding linked
representation.
Node 5 contains a NULL pointer in its LEFT field, so it will be replaced to point to node 2, which is its
in-order predecessor.
Similarly, the LEFT field of node 8 will contain NULL because it has no in-order predecessor, the LEFT
field of node 7 will point to node 3, the LEFT field of node 9 will point to node 4, the LEFT field of
node 10 will point to node 1, the LEFT field of node 11 will contain 6, and the LEFT field of node 12
will point to node 7.
APPLICATIONs OF TREES
Trees are used to store simple as well as complex data. Here simple means an integer value,
character value and complex data means a structure or a record.
Trees are often used for implementing other types of data structures like hash tables, sets,
and maps.
A self-balancing tree, Red-black tree is used in kernel scheduling, to preempt massively
multiprocessor computer operating system use. (We will study red-black trees in next
chapter.)
Another variation of tree, B-trees are prominently used to store tree structures on disc. They
are used to index a large number of records. (We will study B-Trees in Chapter 11.)
B-trees are also used for secondary indexes in databases, where the index facilitates a select
operation to answer some range criteria.
Trees are an important data structure used for compiler construction.
Trees are also used in database design.
Trees are used in file system directories.
Trees are also widely used for information storage and retrieval in symbol tables.
REPRESENTATION OF GRAPHS
There are three common ways of storing graphs in the computer’s memory.
They are:
An adjacency matrix is used to represent which nodes are adjacent to one another.
By definition, two nodes are said to be adjacent if there is an edge connecting them.
In a directed graph G, if node v is adjacent to node u, then there is definitely an edge from u to
v.
For any graph G having n nodes, the adjacency matrix will have the dimension of n ¥ n. In an
adjacency matrix, the rows and columns are labelled by graph vertices.
An entry aij in the adjacency matrix will contain 1, if vertices vi and vj are adjacent to each
other.
However, if the nodes are not adjacent, aij will be set to zero.
It is summarized in Fig. 13.13. Since an adjacency matrix contains only 0s and 1s, it is called a bit
matrix or a Boolean matrix.
The entries in the matrix depend on the ordering of the nodes in G. Therefore, a change in the
order of nodes will result in a different adjacency matrix.
Adjacency List Representation
An adjacency list is another way in which graphs can be represented in the computer’s memory.
Furthermore, every node is in turn linked to its own list that contains the names of all other
nodes that are adjacent to it.
It is easy to follow and clearly shows the adjacent nodes of a particular node.
It is often used for storing graphs that have a small-to-moderate number of edges. That
is, an adjacency list is preferred for representing sparse graphs in the computer’s
memory; otherwise, an adjacency matrix is a good choice.
Adding new nodes in G is easy and straightforward when G is represented using an
adjacency list. Adding new nodes in an adjacency matrix is a difficult task, as the size of
the matrix needs to be changed and existing nodes may have to be reordered