DS - Unit 2 - Linked List
DS - Unit 2 - Linked List
Introduction
The term “list” refers to a linear collection of data items. It contains a first element, second
element … and a last element. Data processing frequently involves storing and processing data
organized into lists. One way to store such data is by means of arrays. The disadvantage of array
is it is relatively expensive to insert and delete elements in an array. Also, since an array usually
copies a block of memory space, one cannot simply double or triple the size of an array when
additional space is required. For this reason, arrays are called dense lists and are said to be static
data structures.
Another way of storing a list in memory is to have each element in the list contain a field called a
link or pointer, which contains the address of the next element in the list. Thus successive
elements in the list need not occupy adjacent space in the memory. This will make it easier to
insert and delete elements in the list.
A linked list is a sequence of data structures, which are connected together via links. This is a
sequence of links which contains items. Each link contains a connection to another link. Linked
list is the second most-used data structure after array. The linked list or one-way list is a linear
set of data elements which is also termed as nodes. Here, the linear order is specified using
pointers. That is each node is divided into two parts: the first part contains the information of the
element and the second part called the link field or next pointer field contains the address of the
next node in the list.
The following figure shows a diagrammatic representation of linked list with 4 nodes. Each node
is pictured with two parts. The left part represents the information part of the node, which may
contain an entire record of data items (NAME, ADDRESS etc.,). The right part represents the
next pointer field of the node, and there is an arrow drawn from it to the next node in the list. The
pointer of the last node contains a special value called NULL pointer, which is invalid address.
NAME
or X
START
Next pointer field of third node
Information field of third node
In actual practice, 0 or a negative number is used for the null pointer. This is denoted by x in the
above diagram, which marks the end of the list. The linked list also contains a list pointer
variable called START or NAME – which contains the address of the first node in the list. This
is denoted by drawing an arrow from START to the first node. We need only this address in
START to trace through the list. A special case is the list that has no nodes. Such a list is called
a null list or empty list and is denoted by the null pointer in the variable START.
The important terms to understand the concept of Linked List.
Link − Each link of a linked list can store a data called an element.
Next − Each link of a linked list contains a link to the next link called Next.
LinkedList − A Linked List contains the connection link to the first link called First.
1
Linked List as ADT
Linked list is one of the basic structures used to implement an ADT list. This can be used to
create linear and non-linear structures. All the elements in a list have either zero, one or more
successors.
When compared to an array, linked lists have a major advantage of easy insertion and deletion of
data. There is no need for shifting the elements present in the linked list to accommodate a new
element or to delete an element. On the other hand, since there is no physical sequence for the
elements, we are restricted to use sequential searches instead of using binary search.
Figure (A) given below, shows a linked list implementation of linear list. Every element has a
link except the last which point to its only neighbor. The link present in the last element has a
NULL pointer which indicates the end of the list. Figure (B) shows a linked list implementation
of a non-linear list. Every element present in a non-linear list can have two or more links. There
are two links for every element – one for each successor. Figure (C) shows empty list which can
be either linear or non-linear. An empty list can be called as NULL list pointer.
2
Above figure shows a linked list. It shows that the node of a list need not occupy adjacent
elements in the array INFO and LINK.
Traversing a Linked List
Traversing refers to visiting each node of the list once in order to perform some operation on
that. We know that it is a linear data structure that needs to be traversed starting from the head
node until the end of the list. Unlike arrays, where random access is possible, linked list requires
access to its nodes through sequential traversal. Traversing a linked list is important in many
applications. For example, we may want to print a list or search for a specific node in the list. Or
we may want to perform an advanced operation on the list as we traverse the list.
3
Searching a Linked List
Let LIST be a linked list in memory. Suppose a specific ITEM of information is given for
finding the location LOC of the node where ITEM first appears in the LIST. If ITEM is actually
a key value and we are searching through a file for the record containing ITEM, then ITEM can
appear only once in LIST.
Here we discuss about searching the ITEM in an unsorted LIST. We search for ITEM in the
unsorted LIST by traversing through the list using a pointer variable PTR, and comparing with
the contents INFO[PTR] of each node, one by one in the LIST. Before we update the pointer
PTR by using the statement - PTR = LINK[PTR]
For this purpose, we need to perform two tests.
1. Check to see whether we have reached the end of the list. i.e. we need to check whether PTR =
NULL. If not, then
2. Check to see whether INFO[PTR] = ITEM
The above two tests cannot be performed at the same time, since INFO[PTR] is not defined
when PTR = NULL. So, use the first test to control the execution of a loop, and let the second
test take place inside the loop. The corresponding algorithm is given below:
Algorithm: SEARCH (INFO, LINK, START, ITEM, LOC)
LIST is a linked list in memory. This algorithm finds the location LOC of the node,
where ITEM first appears in the LIST. Otherwise, we set LOC as NULL.
1. Set PTR = START
2. Repeat step 3 while PTR != NULL
3. IF ITEM = INFO[PTR], then
Set LOC = PTR and goto Step 5
else
Set PTR = LINK[PTR] (now PTR points to next node)
[End of IF]
[End of Step 2 loop]
4. Set LOC = NULL (when search is unsuccessful)
5. EXIT
Memory Allocation
The maintenance of linked lists in memory assumes the possibility of inserting new nodes into
the list and hence requires some mechanism which provides unused memory space for the new
nodes. Mechanism is required whereby the memory space of deleted nodes becomes available
for future use.
Together with the linked list in memory, a special list is maintained which consists of unused
memory cells. This list, which has its own pointer, is called the list of available space or the
free-storage list or the free pool.
Whenever a new node is created, memory is allocated by the system. This memory is taken from
list of those memory locations which are free i.e. not allocated. This list is called AVAIL List.
4
Similarly, whenever a node is deleted, the deleted space becomes reusable and is added to the
list of unused space i.e. to AVAIL List. This unused space can be used in future for memory
allocation. Hence the free storage list will also be called the AVAIL list. Such a data structure
will frequently be denoted by writing:
LIST (INFO, LINK, START, AVAIL)
After insertion
.
Inserting Algorithms
Algorithms which insert nodes into linked lists come
up in various situations. They are:
1. Inserting a node at the beginning of the list.
2. Inserting a new node after the node with a
given location
3. Inserting a node in a sorted List.
6
Inserting after a given node
The following is the algorithm which inserts ITEM into LIST so that ITEM follows node A (the
previous node) or when LOC = NULL, so that that the ITEM is the first node.
Let N denote the new node (whose location is NEW). If LOC= NULL, then N is inserted as the
first node in LIST. Otherwise, let node N point to node B (which is originally followed node of
A) by the assignment
LINK [NEW] = LINK[LOC]
And we let node A point to the new node N by the assignment
LINK[LOC] = NEW
7
Inserting into a Sorted Linked List
Suppose ITEM is to be inserted into a sorted linked LIST. Then ITEM must be inserted between
nodes A and B so that
INFO(A)<ITEM S INFO(B)
The following is a procedure which finds the location LOC of node A. that is, which finds the
location LOC of the last node in LIST whose value is less than TEM.
Traverse the list, using a pointer variable PTR and comparing ITEM with INFO[PTR] at each
node. While traversing, keep track of the location of the preceding node by using a pointer
variable SAVE, as pictured in Fig. 5.21, Thus SAVE and PTR are updated by the assignments
The traversing continues as long as INFO[PTR]>ITEM. or in other words, the traversing stops as
soon as ITEM<=INFO[PTR] Then PTR points to node B. so SAVE will contain the location of the node A.
The formal statement of our procedure follows. The cases where the list is empty or where
ITEM< INFO[START], so LOC=NULL are treated separately. Since they do not involve the
variable SAVE
This procedure finds the location LOC of the last node in a sorted fist such
that INFO[LOC]< ITEM, or sets LOC=NULL
Now we have all the components to present an algorithm which inserts TEM into a linked list.
8
The simplicity of the algorithm comes from using the previous two procedures
There are special cases in deletion. If the deleted node is the first node in the list, then START
will point to node B and if the deleted node N is the last node in the list, then node A will
contain the NULL pointer.
Deletion Algorithm
Algorithms which delete nodes from linked lists come up in various situations. For example,
1. Deleting a node, following a given node.
2. Deleting the node with a given item of information
While deleting a node either the first node or the last node from the list, we must check to see
whether there is a node in the list. If not, then a corresponding message must be printed (eg.
Underflow).
Deleting the Node Following a Given Node
Let LIST be a inked list in memory. Suppose we are given the location LOC of a node N in LIST.
suppose we are given the location LOCP of the node preceding N or, when N is the
first node, we are given LOCP = NULL, The following algorithm deletes N from the list.
9
Algorithm 5.8: DEL(INFO, LINK, START, LOC, LOCP)
This algorithm deletes the node N with location LOC. LOCP is the location of
the node which precedes N or, when N is the first node, LOCP = NULL.
1. If LOCP = NULL, then
Set START= LINK[START]. [Deletes first node.]
Else
Set LINK[LOCP] = LINK[LOC]. [Deletes node N.]
End of If structure.
2. Exit.
The traversing continues as long as INFOĮPTR] # ITEM, or in other words, the traversing stops as
soon as ITEM= INFO[PTR]. Then PTR contains the location LOC of node N and SAVE contains
the location L.OCP of the node preceding N.
The formal statement of our procedure follows. The cases where the list is empty or where
INFO[START] = ITEM (i.e., where node N is the first node) are treated separately, since the
[End of If structure.]
6.Set SAVE := PTR and PTR := LINK|PTR]. [Updates pointers.]
End of Step 4 loop.
7. Set LOC := NULL. [Search unsuccessful.]
8. Returm.
Now we can easily present an algorithm to delete the first node N from a linked list which contains a
given ITEM of information. The simplicity of the algorithm comes from the fact that the task of finding
the location of N and the location of its preceding node has already been done in Procedure 5.9.
A header linked list which always contains a special node, called the header node,at the beginning of the list. The
following are two kinds of widely used header lists:
1. A grounded header list is a header list where the last node contains the null pointer.
2. circular header list is a header list where the last node points back t0 the header node.
Figure 5.30 contains schematic diagrams of these header lists.
1
1
Observe that the list pointer START always points to the header node. Accordingly,LINK[START] = NULL
indicates that a grounded header list is empty, and LINK[START]=START indicates that a circular header list is
empty.
The term "node," by itself, normally refers to an ordinary node, not the header node, when used
with header lists. Thus the first node in a header list is the node following the header node, and the
location of the first node is LINK [START), not START, as with ordinary linked lists.
Algorithm 5.11, which uses a pointer variable PTR to traverse a circular header list which traverses an ordinary
linked list, except that now the algorithm (1) begins with PTR = LINK[START] (not PTR = START) and (2) ends
when PTR = START (not PTR= NULL).
Circular header lists are frequently used instead of ordinary linked lists because many operations
are much easier to state and implement using header lists. This comes from the following two
properties of circular header lists:
1. The null pointer is not used, and hence all pointers contain valid addresses.
2. Every (ordinary) node has a predecessor, so the first node may not require a special case.
There are two other variations of linked lists which sometimes appear in the literature:
1. A linked list whose last node points back to the first node instead of containing the null
pointer, called a circular list.
2. A linked list which contains both special header node at the beginning of the list and a
special trailer node at the end of the list
Figure 5.32 contains schematic diagrams of these lists.
1
2
Polynomials
Header inked lists are frequently used for maintaining polynomials in memory. The he
node plays an important part in this representation, since it is needed to represent the
polynomial.
8X13+5X6+2X8+X5+X10+17X
This 13 order polynomial does not have all the 14 terms (including the constant term). Thus, it Is
very easy to represent the polynomial with the help of a linked list structure. Here every node can retain
information pertaining to a single term of the polynomial. All the nodes store three things:
vanable x
exponent
coefficient for each term.
Whatever be the equation, it does not matter if the polynomial is in x or y. This is a very
important informatIon that needs to be kept in mind when performing operations on polynomials.
Thus it is better if we define a node structure which holds 2 integers-exp and coff.
Circularly linked list is a special type of linked list. In a circularly linked list, the link field of the
last node points to the first node of the list. This type of list is mainly used in lists that allow
access to nodes in the middle of the list without starting at the beginning.
Figure 5.33 is a schematic diagram of a circularly linked list with four nodes. Notice that the
last node does not contain a NULL pointer as in a singly linked list. The link field of the last node
is connected to the information part of the first node.
1
3
Insertion into and deletion from a circularly linked list follows the Same pattern used in a singlly
liSt. However, in this case, the last node points to the first node. Therefore, when insenrting the last node we must
also point the link field to the first node.
A two-way list is a linear collection of data elements, called nodes, where each node N is
divided into three parts:
I. An information field INFO which contains the data of N
2 A pointer field FORW which contains the location of the next node in the list
3. A pointer field BACK which contains the location of the preceding node in the list
The list also requires two list pointer variables: FIRST, which points to the first node in the list,
and LAST, which points to the last node in the list. Figure 5.36 contains a schematic diagram of
such a list. Observe that the null pointer appears in the FORW field of the last node in the list and
also in the BACK field of the first node in the list.
Observe that, using the variable FlRST and the pointer field FORW, we can traverse a two-way
list in the forward direction as before. On the other hand, using the variable LAST and the pointer
field BACK, we can also traverse the list in the backward direction.
Suppose LOCA and LOCB are the locations, respectively, of nodes A and B in a two-way list.
then the way that the pointers FORW and BACK are defined gives us the following:
Pointer property: FORW[LOCA] = LOCB if and only if BACK[LOCB] = LOCA
1
4
Operations on Two-Way Lists
Deleting
Suppose we are given the location LOC of a node N in LIST, and suppose we want to delete N from
the list. We assume that LIST is a two-way circular header list. Note that BACK[LOC] and FORW[LOC]
are the locations, respectively,of the nodes which precede and follow node Accordingly, as pictured
in Fig. 5.40, N is deleted from the list by changing the following pair of pointers
inserting
Suppose we are given the locations L.OCA and LOCB of adjacent nodes A and B LIST,
and suppose we want to insert a given ITEM of information between nodes A and B.
The variable NEW contains the address of NEWNODE and then we copy the data ITEM into the node N;
Now, as pictured in Fig. 5.41, the node N with contents ITEM is inserted into the list by changing
the following four pointers:
FORW[LOCA] := NEW, FORW[NEW] = LOCB
BACK[LOCB] := NEW, BACK[NEW]= LOCA
The formal statement of our algorithm follows.
1
5
Algorithm 5.17: INSTWL(INFO, FORW, BACK, START, LOCA, LOCB, ITEM)
1. start
2. Set NEW=address of (NEWNODE)
3. [ copy new data into node] INFO[NEW]= ITEM.
4. [Insert node into list.]
Set FORW[LOCA] := NEW, FORW[NEW] := LOCB
BACK[LOCB] := NEW, BACK[NEW]= LOCA.
5. Exit.
1
7