Topic: 4.1.2 Algorithms: Chapter: 4.1 Computational Thinking and Problem-Solving
Topic: 4.1.2 Algorithms: Chapter: 4.1 Computational Thinking and Problem-Solving
It is a step by step set of operations to be performed. It is almost as similar to a cooking recipe in real life.
Algorithms exist that perform calculation, data processing, automated reasoning, sorting of data, and
many other things.
In this chapter, we will be focusing on some of the different sorting and searching algorithms that exist
today.
Linear Sort
U
This is the simplest kind of searching. It is also called the serial sort or the sequential sort. Searching
starts with the first item and then moves to each item in turn until either a match is found or the search
reaches the end of the data set with no match found.
A criteria will allow a possible match to be found within the records / items stored. If no match is found,
then the process will return the appropriate message.
The Algorithm
Advantages
Linear search is fairly easy to code. For example the pseudo-code below shows the algorithm in
action.
procedure serial_search()
For i = 0 to 19
check_for_match(i, list)
if match_found return 'match found'
Next i
return 'no match found'
end
Good performance over small to medium lists. Computers are now very powerful and so checking
potentially every element in the list for a match may not be an issue with lists of moderate length.
The list does not need to be in any order. Other algorithms only work because they assume that
the list is ordered in a certain way. Serial searching makes no assumption at all about the list so it
will work just as well with a randomly arranged list as an ordered list.
Not affected by insertions and deletions. Some algorithms assume the list is ordered in a certain
way. So if an item is inserted or deleted, the computer will need to re-order the list before that
Page 1 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
Disadvantages
May be too slow for oversized lists. Computers take a finite amount of time to search each item, so
naturally, the longer the list, the longer it will take to search using the serial method. The worst
case being no match found and every item had to be checked.
This speed disadvantage is why other search methods have been developed.
Binary Search
U
Sometimes you may be doing a binary search without realizing it. Suppose you want to find Samueal
Jones in the local telephone book. Would you start from page 1 and then go on from there, page by page?
Unlikely.
You don’t do this because you know an important fact about telephone books - the entries are in
alphabetic order. So what you do is make a guess - J is about halfway down the alphabet and so you open
the telephone book around half way. The page you see has names starting with N. So you know J will be
in the first half of the book. Next you open a page about halfway down the first half - the page has 'H'. So
now Jones must be in the upper half of this section.
You are carrying out a 'Binary search' algorithm. Notice that after only two guesses you are getting much
closer to the answer. If you were carrying out a serial search, you would still be at page 2.
The Algorithm
1. Set the highest location in the list to be searched as N
2. Set the lowest location in the list to be searched as L
3. Examine the item at location (N - L) /2 (i.e. halfway)
4. Is it a match? If Yes End search.
5. No
6. Is item less than criteria?
7. If Yes, Set lower limit L to item + 1 (Force the next search to use the upper half)
8. If No, Set upper limit N to item - 1 (Force the next search to use the lower half)
9. Is lower limit = upper limit, if yes end search (no match found)
10. Repeat from step 3 with the new upper and lower bounds.
Page 2 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
1. If the list is large and changing often, with items constantly being added or deleted, then the time it
takes to constantly re-order the list to allow for a binary search might be longer than a simple
serial search in the first place.
2. If the list is large and static e.g. telephone number database, then a binary search is very fast
compared to linear search. (in math terms it takes 2log 2 (n) for a binary search over n items)
R R
3. If the list is small then it might be simpler to just use a linear search
4. If the list is random, then linear is the only way
5. If the list is skewed so that the most often searched items are placed at the beginning, then on
average, a linear search might be better.
If you have an array with N data items and you want to apply Linear and Binary search on this data set,
The worst case will be if the item you want to look for is at the end of the array.
In this case a linear search would take N iterations to retrieve this item
Whereas, a binary search would take 2log 2 (n) iterations to retrieve this item.
R R
https://blog.penjee.com/wp-content/uploads/2015/04/binary-and-linear-search-animations.gif
30TU U30T
Page 3 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
In this method we compare each number in turn with the numbers before it in the list. We then insert the
number into its correct position.
We start with the second number, 47, and compare it with the numbers preceding it. There is only one and
it is less than 47, so no change in the order is made. We now compare the third number, 12, with its
predecessors. 12 is less than 20 so 12 is inserted before 20 in the list to give the list
12 20 47 53 32 84 85 96 45 18
This is continued until the last number is inserted in its correct position. In Fig. 3.4.j.1 the blue numbers
are the ones before the one we are trying to insert in the correct position. The red number is the one we
are trying to insert.
Page 4 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
Page 5 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
The Bubble sort is a simple sorting algorithm that works by repeatedly stepping through the list to be
sorted, comparing each pair and swapping them if they are in the wrong order. The pass through the list is
repeated until no swaps are needed, which indicated that the list is sorted. The algorithm gets its name
from the way larger elements “bubble” to the top of the list. It is a very slow was of sorting data and rarely
used in the industry. There are much faster sorting algorithms out there such as the Quick sort or the
Heap sort.
Animation link :
http://en.wikibooks.org/wiki/A-
30TU
level_Computing/AQA/Problem_Solving,_Programming,_Data_Representation_and_Practical_Exercise/Pro
blem_Solving/Searching_and_sorting#/media/File:Bubble-sort.gif U30T
Step-by-step example
Let us take the array of numbers "5 1 4 2 8", and sort the array from lowest number to greatest number
using bubble sort algorithm. In each step, elements written in bold are being compared.
First Pass:
(51428) ( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and swaps them since 5 > 1
(15428) ( 1 4 5 2 8 ), It then compares the second and third items and swaps them since 5 > 4
(14528) ( 1 4 2 5 8 ), Swap since 5 > 2
(14258) ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does not
swap them.
The algorithm has reached the end of the list of numbers and the largest number, 8, has bubbled to the
top. It now starts again.
Second Pass:
(14258) ( 1 4 2 5 8 ), no swap needed
(14258) ( 1 2 4 5 8 ), Swap since 4 > 2
(12458) ( 1 2 4 5 8 ), no swap needed
(12458) ( 1 2 4 5 8 ), no swap needed
Now, the array is already sorted, but our algorithm does not know if it is completed. The algorithm needs
one whole pass without any swap to know it is sorted.
Third Pass:
(12458) (12458)
(12458) (12458)
(12458) (12458)
(12458) (12458)
Finally, the array is sorted, and the algorithm can terminate.
Page 6 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
It is an important point to note that the number of iterations that each algorithm takes to sort the data in
either ascending or descending order may vary according to how the initial data is organized.
If the data is partially sorted, then the algorithms may take less iterations.
If the data is in descending order and you want to sort the data in ascending order, then the algorithm will
take maximum iterations to sort the data.
Page 7 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
Insertion
Consider Fig. 1a which shows a linked list and a free list. The linked list is created by removing cells from
the front of the free list and inserting them in the correct position in the linked list.
Fig. 1a
Now suppose we wish to insert an element between the second and third cells in the linked list. The
pointers have to be changed to those in Fig. 1b
Fig. 1b
The algorithm must check for an empty free list as there is then no way of adding new data. It must also
check to see if the new data is to be inserted at the front of the list. If neither of these is needed, the
algorithm must search the list to find the position for the new data.
Page 8 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
Deletion
Suppose we wish to delete the third cell in the linked list shown in Fig. 3.4.h.1. The result is shown in Fig.
1c.
Fig. 1c
In this case, the algorithm must make sure that there is something in the list to delete.
1. Check that the list is not empty.
2. If the list is empty report an error and stop.
3. Search list to find the cell immediately before the cell to be deleted and call it PREVIOUS.
4. If the cell is not in the list, report an error and stop.
5. Set TEMP to pointer in PREVIOUS.
6. Set pointer in PREVIOUS equal to pointer in cell to be deleted.
7. Set pointer in cell to be deleted equal to FREE.
8. Set FREE equal to TEMP and stop.
Page 9 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
The algorithm is
1. Check that the list is not empty.
2. If the list is empty report an error and stop.
3. Search the list to find the cell to be amended.
4. Amend the data but do not change the key.
5. Stop.
Searching
Assuming that the data in a linked list is in ascending order of some key value, the following algorithm
explains how to find where to insert a cell containing new data and keep the list in ascending order. It
assumes that the list is not empty and the data is not to be inserted at the head of the list.
1. POINTER is equal to HEAD
2. NEXT is equal to POINTER.NEXT
3. DATA is equal to NEXT.DATA
4. While (newData is less than DATA)
a. POINTER is equal to NEXT
b. NEXT is equal to POINTER.NEXT
c. DATA is equal to NEXT.DATA
5. Endwhile
Note: A number of methods have been shown here to describe algorithms associated with linked lists. Any
method is acceptable provided it explains the method. An algorithm does not have to be in pseudo code,
indeed, the sensible way of explaining these types of algorithm is often by diagram.
Page 10 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
The TREE is a general data structure that describes the relationship between data items or 'nodes'. The
parent node of a binary tree has only two child nodes.
One of the most powerful uses of the TREE data structure is to sort and manipulate data items.
Most databases use the Tree concept as the basis of storing, searching and sorting its records.
The Binary tree mostly holds data items in a sorted order, but with the addition of a simple rule
Rule: The LEFT node always contain values that come before the root node and the RIGHT node always
contain values that come after the root node.
For numbers, this means the left sub-tree contains numbers less than the root and the right sub-tree
contains numbers greater than the root. For words, as might be in a sorted dictionary, the order is
alphabetic.
Page 11 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
A sequence of numbers are to formed into a binary search tree. These numbers are available in this order:
20, 17, 29, 22, 45, 9, 19.
1. The first item is 20 and this is the root node, so begin the diagram
This is a binary search tree, so there are two child nodes available, the LEFT and the RIGHT. The next
number is 17, the rule is applied (left is less than parent node) and so it has to be the LEFT node, like this
The next number is 29, this is higher than the root node so it goes to the RIGHT sub-tree which happens to
be empty at this stage, so the tree now looks like
Page 12 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
The next number is 22. This is more that the root and so need to be on the RIGHT sub-tree. The first node
is already occupied. So the rule is applied again to that node, 22 comes before 29 and so it needs to be on
the LEFT sub-tree of that node, like this
The next number is 45, this is more than the root and more than the first right node, so it is placed on the
right side of the tree like this
Page 13 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
The next number is 9 which is less than the root, the first left node is occupied and 9 is less than that node
too, so it is placed on the left sub-tree, like this
The next number is 19, which is less than the root, so it will need to be in the left sub-tree. It is greater
than the occupied 17 node and so it is placed in the right sub-tree, like this
Page 14 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
Data in a tree serves two purposes (one is to be the data itself) the other is to act as a reference for the
creation of the subtree below it. If Joh is deleted there is no way of knowing which direction to take at that
node to find the details of the data beneath it.
Solution 1 is to store Joh’s subtree in temporary storage and then rewrite it to the tree after Joh is deleted.
(The effect is that one member of the subtree will take over from Joh as the root of that subtree).
Solution 2 is to mark data Joh as deleted so that the data no longer exists but it can maintain its action as
an index for that part of the tree so that the subtree can be correctly negotiated.
Retrieval
Page 15 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
Insertion
Fig. 3.4.h.4 shows a stack and its head pointer. Remember, a stack is a last-in-first-out (LIFO) data
structure. If we are to insert an item into a stack we must first check that the stack is not full. Having
done this we shall increment the pointer and then insert the new data item into the cell pointed to by the
stack pointer. This method assumes that the cells are numbered from 1 upwards and that, when the
stack is empty, the pointer is zero.
When an item is deleted from a stack, the item's value is copied and the stack pointer is moved down one
cell. The data itself is not deleted. This time, we must check that the stack is not empty before trying to
delete an item.
These are the only two operations you can perform on a stack.
Page 16 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
Queues
U
Fig. 3.4.h.5 shows a queue and its head and tail pointers. Remember, a queue is a first-in-first-out (FIFO)
data structure. If we are to insert an item into a queue we must first check that the stack is not full.
Having done this we shall increment the pointer and then insert the new data item into the cell pointed to
by the head pointer. This method assumes that the cells are numbered from 1 upwards and that, when
the queue is empty, the two pointers point to the same cell.
Before trying to delete an item, we must check to see that the queue is not empty. Using the
representation above, this will occur when the head and tail pointers point to the same cell.
These are the only two operations that can be performed on a queue.
Page 17 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
A hash table or a hash map is a data structure that associates keys with values. The primary operation it
supports efficiently is a lookup: given a key (e.g. a person’s name), find the corresponding value (e.g. that
person’s telephone number). It works by transforming the key using a hash function into a hash, a number
that the hash table uses to locate the desired value.
Insertion
In a has table, Each slot either contains a key or NIL (if the slot is empty) and if the slot is empty, then you
can insert your data item in that slot. If the hash table is searched until the end, and no empty slots are
found, then the algorithm will return “hash table full”
hash_insert (T, k)
i := 0
repeat j := h(k, i)
if T[j] = NIL
then T[j] := k
else I := i+ 1
until i = m
error “hash table overflow”
Page 18 of 19
Computer Science 9608 (Notes)
Chapter: 4.1 Computational thinking and
problem-solving
If we compare the Binary search to the Linear search, it should be quite obvious by now that the Binary
search takes fewer iterations to return the value being searched for. However, as mentioned before, the
Linear search is very easy to implement which means that it does not take up a lot of computer memory
for its functioning unlike the Binary search. This is the compromise that you have to make for stronger
algorithms. Those which take fewer iterations (take less time to return answer) are more likely to consume
more computer memory and vice versa…
Page 19 of 19