Algorithms and Data
Structures
Simonas Šaltenis
Nykredit Center for Database Research
Aalborg University
simas@[Link]
September 10, 2001
Administration
People
Simonas Šaltenis
Anders B. Christensen
Daniel Povlsen
Home page [Link]
Check the homepage frequently!
Course book “ Introduction to Algorithms”, [Link]
Cormen et al.
Lectures, Fib 15, A, 8:15-10:00, Mondays and
Thursdays
September 10, 2001 2
Administration (2)
Exercises: every week after the lecture
Exam: SE course, written
Troubles
Simonas Šaltenis
E1-215
simas@[Link]
September 10, 2001 3
Syllabus
Introduction (1)
Correctness, analysis of algorithms (2,3,4)
Sorting (1,6,7)
Elementary data structures, ADTs (10)
Searching, advanced data structures (11,12,13,18)
Dynamic programming (15)
Graph algorithms (22,23,24)
Computational Geometry (33)
NP-Completeness (34)
September 10, 2001 4
What is it all about?
Solving problems
Get me from home to work
Balance my checkbook
Simulate a jet engine
Graduate from SPU
Using a computer to help solve problems
Designing programs (architecture, algorithms)
Writing programs
Verifying programs
Documenting programs
September 10, 2001 5
Data Structures and Algorithms
Algorithm
Outline, the essence of a computational
procedure, step-by-step instructions
Program – an implementation of an
algorithm in some programming language
Data structure
Organization of data needed to solve the
problem
September 10, 2001 6
Overall Picture
Data Structure and Implementation
Algorithm Design Goals Goals
Correctness Robustness
Adaptability
Efficiency
Reusability
September 10, 2001 7
Overall Picture (2)
This course is not about:
Programming languages
Computer architecture
Software architecture
Software design and implementation principles
Issues concerning small and large scale
programming
We will only touch upon the theory of
complexity and computability
September 10, 2001 8
History
Name: Persian mathematician Mohammed
al-Khowarizmi, in Latin became Algorismus
First algorithm: Euclidean Algorithm,
greatest common divisor, 400-300 B.C.
19th century – Charles Babbage, Ada
Lovelace.
20th century – Alan Turing, Alonzo Church,
John von Neumann
September 10, 2001 9
Algorithmic problem
Specification
Specification ? of output as
of input a function of
input
Infinite number of input instances satisfying the
specification. For example:
A sorted, non-decreasing sequence of natural
numbers. The sequence is of non-zero, finite length:
1, 20, 908, 909, 100000, 1000000000.
3.
September 10, 2001 10
Algorithmic Solution
Input instance, Algorithm Output
adhering to related to
the the input as
specification required
Algorithm describes actions on the input
instance
Infinitely many correct algorithms for the same
algorithmic problem
September 10, 2001 11
Example: Sorting
INPUT OUTPUT
sequence of numbers a permutation of the
sequence of numbers
a1, a2, a3,….,an b1,b2,b3,….,bn
Sort
2 5 4 10 7 2 4 5 7 10
Correctness Running time
For any given input the algorithm Depends on
halts with the output: • number of elements (n)
• b1 < b2 < b3 < …. < bn • how (partially) sorted
• b1, b2, b3, …., bn is a they are
permutation of a1, a2, a3,….,an • algorithm
September 10, 2001 12
Insertion Sort
A 3 4 6 8 9 7 2 5 1
1 j n
i
Strategy for j=2 to length(A)
do key=A[j]
• Start “empty handed” “insert A[j] into the
• Insert a card in the right sorted sequence A[1..j-1]”
position of the already sorted i=j-1
hand while i>0 and A[i]>key
• Continue until all cards are do A[i+1]=A[i]
inserted/sorted i--
A[i+1]:=key
September 10, 2001 13
Analysis of Algorithms
Efficiency:
Running time
Space used
Efficiency as a function of input size:
Number of data elements (numbers, points)
A number of bits in an input number
September 10, 2001 14
The RAM model
Very important to choose the level of
detail.
The RAM model:
Instructions (each taking constant time):
Arithmetic (add, subtract, multiply, etc.)
Data movement (assign)
Control (branch, subroutine call, return)
Data types – integers and floats
September 10, 2001 15
Analysis of Insertion Sort
Time to compute the running time as a
function of the input size
cost times
for j=2 to length(A) c1 n
do key=A[j] c2 n-1
“insert A[j] into the 0 n-1
sorted sequence A[1..j-1]”
i=j-1 c3 n-1
n
while i>0 and A[i]>key c4 t
nj 2
j
do A[i+1]=A[i] c5 nj 2
(t j 1)
i-- c6 j 2 j
(t 1)
A[i+1]:=key c7 n-1
September 10, 2001 16
Best/Worst/Average Case
Best case: elements already sorted
tj=1, running time = f(n), i.e., linear time.
Worst case: elements are sorted in
inverse order
tj=j, running time = f(n2), i.e., quadratic
time
Average case: tj=j/2, running time =
f(n2), i.e., quadratic time
September 10, 2001 17
Best/Worst/Average Case (2)
For a specific size of input n, investigate
running times for different input instances:
6n
5n
4n
3n
2n
1n
September 10, 2001 18
Best/Worst/Average Case (3)
For inputs of all sizes:
worst-case
6n average-case
Running time
5n
best-case
4n
3n
2n
1n
1 2 3 4 5 6 7 8 9 10 11 12 …..
Input instance size
September 10, 2001 19
Best/Worst/Average Case (4)
Worst case is usually used:
It is an upper-bound and in certain application
domains (e.g., air traffic control, surgery)
knowing the worst-case time complexity is of
crucial importance
For some algorithms worst case occurs fairly
often
The average case is often as bad as the
worst case
Finding the average case can be very difficult
September 10, 2001 20
Growth Functions
1,00E+10
1,00E+09
1,00E+08
n
log n
1,00E+07
sqrt n
n log n
1,00E+06
100n
n^2
1,00E+05
T(n)
n^3
1,00E+04
1,00E+03
1,00E+02
1,00E+01
1,00E+00
1,00E-01
2 4 8 16 32 64 128 256 512 1024
n
September 10, 2001 21
Growth Functions (2)
1,00E+155
1,00E+143
1,00E+131
1,00E+119 n
log n
1,00E+107
sqrt n
1,00E+95 n log n
100n
1,00E+83 n^2
T(n)
n^3
1,00E+71
2^n
1,00E+59
1,00E+47
1,00E+35
1,00E+23
1,00E+11
1,00E-01
2 4 8 16 32 64 128 256 512 1024
n
September 10, 2001 22
That’s it?
Is insertion sort the best approach to
sorting?
Alternative strategy based on divide and
conquer
MergeSort
sorting the numbers <4, 1, 3, 9> is split into
sorting <4, 1> and <3, 9> and
merging the results
Running time f(n log n)
September 10, 2001 23
Example 2: Searching
INPUT OUTPUT
• sequence of numbers (database) • an index of the found
• a single number (query) number or NIL
a1, a2, a3,….,an; q j
2 5 4 10 7; 5 2
2 5 4 10 7; 9 NIL
September 10, 2001 24
Searching (2)
j=1
while j<=length(A) and A[j]!=q
do j++
if j<=length(A) then return j
else return NIL
Worst-case running time: f(n), average-case:
f(n/2)
We can’t do better. This is a lower bound for the
problem of searching in an arbitrary sequence.
September 10, 2001 25
Example 3: Searching
INPUT OUTPUT
• sorted non-descending sequence • an index of the found
of numbers (database) number or NIL
• a single number (query)
a1, a2, a3,….,an; q j
2 4 5 7 10; 5 2
2 4 5 7 10; 9 NIL
September 10, 2001 26
Binary search
Idea: Divide and conquer, one of the key design
techniques
left=1
right=length(A)
do
j=(left+right)/2
if A[j]==q then return j
else if A[j]>q then right=j-1
else left=j+1
while left<=right
return NIL
September 10, 2001 27
Binary search – analysis
How many times the loop is executed:
With each execution its length is cult in half
How many times do you have to cut n in half
to get 1?
lg n
September 10, 2001 28
Next Week
Correctness of algorithms
Asymptotic analysis, big O notation.
September 10, 2001 29