5 Specifying Aggregate Data:
Arrays
Arrays
COL 100 - Introduction to Computer
p
Science
II Semester 2015-2016
Department of Computer Science and Engineering
Indian Institute of Technology Delhi
Aggregate Data
Data Types seen so far:
int, char, float, double,...
These are Scalar data types
Aggregate data: collection of variables
of same type: ARRAYS
of different types: struct/class/union/...
Jan 2016
P. R. Panda, IIT Delhi
The Array Data Type
Representing
p
g Collection of
Data
ordered set of elements
all of same type
position has significance
(compare: set)
e.g., marks scored by 100
students
Jan 2016
P. R. Panda, IIT Delhi
10
-2
Declaring an Array
Declaration has 4 components
specifier (optional)
base type
declarator
name
declarator operator ( [ ], etc.)
initialiser (optional)
extern int Marks [4] = {5,10, 15, 20};
Specifier
Jan 2016
Base type
Declarator Initialiser
P. R. Panda, IIT Delhi
Array Access
Individual elements can
be accessed
READ or WRITTEN
A
Array access has
h ttwo
components
array
y name
index (starts at 0)
X = Y; and X = 0; not
allowed
ll
d ffor arrays iin
C/C++
Jan 2016
10
-2
X[2] = 5; // Array on LHS => WRITE
X[3] = X[3] + 1; // Array on RHS => READ
P. R. Panda, IIT Delhi
Storing an Array in Memory
Array size must be constant
int X[5]; // X[0] to X[4]
int Y[size]; // ?
We need to know how much
space it occupies
Staticallyy (at
( COMPILE-time))
...before execution
Jan 2016
P. R. Panda, IIT Delhi
10
-2
int X[5];
Storing an Array in Memory
Elements are stored in sequence
((continguous)
g
)
How do we implement A[i] = b;?
First compute address of A[i]
Need to know bytes occupied by As
elements
This is Hardwdare
Hardwdare-dependent
dependent
sizeof operator
Store value of b at this address
Jan 2016
P. R. Panda, IIT Delhi
int A[4];
Address Memory
(Byte)
0
1
100
?
?
?
A[0]
A[1]
A[2]
A[3]
Array Index Range
Index expression must be within
RANGE (0 to N-1 for A[N])
What if we have A[4] = 7 or A[-1]
A[ 1] = 7 ?
When is the error detected?
Safety vs.
vs Efficiency
Native types targeted at EFFICIENCY
If we need safety, we should build our
own types
Jan 2016
P. R. Panda, IIT Delhi
int A[4];
Address Memory
(Byte)
0
1
100
?
?
?
A[0]
A[1]
A[2]
A[3]
Initialisation
Local variables have garbage
(undefined) value until explicit
initialisation
Variables can be initialised when
they
y are declared
Variables in global scope must
be initialised with constant
expression
e
p ess o
int i = 0, j = 1;
equivalent
int ii, j;
i = 0;
j = 1;
initialisation done once, before
program begins
Jan 2016
P. R. Panda, IIT Delhi
int i;
cout << i;
prints garbage
int x = 0;
int y = x; /* error */
main ( ) {
int p = y;
int q = p + 1; /* OK */
}
9
Initialising Arrays
List of initialisers in {...}
Array size can be inferred from list size
Too few initialisers: rest are zero
Too many: error
int A [ ] = {10, 20, 30, 40};
equivalent to:
int A [4] = {10, 20, 30, 40};
int A [4] = {10, 20, 30, 40};
abbreviation for:
i t A [4];
[4]
int
A [0] = 10;
A [1] = 20;
A [2] = 30;
A [3] = 40;
int A [4] = {10, 20};
equivalent to:
int A [4] = {10, 20, 0, 0};
In general, INITIALISATION is NOT the
same as ASSIGNMENT (details later)
Jan 2016
int A [2] = {10, 20, 30}; /* error */
P. R. Panda, IIT Delhi
10
Strings in C++
string data type: array of
characters
Terminated by \0
Manipulation:
Jan 2016
Copying a string (s = t)
Comparing strings (s == t)
Extracting characters (s[1] == b)
Concatenating/Appending (s + t)
others...later
P. R. Panda, IIT Delhi
string s = abc;
abc ;
a
b
c
\0
11
Operations with Arrays
Sum of Array elements
0
1
2
3
4
Initialising an Array
for (i = 0; i < 5; i++)
A [i] = 0;
0
A
1
0
2
0
for (i = 0; i < 5; i++) {
t = A [i];
A [i] = A [4 - i];
A [4 - i] = t;
}
Jan 2016
sum = 0;
for (i = 0; i < 5; i++)
sum = sum + A [i];
5
sum = 25
What does this loop do?
A
P. R. Panda, IIT Delhi
12
Searching in an Array
Given array A[N] and value V, is V
present in the Array? 0 1 2 3
A
fo nd = false;
found
for (i = 0; i < N; i++)
if (A[i] == V) {
found = true; // V is present
break;
}
Jan 2016
H many comparisons?
i
?
How
best case
average case
worst case
P. R. Panda, IIT Delhi
13
Prime Number Algorithm
Avoiding
multiples of 2
and 3
Avoiding
duplicate loop
bodies
Jan 2016
x = 6; root_n = sqrt (n);
prime = true;
while (x < root_n) {
if ((n % ((x+1)) == 0))
{prime = false; break}
if (n % (x+5) == 0)
{prime = false; break}
x = x + 6;
}
cout << prime ? Prime
Prime : Not
Not Prime
Prime;;
P. R. Panda, IIT Delhi
14
Prime Number Algorithm
Avoiding duplicate loop bodies
x = 6; root_n = sqrt (n);
prime = true;
while
hile ((x < root_n)
root n) {
if (n % (x+1) == 0)
{prime = false; break}
if (n % (x+5) == 0)
{prime = false; break}
x = x + 6;
}
cout << prime ? Prime : Not Prime;
Jan 2016
P. R. Panda, IIT Delhi
x = ???;
root n = sqrt (n);
root_n
(n)
prime = true;
while (x < root_n) {
x = ???
if (n % x == 0)
{prime = false; break}
}
15
Prime Numbers:
Seive of Eratosthenes
Generalise earlier strategy
Avoid multiples of ALL smaller primes
Algorithm:
For 2 <= x <= n
...
Jan 2016
P. R. Panda, IIT Delhi
16
Algorithm
Array of [2..n]
Initialise:
All numbers UNCROSSED
x=2
WHILE ((x <= n))
Jan 2016
Proceed to next uncrossed number x. This is a
PRIME
CROSS all multiples of x
P. R. Panda, IIT Delhi
17
Problem with Algorithm
Too much space
Need array of size ~ n
Let us keep array of prime numbers
Jan 2016
P. R. Panda, IIT Delhi
18
Maintaining Array of Primes
Append to p when we find new prime
p [0] = 2;
p [1] = 3;
p [2] = 5;
limit = 2; plim_square = 5*5;
if (x >= plim_square)
plim square) {
limit = limit + 1;
plim_squa
p
square
e = p[
p[limit]
t] * p[limit];
p[ t];
}
Jan 2016
P. R. Panda, IIT Delhi
Array p of Primes
0 1 2 3 4
2
11
limit last_prime
19
Checking for Prime
Check if x is divisible by any number in p
j = 2;
prime = true;
while (prime && j < limit) {
if (x % p[j] == 0)
prime = false;
j = j + 1;
}
Jan 2016
P. R. Panda, IIT Delhi
Array p of Primes
0 1 2 3 4
2
11
limit last_prime
20
Too Many Divisions
MOD operator (%) is expensive
Same as division
Try reducing divisions
Consider a given p[j] = 31
0
9 10 11
11 13 17 19 23 29 31 37
Suppose we just checked x = 1643 (multiple of 31)
After
Aft thi
this, which
hi h multiple
lti l off 31 will
ill show
h
up
as a candidate x?
Jan 2016
P. R. Panda, IIT Delhi
21
Next Expected Multiple
No need to check any x between 1643 and
1643+62 for divisibility by 31
For every p[j], we can store the next relevant
multiple
Compare x with p[j]
if x is greater
greater, update the multiple
if x is smaller?
if x is equal?
Jan 2016
P. R. Panda, IIT Delhi
22
Anticipating the Next Multiple
New array multiple[]
multiple[j] contains
the next anticipated
multiple of p[j]
if not reached, then x is
nott divisible
di i ibl b
by p[j]
[j]
Jan 2016
j = 2;
prime = true;
while (prime && j < limit) {
while (multiple [j] < x)
multiple [j] += p[j] * 2;
if (x == multiple [j])
prime = false;
j = j + 1;
}
P. R. Panda, IIT Delhi
23
Overall Algorithm
Initialise
While (x < n)
get next x
if (x >= square of limit prime)
Update limit
loop through prime multiples
Update multiples
Prime
Pi
T
Test: compare with
i h current multiple
li l
if (x is Prime)
Append x to array
Jan 2016
P. R. Panda, IIT Delhi
24
Searching a Sorted Array
Assume we have a sorted array
values are in increasing (decreasing) order
Search for a given value V
A
Possible to search using less than 5
comparisons (worst case)
Jan 2016
P. R. Panda, IIT Delhi
25
Binary Search
mid
L
Divide
Divide and
Conquer strategy
at every stage, we
reduce the size of
the problem to half
the earlier stage
Strategy: Compare
with the middle
element of current
range, and
eliminate half of
the range
Jan 2016
// Algorithm Binary Search
int A[N], V, low, high; // Search for V in A[ ]
...
low = 0; high = N-1;
do {
// Calculate mid position k of remaining array
mid = (low + high)/2;
if (A[mid] == V) break; // found!
if (V > A [mid])
low = mid + 1;
else
high = mid 1;
} while (low <= high);
cout << (A[mid] == V? yes : no);
P. R. Panda, IIT Delhi
26
Sorting an Array
Rearranging array contents in
increasing or decreasing order
0
A
1
9
2
6
Sort in increasing order
A
Jan 2016
P. R. Panda, IIT Delhi
How do we sort?
27
Simple Sorting Algorithm
A[0]
A[i] A[i+1]
A[N-1]
for (i = 0; i < N; i++) {
k = position of min. element
between A [i] and A [N-1]
Swap A [i] and A [k]
}
Jan 2016
P. R. Panda, IIT Delhi
28
Simple Sorting Algorithm
A[0]
for (i = 0; i < N; i++) {
k=p
position of min. element
between A [i] and A [N-1]
Swap A [i] and A [k]
}
A[i] A[i+1]
A[N-1]
k = i;;
for (j = i+1; j < N; j++)
if (A[j] < A [k])
k = j;
t = A [ i ];
A [ i ] = A [ k ];
A [ k ] = t;
Jan 2016
P. R. Panda, IIT Delhi
29
Matrix Multiplication
Input
I
t - 44
4 4 matrices
ti
A and
dB
Output
p - Product matrix C = A B
A
C
1,j
i,1 i,2 i,3 i,4
2,j
3,j
i,j
4,j
Ci,j = Dot Product (Row i of A, Column j of B)
Jan 2016
P. R. Panda, IIT Delhi
30
Matrix Multiplication Formulation
A
1
4
1
2
3
i,j
j
k=1
k=2
k=3
k=4
Ci,j = Ai,k Bk,j
k=1
Jan 2016
P. R. Panda, IIT Delhi
31
Representing a 22-Dim Matrix
0
Declare 2-dimensional array:
y
float A [4] [4];
ARRAY of ARRAYS
Both dimensions start at 0 and
end at 3
Generalises to n dimensions
int B [3][4][6];
3-dim array.
y Can represent
p
3x4x6
matrix.
Jan 2016
P. R. Panda, IIT Delhi
0 0,0 0,1 0,2 0,3
1 1,0 1,1 1,2 1,3
2 2,0 2,1 2,2 2,3
, 3,1
, 3,2
, 3,3
,
3 3,0
Ai,j stored
t d in
i
array location A[i][j]
32
Dot Product Algorithm
Range 0-3 instead of 1-4
A
0
Specification
i,0 i,1 i,2 i,3
0,j
1,j
2,j
3,j
k=1
=
Jan 2016
k=2
ij
i,j
Ci,j
i j = Ai,k
i k Bk,j
kj
k=0
Implementation
j
k=0
n-1
c[i][j] = 0;
for (k = 0; k < 4; k++) {
c[i][j] += A[i][k] * B[k][j];
}
k=3
C
P. R. Panda, IIT Delhi
33
Matrix Multiplication Algorithm
A
0
i,0 i,1 i,2 i,3
0,j
1,j
2,j
3,j
j
k=0
k=1
k=2
k=3
n-1
i,j
ij
Ci,j = Ai,k Bk,j
main()
{
float a[4][4], b[4][4], c[4][4];
.../* read values into arrays */
for (i = 0; i < 4; i++)
for (j = 0; j < 4; j++) {
c[i][j] = 0;
for (k = 0; k < 4; k++) {
c[i][j] += A[i][k] * B[k][j];
}
}
}
k=0
C
Jan 2016
DotProduct
P. R. Panda, IIT Delhi
34
Storing MultiMulti-dimensional Arrays
ROW-MAJOR order
Memory
elements of a row are
contiguous
ti
iin memory
followed in C/C++
Alternatively,
COLUMN-MAJOR
order
elements of a column
are contiguous
0 0,0 0,1 0,2 0,3
1 1,0 1,1 1,2 1,3
2 2,0 2,1 2,2 2,3
3 3,0 3,1 3,2 3,3
Jan 2016
P. R. Panda, IIT Delhi
A [0][0]
A [0][1]
A [0][2]
A [0][3]
A [1][0]
A [1][1]
A [1][2]
A [1][3]
A [2][0]
A [2][1]
A [2][2]
A [2][3]
A [3][0]
A [3][1]
A [3][2]
35
A [3][3]
Accessing MultiMulti-dimensional
Arrays
y
Addr_A
y = A[i][j]
First compute address of
A[i][j]
Read data at this address
0 0,0 0,1 0,2 0,3
1 1,0 1,1 1,2 1,3
2 2,0 2,1 2,2 2,3
3 3,0 3,1 3,2 3,3
int A[N][N];
Address of A[i][j] =
Addr_A + sizeof(int) x (N x i + j)
Jan 2016
P. R. Panda, IIT Delhi
A [0][0]
A [0][1]
A [0][2]
A [0][3]
A [1][0]
A [1][1]
A [1][2]
A [1][3]
A [2][0]
A [2][1]
A [2][2]
A [2][3]
A [3][0]
A [3][1]
A [3][2]
36
A [3][3]
Can we improve the MM code?
main()
{
float a[4][4], b[4][4], c[4][4];
.../* read values into arrays */
for (i = 0; i < 4; ii++))
for (j = 0; j < 4; j++) {
c[i][j] = 0;
f (k = 00; k < 44; k++)
k ){
for
c[i][j] += A[i][k] * B[k][j];
}
}
}
Jan 2016
P. R. Panda, IIT Delhi
37
Improve the MM code
Is the += useful?
avoids repeated computation
off the
th address
dd
off c[i][j]
[i][j]
in general, some complex
analysis is necessary to
establish that two
expressions c[i][j] are
identical
Address of c[i][j] unchanged in
inner loop
Replace c[i][j] by scalar
avoids multiple address
computations and memory
accesses
Jan 2016
main()
{
float a[4][4], b[4][4], c[4][4];
.../* read values into arrays */
for (i = 0; i < 4; i++)
for (j = 0; j < 4; j++) {
c[i][j]
[ ][j] = 0;;
for (k = 0; k < 4; k++) {
c[i][j] += A[i][k] * B[k][j];
}
}
}
P. R. Panda, IIT Delhi
38
Better Version
main()
{
float a[4][4], b[4][4], c[4][4];
.../* read values into arrays */
for (i = 0; i < 4; ii++))
for (j = 0; j < 4; j++) {
c[i][j] = 0;
f (k = 00; k < 44; k++) {
for
c[i][j] += A[i][k] * B[k][j];
}
}
}
Jan 2016
main()
{
float a[4][4], b[4][4], c[4][4], t;
.../* read values into arrays */
for ((i = 0;; i < 4;; i++))
for (j = 0; j < 4; j++) {
t = 0;
f (k = 00; k < 4;
for
4 k++) {
t += A[i][k] * B[k][j];
}
c[i][j] = t;
}
}
P. R. Panda, IIT Delhi
39
Which is Better?
Functionally equivalent code may not be equally efficient!
for (i = 0; i < N; i++)
for (j = 0; j < N; j++)
for (i = 0; i < N; i++)
for (j = 0; j < N; j++)
A[i][j] = 10;
Jan 2016
A[j][i] = 10;
P. R. Panda, IIT Delhi
40