0% found this document useful (0 votes)
33 views99 pages

Mathematics For Information Technology

The document titled 'Mathematics for Computer Engineers' covers various mathematical concepts essential for computer engineering, including numerical linear algebra, quadratic forms, inner product spaces, and complex analysis. It provides foundational knowledge and methods such as Gauss elimination, eigenvalues, and complex functions. The content is structured into sections and subsections, detailing specific topics and techniques relevant to the field.

Uploaded by

PrashansaBhatia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views99 pages

Mathematics For Information Technology

The document titled 'Mathematics for Computer Engineers' covers various mathematical concepts essential for computer engineering, including numerical linear algebra, quadratic forms, inner product spaces, and complex analysis. It provides foundational knowledge and methods such as Gauss elimination, eigenvalues, and complex functions. The content is structured into sections and subsections, detailing specific topics and techniques relevant to the field.

Uploaded by

PrashansaBhatia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 99

Mathematics for Computer Engineers (MCE)

Rupak R. Gupta (RRG)


Aaditya Joil (Jojo)

2024
Contents

1 Numerical Linear Algebra 5

1.1 Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Elementary Row Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2 Matrix Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.3 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.1.4 Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6


1.2 Linear System of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Direct Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3.1 Gauss Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7


1.3.2 Gauss-Jordan Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.3 LU Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Indirect Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.4.1 Gauss-Jacobi Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15


1.4.2 Gauss-Seidel Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4.3 Power method to find Eigenvalues . . . . . . . . . . . . . . . . . . . . . . 19

2 Quadratic Forms 21
2.1 Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.1.1 Rank of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21


2.1.2 Column Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.3 Null Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.1.4 Similarity and Diagonalisability . . . . . . . . . . . . . . . . . . . . . . . . 22


2.2 Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.1 Associated matrices for common vector spaces . . . . . . . . . . . . . . . 23
2.2.2 Canonical (Ž) Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Page 1 of 98
2.2.3 Definiteness of Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Principal Axes Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Cholesky’s Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.5 Constraint Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3 Inner Product Spaces 30


3.1 Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1.2 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.1.3 The vector space Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31


3.1.4 Length of a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.2.1 Rules for Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31


3.2.2 Inner Product Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 A few common Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.1 Inner Product for the vector space Cn . . . . . . . . . . . . . . . . . . . . 32

3.3.2 Inner Product for the vector space of continuous real-valued functions . . 32
3.3.3 Inner Product for the vector space of Mm×n (C) . . . . . . . . . . . . . . 34
3.3.4 Inner Product for the vector space of P n [X] . . . . . . . . . . . . . . . . . 35

3.4 Norm (magnitude) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36


3.4.1 Unit Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5.1 Orthogonal Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.5.2 Orthogonal Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37


3.5.3 Orthonormal Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5.4 Orthogonal Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.6 Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40


3.6.1 Orthogonal Projection on a Subspace . . . . . . . . . . . . . . . . . . . . 41
3.7 Graham-Schmidt Orthogonalisation Process . . . . . . . . . . . . . . . . . . . . . 43
3.7.1 QR-Factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.7.2 Best Approximation of a Vector . . . . . . . . . . . . . . . . . . . . . . . . 45


3.7.3 Least Square Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4 Complex Analysis 49
4.1 Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.1 Imaginary Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Page 2 of 98
4.1.2 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.3 Euler’s Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.4 ArGand Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.1.5 Complex Conjugate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51


4.1.6 Equality of Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.7 Powers of Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.1.8 Roots of Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 52


4.2 Complex Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2.1 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.2 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.2.3 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2.4 Holomorphic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2.5 Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.2.6 Entire Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60


4.2.7 Harmonic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2.8 Milne-Thomson’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.9 Bijectivity of Complex Functions . . . . . . . . . . . . . . . . . . . . . . . 65

4.2.10 Some Special Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66


4.2.11 Orthogonal Trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.3 Conformal Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.3.1 Cross Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.3.2 Inverse Möbius Transformation . . . . . . . . . . . . . . . . . . . . . . . . 73


4.3.3 Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.4 Complex Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.4.1 Line Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74


4.4.2 Cauchy’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.4.3 Cauchy’s Integral Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.4.4 Generalized Cauchy’s Integral Formula . . . . . . . . . . . . . . . . . . . . 79

5 Elementary Number Theory 81

5.1 Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1.1 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1.2 Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1.3 Prime Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.2 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83


5.2.1 Modulus Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Page 3 of 98
5.2.2 Properties of Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2.3 Division Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.3 Greatest Common Divisor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

5.3.1 Properties of the GCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86


5.3.2 Euclidean Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.4 Linear Diophantine Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.5 Fundamental Theorem of Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . 89


5.6 Theory of Congruences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.6.1 Properties of Congruence . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.6.2 Residue Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.6.3 Solution of Linear Congruences . . . . . . . . . . . . . . . . . . . . . . . . 91


5.6.4 Chinese Remainder Theorem (CRT) . . . . . . . . . . . . . . . . . . . . . 93
5.6.5 System of Linear Congruences . . . . . . . . . . . . . . . . . . . . . . . . 94

5.6.6 Linear Congruences in Two Variables . . . . . . . . . . . . . . . . . . . . 95

Acronyms 97

Page 4 of 98
Chapter 1
Numerical Linear Algebra

1.1 Foundations
Numerical Linear Algebra (NLA) is the study of numerical methods to find approximate solutions
of real-time problems to save time.

1.1.1 Elementary Row Operations


The elementary row operations that can be performed on a matrix are as follows:
• Scaling: A row Ri can be scaled by a scalar λ as Ri −→ λRi . This scales the determinant
of the matrix by λ.
• Vector addition: A row vector Rj scaled by a scalar λ can be added to a row vector Ri
as Ri −→ Ri + λRj . This has no effect on the determinant of the matrix.
• Swap: Two rows Ri and Rj can be swapped as Ri ←→ Rj . This negates the determinant
of the matrix.

Row Echelon Form

Transforming a matrix to a upper triangular matrix by elementary row transformations produces


what is called the “row echelon form” of the matrix.
For example,
 ′
a′1i a′1n
  
a11 ··· a1i ··· a1n a11 ··· ···
 a21 ··· a2i ··· a2n   0 ··· a′2i ··· a′2n 
   
∼ .
 . .. ..  .. .. 
.. .. .. ..

 .
. . .   .. . .

 . . . . 
an1 ··· ani ··· ann 0 ··· 0 ··· a′nn
Pivot elements The first non-zero elements of every row of a matrix in row echelon form. In
the above example, a′11 , a′2i , . . . , a′nn form the pivot elements.

1.1.2 Matrix Augmentation


An augmented matrix A|B is the matrix obtained when a matrix B is appended at the end of
matrix A.

Page 5 of 98
" # " #
a11 a12 a13 1 b11 b12
For example, consider A := ∈ M2×3 (F) and B := ∈ M2×2 (F)
a21 a22 a23 b21 b22
The augmented matrix A|B is given by,
" #
a11 a12 a13 b11 b12
A|B = ∈ M2×5 (F)
a21 a22 a23 b21 b22

1.1.3 Eigenvalues and Eigenvectors


Eigenvectors

A matrix performs a linear transformation when it is multiplied with a vector. The vectors which
are only scaled but not rotated when multiplied by a given matrix are known as the eigenvectors
of that matrix.

Eigenvalues

The factor by which a given eigenvector is scaled on multiplication with its matrix is known as
the eigenvalue of the given matrix for that particular eigenvector.

Ax = λx (1.1)

Eigenvector Eigenvalue

1.1.4 Scalar Product


h i⊤ h i⊤
If vectors u := u1 u2 · · · un and v := v1 v2 ··· vn ∈ Rn , then their scalar
product (or dot product) is defined as,

u · v := u1 v1 + u2 v2 + · · · + un vn (1.2)

Norm
h i⊤
The norm of a vector v := v1 v2 ··· vn ∈ Rn is defined as:

√ q
∥v∥ := v·v = v12 + v22 + · · · + vn2 (1.3)

1.2 Linear System of Equations


A system of equations is said to be linear if each equation is a linear combination of all the the
unknowns.
An n × n system is as follows:

a11 x1 + a12 x2 + ··· + a1n = b1


a21 x1 + a22 x2 + ··· + a2n = b2
.. .. .. .. ..
. . . . .
an1 x1 + an2 x2 + ··· + ann = bn
1F represents the scalar field, which may be R or C.

Page 6 of 98
The above system can be represented as AX = B, where:
 
a11 a12 ··· a1n
 a21 a22 ··· a2n 
 
A=
 .. .. .. ,
..  A ∈ Mn×n (F)
 . . . . 
an1 an2 ··· ann
 
b1
 b2 
 
 ..  ,
and B =   B ∈ Fn
.
bn

1.3 Direct Methods


Direct methods are the methods which produce exact results. These methods are typically more
expensive computationally.

1.3.1 Gauss Elimination


In Gaussian elimination method, we perform elementary row operations on the augmented matrix
A|B to convert it into an upper triangular matrix.
 
a11 a12 ··· a1n b1
 a21 a22 ··· a2n b2
 

∴ A|B = 
 .. .. .. .. .. 
.

 . . . . 
an1 an2 ··· ann bn
a′11 a′12 a′1n b′1
 
···
 0 a′22 ··· a′2n b′2
 

∼
 .. .. .. .. .. 
.

 . . . . 
0 0 ··· a′nn b′n

Upon back-substitution, we get

b′n
a′nn xn = b′n =⇒ xn =
a′nn

a′(n−1)(n−1) xn−1 + a′(n−1)n xn = b′n−1 and so on.


The time complexity of Gauss elimination is O 2n3 /3 for a system of order n.
Q.1. Solve using Gauss elimination:

x + 2y + z = 3
2x + 3y + 3z = 10
3x − y + 2z = 13

Page 7 of 98
We have the system of equations,

1x + 2y + 1z = 3
2x + 3y + 3z = 10
3x − 1y + 2z = 13

The above system can be written as,


    
1 2 1 x 3
2 3 3 y  = 10
    

3 −1 2 z 13

By Gauss elimination method,


   
1 2 1 3 1 2 1 3 !
R2 −→ R2 − 2R1
 2 3 3 10  ∼  0 −1 1 4 
   
R3 −→ R3 − 3R1
3 −1 2 13 0 −7 −1 4
 
1 2 1 3  
∼ 0 −1 1 4 R3 −→ R3 − 7R1
 

0 0 −8 −24

Upon back-substitution, we get

−8z = −24 =⇒ z = 3
−y + z = 4 =⇒ y = −1
x + 2y + z = 3 =⇒ x = 2

Therefore, (x, y, z) = (2, −1, 3) . (Ans.)

1.3.2 Gauss-Jordan Elimination


In Gauss-Jordan elimination, we perform elementary row operations on the augmented matrix
A|B to convert it into a diagonal matrix.
 
a11 a12 ··· a1n b1
 a21 a22 ··· a2n b2
 

∴ A|B = 
 .. .. .. .. .. 
.

 . . . . 
an1 an2 ··· ann bn
a′11 b′1
 
0 ··· 0
 0 a′22 ··· 0 b′2
 

∼
 .. .. .. .. .. 
.

 . . . . 
0 0 ··· a′nn b′n

Page 8 of 98
Upon back-substitution, we get

b′n
a′nn xn = b′n =⇒ xn =
a′nn

b′n−1
a′(n−1)(n−1) xn−1 = b′n−1 =⇒ xn−1 =
a′(n−1)(n−1)
..
.
b′2
a′22 x2 = b′2 =⇒ x2 =
a′22

b′1
a′11 x1 = b′1 =⇒ x1 =
a′11

The time complexity of Gauss-Jordan elimination is O 2n3 /3 for a system of order n.

Q.2. Solve using Gauss-Jordan elimination:

x + 2y + z = 3
2x + 3y + 3z = 10
3x − y + 2z = 13

We have the system of equations,

1x + 2y + 1z = 3
2x + 3y + 3z = 10
3x − 1y + 2z = 13

The above system can be written as,


    
1 2 1 x 3
2 3 3 y  = 10
    

3 −1 2 z 13

By Gauss-Jordan elmination,
   
1 2 1 3 1 2 1 3 !
R2 −→ R2 − 2R1
 2 3 3 10  ∼  0 −1 1 4 
   
R3 −→ R3 − 3R1
3 −1 2 13 0 −7 −1 4
 
1 2 1 3  
∼ 0 −1 1 4 R3 −→ R3 − 7R1
 

0 0 −8 −24
 
8 16 0 0 !
R1 −→ 8R1 + R3
∼ 0 −8 0 8
 

R2 −→ 8R + 2R3
0 0 −8 −24
 
8 0 0 16  
∼ 0 −8 0 8 R1 −→ R1 + 2R2
 

0 0 −8 −24

Page 9 of 98
Upon back-substitution, we get

−8z = −24 =⇒ z = 3
−8y = 8 =⇒ y = −1
8x = 16 =⇒ x = 2

Therefore, (x, y, z) = (2, −1, 3) . (Ans.)

1.3.3 LU Decomposition
We have the linear system of equations AX = B. We can decompose the coefficient matrix A
as A = LU , where L is a lower triangular matrix while U is an upper triangular matrix.

The time complexity of LU decomposition is O n3 /3 + n2 for a system of order n, which is
slightly better than Gauss elimination.
A = LU , where L is a lower triangular matrix and U is an upper triangular matrix.

Therefore,

AX = B =⇒ (LU )X = B
=⇒ LU X = B
∴ LY = B, where Y := U X

       
y1b1 x1 y1
 y2   b2   x2   y2 
       
 ..  =  ..  and A  ..  =  .. 
=⇒ L        
. .  .  .
yn bn xn yn

We can define the matrices L and U slightly differently depending on the leading diagonal, which
corresponds to two methods.

Crout’s Method
   
l11 0 ··· 0 1 u12 ··· u1n
 l21 l22 ··· 0  0 1 ··· u2n 
   
A = LU, where L := 
 .. .. ..  and U :=  ..
..   .. .. .. 
. .

 . . .  . . . 
ln1 ln2 ··· lnn 0 0 ··· 1

Doolittle’s Method
   
1 0 ··· 0 u11 u12 ··· u1n
 l21 1 ··· 0  0 u22 ··· u2n 
   
A = LU, where L := 
 .. .. ..  and U :=  ..
..   .. .. .. 
. .

 . . .  . . . 
ln1 ln2 ··· 1 0 0 ··· unn

Page 10 of 98
Q.3. Solve using LU decomposition:

2x + 3y + z = −1
5x + y + z = 9
3x + 2y + 4z = 11

We have the system of equations,

2x + 3y + 1z = −1
5x + 1y + 1z = 9
3x + 2y + 4z = 11

The above system can be written as,


    
2 3 1 x −1
5 1 1 y   9  , or AX = B.
=
    

3 2 4 z 11

Using Crout’s method,


   
l11 0 0 1 u12 u13
Let A := LU , where L := l21 l22 0  and U := 0 1 u23 .
   

l31 l32 l33 0 0 1


By LU decomposition,

A = LU
    
2 3 1 l11 0 0 1 u12 u13
∴ 5 1 1 = l21 l22 0  0 1 u23 
    

3 2 4 l31 l32 l33 0 0 1


  
2 3 1 l11 l11 u12 l11 u13
∴ 5 1 1 = l21 l21 u12 + l22 l21 u13 + l22 u23
   

3 2 4 l31 l31 u12 + l32 l31 u13 + l32 u23 + l33

   
2 0 0 1 3/2 1/2

Upon comparing like terms and evaluating, L = 5 −13/2 0  and U = 0 1 3/13.


   

3 −5/2 40/13 0 0 1
Now,

AX = B
=⇒ (LU )X = B =⇒ L(U X) = B
 h i⊤ 
∴ LY = B where Y = U X := a b c
    
2 0 0 a −1
∴ 5 −13/2 =
0  b  9 
    

3 −5/2 40/13 c 11

Page 11 of 98
Upon back-substitution, we get

2a = −1 =⇒ a = −0.5
5a − (13/2)b = 9 =⇒ b = −23/13
3a − (5/2)b + (40/3)c = 11 =⇒ c = 21/8

 ⊤
=⇒ Y = − 1 −
23 21 = UX
2 13 8

    
1 3/2 1/2 x −1/2
∴ 0 1 3/13  y  = −23/13
    

0 0 1 z 21/8

Upon back-substitution, we get

z = 21/8 =⇒ z = 21/8

y + (3/13)z = −23/13 =⇒ y = −19/8

x + (3/2)y + (1/2)z = −1/2 =⇒ x = 7/4

!
7 19 21
Therefore, (x, y, z) = ,− , . (Ans.)
4 8 8

Q.4. Solve using Crout’s method as well as Doolittle’s method:

3x + 5y + 2z = 8
8y + 2z = −7
6x + 2y + 8z = 26

We have the system of equations,

3x + 5y + 2z = 8
0x + 8y + 2z = −7
6x + 2y + 8z = 26

The above system can be written as,


    
3 5 2 x 8
0 8 2 y  = −7 , or AX = B.
    

6 2 8 z 26

Using Crout’s method,


   
l11 0 0 1 u12 u13
Let A = LU , where L := l21 l22 0  and U := 0 1 u23 .
   

l31 l32 l33 0 0 1

Page 12 of 98
By LU decomposition,

A = LU
    
3 5 2 l11 0 0 1 u12 u13
∴ 0 8 2 = l21 l22 0  0 1 u23 
    

6 2 8 l31 l32 l33 0 0 1


   
3 5 2 l11 l11 u12 l11 u13
∴ 0 8 2 = l21 l21 u12 + l22 l21 u13 + l22 u23
   

6 2 8 l31 l31 u12 + l32 l31 u13 + l32 u23 + l33

   
3 0 0 1 5/3 2/3

Upon comparing like terms and evaluating, L = 0 8 0 and U = 0 1 1/4.


   

6 −8 6 0 0 1
Now,

AX = B
=⇒ (LU )X = B =⇒ L(U X) = B
 h i⊤ 
∴ LY = B where Y = U X := a b c
    
3 0 0 a 8
∴ 0 8 0  b  = −7
    

6 −8 6 c 26

Upon back-substitution, we get

3a = 8 =⇒ a = 8/3
8b = −7 =⇒ b = −7/8
6a − 8b + 6c = 26 =⇒ 1/2

 ⊤
=⇒ Y = 8 −
7 1 = UX
3 8 2

    
1 5/3 2/3 x 8/3

∴ 0 1 1/4  y  = −7/8
    

0 0 1 z 1/2

Upon back-substitution, we get

z = 1/2 =⇒ z = 1/2
y + (1/4)z = −7/8 =⇒ y = −1
x + (5/3)y + (2/3)z = 8/3 =⇒ x = 4

Page 13 of 98
Using Doolittle’s method,
   
1 0 0 u11 u12 u13
Let A = LU , where L := l21 1 0 and U :=  0 u22 u23 .
   

l31 l32 1 0 0 u33


By LU decomposition,

A = LU
    
3 5 2 1 0 0 u11 u12 u13
∴ 0 8 2 = l21 1 0  0 u22 u23 
    

6 2 8 l31 l32 1 0 0 u33


   
3 5 2 u11 u12 u13
∴ 0 8 2 = l21 u11 l21 u12 + u22 l21 u13 + u23
   

6 2 8 l31 u11 l31 u12 + l32 u22 l31 u13 + l32 u23 + u33

   
1 0 0 3 5 2
Upon comparing like terms and evaluating, L = 0 1 0 and U = 0 8 2.
   

2 −1 1 0 0 6
Now,

AX = B
=⇒ (LU )X = B =⇒ L(U X) = B
 h i⊤ 
∴ LY = B where Y = U X := a b c
    
1 0 0 a 8
∴ 0 1 0  b  −7
=
    

2 −1 1 c 26

Upon back-substitution, we get

a=8
b = −7
2a − b + c = 26 =⇒ c = 3

h i⊤
=⇒ Y = 8 −7 3 = UX

    
3 5 2 x 8
∴ 0 8 2 y  = −7
    

0 0 6 z 3

Page 14 of 98
Upon back-substitution, we get

6z = 3 =⇒ z = 1/2
8y + 2z = −7 =⇒ y = −1
3x + 5y + 2z = 8 =⇒ x = 4

!
1
Therefore, (x, y, z) = 4, −1, . (Ans.)
2

Cholesky’s Method
 
l11 0 ··· 0
 l21 l22 ··· 0
 
2 ⊤

If A is a symmetric and positive definite matrix, then A = LL , where L := 
 .. .. .. .. .
.

 . . . 
ln1 ln2 ··· lnn

1.4 Indirect Methods


As seen above, the direct methods are computationally very expensive, so for most practical ap-
plications we prefer indirect methods to approximate the solution within a reasonable tolerance.

1.4.1 Gauss-Jacobi Method


Consider the linear system of equations:

a11 x1 + a12 x2 + ··· + a1n = b1


a21 x1 + a22 x2 + ··· + a2n = b2
.. .. .. .. ..
. . . . .
an1 x1 + an2 x2 + ··· + ann = bn

If the absolute values of pivot (diagonal) coefficients are greater than the absolute value of the
sum of other coefficients in the same row, then we can apply the Gauss-Jacobi algorithm to
approximate the solution.

|a11 | > |a12 | + |a13 | + ... + |a1n |


|a22 | > |a21 | + |a23 | + ... + |a2n |
.. .. .. .. ..
. . . . .
|ann | > |an1 | + |an2 | + . . . + an(n−1)

h i⊤
We shall start with an initial guess for the solution x(0)
1
(0)
x2 ···
(0)
xn and follow the
2 ∀x ∈ Rn x⊤ Ax > 0. In other words, all eigenvalues are real and positive.

Page 15 of 98
recurrence relation:

(k+1) 1  (k) (k)



x1 := b1 − a12 x2 − a13 x3 − · · · − a1n x(k)
n
a11
(k+1) 1  (k) (k)

x2 := b2 − a21 x1 − a23 x3 − · · · − a2n x(k)
n
a22
.. ..
. .
1  (k) (k) (k)

x(k+1)
n := bn − an1 x1 − an2 x2 − · · · − an(n−1) xn−1
ann

1.4.2 Gauss-Seidel Method


The Gauss-Seidel algorithm is an optimization to the Gauss-Jacobi algorithm, where the variables
are updated with updated values of the previous computed variables within the same iteration.

(k+1) 1  (k) (k)



x1 := b1 − a12 x2 − a13 x3 − · · · − a1n x(k)n
a11
(k+1) 1  (k+1) (k)

x2 := b2 − a21 x1 − a23 x3 − · · · − a2n x(k)
n
a22
.. ..
. .
1  (k+1) (k+1) (k+1)

x(k+1)
n := bn − an1 x1 − an2 x2 − · · · − an(n−1) xn−1
ann

Q.5. Solve using Gauss-Jacobi and Gauss-Seidel methods:

10x − 5y − 2z = 3
4x − 10y + 3z = −3
x + 6y + 10z = −3

We can observe that:

|10| > |−5| + |−2|


|−10| > |4| + |3|
|10| > |1| + |6|

Thus, the solution to the given system can be approximated using Gauss-Jacobi method.

The recurrence relation would be of the form:

1 
x(k+1) = 3 + 5y (k) + 2z (k)
10
1 
y (k+1) = 3 + 4x(k) + 3z (k)
10
1 
z (k+1) =− 3 + x(k) + 6y (k)
10

Let the initial approximation be x(0) := 0, y (0) := 0, z (0) := 0.

Page 16 of 98
Using Gauss-Jacobi method,
Iteration 1

1 
x(1) = 3 + 5y (0) + 2z (0) = 0.3
10
1 
y (1) = 3 + 4x(0) + 3z (0) = 0.3
10
1 
z (1) =− 3 + x(0) + 6y (0) = −0.3
10

Iteration 2

1 
x(2) = 3 + 5y (1) + 2z (1) = 0.39
10
1 
y (2) = 3 + 4x(1) + 3z (1) = 0.33
10
1 
z (2) =− 3 + x(1) + 6y (1) = −0.51
10

Iteration 3

1 
x(3) = 3 + 5y (2) + 2z (2) = 0.363
10
1 
y (3) = 3 + 4x(2) + 3z (2) = 0.303
10
1 
z (3) =− 3 + x(2) + 6y (2) = −0.537
10

Iteration 4

1 
x(4) = 3 + 5y (3) + 2z (3) = 0.3441
10
1 
y (4) = 3 + 4x(3) + 3z (3) = 0.2841
10
1 
z (4) =− 3 + x(3) + 6y (3) = −0.5181
10

Iteration 5

1 
x(5) = 3 + 5y (4) + 2z (4) = 0.33840
10
1 
y (5) = 3 + 4x(4) + 3z (4) = 0.28221
10
1 
z (5) =− 3 + x(4) + 6y (4) = −0.50487
10

Iteration 6

1 
x(6) = 3 + 5y (5) + 2z (5) ≈ 0.3401
10
1 
y (6) = 3 + 4x(5) + 3z (5) ≈ 0.2839
10
1 
z (6) =− 3 + x(5) + 6y (5) ≈ −0.5031
10

h i⊤ h i⊤
Therefore, our solution is x y z ≈ 0.3401 0.2839 −0.5031

Page 17 of 98
by Gauss-Jacobi in 6 iterations. (Ans.)
Using Gauss-Seidel method,
Iteration 1

1 
x(1) = 3 + 5y (0) + 2z (0) = 0.3
10
1 
y (1) = 3 + 4x(1) + 3z (0) = 0.42
10
1 
z (1) =− 3 + x(1) + 6y (1) = −0.582
10

Iteration 2

1 
x(2) = 3 + 5y (1) + 2z (1) = 0.3936
10
1 
y (2) = 3 + 4x(2) + 3z (1) = 0.28284
10
1 
z (2) =− 3 + x(2) + 6y (2) = −0.509064
10

Iteration 3

1 
x(3) = 3 + 5y (2) + 2z (2) = 0.3396072
10
1 
y (3) = 3 + 4x(3) + 3z (2) = 0.28312368
10
1 
z (3) =− 3 + x(3) + 6y (3) = −0.503834928
10

Iteration 4

1 
x(4) = 3 + 5y (3) + 2z (3) ≈ 0.3408
10
1 
y (4) = 3 + 4x(4) + 3z (3) ≈ 0.2852
10
1 
z (4) =− 3 + x(4) + 6y (4) ≈ −0.5052
10

h i⊤ h i⊤
Therefore, our solution is x y z ≈ 0.3408 0.2852 −0.5052
by Gauss-Seidel in 4 iterations. (Ans.)

Page 18 of 98
1.4.3 Power method to find Eigenvalues
Dominant eigenvalue

Let λ1 , λ2 , . . . , λn be n eigenvalues of a square matrix of order n.


If |λ1 | > |λi | ∀i ∈ J2, nK then λ1 is called the dominant eigenvalue and its corresponding
eigenvector is called the dominant eigenvector.

Power method

Let A be a square matrix of order n. Start with an initial guess: say x0 is the first approximation
of the dominant eigenvector.
h i⊤
Let x0 := 1 1 · · · 1 . We’ll follow the recurrence relation:

Scale 1
xk+1 := Axk −→ x(k+1) , (1.4)
m(k+1)

where m(k+1) is the magnitude of the highest component in the vector x(k+1) . We scale the
vector each iteration so that the values do not explode and precision is maintained while using
floating point arithmetic.
Theorem 1. If x is an eigenvector of a matrix A, then its corresponding eigenvalue is given by

Ax · x Ax · x
λ= = 2 ,
x·x ∥x∥

Q.6. Find the approximated dominant eigenvalue and eigenvector of the matrix
 
1 2 0
A = −2 1 2 .
 

1 3 1

h i⊤
Consider the initial guess x0 := 1.00 1.00 1.00 .

For the sake of our sanity, we will stay precise upto 2 decimal places.
      
1 2
0 1.00 3.00 0.60
 Scale 
∴ x1 := Ax0 = −2 1 2 1.00 = 1.00 −→ 0.20
    

1 3 1 1.00 5.00 1.00


      
1 2 0 0.60 1.00 0.45
 Scale 
∴ x2 := Ax1 = −2 1 2 0.20 = 1.00 −→ 0.45
    

1 3 1 1.00 2.20 1.00


      
1 2 0 0.45 1.35 0.50
 Scale
∴ x3 := Ax2 = −2 1 2 0.45 = 1.55 −→ 0.50
     

1 3 1 1.00 2.80 1.00


| {z }
Dominant eigenvector

(Ans.)

Page 19 of 98
From theorem 1, the dominant eigenvalue λ is given by

Ax · x
λ=
x·x
        
1 2 0 0.50 0.50 1.50 0.50
−2 1 2 0.50 · 0.50 1.50 · 0.50
        

1 3 1 1.00 1.00 3.00 1.00


=     =    
0.50 0.50 0.50 0.50
0.50 · 0.50 0.50 · 0.50
       

1.00 1.00 1.00 1.00


(1.50)(0.50) + (1.50)(0.50) + (3.00)(1.00)
= 2 2 2
(0.50) + (0.50) + (1.00)
0.75 + 0.75 + 3.00
=
0.25 + 0.25 + 1.00
4.50
= =⇒ λ ≈ 3.00
1.50

(Ans.)

Page 20 of 98
Chapter 2
Quadratic Forms

2.1 Foundations
2.1.1 Rank of a Matrix
The rank of a matrix is the number of non-zero rows it has when it is in Row-Echelon form.
 
α11 α12 α13 α14 ··· 0
 0 α22 α23 α24 ··· 0
 

 
 0 0 0 α34 ··· 0 
A= .
 
 .. .. .. .. .. .. 
 . . . . . 

···
 
 0 0 0 0 α(n−1)(n) 
0 0 0 0 ··· 0 n×n

The above matrix has n − 1 non-zero rows in it’s reduced row echelon form and hence has rank
n − 1.

2.1.2 Column Space


The column space of a matrix A, is just the subspace spanned by it’s columns. It is also known
as the Image or the Range of the matrix A.

Col Space(A) = Im(A) = {A · x | x ∈ V }

where A is a transformation from V to W .

Why is the Image of A a subspace?


   
a11 a12 a13 x1
Consider a matrix A = a21 a22 a23  and a vector x = x2 .
   

a31 a32 a33 x3


 
a11 x1 + a12 x2 + a13 x3
Since the vector A · x is given as a21 x1 + a22 x2 + a23 x3 , which is essentially just a linear
 

a31 x1 + a32 x2 + a33 x3


combination of the columns of A. The set of all elements A · x is just a subspace in W .

Page 21 of 98
2.1.3 Null Space
The null space of a matrix A, is the set of vectors which are mapped to 0W when transformed
by A. It is also known as the Kernel of the matrix A.

Null Space(A) = rank(A) = {x ∈ V | A · x = 0W }

where A is a transformation from V to W .

2.1.4 Similarity and Diagonalisability


Similarity

A matrix A is said to be similar to another matrix B, if there exists and invertible matrix P
such that:
B = P −1 AP

The similarity relation is an equivalence relation.

Diagonalisability

A matrix A is said to be diagonalisable iff it is similar to a diagonal matrix D.

D =P −1 AP

Diagonal Matrix Similar

The most important theorem about diagonalisability is as follows:


Theorem 2. An n × n matrix A is diagonalisable if and only if there is an invertible matrix P
given by h i
P = X1 X2 ··· Xn

where the Xk are eigenvectors of A.

We accept this theorem without a proof.

2.2 Quadratic Forms


The expression of the form Q(x) = k11 x21 + k12 x1 x2 + k13 x1 x3 + · · · + k22 x22 + · · · + knn x2n , where
the degree of each term is 2, is called a quadratic form.
Every Quadratic Form (QF) can be represented in the following way:

Q : Rn → R

Q(x) = x⊤ A x

Associated Matrix

Page 22 of 98
2.2.1 Associated matrices for common vector spaces
The vector space R2
h i⊤
For the vector space R2 , with vectors of the form x = x1 x2 , consider the associated matrix
" #
a11 a12
A=
a21 a22
h i
Q(x) = x⊤ Ax
" #" #
h i a a12 x1
11
= x1 x2
a21 a22 x2
" #
h i a x +a x
11 1 12 2
= x1 x2
a21 x1 + a22 x2
h i h i
∴ k11 x2 + k12 x1 x2 + k22 x2 = a11 x21 + (a12 + a21 )x1 x2 + a22 x22

Comparing similar terms, we get a11 = k11 , a12 + a21 = k12 and a22 = k22 . We prefer symmetric
matrices as they have a few special properties which will help us. Hence, we can say that the
associated matrix A is " #
k11 k12/2
A=
k12/2 k22

The vector space R3

Following a similar process for R2 , the associated matrix for a QF in R3 is


 
k11 k12/2 k13/2

A = k12/2 k22 k23/2


 
k13/2 k23/2 k33

2.2.2 Canonical (Ž) Form


The canonical form of a QF is a corresponding QF where only the square terms exist. e.g.
f (x, y) = ax2 + by 2

To convert a given QF into it’s canonical form, we convert the associated matrix into a similar
diagonal matrix. Since we have taken A(the associated matrix) to be a symmetric matrix, it is
always diagonalisable by the Spectral Theorem.

It is important to note that for a given QF, there exists a unique canonical form of the form
Pn
λ1 x21 + λ2 x22 + · · · + λn x2n = i=1 λi x2i where λi are the eigenvalues of the matrix A.

2.2.3 Definiteness of Eigenvalues


We define the nature of a quadratic form x⊤ Ax as follows:

• Positive definite: x⊤ Ax > 0 ; ∀x ̸= 0


• Positive semi-definite: x⊤ Ax ≥ 0 ; ∀x ̸= 0
• Negative definite: x⊤ Ax < 0 ; ∀x ̸= 0
• Negative semi-definite: x⊤ Ax ≤ 0 ; ∀x ̸= 0

Page 23 of 98
• Indefinite: x⊤ Ax > 0 ∧ x⊤ Ax < 0
Alternatively, we can determine the nature from the eigenvalues of the associated matrix:
• Positive definite: ∀λ.λ > 0

• Positive semi-definite: ∀λ.λ ≥ 0


• Negative definite: ∀λ.λ < 0
• Negative semi-definite: ∀λ.λ ≤ 0

• Indefinite: λ>0∧λ<0

Sylvester’s Criterion

Sylvester’s Criterion provides a straightforward method to determine whether a given real, sym-
metric matrix A is positive definite, negative definite, or indefinite. Instead of computing eigen-
values, which can be computationally expensive, this criterion relies on Leading Principal
Minors, making it a more efficient alternative.

Leading Principal Minors. A Leading Principal Minor (LPM) of a matrix is the determinant
of its top-left k × k submatrix. For an n × n symmetric matrix A, the k-th LPM is defined as:

∆k = det(Ak ), k = 1, 2, . . . , n,

where Ak is the submatrix formed by taking the first k rows and columns of A.

For example, for a 3 × 3 matrix:


 
a11 a12 a13
A = a21 a22 a23  ,
 

a31 a32 a33

the LPMs are: " #


a11 a12
∆1 = a11 , ∆2 = det , ∆3 = det(A).
a21 a22

Why Use Sylvester’s Criterion? While eigenvalues can also determine definiteness, com-
puting them is much more challenging:

• Finding eigenvalues involves solving the characteristic equation det(A − λI) = 0, which is
a polynomial of degree n.
• Polynomials of degree greater than 4 do not have general solutions (as per Abel-Ruffini
theorem). Thus, eigenvalues can only be explicitly computed for matrices up to 4 × 4.
• In contrast, Sylvester’s Criterion only requires computing a series of determinants, which
is computationally inexpensive for most practical applications.

The Criterion. Sylvester’s Criterion states:


• A is positive definite if all the LPMs are positive:

∆1 > 0, ∆2 > 0, ..., ∆n > 0.

Page 24 of 98
• A is negative definite if the signs of the LPMs alternate, starting with −:

∆1 < 0, ∆2 > 0, ∆3 < 0, ....

• A is indefinite if the sequence of LPMs does not meet either of the above criteria.

2.3 Principal Axes Theorem


If A is an n × n symmetric matrix associated with the quadratic form x⊤ Ax and if Q is an
orthonormal matrix such that Q−1 AQ = Q⊤ AQ = D, then the change of variable x = Qy
transforms the quadratic form x⊤ Ax into the quadratic form y⊤ Dy.

x⊤ Ax 7→ y⊤ Dy = λ1 x21 + λ2 x22 + · · · + λn x2n

Proof.

x⊤ Ax = (Qy)⊤ A(Qy)
= (y⊤ Q⊤ )A(Qy)
= y(Q⊤ AQ)y
∴ x⊤ Ax = y⊤ Dy

Note:
Rank:- Number of non-zero eigenvalues/Rank of associated matrix.
Index:- Number of positive eigenvalues
Signature:- Difference between the number of positive and number of negative eigenvalues.

Q.1. Determine the nature, rank, index, signature of the quadratic form: x21 + 2x22 + 3x23 +
2x2 x3 − 2x3 x1 + 2x1 x2 . Also transform it into it’s canonical form.
 
1 1 −1
The associated matrix of the above quadratic form is A =  1 2 1 .
 

−1 1 3
Finding the eigenvalues of A using the characteristic equation.

det(A − λI) = 0
 
1−λ 1 −1
∴ det 1 2−λ 1  = 0
 

−1 1 3−λ
3 2
∴ λ − 6λ + 8λ + 2 = 0
∴ λ ∈ {−0.214 . . . , 3.675 . . . , 2.539 . . . }

The above QF is indefinite , has rank = 3 , has index = 2 and signature = 1 . (Ans.)

Page 25 of 98
 
λ1 0 0
⊤ ⊤
Converting the above QF to canonical form, we get x Ax 7→ y Dy, where D =  0 λ2 0 .
 

0 0 λ3
Thus, the canonical form is (−0.214 . . . )y12 + (3.675 . . . )y22 + (2.539 . . . )y32 . (Ans.)

2.4 Cholesky’s Factorization


In section 1.3.3, we defined a numerical method to find the exact solution of a system of linear
equations by decomposing a symmetric, positive definite matrix into LL⊤ where L is a lower
triangular matrix.
 
1 2 3
Q.2. Decompose the matrix A = 2 8 22 using Cholesky factorization.
 

3 22 82
Finding the eigenvalues of A using the characteristic equation.

det(A − λI) = 0
 
1−λ 2 3
∴ det 2 8−λ 22  = 0
 

3 22 82 − λ
∴ λ3 − 91λ2 + 249λ − 36 = 0
∴ λ ∈ {88.1808 . . . , 2.665 . . . , 0.153 . . . }

Since, all the eigenvalues are positive in nature, A is a positive definite matrix. A is also
symmetric in nature. A can be decomposed to LL⊤ .
 
l11 0 0
Consider, L = l21 l22 0 .
 

l31 l32 l33

∵ LL⊤ = A
∴ A = LL⊤
    
1 2 3 l11 0 0 l11 l21 l31
∴ 2 8 =
22 l21 l22 0  0 l22 l32 
    

3 22 82 l31 l32 l33 0 0 l33


  2 
1 2 3 l11 l11 l21 l11 l31
2 2
∴ 2 8 22 = l11 l21 l21 + l22 l21 l31 + l22 l32 
   
2 2 2
3 22 82 l11 l31 l21 l31 + l22 l32 l31 + l32 + l33

Page 26 of 98
Comparing similar terms, we find the values:

l11 = 1=1
l21 = 2/1 = 2
l31 = 3/1 = 3
p √
l22 = 8 − 22 = 4 = 2
l32 = (22 − 6)/2 = 16/2 = 8
p √
l33 = 82 − 32 − 82 = 9 = 3

(We consider only the positive roots.)


 
1 0 0
Thus, A = LL⊤ , L = 2 2 0 . (Ans.)
 

3 8 3

Q.3. Transform 2x21 + x22 − 3x23 − 8x2 x3 − 4x3 x1 + 12x1 x2 into it’s canonical form and find the
change of variable.
 
2 6 −2
The associated matrix of the above quadratic form is A =  6 1 −4.
 

−2 −4 −3
Finding the eigenvalues of A using the characteristic equation.

det(A − λI) = 0
 
1−λ 6 −2
∴ det 6 1−λ −4  = 0
 

−2 −4 −3 − λ
∴ λ3 − 63λ − 162 = 0
∴ λ ∈ {−6, −3, 9}

Finding the eigenvectors of A:

For λ = −6:

(A + 6I) · X1 = 0R3
    
8 6 −2 x1 0
∴ 6 7 −4 x2  = 0
    

−2 −4 3 x3 0

h i⊤
From Cramer’s rule, X1 = −1 2 2
For λ = −3:

(A + 3I) · X2 = 0R3
    
5 6 −2 x1 0
∴ 6 4 −4 x2  = 0
    

−2 −4 0 x3 0

Page 27 of 98
h i⊤
From Cramer’s rule, X2 = 2 −1 2
For λ = 9:

(A − 9I) · X3 = 0R3
    
−7 6 −2 x1 0
∴ 6 −8 −4  x2  = 0
    

−2 −4 −12 x3 0

h i⊤
From Cramer’s rule, X3 = −2 −2 1
We know that for diagonalisability, there must exist an invertible matrix P (here Q), such
that P −1 AP . (From theorem 2)
h i
If P is orthogonal then, P −1 = P ⊤ . Hence, the orthogonal matrix Q = X1 X2 X3
is the required matrix. If we normalise the vectors, then we get the eigenvalues as the
coefficients of the canonical QF.
The canonical form of the above equation is −6y12 − 3y22 + 9y32 . (Ans.)
 
− 1/3 2/3 − 2/3

The change of variable is x =  2/3 − 1/3 − 2/3 y . (Ans.)


 
2/3 2/3 1/3

2.5 Constraint Optimisation


If we have been given a quadratic form x⊤ Ax with the constraint, ∥x∥ = 1, then:

• The largest (maximum) value of Q(x) is found to be λmax (the maximum eigenvalue) for
the corresponding eigenvector scaled so that it’s norm is 1.
• The smallest (minimum) value of Q(x) is found to be λmin (the minimum eigenvalue) for
the corresponding eigenvector scaled so that it’s norm is 1.
Q.4. Find the minimum and maximum value of Q(x) = x21 + 4x1 x2 − 2x22 with the constraint
x⊤ x = 1.
" #
1 2
The associated matrix of the above quadratic form is A = .
2 −2
Finding the eigenvalues of A using the characteristic equation.

det(A − λI) = 0
" #!
1−λ 2
∴ det =0
2 −2 − λ
∴ λ2 + λ − 6 = 0
∴ λ ∈ {−3, 2}

Finding the eigenvectors of A:

Page 28 of 98
For λ = −3:

(A + 3I)X1 = 0R2
" #" # " #
4 2 x1 0
∴ =
2 1 x2 0

h i⊤ h i⊤
√ √
From Cramer’s Rule: X1 = 1 −2 . Normalising the vector: X1 = 1/ 5 − 2/ 5 .

For λ = 2:

(A − 2I)X2 = 0R2
" #" # " #
−1 2 x1 0
∴ =
2 −4 x2 0

h i⊤ h i⊤
√ √
From Cramer’s Rule: X2 = 2 1 . Normalising the vector: X2 = 2/ 5 1/ 5 .
Thus, from the theory
" √ #of constraint optimisation, we know that the maximum
" √ # value of
2/ 5 1/ 5
Q(x) is 2 for x = √ and the minimum value of Q(x) is −3 for x = √ . (Ans.)
1/ 5 − 2/ 5

Page 29 of 98
Chapter 3
Inner Product Spaces

3.1 Foundations
Throughout this chapter we will go through the already-known concepts regarding vectors in
3-dimensional space and generalise them for any valid vector space.

3.1.1 Vectors
A vector is the most important mathematical structure for Computer Scientists and Engineers.
The notion of a vector in the mathematical sense is very different from what you would expect
vectors to be in the traditional sense of the word (magnitude and direction). In mathematics, a
vector is simply a structure which usually contains more than one element (usually numbers) in
a specific predefined format. This format may be a list, grid or anything else.

3.1.2 Vector Spaces


A vector space is a set of elements (usually called vectors) defined over a field F and having the
operations of addition(+) and scalar multiplication(·) (from the field)1
Vector spaces follow a few basic properties which have been discussed in short here.
Consider a generic vector space V and the vectors u, v and w and the scalars k, l, m from the
field F.

Closure Properties
1. Vector Addition is closed in V .
u+v ∈V
2. Scalar multiplication of vectors is closed in V .
k·u∈V
Addition Properties

3. Vector addition is commutative. u+v =v+u


4. Vector addition is associated. u + (v + w) = (v + u) + w
1 The addition and scalar multiplication operations are purely notational. They may be defined in any specific
way that we would like.

Page 30 of 98
5. There exists a special vector 0V ∈ V for all u ∈ V such that: u + 0V = u
(Existence of additive identity).
6. For all u ∈ V , there exist −u ∈ V such that: u + (−u) = 0V
(Existence of additive inverse).
Multiplication Properties

7. Compatibility with field multiplication. k · (l · u) = (kl) · u


8. Distribution of scalar multiplication over vector addition. k · (u + v) = k · u + k · v
9. Distribution of scalar multiplication over field addition. (k + l) · u = k · u + l · u
10. There exists a special scalar 1V ∈ F for all u ∈ V such that: 1V · u = u
(Existence of multiplicative identity).

3.1.3 The vector space Rn


The vector space defined over the set of Real Numbers (R) with the vectors of the form v =
h i⊤
v1 v2 · · · vn , is commonly known as the n-dimensional Euclidean Space when equipped
with the standard dot product. This is one of the most familiar and widely studied vector spaces
in mathematics.

3.1.4 Length of a vector


The length of a vector for 3-dimensional Euclidean Space over a real field is known defined as
the distance of the point to which the vector points from the origin of the vector space and is
p
equal to |⃗v | = x2 + y 2 + z 2 for a vector ⃗v = xî + y ĵ + z k̂.

3.2 Inner Product


The inner product is the generalization of the dot product defined for n-dimensional Euclidean
Space for any generic vector space.
Formally, the inner product is a function defined for a vector space over a field F that takes in
an ordered pair of vectors, (u, v) and assigns to them, a number ⟨u, v⟩ ∈ F:

f :V ×V →F

f (u, v) = ⟨u, v⟩

where ⟨u, v⟩ is just a notational placeholder for the inner product.

3.2.1 Rules for Inner Products


The following rules must be followed by a binary function for it to be considered a valid inner
product.
1. Linearity: ⟨λ · u + v, w⟩ = λ ⟨u, w⟩ + ⟨v, w⟩.

2. Conjugate Symmetry: ⟨u, v⟩ = ⟨v, u⟩.


3. Non-negativity: ∀u ∈ V ⟨u, u⟩ ≥ 0F ∧ ⟨u, u⟩ = 0F ⇐⇒ u = 0V .

Page 31 of 98
3.2.2 Inner Product Space
A vector space which defines a valid inner product is known as an Inner Product Space.

3.3 A few common Inner Product Spaces


Let us look at a few inner product spaces and their corresponding inner products.

3.3.1 Inner Product for the vector space Cn


Consider the vector space where the vectors are defined as a list of n complex numbers as follows:
h i⊤ h i⊤
z = z1 z2 · · · zn and w = w1 w2 · · · wn where ∀zi , wi ∈ C.
The inner product is defined as follows:

⟨z, w⟩ = z1 w1 + z2 w2 + z3 w3 + · · · + zn wn

We can easily verify that this function satisfies the above mentioned rules, hence it is a valid
inner product.
The standard dot product for n-dimensional Euclidean space over a real field is just a special
case of this exact inner product. (Remember that for real numbers r = r)

3.3.2 Inner Product for the vector space of continuous real-valued


functions
The set of continuous real-valued functions constitute an inner product space with the standard
inner product as follows: Z 1
⟨f , g⟩ = f (x) · g(x) · dx
−1

If the function f and/or g are not defined for the domain [−1, 1], then we just take the integration
for the largest possible common domain of definition.
Let us verify the validity of this function as the inner product:

1. Linearity:
Z 1
⟨λf + g, h⟩ = (λf (x) + g(x)) · h(x) · dx
−1
Z 1
= λf (x) · h(x) + g(x) · h(x) · dx
−1
Z 1 Z 1
= λf (x) · h(x) dx + g(x) · h(x) · dx
−1 −1
Z 1 Z 1
=λ f (x) · h(x) dx + g(x) · h(x) · dx
−1 −1

∴ ⟨λf + g, h⟩ = λ ⟨f , h⟩ + ⟨g, h⟩

Page 32 of 98
2. Conjugate Symmetry:
Z 1
⟨f , g⟩ = f (x) · g(x) · dx
−1
Z 1
= g(x) · f (x) · dx
−1

(∵ Integration of real-valued functions is real valued.)


Z 1
= g(x) · f (x) · dx
−1

∴ ⟨f , g⟩ = ⟨g, f ⟩

3. Non-negativity:
Z 1
⟨f , f ⟩ = f (x) · f (x) · dx
−1
Z 1
2
= (f (x)) · dx
−1

(∵Area under the curve of a positive function is always positive


and is zero only when the curve is 0.)
∴ ⟨f , f ⟩ ≥ 0 ∧ ⟨f , f ⟩ = 0 ⇐⇒ f (x) = 0

Since, all three conditions are satisfied, the above defined function is a valid inner product on
the inner product space of real-valued continuous functions defined in the domain [−1, 1].

Q.1. Suppose that for a vector space over the set of continuous real-valued functions defined for
the domains [0, ∞) , [0, 1] has an inner product candidate as follows:
Z 1
⟨f , g⟩ = |f (x) · g(x)| · dx
0

Is it an inner product space?


The given function is an inner product if it satisfies the three properties as defined before.
1. Linearity:
Z 1
⟨λf + g, h⟩ = |(λf (x) + g(x)) · h(x)| · dx
0
Z 1
= |λ(f (x) · h(x)) + (g(x) · h(x))| · dx
0
Z 1 Z 1
λ ⟨f , h⟩ + ⟨g, h⟩ = λ |f (x) · h(x)| · dx + |g(x) · h(x)| · dx
0 0
Z 1
= |λf (x) · h(x)| + |g(x) · h(x)| · dx
0

We know that ∃x, y ∈ R; |x + y| =


̸ |x| + |y|.

∴ ⟨λf + g, h⟩ is not always equal to λ ⟨f , h⟩ + ⟨g, h⟩.


Thus, the given function is not an inner product. (Ans.)
Note:

Page 33 of 98
Whenever the problem statement does not specify a specific inner product, we make use of the
standard inner products defined in these sections.

Q.2. Find inner product of x and x2 on [−1, 1]

Let f (x) = x and g(x) = x2


Z 1
⟨f , g⟩ = f (x) · g(x) · dx
−1
Z 1
= x · x2 · dx
−1
Z 1
= x3 · dx
−1
1
= x4 /4 −1


= 1/4 − 1/4 = 0

Thus the inner product of x and x2 is x, x2 = 0 . (Ans.)

3.3.3 Inner Product for the vector space of Mm×n (C)


The set of m × n matrices over a complex field constitute an inner product space with the
standard inner product as follows:

⟨A, B⟩ = tr(BA∗ )

where A∗ is the transposed conjugate of the matrix A.

Let us verify the validity of this function as the inner product:


1. Linearity:

⟨λA + B, C⟩ = tr(C · (λA + B)∗ )


= tr(C(λA∗ + B ∗ )) = tr(C · λA∗ + CB ∗ )
= tr(λCA∗ ) + λ(CB ∗ ) = λ tr(CA∗ ) + tr(CB ∗ )
∴ ⟨λA + B, C⟩ = λ ⟨A, C⟩ + ⟨B, C⟩

2. Conjugate Symmetry:
⟨A, B⟩ = tr(BA∗ )
  Conjugate of conjugate is the number itself
= tr(BA∗ )

= (tr((BA∗ )∗ ))
= (tr((A∗ )∗ (B)∗ ))
= (tr(AB ∗ ))
⟨A, B⟩ = ⟨B, A⟩

Page 34 of 98
3. Non-negativity:

⟨A, A⟩ = tr(AA∗ )
P 
n 2

|a | ··· ··· ·
 i=1 1i
.. ..  
 Pn 2 
. i=1 |a2i | .
  
= tr 
 
 .. .. ..  
 . . .
 
 Pn
 
2
· ··· ··· i=1 |ami |
m×n
Pn Pm 2
= i=1 j=1 |aij |
 
0 0 0 ··· 0 Sum of squares of eleme
0 0 0 ··· 0
 
∴ ⟨A, A⟩ ≥ 0 ∧ ⟨A, A⟩ = 0 ⇐⇒ A =   .. .. .. .. .. 
.

. . . .
0 0 0 ··· 0
m×n

Since, all three conditions are satisfied, the above defined function is a valid inner product on
the inner product space of complex-elemented matrices of order m × n.

3.3.4 Inner Product for the vector space of P n [X]


The set of complex-valued polynomials of degree less than or equal to n constitute an inner
product space. Assuming the vectors to be of the form:
 
p0
 
p 
h i  1
p(x) = p0 + p1 x + p2 x2 + · · · pn xn = 1 x x2 ··· xn ·  p 2 
 
.
.
.
pn

The standard inner product is defined as follows:


n
X
⟨p, q⟩ = p0 q0 + p1 q1 + p2 q2 + · · · + pn qn = pi qi
i=0

Note:

Since real-valued polynomials are continuous real-valued functions as well, we can use the stan-
dard inner product for the vector space of continuous real valued functions as well.

Since we can represent the polynomials as column matrices, the vector space is just converted

to an (n + 1)-dimensional Euclidean Space over a complex field with basis as 1, x, x2 , · · · , xn
instead of {x1 , x2 , x3 , · · · , xn+1 }. The function ⟨p, q⟩ also has the same output as the inner
product for that vector space.
We know that the inner product for n-dimensional Euclidean Space is valid and hence, the inner
product candidate for the vector space of polynomials of degree less than or equal to n is also
valid.

Page 35 of 98
3.4 Norm (magnitude)
The norm of a vector is the generalisation of the concept of length of a vector. We define the
norm ∥u∥ of a vector u as follows:
p
∥u∥ = ⟨u, u⟩

Note that this reduces to the well-known identity |⃗u| = ⃗u · ⃗u for Euclidean Space.

Q.3. Find the norm of f (x) = x on [−1, 1] for the vector space:
(i) Real-valued continuous functions
(ii) P 2 [x]

(i) Real-valued continuous functions


p
∥f ∥ = ⟨f , f ⟩
v sZ
uZ
u 1 1
= t f (x) · f (x) · dx = x · x · dx
−1 −1
sZ
1
= x2 dx
−1
q p
= [x3 /3]1−1 = 1/3 − (−1)/3
p
= 2/3

(Ans.)
(ii) P 2 [x]
p
∥f ∥ = ⟨f , f ⟩
√ √
= p0 · p0 + p1 · p1 + p2 · p2 = 0.0 + 1.1 + 0.0

= 1= 1

(Ans.)
!
−3 4
Q.4. If u = , , ∥u∥ =?
5 5

For vector space R2 , inner product is defined as u1 · v1 + u2 · v2 for vectors ⃗u = (u1 , u2 )
and ⃗v = (v1 , v2 ). So,
q
∥u∥ = u21 + u22
v
u !2 !2
u −3 4
= +
t
5 5
s s
9 16 2
 5 √
= + = = 1
25 25 5
2
∥u∥ = 1

(Ans.)

Page 36 of 98
3.4.1 Unit Vector
Let V be an inner product space. A vector u ∈ V is said to be a unit vector iff it’s norm is 1.

u
∀u ∈ V /{0V }; is a unit vector.
∥u∥

u
Proof. Consider the vector v = . It is a unit vector iff ∥v∥ = 1.
∥u∥

u
∥v∥ =
∥u∥
* +
u u
= ,
∥u∥ ∥u∥

1 1
= ·
· ⟨u, u⟩
∥u∥ ∥u∥
1 p 2
= 2 · ⟨u, u⟩
∥u∥
1 2
= 2 · ∥u∥
∥u∥
∴ ∥v∥ = 1


e.g. (1, 2, 3) ∈ Rn
!⊤
⊤ 1 2 3
Unit((1, 2, 3) ) = √ ,√ ,√
14 14 14

3.5 Orthogonality
Two vectors u and v are said to be orthogonal iff ⟨u, v⟩ = 0F .
The concept of orthogonality denotes “perpendicularity” of some sort.

3.5.1 Orthogonal Set


A set of vectors is said to be an Orthogonal Set if all the distinct pairs of vectors in that set are
orthogonal to each other.

3.5.2 Orthogonal Basis


An orthogonal set with the minimum number of linearly independent elements required to span
the entire space is known as an Orthogonal Basis.

3.5.3 Orthonormal Basis


An orthogonal basis where all the elements are unit vectors is known as an Orthonormal Basis.

Page 37 of 98
3.5.4 Orthogonal Complement
The Orthogonal Complement of a subspace W of a vector space V is another subspace containing
elements of V that are orthogonal to all the elements of W simultaneously. It is referred to as
the perp or perpendicular complement informally and is denoted by W ⊥ .
Q.5. Is the set S = {(2, 1, −1), (0, 1, 1), (1, −1, 1)} an orthogonal basis? If yes, convert it to an
orthonormal basis.
Consider the vectors v1 = (2, 1, −1), v2 = (0, 1, 1), and v3 = (1, −1, 1).

⟨v1 , v2 ⟩ = 2 · 0 + 1 · 1 + (−1) · 1 = 0 + 1 − 1 = 0
⟨v2 , v3 ⟩ = 0 · 1 + 1 · (−1) + 1 · 1 = 0 − 1 + 1 = 0
⟨v3 , v1 ⟩ = 1 · 2 + (−1) · 1 + 1 · (−1) = 2 − 1 − 1 = 0

S is an orthogonal set . Consider S ′ to be the orthonormal version of S. (Ans.)

p √ √
∥v1 ∥ = 22 + 12 + (−1)2 = 4 + 1 + 1 = 6
p √ √
∥v2 ∥ = 02 + 12 + 12 = 0 + 1 + 1 = 2
p √ √
∥v3 ∥ = 12 + (−1)2 + 12 = 1 + 1 + 1 = 3
( )

1 1 1
∴ S = √ (2, 1, −1), √ (0, 1, 1), √ (1, −1, 1)
6 2 3
( ! ! !)
2 1 − 1 1 1 1 − 1 1
∴ S′ = √ , √ , √ , 0, √ , √ , √ , √ , √
6 6 6 2 2 3 3 3

(Ans.)
Q.6. Is S1 = {sin(nx) | n = 1, 2, 3, . . . on [0, π]} an orthogonal set? Is it an orthonormal set?

Consider any two positive integers m and n which are not equal to each other. Calculating
the inner product for f (x) = sin mx and g(x) = sin nx.
Z π Z π
⟨f , g⟩ = f (x) · g(x) · dx = sin(mx) · sin(nx) · dx
0 0
Z π −1
= (cos((m + n)x) − cos((m − n)x)) · dx
0 2
" #π 
sin((m + n)x) sin((m − n)x)
= −0.5 − 
(m + n) (m − n)
0
!
sin((m + n)π) − sin(0) sin((m − n)π) − sin(0)
= −0.5 −
(m + n) (m − n)

= −0.5((0 − 0)/(m + n) − (0 − 0)/(m − n))


∴ ⟨f , g⟩ = 0

Since, the inner product of any two non-equal vectors is 0, the set S1 is an orthogonal set .
(Ans.)

Page 38 of 98
Consider the norm of a random vector f (x) = sin(mx).
Z π
∥f ∥ = f (x) · g(x) · dx
0
Z π −1
= (cos((m + m)x) − cos((m − m)x)) · dx
0 2
Z π 
= −0.5 cos(2mx) − 1 dx
Z π0 
= 0.5 1 − cos(2mx) dx
0

sin(2mx)
= 0.5 x −
2m
0
! !
sin(2mπ) − sin(0) 0−0
= 0.5 [π − 0] − = 0.5 π −
2m 2m

= 0.5π

Since, the norm of every element in the S1 is 0.5π, the set S1 is not an orthonormal set .
(Ans.)
Q.7. Find two vectors with norm 1 which are orthogonal to (3, −4) in R2 .
Consider an orthogonal vector to ⃗a = (3, −4) in R2 to be of the form ⃗b = (x, y). Now
since, we know that the two vectors are orthogonal,

⃗a · ⃗b = 0
∴ 3x − 4y = 0
3
∴y= x
4

The above equation provides a line in R2 but we only require unit vectors so:

⃗b = 1
p
∴ x2 + y 2 = 1
v
u !2
u 3
∴ x2 + x =1
t
4
s
25
∴ x2 = 1
16
25
∴ x2 = 12
16
16
∴ x2 =
25s
16 4
∴x=± =±
25 5

3 3
From the above equation y = x, y = ± . Thus the two unit vectors orthogonal to (3, −4)
4 5

Page 39 of 98
! !
2
4 3 4 3
in R are , and − ,− . (Ans.)
5 5 5 5

3.6 Orthogonal Projections


The orthogonal projection of a vector u onto v is defined as follows:

⟨u, v⟩
projv (u) = 2 ·v
∥v∥

To understand this formula, let us take a look at 2-Dimensional Euclidean Space in Cartesian
(Rectangular) Coordinates. In the above diagram, we can see that the vector ⃗u casts a “shadow”

⃗u

|⃗u| sin θ

θ
⃗v
|⃗u| cos θ

Figure 3.1: Orthogonal Projection of ⃗u onto ⃗v .

onto the vector v of length |⃗u| cos θ, i.e. the component of ⃗u in the direction of ⃗v has the length
|⃗u| cos θ where θ is the angle between the two vectors. This length is just called the scalar
projection of ⃗u onto ⃗v .
If we need the vector in this direction, we already have ⃗v but it is not of the required length, so
we convert it into a unit vector by dividing it by it’s length and then multiply the length of the
“shadow” with this new unit vector to get our required vector.
The vector (say w)
⃗ perpendicular to ⃗v and going from the tip of the projection of ⃗u onto ⃗v to
the tip of ⃗u is the other vector which forms up ⃗u. By the triangle law of addition of vectors, we
can say that:

w
⃗ + proj⃗v ⃗u = ⃗u
⃗ = ⃗u − proj⃗v ⃗u
∴w

Thus, the orthogonal projection of u onto v is just the component of the vector u in the direction
of v i.e. projv (u) and the component of u orthogonal to v is u − projv (u)
Q.8. Find proj⃗v ⃗u where ⃗u = (1, 0, 5) and ⃗v = (3, 1, −7):
√ √
∥⃗v ∥ = 12 + 02 + 52 = 26
⃗u · ⃗v = 3 · 1 + 1 · 0 + 5 · (−7) = 3 − 35 = −32
!
⟨⃗u, ⃗v ⟩ ⃗u · ⃗v 32 16 80
proj⃗v ⃗u = 2 ·⃗
v = √ 2 · ⃗v = − · (1, 0, 5) =
26
− , 0, −
13 13
(Ans.)
∥⃗v ∥ 26

Page 40 of 98
(" # )
⊥ x
Q.9. Find the orthogonal complement W of W = 2x − y = 0; x, y ∈ R and give it’s
y
basis.
" #
⊥ u
Consider the elements of W to be of the form . Since the elements of W ⊥ are
v
orthogonal to W , we can say that:
" #
h i x
u v · =0
y
∴ ux + vy = 0
∴ ux + v(2x) = 0
∴ v = −0.5u

" #
⊥ u
The elements of the orthogonal complement W are of the form where v = −0.5u,
v
" #
1
i.e. u · .
−0.5
(" # )
⊥ u
Thus the orthogonal complement of W is W = 2v + u = 0; u, v ∈ R with the
v
(" #)
1
basis . (Ans.)
−0.5

3.6.1 Orthogonal Projection on a Subspace


Given a subspace W with an orthogonal basis {w1 , w2 , w3 · · · , wn }, the orthogonal projection
of u onto W is:
n
X
projW u = projw1 (u) + projw2 (u) + projw3 (u) + · · · + projwn (u) = projwi (u)
i=1

.
We can think of orthogonal projection onto a subspace as casting a “shadow” onto the subspace
when an overhead light source is present.
Q.10. Let W be the plane in R3 with equation x − y + 2z = 0. Find the orthogonal projection
h i⊤
of 3 −1 2 onto W .
h i⊤
By trial and error, we know that 1 1 0 lies in W . Another vector in W perpendicular
to this vector forms a basis of W .
h i⊤
Let the other vector be of the form x y z but by the equation of the plane, we know
h i⊤
that x = y − 2z. The other vector becomes y − 2z y z .

Page 41 of 98
Taking the dot product,
 
h i 1
y − 2z y z 1 = 0
 

0
∴ (y − 2z)(1) + (y)(1) + (z)(0) = 0
∴ y − 2z + y = 0
∴y=z
h i⊤ h i⊤
∴ The other vector becomes −y y y = y −1 1 1

h i⊤ h i⊤
We know that one of the orthogonal basis of W is 1 1 0 , −1 1 1 .
h i⊤ h i⊤
Finding the projection of 3 −1 2 on 1 1 0

h i⊤ 3(1) − 1(1) + 2(0) h i⊤


proj 3 −1 2 = √ 2 · 1 1 0
⊤
1 1 0 12 + 12 + 02
2h i⊤
= 1 1 0
2
h i⊤
= 1 1 0

h i⊤ h i⊤
Finding the projection of 3 −1 2 on −1 1 1

h i⊤ 3(−1) − 1(1) + 2(1) h i⊤


proj 3 −1 2 = p 2 · −1 1 1
⊤
−1 1 1 (−1)2 + 12 + 12
2h i⊤
= − −1 1 1
3
 ⊤
= 2 2 2
− −
3 3 3

Thus,
h i⊤ h i⊤ h i⊤
projW 3 −1 2 = proj ⊤ 3 −1 2 + proj ⊤ 3 −1 2
1 1 0 −1 1 1
i⊤  ⊤
2 2 2
h
= 1 1 0 + − −
3 3 3
 ⊤
= 5 1

2
3 3 3

 ⊤
5 1 2
h i
Thus the orthogonal projection of 3 −1 2 ⊤ on the plane x−y+2z = 0 is − .
3 3 3
(Ans.)

Page 42 of 98
3.7 Graham-Schmidt Orthogonalisation Process
The Graham-Schmidt orthogonalisation process is used to convert a set of linearly independent
vectors to an orthogonal set of vectors.
Let {x1 , x2 , x3 , . . . , xk } be the basis of a subspace W of k dimensions. Then we consider an
orthogonal basis as the following set {v1 , v2 , v3 , . . . , vk } where:

v1 = x 1
v2 = x2 − projv1 x2
v3 = x3 − projv1 x3 − projv2 x3
..
.
 
vk = xk − projv1 xk + projv2 xk + projv3 xk + · · · + projvk−1 xk
| {z }
The vector component of xk perpendicular to the subspace formed by first k − 1 vectors

Note:
Even though the bold notation or arrow notation has not been used, the terms xi and vi still
represent vectors over here.

     


 1 2 2 


     
−1 1 2
Q.11. Construct Orthonormal basis for the subspace W = span   ,   ,  
     
−1 0 1
 

 

 1 1 2 
Using the Graham-Schmidt Orthogonalisation Process:
h i⊤
v1 = x1 = 1 −1 −1 1

Finding the value of v2 ,


⟨x2 , v1 ⟩ = 2(1) + 1(−1) + 0(−1) + 1(1) = 2 − 1 + 0 + 1 = 2
p √
∥v1 ∥ = 12 + (−1)2 + (−1)2 + 12 = 1 + 1 + 1 + 1 = 2

v2 = x2 − projv1 (x2 )
⟨x2 , v1 ⟩
= x2 − 2 · v1
∥v1 ∥
h 2h i⊤ i⊤
= 2 1 0 1
1 −1 −1 1−
22
h i⊤ 1 h i⊤
= 2 1 0 1 − 1 −1 −1 1
2
h i⊤
∴ v2 = 1.5 1.5 0.5 0.5

Finding the value ofv3 ,


⟨x3 , v1 ⟩ = 2(1) + 2(−1) + 1(−1) + 2(1) = 2 − 2 − 1 + 2 = 1
⟨x3 , v2 ⟩ = 2(1.5) + 2(1.5) + 1(0.5) + 2(0.5) = 3 + 3 + 0.5 + 1 = 7.5

Page 43 of 98
√ √ √
∥v2 ∥ = 1.52 + 1.52 + 0.52 + 0.52 = 2.25 + 2.25 + 0.25 + 0.25 = 5

v3 = x3 − projv1 (x3 ) − projv2 (x3 )


⟨x3 , v1 ⟩ ⟨x3 , v2 ⟩
= x3 − 2 · v1 − 2 · v2
∥v1 ∥ ∥v2 ∥
h i⊤ 1hi⊤ 7.5 h i⊤
= 2 2 1 1 − √ 2 1.5 1.5 0.5 0.5
2 − 2 1 −1 −1
2 5
h i⊤ 1 h i⊤ 3 h i⊤
= 2 2 1 2 − 1 −1 −1 1 − 1.5 1.5 0.5 0.5
4 2
h i⊤ h i⊤ h i⊤
= 2 2 1 2 − 0.25 −0.25 −0.25 0.25 − 2.25 2.25 0.75 0.75
h i⊤
∴ v3 = −0.5 0 0.5 1

p √ √
∥v3 ∥ = (−0.5)2 + 02 + 0.52 + 12 = 0.25 + 0 + 0.25 + 1 = 1.5
Thus, we have found an orthogonal set of vectors.
To convert it to orthonormal set, wi = vi/∥vi ∥
h i⊤
w1 = v1/∥v1 ∥ = [1 −1 −1 1]⊤/2 = 0.5 −0.5 −0.5 0.5
h i⊤
√ √ √ √ √
w2 = v2/∥v2 ∥ = [1.5 1.5 0.5 0.5]⊤/ 5 = 3/2 5 3/2 5 1/2 5 1/2 5
h i⊤
√ √ √ √
w3 = v3/∥v3 ∥ = [−0.5 0 0.5 1]⊤/ 1.5 = − 1/ 6 0 1/ 6 2/ 6

3.7.1 QR-Factorisation
We use QR-Factorisation to find the linear dependence of a set of vectors. For QR-Factorisation,
we decompose the matrix A into Q · R where:
• Q is an orthogonal Q⊤ = Q−1 or unitary Q∗ = Q−1 .

• R is an Upper Triangular Square Matrix.


Q can be easily found out by applying Graham-Schmidt Process to the columns of A and R is
found out as follows.
∵A=Q·R
Premultiplying by Q∗
∴ Q∗ · A = Q∗ · Q · R
∴ R = Q∗ · A
 
1 2 2
 
−1 1 2
Q.12. Find a QR−Factorisation of A = 
 
−1 0 1

1 1 2
For the given matrix A, the matrix Q can easily be found out by performing Graham-
Schmidt Orthogonalisation on the columns of A.
 
1 1.5 −0.5
 
−1 1.5 0 
Q=   (Reason: Ye to kahi dekhe dekhe hue hai)
−1 0.5 0.5 

1 0.5 1

Page 44 of 98
R = Q∗ · A
 
  1 2 2
1 −1 −1 1  
  −1 1 2
=  1.5 1.5 0.5 0.5 · 
 
−1 0 1
−0.5 0 0.5 1
 
1 1 2
 
1+1+1+1 2−1+0+1 2−2−1+2
= 1.5 − 1.5 − 0.5 − 0.5 3 + 1.5 + 0 + 0.5 3 + 3 + 0.5 + 1 
 

−0.5 − 0 − 0.5 + 1 −1 + 0 + 0 + 1 −1 + 0 + 0.5 + 2


 
4 2 1
∴ R = 0 5 7.5
 

0 0 1.5

3.7.2 Best Approximation of a Vector


Let v be a vector and V be the vector space in which v resides. Let W be a subspace of V .
The best approximation of v onto W is defined as the vector in W whose distance from v is the
least from any other vector in W .

|v − Best Approximation| ≤ |v − w|

∀w ∈ W

It has been found that the best approximation of a vector in a subspace is the projection of that
vector onto that subspace itself. i.e. Best Projection = projW (v).
      
3 
 1 5 
Q.13. Find the best approximation of 2 in the plane W = span w⃗1 =  2  , w⃗2 =  2 
     
 
5 −1 −1
 

The vectors w⃗1 and w⃗2 are not orthogonal in nature because w⃗1 · w⃗2 = 5 · 1 + 2 · 2 + (−1) ·
(−1) = 10.
Applying the Graham-Schmidt Orthogonalisation Process,

Finding the value of v⃗1 ,


 
1
v⃗1 = w⃗1 =  2 
 

−1
Finding the value of v⃗2 ,
     
5 1 10/3
w⃗2 · w⃗1 w⃗2 · v⃗1   10    − 4 
v⃗2 = w⃗2 − 2 · v⃗1 = w⃗2 − 2 · v⃗1 =  2  −  2  =  /3
|w1 | |v1 | 6
−1 −1 2/3

 
3
Now to find the best approximation, we find the orthogonal projection of ⃗u = 2 on the
 

5
plane formed by v⃗1 and v⃗2 .

Page 45 of 98
   
1 1/3
3(1) + 2(2) + 5(−1)   
projv⃗1 ⃗u =  2  =  2/3 

6
−1 − 1/3

   
10/3 8/3
3(10/3) + 2( − 4/3) + 5(2/3) 
projv⃗2 ⃗u =  − 4/3 =  − 16/15
  
40/3
2/3 8/15

       
1/3 8/3 9/3 3
projW ⃗u = projv⃗1 ⃗u + projv⃗2 ⃗u =  2/3  +  − 16/15 =  − 6/15 = −0.4
       
− 1/3 8/15 3/15 0.2
   
3 3
Thus, the projection of 2 on the subspace W is −0.4 . (Ans.)
   

5 0.2

3.7.3 Least Square Method


In many cases, there is a direct linear relation between a measured input value and the measured
output value of a system. While devising such a linear relation, we must know the slope and
y-intercept of the line we want to graph. We can never find the exact linear relation as there
may be many errors while measuring or due to the inherent variability of data, what we CAN
do however, is get as close as possible to the line. This is where the Least Square Method comes
in.
y

(x3 , y3 )

(x1 , y1 )
(x2 , y2 )

Figure 3.2: Least Square Method: y = a + bx

The least square method basically seeks to minimise the value of the sum of squares of the
k
X
absolute errors i.e. minimise (yi − (a + bxi ))2 where k is the number of samples taken.
i=1

Page 46 of 98
k
X
(yi − (a + bxi ))2 = (y1 − (a + bx1 ))2 + (y2 − (a + bx2 ))2 + · · · + (yk − (a + bxk ))2
i=1
    2
y1 a + bx1
y2  a + bx2 
   
=   −
 ..   .. 
 
.  . 
yk a + bxk
2
   
y1 1 x1
 " #
y2  1 x2  a
  
 ..  −  ..
=  ·
.. 
 
 .  . . b
|{z}
yk 1 xk u
| {z } | {z }
y A
2
= ∥y − Au∥

2
So we must minimise the value of ∥Au − y∥ instead. We can consider that Au is just the vector
subspace formed by the transforming matrix A for all possible values of u and for a particular
value, Au is the best approximation of y.

∵ Au = projIm(A) y
∴ y − Au ⊥ aj ∀aj ∈ Im(A)
∴ ⟨y − Au, aj ⟩ = 0F
∴ ⟨aj , y − Au⟩ = 0F In this case, F = C

∴ aj (y − Au) = 0C = 0 Since ⟨u, v⟩ = u⊤ · v; ∀u, v ∈ Ck
∴ A∗ (y − Au) = 0Ck
∴ A∗ y − A∗ Au = 0Ck
−1
∴ A∗ y = A∗ Au Pre-multiplying by (A∗ A)
−1
∴ u = (A∗ A) A∗ y
" #
a −1
Thus, the value of u = is given by (A∗ A) A∗ y
b
Q.14. Find the Best Fitted Line using Least Square Method from the set S of samples given as
follows: S = {(−2, 4), (−1, 2), (0, 1), (2, 1), (3, 1)}
Applying the Least Square Method to find the best fitted line, the line has the equation
y = a + bx, where a and b are given as follows:
" #
a
= u = (A∗ A)−1 A∗ y
b

where,

Page 47 of 98
" #⊤ " #⊤
1 1 ··· 1 1 1 1 1 1
• A= =
x1 x2 ···
x5 −2 −1 0 2 3
h i h i⊤
• y = y1 y2 · · · y5 = 4 2 1 1 1
" #
∗ 1 1 ··· 1
Therefore, A = .
x1 x2 · · · x5
 
1 −2
# 1 −1 "
"  #
1 1 1 1 1    5 2
Therefore, A∗ A = · 1 0  = .
−2 −1 0 2 3   2 18
1 2 

1 3
|A∗ A| = 5 × 18 − 2 × 2 = 90 − 4 = 86
" #
1 1 18 −2
Therefore, (A∗ A)−1 = · adj(A∗ A) =
|A∗ A| 86 −2 5
 
4
2 " #
# 
" 
1 1 1 1 1   9
Therefore, A∗ y = ·1 =
−2 −1 0 2 3    −5
1
1
" #" # " # " #
∗ −1 ∗
1 18 −2 9 1 172 2
Thus, u = (A A) A y = = =
86 −2 5 −5 86 −43 −0.5

Hence, the required line is y = 2 − 0.5x . (Ans.)

Page 48 of 98
Chapter 4
Complex Analysis

4.1 Foundations
4.1.1 Imaginary Unit
The imaginary unit is denoted by i. It is defined as the number with the property that i2 := −1 .

4.1.2 Complex Numbers


The set of complex numbers C is defined as follows,

C := {z | z := x + iy : x, y ∈ R}

z = x + iy is called the Cartesian (or rectangular ) form of the complex number z. x is called
the real part of z (denoted as R(z) or Re(z)), while y is known as its imaginary part (denoted
as I(z) or Im(z)).
If I(z) = 0, then z = x + 0i = x =⇒ z ∈ R. Therefore, R ⊂ C.

4.1.3 Euler’s Formula


Euler’s formula is given by,
eiθ = cos θ + i sin θ (4.1)

Therefore, sometimes eiθ or cos θ + i sin θ may be denoted as cis θ.

Consider a complex number z := x + iy.

z = x + iy
p
! Multiply and divide by x2 + y 2
p x y
= x2 + y 2 p + ip
x2 + y 2 x2 + y 2
p
Let r := x2 + y 2 and θ be the angle such that cos θ := x/r and sin θ := y/r.

∴ z = r(cos θ + i sin θ)
From eq. (4.1)
z = reiθ

Page 49 of 98
z = reiθ is called the Euler (or polar ) form of the complex number z. r is called the modulus of
z (denoted as |z|), while θ is known as its argument (denoted as arg(z)).

Euler’s Identity

Euler’s identity is a famous identity that brings the five most important mathematical constants
together in one equation.
eiπ + 1 = 0

Coordinate Conversion

x + iy = hypot(x, y)ei·atan2(y,x) (4.2)


reiθ = r cos θ + i · r sin θ (4.3)
p
where hypot(x, y) := x2 + y 2 and atan2(y, x) is a special piece-wise function that returns the
principal argument Arg(x + iy).

4.1.4 ArGand Plane


While real numbers require only a number line, complex numbers shall be represented using a
complex plane (also called the Argand plane), since there are two degrees of freedom.

z = x + iy

r
i

θ
R
1

Figure 4.1: The Argand plane

The modulus of z, r := |z|, describes the distance of z from 0. The argument of z, θ := arg(z),
describes the angle made by z with the positive real axis.
Since coterminal angles can imply multiple values for arg(z), we confine the range of the principal
argument of z in the interval (−π, π]. The principal argument of a complex number z is denoted
as Arg(z).

When arg(z) = nπ, n ∈ Z, sin θ = 0 =⇒ I(z) = 0 =⇒ z ∈ R.

Page 50 of 98
4.1.5 Complex Conjugate
The complex conjugate z of a complex number z is defined as the reflection of z about the real
axis.

Im

θ
Re
θ

Figure 4.2: Complex conjugate

z = x + iy ⇐⇒ z = x − iy (4.4)
iθ −iθ
z = re ⇐⇒ z = re (4.5)

Properties of the conjugate

1. z = z
2. z = z ⇐⇒ I(z) = 0 ⇐⇒ z ∈ R
3. z1 + z2 = z 1 + z 2

4. z1 z2 = z 1 · z 2
2
5. zz = |z|
z+z
6. R(z) =
2
z−z
7. I(z) =
2i

4.1.6 Equality of Complex Numbers


Rectangular Coordinates

Two complex numbers z1 := x1 + iy1 and z2 := x2 + iy2 are said to be equal iff x1 = x2 and
y1 = y2 .

Polar Coordinates

Two complex numbers z1 := r1 eiθ1 and z2 := r2 eiθ2 are said to be equal iff r1 = r2 and
∃n ∈ Z : θ1 = θ2 + 2nπ.

Page 51 of 98
4.1.7 Powers of Complex Numbers
Consider z := reiθ . Exponentiating z to an index n can be interpreted as rotating z about 0 by
the angle ‘θ’ (n − 1) times and scaling this rotated vector to a length of rn .

Q.1 If z = 3 + i, find z 8 .

By eq. (4.2),
r
√ √ 2
|z| := hypot( 3, 1) = 3 + 12 = 2

 
1 π
Arg(z) := atan2 (1, 3) = arctan √ =
3 6

π
=⇒ z = 2ei /6
 8 Exponentiate to 8th power
8 iπ/6
∴ z = 2e
 π 
= 28 cis i · 8
6

= 256ei /3
√ !! Apply Euler’s formula
1 3
= 256 − + i −
2 2

=⇒ z 8 = −128 − 128 3i

(Ans.)

4.1.8 Roots of Complex Numbers


Consider z := reiθ to be an nth root of the complex number w := Reiϕ .

=⇒ z n := Reiϕ
n
∴ reiθ = Reiϕ
∴ rn einθ = Reiϕ

From §4.1.6, we can deduce that:

1/n
rn = R =⇒ r = R (4.6)
ϕ 2mπ
∃m ∈ Z : nθ = ϕ + 2mπ =⇒ θ = + , m ∈ J0, n − 1K (4.7)
n n

By the fundamental theorem of algebra, the polynomial equation z n = w should have n principal
solutions. Therefore, we constrain m to the interval {0, 1, . . . , n − 1} := J0, n − 1K.
Q.2. Find all the fifth roots of unity.
Consider z := reiθ ∈ C s.t. z 5 := 1.

We can express 1 in polar form as e2iπ . From eq. (4.6),

r5 = 1 =⇒ r = 1

Page 52 of 98
From eq. (4.7),
2π 2mπ 2(m + 1)π
θ= + =⇒ θ = , m ∈ J0, 4K
5 5 5

Therefore,
n o
2π 4π 6π 8π
z∈ ei /5 , ei /5 , ei /5 , ei /5 , 1

(Ans.)
1/3
Q.3. Find all values of (−8i) .

−8i = 0 − 8i
= 8(0 − i)
−8i = 8e−i
π/2

Consider z := reiθ ∈ C s.t. z 3 := −8i.

From eq. (4.6),


r3 = 8 =⇒ r = 2

From eq. (4.7),

π/2
2mπ (4m − 1)π
θ=− + =⇒ θ = , m ∈ J0, 2K
3 3 6

Therefore,
n o
z ∈ 2e−i /6 , 2ei /6 , 2ei /6
π 3π 7π

n√ √ o
∈ 3 − i, 2i, − 3 − i

(Ans.)
√ 1/4
Q.4. Evaluate −8 − 8 3i .

Consider z := reiθ ∈ C s.t. z 4 = −8 − 8 3i

∵ z 4 = −8 − 8 3i

∴ r4 ei(4θ) = 8(−1 − 3i)
√ !
1 3
= 16 − − i
2 2
− 2π/3)
∴ r4 ei(4θ) = 16ei(

From eq. (4.6),


r4 = 16 =⇒ r = 2

Page 53 of 98
From eq. (4.7),

− 2π/3 2mπ (3m − 1)π


θ= + =⇒ θ = , m ∈ J0, 3K
4 4 6

Therefore,
n o
z ∈ 2e−i /6 , 2ei /3 , 2ei /6 , 2ei /3
π π 5π 4π

n√ √ √ √ o
∈ 3 − i, 1 + 3i, − 3 + i, −1 − 3i

(Ans.)
Q.5. Find the cube roots of 3 + 4i.

Consider z := reiθ s.t. z 3 = 3 + 4i


4

We can represent the complex number (3 + 4i) as 5ei arctan ( /3) .
From eq. (4.6),

r3 = 5 =⇒ r =
3
5

From eq. (4.7),


arctan(4/3) + 2mπ
θ= , m ∈ J0, 2K
3

Therefore,

√ √ √
 i 4
 
i

4
  
i

4
  
3 arctan 3 arctan 3 +2π 3 arctan 3 +4π
z∈ 5e 3 3 , 5e 3 , 5e 3

(Ans.)

4.2 Complex Functions


Let set S ⊆ C.

A function f defined on set S is a rule that assigns each element of z ∈ S to a complex number
w ∈ S.

w := f (z)

Dependent variable Independent variable

By convention, we denote z := x + iy, whereas we denote w := u + iv. Therefore, complex


functions can be visualized as mappings from an xy space to a uv space.

Page 54 of 98
y f v

z f (z)

x u

Figure 4.3: A complex function f mapping xy space to uv space

For example, consider the function f (z) := z 2 + iz. For z := x + iy,

w := f (z)
= z 2 + iz
2
= (x + iy) + i(x + iy)
= x2 − y 2 + 2xyi + ix − y
=⇒ u + iv = (x2 − y 2 − y) + i(2xy + x)

Therefore, u = x2 − y 2 − y and v = 2xy + x, which means u and v are themselves also functions
of x and y.

Q.6. If f (z) := z 2 + 3z, then compute f (1 + 3i).

2
f (1 + 3i) := (1 + 3i) + 3(1 + 3i)
= 12 − 32 + (2 · 1 · 3)i + 3 + 9i
∴ f (1 + 3i) = −5 + 15i

(Ans.)

4.2.1 Limits
The following is the definition of a limit for real-valued functions:

lim f (x) = L means ∀ε > 0 ∃δ > 0 : |f (x) − L| < ε ⇐= |x − x0 | < δ.


x→x0

The limit is said to exist and equal to L when lim + f (x) = lim − f (x) = L as shown in fig. 4.4.
x→x0 x→x0

x0 − δ x0 x0 + δ L−ε L L+ε

Figure 4.4: Definition of a limit

Complex Definition

lim f (z) = l means ∀ε > 0 ∃δ > 0 : |f (z) − l| < ε ⇐= |z − z0 | < δ 1 . (4.8)


z→z0

1 |z − z0 | < δ represents a disc (excluding circumference) in the complex plane centered at z0 with radius δ.

Page 55 of 98
y v
δ

z0

x u
ε

l
f

Figure 4.5: Complex definition of a limit

Unlike real numbers, a complex number can be approached from any direction, as seen in fig. 4.5.
Therefore, the complex limit exists and is equal to l only when the limit from every direction is
equal and equal to l.
z
Q.7. Let f (z) := . Find lim f (z).
z z→0
Let z := x + iy =⇒ z := x − iy.

x + iy
∴ f (z) =
x − iy
z x + iy
=⇒ lim = lim
z→0 z (x,y)→(0,0) x − iy

Let us assume a line of approach y := mx to the point (0, 0) where m is the slope of the
line (∵ y → 0 =⇒ mx → 0 =⇒ x → 0).

z x + imx
∴ lim = lim
z→0 z x→0 x − imx
x(1 + im)
= lim 
x→0 x(1 − im)

1 + im
= lim
x→0 1 − im
1 + im
=
1 − im

For different directions of approach, the value of m changes so the value of limit changes
as well. The limit does not exist . (Ans.)
2
Q.8. f (z) := |z| . Find lim f (z).
z→0
p
Let z := x + iy =⇒ |z| = x2 + y 2

2
∴ lim |z| = lim (x2 + y 2 )
z→0 (x,y)→(0,0)

Let us assume a line of approach y := mx to the point (0, 0) where m is the slope of the
line (∵ y → 0 =⇒ mx → 0 =⇒ x → 0).

Page 56 of 98
2
lim |z| = lim (x2 + m2 x2 )
z→0 x→0

= lim x2 (1 + m2 )
x→0
Applying the limit
= 0 · (1 + m2 )
2
=⇒ lim |z| = 0
z→0

(Ans.)
i·z i
Q.9. Suppose f (z) := . Show that lim f (z) = .
2 z→1 2
i
Proof. Let us assume that the limit of the given function as z → 1 is equal to . According
2
to eq. (4.8):
i·z i
∀ε > 0 ∃δ > 0 : − < ε ⇐= |z − 1| < δ
2 2
Let z := x + iy
z = x − iy

i
 
x−1 y
(z − 1) = i −i
2 2 2
y x−1
= +i
2 2
1 p
= · y 2 + (x − 1)2
2
1 p
∴ ε > · y 2 + (x − 1)2
2
p
∴ 2ε = y 2 + (x − 1)2 + c1 (c1 > 0)

|z − 1| = |(x − 1) + iy|
p
= (x − 1)2 + y 2
p
∴ δ > (x − 1)2 + y 2
p
∴ δ = (x − 1)2 + y 2 + c2 (c2 > 0)
∴ δ = 2ε − c1 + c2

Since there exists a valid relation between ε and δ,


The limit of the given function as z approaches 1 is i/2.

4.2.2 Continuity
A complex-valued function f is said to be continuous at z0 iff the limit of the function at z0
exists and equals the value of the function at z0 . This can be defined as follows:

lim f (z) = f (z0 ) means ∀ε > 0 ∃δ > 0 : |f (z) − f (z0 )| < ε ⇐= |z − z0 | < δ.
z→z0

Page 57 of 98
Q.10. Consider a function f as follows,

2
 R(z )

z ̸= 0

2 ,
f (z) := |z|

0,

otherwise

Is f continuous?
Consider z := x + iy.
x2 − y 2
When we expand the function for z ̸= 0, we get f (z) = . Let us assume a line of
x2 + y 2
approach from a random direction specified by the equation y = mx. Thus the function f
is
x2 − m2 x2 2
x
 (1 − m2 ) 1 − m2
lim f (z) = lim = lim 2
=
z→0 x→0 x2 + m2 x2 x
x→0  (1 + m2 ) 1 + m2
After applying the limit, the final limit depends on the direction of approach (m). Hence
the limit does not exist, and the function is not continuous at z = 0. (Ans.)

4.2.3 Differentiability
The ratio of the change in the value of the function f (z) to the change in the value of the complex
variable z is known as its derivative. It is defined as follows,

f (z + ∆z) − f (z)
f ′ (z) := lim (4.9)
∆z→0 ∆z

Iff this limit exists, f is differentiable

An alternative definition is given by,

f (z) − f (z0 )
f ′ (z) := lim (4.10)
z→z0 z − z0
2
Q.11. Is f (z) := |z| differentiable?
By eq. (4.9),
f (z + ∆z) − f (z)
f ′ (z) := lim
∆z→0 ∆z
2 2
|z + ∆z| − |z|
:= lim
∆z→0 ∆z
(z + ∆z)(z + ∆z) − zz
= lim
∆z→0 ∆z
(z + ∆z)(z + ∆z) − zz
= lim
∆z→0 ∆z
zz
 + z∆z + z∆z + ∆z∆z − zz

= lim
∆z→0 ∆z
∆z
= lim z + z + ∆z
∆z→0 ∆z
∆z → 0 ⇐⇒ ∆z → 0
∆z
= z + z lim
∆z→0 ∆z

Assuming that ∆z approaches 0 from the x-axis, ∆z = ∆z, the given limit equals (z + z).

Assuming that ∆z approaches 0 from the y-axis, ∆z = −∆z, the given limit becomes
equals (z − z).

Page 58 of 98
Since, the value of the derivative is different for different directions of approach, the func-
tion is not differentiable at every point. (Ans.)

Necessary condition for differentiability

Assume z := x + iy is a complex number and f is a complex-valued function s.t. f (x, y) :=


u(x, y) + i · v(x, y). If f is differentiable at z = z0 then it means that the below limit exists.

f (z0 + ∆z) − f (z0 ) f (x0 + ∆x, y0 + ∆y) − f (x0 , y0 )


f ′ (z) := lim = lim
∆z→0 ∆z (∆x,∆y)→(0,0) ∆x + i ∆y

Assuming that ∆z → 0 from the x-axis i.e. ∆y = 0, the above limit becomes:

f (x0 + ∆x, y0 ) − f (x0 , y0 )


f ′ (z0 ) = lim
∆x→0 ∆x
∂ ∂ ∂
= f (x0 , y0 ) = u(x0 , y0 ) + i v(x0 , y0 )
∂x ∂x ∂x
= ux + ivx

Assuming that ∆z → 0 from the y-axis i.e. ∆x = 0, the above limit becomes:

f (x0 , y0 + ∆y) − f (x0 , y0 )


f ′ (z0 ) = lim
∆y→0 i ∆y
 
i ∂ ∂ ∂
= 2· f (x0 , y0 ) = −i u(x0 , y0 ) + i v(x0 , y0 )
i ∂y ∂y ∂y
= vy − iuy

We know that for a complex limit to exist, it must be equal for all directions of approach, so
from the above two equations we can say,

ux = vy and vx = −uy (4.11)

The above equations are known as the Cauchy-Riemann (CR) equations and they hold true
for every complex differentiable function.

Sufficient conditions for differentiability

For a complex function f to be complex differentiable at z = z0 , it should follow the following


two conditions:
i) The function satisfies the CR equations.
ii) The partial derivatives ux , vx , uy and vy are continuous in the neighborhood of z0 .

4.2.4 Holomorphic Functions


A complex function is said to be holomorphic in an open subset U of C if it is complex differen-
tiable for every point in U .

Page 59 of 98
4.2.5 Analytic Functions
A complex function is said to be analytic at a given point z0 of its domain if it can be locally
represented as a convergent power series around z0 . That is ∃R > 0 such that f (z) can be
P∞
written as n=0 an (z − z0 )n , ∀z|z − z0 | < R
If a complex function is holomorphic in a given region of its domain then it is analytic for every
point in its domain.
2
Q.12. Check the analyticity of the function f (z) := |z| .
2
Given that f (z) := |z| .
Therefore, f (z) = f (x, y) = x2 + y 2 =⇒ u(x, y) = x2 + y 2 ∧ v(x, y) = 0.
Since ux = 2x and vy = 0, CR equations are not satisfied. Therefore, the function is not
holomorphic and therefore not analytic . (Ans.)
Q.13. Show that if f (z) and f (z) are both analytic then f is a constant function.
Proof. Let f (z) := u + iv ⇐⇒ f (z) = u − iv. If f (z) is analytic then it means that the
CR equations are satisfied: ux = vy and vx = −uy .

Similarly if f (z) is analytic then we get ux = (−v)y =⇒ ux = −vy and (−v)x = −uy =⇒
vx = uy .
From the above two results we know that ux = −ux and vx = −vx . This can occur iff
ux = 0 ∧ vx = 0.

∴ f ′ (z) := ux + ivx = 0 =⇒ f (z) = c, where c is a constant.

Hence, if a complex function and its conjugate are both analytic then we can say that the
function is constant.

4.2.6 Entire Functions


A complex function is said to be entire if it is holomorphic for the entire complex plane C.

Q.14. Check whether f (z) := (z 2 − 2)e−x e−iy is an entire function or not.

f (z) := (z 2 − 2)e−x e−iy


= e−x (x2 − y 2 − 2 + 2ixy)(cos y − i sin y)
= e−x ((x2 − y 2 − 2) cos y + 2xy sin y) + i e−x (2xy cos y − (x2 − y 2 − 2) sin y)

Now comparing with the CR equations we can say that:

u(x, y) = e−x ((x2 − y 2 − 2) cos y + 2xy sin y)


∴ ux = (2x cos y + 2y sin y)e−x − ((x2 − y 2 − 2) cos y + 2xy sin y)e−x
= ((y 2 + 2x − x2 + 2) cos y + (2y − 2xy) sin y)e−x
∴ uy = (−2y cos y − (x2 − y 2 − 2) sin y + 2x sin y + 2xy cos y)e−x
= ((y 2 + 2x − x2 + 2) sin y − (2y − 2xy) cos y)e−x

Page 60 of 98
and

v(x, y) = e−x ((2xy) cos y − (x2 − y 2 − 2) sin y)


∴ vx = (2y cos y − 2x sin y)e−x − (2xy cos y − (x2 − y 2 − 2) sin y)e−x
= ((2y − 2xy) cos y + (x2 − 2x − y 2 − 2) sin y)e−x
∴ vy = (2x cos y − 2xy sin y − −2y sin y − (x2 − y 2 − 2) cos y)e−x
= ((y 2 + 2x − x2 + 2) cos y + (2y − 2xy) sin y)e−x

As we can clearly see, ux = vy , vx = −uy and the partial derivatives ux , vx , uy and vy are
continuous over the entire complex plane2 . Therefore, the function is holomorphic over
the entire complex plane and is entire . (Ans.)

4.2.7 Harmonic Functions


A real valued function in two variables say x and y is said to be harmonic if it’s first and second
order derivatives exist, these derivatives are continuous and they satisfy the Laplace equation:

∂2u ∂2u
+ 2 =0 (4.12)
∂x2 ∂y

Harmonic Conjugates

If two real valued harmonic functions u and v exist such that the function u + iv is analytic in
nature, then v is known as the harmonic conjugate of u.
Q.15. Find the analytic function f (z) := u + iv whose real part is defined as follows: u =
sin x · cosh y.
Since f is analytic, we can say that it satisfies the CR equations, i.e. ux = vy = cos x·cosh y
and uy = −vx = sin x · sinh y.

∂v ∂v
= cos x · cosh y = − sin x · sinh y
∂y ∂x
Z Z Z Z
∴ dv = cos x · cosh y dy ∴ dv = − sin x · sinh y dx
Z Z
∴ v = cos x cosh y dy ∴ v = sinh y − sin x dx

∴ v = cos x · sinh y + c1 (x) ∴ v = sinh y · cos x + c2 (y)

Notice that the constants of integration are actually functions of x and y respectively. This
is because while taking a partial derivative, the other variables are considered as constants
and consequently the partial derivatives of their functions evaluate to zero.
Now on comparing the two values of v which we have evaluated, we get the equation
c1 (x) = c2 (y). This can only happen iff c1 (x) = c2 (y) = k where k ∈ C.
Therefore, the entire function f is

f (x, y) := sin x · cosh y + i(cos x · sinh y + k) where k ∈ C


2 because they are compositions of polynomial, exponential and trigonometric sine and cosine functions.

Page 61 of 98
(Ans.)
Alternatively,
ux = cos x · cosh y and uy = sin x · sinh y.
From CR equations, we can deduce that vx = −uy = − sin x · sinh y and vy = ux =
cos x · cosh y.

Now,

∂v
vy :=
∂y
∂v
=⇒ = cos x · cosh y
∂y
Z Z
∴ dv = cos x · cosh y dy

∴ v = cos x · sinh y + ϕ(x), where ϕ(x) is a function independent of y.

Differentiating v with respect to x,


vx = (cos x · sinh y + ϕ(x))
∂x
∴ vx = − sin x · sinh y + ϕ′ (x)

However, we know from CR equations that vx = − sin x · sinh y =⇒ ϕ′ (x) = 0 =⇒


ϕ(x) = k , where k ∈ C is a constant3 .

Therefore, the entire function f is

f (x, y) := sin x · cosh y + i(cos x · sinh y + k) where k ∈ C

(Ans.)
The above procedure can be summarised with the following algorithm:

Algorithm 1: Obtaining analytic function when real part is known


Data: Real part u is given.
1 Find ux and uy .
2 Using CR equations, obtain vx and vy .
3 Integrate vy or vx with respect to y or x respectively.
4 Differentiate v obtained in step 3 with respect to x or y depending on the variable taken
in step 3.
5 Compare vx or vy depending on the variable chosen in step 3.
Result: f (z) := u + iv.

Q.16. If v := 4x3 y − 4xy 3 , then find the analytic function f (z) := u + iv.
Given v := 4x3 y − 4xy 3 , therefore vx = 12x2 y − 4y 3 and vy = 4x3 − 12xy 2 .
3 ϕ(x) was independent of y, therefore ϕ′ (x) = 0 iff it is a constant.

Page 62 of 98
Algorithm 2: Obtaining analytic function when imaginary part is known
Data: Imaginary part v is given.
1 Find vx and vy .
2 Using CR equations, obtain ux and uy .
3 Integrate uy or ux with respect to y or x respectively.
4 Differentiate u obtained in step 3 with respect to x or y depending on the variable taken
in step 3.
5 Compare ux or uy depending on the variable chosen in step 3.
Result: f (z) := u + iv.

From the CR equations, we can say that ux = vy and vx = −uy .

∂u
= 4x3 − 12xy 2
∂x
∴ du = (4x3 − 12xy 2 ) dx
Z Z Integrating both sides
∴ du = 4x3 − 12xy 2 dx

12
∴ u = x4 − x2 y 2 + ϕ(y) = x4 − 6x2 y 2 + ϕ(y)
2

Now consider the partial derivative of u with respect to y.

∂u ∂
x4 − 6x2 y 2 + ϕ(y)

=
∂y ∂y
uy = −12x2 y + ϕ′ (y)
uy = −(12x2 y 2 − ϕ′ (y))

We know from the CR equations that vx = −uy . Substituting the values of vx and uy we
find that ϕ′ (y) = 4y 3 . Therefore ϕ(y) = 4y 3 dy = y 4 + c where c ∈ C.
R

Therefore the function f is equal to (x4 − 6x2 y 2 + y 4 ) + i(4x3 y − 4xy 3 ) + c or

f (z) = z 4 + c where c ∈ C

(Ans.)

4.2.8 Milne-Thomson’s Method


It is another method provided by the scientist Milne-Thomson to directly solve for an analytic
function f (z) when only its real (or imaginary) part is known.

Algorithm 3: Finding analytic f (z) when u(x, y) or v(x, y) is given


Data: Real (u) or imaginary (v) part is given.
′ ′
1 f (z) := ux − iuy or f (z) := vy + ivx from CR.
4
2 In the partial derivatives, substitute x = z and y = 0 .

3 Integrate f (z) now obtained with respect to z to get f (z).
Result: f (z) := u + iv

4 The z+z z−z


reason being that x = and y = , which is true even when x and y ∈
/ R meaning that z and
2 2
z are independent and can be equated to each other.

Page 63 of 98
Q.17. If v := 4x3 y − 4xy 3 , then find the analytic function f (z) := u + iv using Milne-Thompson’s
method.
Given v := 4x3 y − 4xy 3 , therefore vx = 12x2 y − 4y 3 and vy = 4x3 − 12xy 2 .
We know from the CR equations that ux = vy and vx = −uy . Thus f ′ (x) = vy + ivx .

Substituting x = z and y = 0 in vx and vy , f ′ (z) = (4z 3 − 0) + i(0 − 0) = 4z 3 .


Upon integration w.r.t. z, we get f (z) = 4z 3 dz = z 4 + c where c ∈ C .
R
(Ans.)

Finding the entire function when sum or difference of the real and imaginary parts
is given

Consider a complex number w := u + iv which is the result of a complex analytic function f .


i.e. f (z) = u + iv.
Now consider the function Φ(z) := (1 + i)f (z). This function is also complex differentiable for
all points in a given domain of f meaning it is holomorphic for the same region as f and hence
is analytic as well. Now Φ(z) := (1 + i)f (z) = (u − v) + i(u + v).
Since, we know the sum(or difference) of f , it means we know the real(or imaginary) part of a
complex analytic function and hence we can apply Milne-Thomson’s method to find Φ. Then
we can just divide by (1 + i) to get the original function f .
2 sin 2x
Q.18. Find the analytic function f (z) = u + iv when u + v = .
ey + e−y − 2 cos 2x

We know that,

(1 + i)f (z) = (u − v) + i(u + v)


:= U + iV

2 sin 2x
Thus, V := u + v = .
ey + e−y − 2 cos 2x
Differentiating,

e2y + e−2y − 2 cos 2x 4 cos 2x − 2 sin 2x(4 sin 2x)



∴ Vx = 2
(e2y + e− 2y − 2 cos 2x)
−2y
4 cos 2x − 8 cos2 2x − 8 sin2 2x

e2y + e
= 2
(e2y + e− 2y − 2 cos 2x)
e2y + e−2y 4 cos 2x − 8

∴ Vx = 2
(e2y + e− 2y − 2 cos 2x)

2 sin 2x(2e2y − 2e−2y )


∴ Vy = − 2
(e2y + e− 2y − 2 cos 2x)

Page 64 of 98
By Milne-Thompson’s method,

(1 + i)f ′ (z) = Vy (z, 0) + iVx (z, 0)


!
2 sin 2z(2 − 2) (1 + 1)4cos2z − 8
=− 2 +i 2
(1 + 1 − 2 cos 2z) (1 + 1 − 2 cos 2z)
!
8(cos 2z − 1)
=i 2
(2 − 2 cos 2z)
!
1−
( (cos
(( 2z
(
= −8i
2
4(1 − cos 2z) 
2i
=−
1 − cos 2z
i
=− 2
sin z
∴ (1 + i)f ′ (z) = −i · csc2 z
Z
=⇒ (1 + i)f (z) = − i · csc2 z dz

∴ (1 + i)f (z) = i cot z + c, where c is the arbitrary constant of integration.


i cot z c
∴ f (z) = +
1+i 1+i
c
Let k := be another constant.
1+i
 
1+i
∴ f (z) = cot z + k
2

(Ans.)

4.2.9 Bijectivity of Complex Functions


Consider a complex function f : A → B. The function f is said to be bijective if it follows the
two following conditions:
• Injectivity (one-one): ∀z1 , z2 ∈ A f (z1 ) = f (z2 ) =⇒ z1 = z2 i.e. each image has one
and only one pre-image.
• Surjectivity (onto): ∀w ∈ B ∃z ∈ A : w = f (z) i.e. Range of f = B.
Q.19. Restrict the domain and range such that the function exp(z), z ∈ C is a bijective.

Assume that the function f : C → C is defined as: f (z) := exp(z).

z := x + iy =⇒ f (z) = exp(x + iy)


= exp(x) · exp(iy)

For f (z) = ez to be an onto function:


Let the codomain of f be C \ {0} because ez can never be zero.

For f (z) = ez to be a one-one function, ez1 = ez2 =⇒ z1 = z2 .

Page 65 of 98
Consider z1 := x1 + iy1 and z2 := x2 + iy2 .

∴ ez1 = ez2 =⇒ ex1 +iy1 = ex2 +iy2


=⇒ ex1 eiy1 = ex2 eiy2

This is true iff x1 = x2 and y1 = y2 + 2nπ, n ∈ Z. The function is clearly not injective.
Let domain of f be {x + iy | y ∈ [0, 2π) , x ∈ R}.

Domain of f : {x + iy | y ∈ [0, 2π) , x ∈ R}


Codomain of f : C \ {0}

(Ans.)
Q.20. Restrict the domain and range such that the function z 2 , z ∈ C is a bijective.

Assume that the function f : C → C is defined as: f (z) := z 2 .

z := x + iy =⇒ f (z) = z 2 = (x2 − y 2 ) + i(2xy)

f (z) = z 2 is already an an onto function because if z is defined in the polar form (reiθ )
then f (z) becomes r2 ei(2θ) . We can choose any arbitrarily small or large r and θ to obtain
every complex number in C.
2 2
For f (z) := z 2 to be a one-one function: (z1 ) = (z2 ) =⇒ z1 = z2 must be true. But:

(z1 )2 = (z2 )2 =⇒ (z1 )2 − (z2 )2 = 0


∴ (z1 − z2 )(z1 + z2 ) = 0
∴ z1 = z2 or z1 = −z2

To make f one to one, we must select two quadrants such that z and −z do not appear in
the domain together but also such that most of the complex plane is covered by the range.
Selecting the set {x + iy | x > 0} ∪ {x + iy | x = 0, y ≥ 0} as the domain, we get the
entire complex plane C as the range.

Domain of f : {x + iy | x > 0} ∪ {x + iy | x = 0, y ≥ 0}
Codomain of f : C

(Ans.)

4.2.10 Some Special Mappings


A few of the characteristic complex functions or mappings are widely used and hence, have been
given special names. They are as follows:
• Translation Map: f (z) := z + c where c ∈ C, translates all the points in the complex plane
h i⊤
by the vector Re(x) Im(x) .
• Rotation and Scaling
" Map: f (z) := cz where #c ∈ C, applies a linear transformation
|c| cos(arg(c)) −|c| sin(arg(c))
characterised by to all the points in the complex plane.
|c| sin(arg(c)) |c| cos(arg(c))

Page 66 of 98
• Inverse Map: f (z) := 1/z is an important function for many reasons, but it’s most apparent
useful property is that it is the inverse of itself.
Q.21. Find the image of the rectangle (in the complex plane) bound by the lines x = 0, y = 0,
x = 2, y = 2 when the function f which is defined as f (z) := z − (1 − i) is applied to every
point in the complex plane.

y v

(−1, 3) (1, 3)

(0, 2) (2, 2)

(−1, 1) (1, 1)
x u
(0, 0) (2, 0)

As we can see, the points of the rectangle are now bound by the lines x = −1, x = 1, y = 1
and y = 3. (Ans.)
Q.22. Find the image of the circle (in the complex plane) which follows the equation |z| = 2
when the function f which is defined as f (z) := z − (3 + 2i) is applied to every point in
the complex plane.

The output of the function f will be another complex number (say w).

∵ w = f (z) := z − (3 + 2i) =⇒ z = w + (3 + 2i)


∴ |z| = 2 =⇒ |w + (3 + 2i)| = 2 ⇐⇒ |w − (−3 − 2i)| = 2

The output of the functionf when applied to the points inside a circle is another circle of
the same radius but transferred to the point (−3, −2) in the complex plane. (Ans.)

y
u
r=2
r=2
x (−3, −2)
(0, 0)

Q.23. Consider the function f : C → C where f (z) := 3z + (2 + i) and its effect on the region in
the complex plane which follows the following relation. |z − 1| = 1.

Page 67 of 98
p
|z − 1| = 1 =⇒ (x − 1)2 + y 2 = 1
f (z) := 3z + (2 + i) =⇒ (x − 1)2 + y 2 = 12
∴ f (x + iy) := (3x + 2) + i(3y + 1) !2 !2
u−5 v−1
∴ (Let f (x + iy) = (u + iv)) =⇒ + =1
3 3
∴ (u + iv) := (3x + 2) + i(3y + 1) =⇒ (u − 5)2 + (v − 1)2 = 32
u−2 v−1
∴x= and y = which is another circle centered at (5, 1) and
3 3
with radius 3. (Ans.)

I
Image

Pre-image
(5, 1)
R
(1, 0)

Q.24. Consider the function f : C → C where f (z) := (1 + i) · z and its effect on the region in
the complex plane bound by x = 0, y = 0, x = 1 and y = 1.
Let z := x + iy and f := u + iv. Therefore,

f (z) := (1 + i) · (x + iy) =⇒ (u + iv) = (x − y) + i(x + y) =⇒ u = x − y and v = x + y

On solving this linear system of equations we get x = (u + v)/2 and y = (v − u)/2 which
we substitute in the different equations governing the shape of the (square) region.

I
1. x =0 =⇒ (u + v)/2 = 0 =⇒ u + v =
0 =⇒ u = −v
Image
2. y =0 =⇒ (v − u)/2 = 0 =⇒ v − u =
0 =⇒ u=v
3. x =1 =⇒ (u + v)/2 = 1 =⇒ u + v =
2 =⇒ v = −u + 2 Pre-image
4. y =1 =⇒ (v − u)/2 = 1 =⇒ v − u =
R
2 =⇒ v =u+2

Another square which is rotated by 45o about the origin (in the counterclockwise direction)

and has 2 times larger sides is the final image. (Ans.)
Q.25. Find out what happens to the circle |z| = 1 in the complex plane when we apply the
transformation f := 1/z to the entire plane.

Page 68 of 98
1 z x −y
For w = f (z) := = ; u + iv := +i =⇒ x = |z| · u and y = −|z| · v.
z |z| |z| |z|
For the points (z0 ) on the circle where |z| = 1:
p
|z| = 1 =⇒ x2 + y 2 = 1 =⇒ x2 + y 2 = 1 =⇒ |z0 |(u2 + v 2 ) = 1 =⇒ u2 + v 2 = 1.

1
If we have a closer look at the equation w = (x − iy) we can see that every point in
|z|
the complex plane is flipped along the real-axis and is scaled by the multiplicative inverse
of its distance from origin.
In simpler words, all points have been reflected along the real axis and points outside the
circle |z| = 1 are mapped to locations inside the circle and vice-versa. (Ans.)

Q.26. Find the image of the circle in the complex plane described by the equation |z − 3| = 1
when the inversion map 1/z is applied to the entire plane.
Let the set K := {k : |k − 3| = 1, k ∈ C} represent the set of values for which we are
finding the image of the function and w be the corresponding image of a given element in
K. w ∈ im(K) ⇐⇒ f −1 (w) ∈ K ⇐⇒ 1/w ∈ K.

∵ 1/w ∈ K =⇒ |1/w − 3| = 1
∴|(1 − 3w)/w| = 1 (Since |a/b| = |a|/|b|, ∀a, b ∈ C)
2 2
∴|1 − 3w| = |w| =⇒ |1 − 3w| = |w|
∴(1 − 3w)(1 − 3w) = w · w
∴(1 − 3w)(1 − 3w) = w · w
∴1 − 3(w + w) + 9w · w = w · w
∴1 − 3 · 2 Re(w) + 8w · w = 0
∴1 − 6u + 8(u2 + v 2 ) = 0 (Let w := u + iv)
∴8u2 − 6u + 9/8 + 8v 2 + 1 = 9/8 (Completing the square)
√ 3 2
∴(2 2u − 2√ 2
) + 8v 2 = 1/8
∴(u − 3/8)2 + v 2 = (1/8)2 (Dividing by 8 on both sides)

The circle present at (3, 0) with radius 1 gets mapped to another circle at (3/8, 0) with a
radius of 1/8. i.e.
K := {k : |k − 3| = 1, k ∈ C} becomes W := {w : |w − 3/8| = 1/8, w ∈ C} (Ans.)

Q.27. Find the image of the circle |z − 1| = 1 under w = 1/z.


Let the set K := {k : |k − 1| = 1, k ∈ C} represent the set of values for which we are
finding the image of the function and w be the corresponding image of a given element in
K. w ∈ im(K) ⇐⇒ f −1 (w) ∈ K ⇐⇒ 1/w ∈ K.

1/w ∈ K =⇒ |1/w − 1| = 1
 
z1 |z1 |
|1 − w| = |w| ∵ =
z2 |z2 |
|w − 1| = |w|

Page 69 of 98
w is the set of all points whose distance from the point (1, 0) and (0, 0) is the same. This
describes a line which passes through the midpoint of these two points and whose direction
is perpendicular to that of the line joining these two points.

I
Re(w) = 1/2

w ∈ {z ∈ C : Re(z) = 1/2} (Ans.)

R
(0, 0) (1, 0)

Q.28. Find the image of the line {x + iy ∈ C : x = y} under w = 1/z.


Let the set K represent the set of values for which we are finding the image of the function
and w be the corresponding image of a given element in K. w ∈ im(K) ⇐⇒ f −1 (w) ∈
K ⇐⇒ 1/w ∈ K.

∵ 1/w ∈ K =⇒ Re(1/w) = Im(1/w)


!
u −v 1 z
∴ = ∵ = 2
u2 + v 2 u2 + v 2 z |z|
∴u = −v

The line x = y gets mapped onto x = −y under inversion mapping.


w ∈ {u + iv ∈ C : u = −v} (Ans.)
Q.29. Prove that the inversion mapping 1/z always maps a straight line or a circle onto another
straight line or a circle.

Proof. Let Q(x, y) = A(x2 + y 2 ) + Bx + Cy + D be a polynomial in x and y. If the two


variables are the real and
 imaginary parts of a complex number respectively then:
a circle ∀A ̸= 0
Q(x, y) = 0 represents on the complex plane. (B, C, D ∈ R)
a straight line A = 0

Q(x, y) = 0
∴A(x2 + y 2 ) + Bx + Cy + D = 0
! !
z+z z−z
∴A(z · z) + B +C +D =0
2 2
! ! Inversion
A B 1 1 C 1 1
∴ + + + − +D =0
w·w 2 w w 2 w w
! !
A B w+w C w−w
∴ + + +D =0
w·w 2 w·w 2 w·w
! !
w−w w+w
∴D(w · w) − C +B +A=0
2 2
w := u + iv
2 2
∴D(u + v ) − Cv + Bu + A = 0

This curve is of the same form as Q(x, y) = 0 and represents a circle or a straight line
depending on the value of D.

Page 70 of 98
4.2.11 Orthogonal Trajectories
Consider an analytic function f (z) := u(x, y) + iv(x, y), then the equations u(x, y) = c1 and
v(x, y) = c2 represent the level curves in the complex plane where u and v take a specific value.
Let us find the slopes of the tangent to these curves:

u(x, y) = c1 v(x, y) = c2
∂u ∂u ∂v ∂v
∴ dx + dy = 0 ∴ dx + dy = 0
∂x ∂y ∂x ∂y
ux vx
   
dy dy
∴ =− ∴ =−
dx u=c1 uy dx v=c1 vy
ux vx
∴mu = − ∴mv = −
uy vy

For an analytic function, the CR equations are followed which means that the product mu · mv
becomes −1. Which means that they are perpendicular(orthogonal ) to each other.

4.3 Conformal Mapping


Linear Fractional Transformation (LFT), Möbius transformation.

”Every möbius tranformation is conformal but not every conformal mapping is-
SHUT UP!!! Biden Blast

Möbius transformations ⊂ Conformal maps

Q.30. Find the möbius transformation such that the points −1, 0, 1 are mapped onto the points
−i, 1, i respectively.
az + b
Let w := be the required transformation for some a, b, c, d ∈ C.
cz + d

a(−1) + b
−1 7→ −i =⇒ −i =
c(−1) + d
−a + b
∴ −i = =⇒ −a + b − ci + di = 0 −→ (1)
−c + d

a(0) + b
0 7→ 1 =⇒ 1 =
c(0) + d
b
∴1= =⇒ b − d = 0 −→ (2)
d

a(1) + b
1 7→ i =⇒ i =
c(1) + d
a+b
∴i= =⇒ a + b − ci − di = 0 −→ (3)
c+d

Page 71 of 98
From (1), (2) and (3),    
  a 0
−1 1 −i i    
 b  0
 
0 1 0 −1  c  = 0

1 1 −i −i
   
d 0
By Gauss elimination,
  
−1 1 −i i −1 1 −i i  
∴ 0 1 0 −1 ∼  0 1 0 −1 R3 −→ R3 + R1
   

1 1 −i −i 0 2 −2i 0
 
−1 1 −i i  
∼ 0 1 0 −1 R3 −→ R3 − 2R2
 

0 0 −2i 2

Upon back-substitution, we get

−2ci + 2d =⇒ d = ci
b − d = 0 =⇒ b = d = ci

−a + b − ci + di =⇒ a = ci − ci + ci2 = −c

az + b −cz + (−ci) −z − i
∴ w := =  =
cz + d cz + (−ci) z−i

(Ans.)

4.3.1 Cross Ratio


A transformation which maps the points z1 , z2 , z3 to w1 , w2 , w3 respectively is

(z1 − z2 )(z3 − z) (w1 − w2 )(w3 − w)


= (4.13)
(z2 − z3 )(z − z1 ) (w2 − w3 )(w − w1 )

Theorem 3. A möbius transformation is uniquely determined by the assignment of three points


z1 , z2 and z3 and their corresponding distinct images w1 , w2 and w3 .
Q.31. Determine the LFT that maps the points 2, i, −2 onto the points 1, i, −1 respectively using
cross ratio.
Given (z1 = 2 7→ w1 = 1), (z2 = i 7→ w2 = i) and (z3 = −2 7→ w1 = −1).

z1 − z2 2−i w1 − w2 1−i
Computing = Computing =
z2 − z3 i+2 w2 − w3 i+1
2
(2 − i) (1 − i)2
= =
22 + 1 2 12 + 12
3 − 2i − 2i
= = = −i
5 2

Page 72 of 98
From eq. (4.13),

(3 − 2i)(−2 − z) − i(−1 − w) (3i + 2)(z + 2) 2


= −1=
5(z − 2) w−1 5(z − 2) w−1
(3 − 2i)(z + 2) − (w + 1) (3i + 2 − 5)z + 2(3i + 2 + 5) 2
∴ = ∴ =
5i(z − 2) w−1 5(z − 2) w−1
(−3i−2)(z + 2) −(w + 1) w−1 5z − 10
∴ = ∴ =
5(z − 2) w−1 2 (3i − 3)z + (6i + 14)
(3i + 2)(z + 2) w+1 w
 −1 2 10z − 20
∴ = = + ∴w = +1
5(z − 2) w−1 w
 −
1 w − 1 (3i − 3)z + (6i + 14)

(3i + 2)(z + 2) 2 (3i + 7)z + (6i − 6)


∴ =1+ ∴w= (Ans.)
5(z − 2) w−1 (3i − 3)z + (6i + 14)

(Go to next column)

4.3.2 Inverse Möbius Transformation


az + b
Consider an LFT w = , ad − bc ̸= 0.
cz + d
The inverse transformation is given by,

−dw + b
z= , (−d)(−a) − bc ̸= 0 =⇒ ad − bc ̸= 0 (4.14)
cw − a

Therefore, if w is an LFT, then its inverse is also an LFT. Such a transformation is said to be
bilinear.
4z + 2
Q.32. Can we define a bilinear transformation for the transformation ?
2z + 1
A bilinear transform is defined for any LFT if and only if ad−bc ̸= 0, for the given function
ad − bc = 4 · 1 − 2 · 2 = 0. Hence, we cannot define a bilinear transformation for the given
LFT. (Ans.)

Linear fractional transformations as matrices


" #
az + b a b
Consider representing the transformation w = by the matrix . Therefore, the
cz + d c d
" #!
a b
transformation is an LFT iff ad − bc ̸= 0 i.e. det ̸= 0.
c d
The inverse
" transformation
#! " can
# be represented by the negative adjugate of the above matrix, i.e.
a b −d b
− adj = .
c d c −a

Page 73 of 98
4.3.3 Fixed Points
az + b
Fixed point of a transformation f (z) = can be obtained by putting f (z) = z.
cz + d
3z − 4
Q.33. Find the fixed point of the transformation w = .
z+3
To find the fixed point of the given transformation, put f (z) = z.

3z − 4
z=
z+3
∴ z(z + 3) = 3z − 4 =⇒ z 2 + 3z = 3z − 4
∴ z2 + 
3z
= − 4 =⇒ z 2 = −4
3z
∴ z = 2i ∨ z = −2i

Thus, z ∈ {2i, −2i} . (Ans.)

4.4 Complex Integration


4.4.1 Line Integral
A line integral is an integral that is evaluated on a function along a curve. The line integral of
H
the function f (z) along the curve C is denoted by C f (z) dz.
H 2+i 2
Q.34. Evaluate 0 (z) dz
(i) along the line x = 2y.
I 2+i I (x,y)=(2,1)
2
I= (z) dz = (x2 − y 2 − 2ixy)(dx + i dy)
0 (x,y)=(0,0)

(Since, we travel along the given line: x = 2y =⇒ dx = dy)


I 1 I 1
= (4y 2 − y 2 − 4iy 2 )(dy)(2 + i) = (3 − 4i)(2 + i) y 2 dy
0 0
" #1
3
y
= (10 − 5i)
3
0

10 − 5i
I=
3

10 − 5i
Thus, the solution of the given line integral is . (Ans.)
3

Page 74 of 98
(ii) along the line y = 0 followed by the line x = 2.
I 2+i I (x,y)=(2,1)
I= (z)2 dz = (x2 − y 2 − 2ixy)(dx + i dy)
0 (x,y)=(0,0)
I (x,y)=(2,0) I (x,y)=(2,1)
2 2
= (x − y − 2ixy)(dx + i dy) + (x2 − y 2 − 2ixy)(dx + i dy)
(x,y)=(0,0) (x,y)=(2,0)

(For the path y = 0, dy = 0 and the path x = 2, dx = 0)


I 2 I 1
= x2 dx + i (4 − y 2 − 4iy) dy
0 0
" #2 " #1 !
x3 y3 8 1
= + i 4y − − 2iy 2 = +2+i 4−
3 3 3 3
0 0

14 11i
I= +
3 3

Thus, the solution of the given line integral is 14/3 + 11/3i . (Ans.)
H 2+i
Q.35. Evaluate 1−i 2x + iy + 1 dz along the straight line from the point 1 − i to the point 2 + i.
The given straight line passes through the points (1, −1) and (2, 1). The slope of this line
is (1 − (−1))/(2 − 1) = 2/1 = 2. The intercept of this line can be found out by substituting
either of the two given points in the formula y = 2x + c =⇒ c = y − 2x. If we substitute
(2, 1) in the given formula, then we get c = −3.
We also require the relation between the rates of changes of the variables x and y. So upon
differentiating the given equation of the line w.r.t. x and separating the variables, we get
dy = 2 dx.
I 2+i I (x,y)=(2,1)
I= 2x + iy + 1 dz = (2x + 1 + iy)(dx + i dy)
1−i (x,y)=(1,−1)
I x=2
= (2x + 1 + i(2x − 3))(1 + 2i) dx
x=1
I 2
= (1 + 2i) (2x + 1 + i(2x − 3)) dx
1
2
= (1 + 2i) (x2 + x) + i(x2 − 3x) 1


I = (1 + 2i) ((6 − 2i) − (2 − 2i)) = (1 + 2i)(4) = 4 + 8i

Thus, the solution of the given line integral is 4 + 8i . (Ans.)

Page 75 of 98
H 1+i
Q.36. Evaluate 0
z 2 dz along the parabola x = y 2 .
I 1+i I (x,y)=(1,1)
I= z 2 dz = (x2 − y 2 + 2ixy)(dx + i dy)
0 (x,y)=(0,0)

(Since we travel along the given parabola: x = y 2 =⇒ dx = 2y dy)


I y=1 I 1
4 2 3
= (y − y + 2iy )(2y + i) dy = − (2y 5 − 2y 3 − 2y 3 + i(y 4 − y 2 + 4y 4 )) dy
y=0 0
" !#1 " !#
1 y6 y3 1 1
I
5 3 4 2 4 5
= (2y − 4y + i(5y − y )) dy = −y +i y − = −1+i 1−
0 3 3 3 3
0

2 2
I =− +i
3 3

2 2
Thus, the solution of the given line integral is − + i . (Ans.)
3 3
Q.37. Evaluate the integral of the function f (z) = x2 + ixy from A(1, 1) to B(2, 4) along the
curve x = t and y = t2 .
For the given curve, (x = t and y = t2 ) =⇒ (dx = dt and dy = 2t dt. Thus, dz =
(1 + 2it) dt. The function f (z) becomes f (t) = t2 + it3
For the points A and B, t = 1 and t = 2 respectively. Thus,
I I 2
I= f (z) dz = f (t)(1 + 2it) dt
AB 1
" #2
2 2 t32 3
I I
2 3 2 4 3
= (t + it )(1 + 2it) dt = (t − 2t + i(3t )) dt = − t5 + i t4
1 1 3 5 4
1
" ! !#
8 64 1 2 3 151 45
= − + 12i − − + =− +i
3 5 3 5 4 15 4

151 45
Thus, the solution of the given line integral is − +i . (Ans.)
15 4

When the curve is a circle

Whenever we encounter a closed curve (e.g. circle), we can usually parameterize them in some
way such that the integral goes from a closed line integral in the Cartesian coordinates to an
open one in the parameterized coordinates.
For a circle, this parameter is θ, the angle which is made with the direction parallel to the
positive x-axis. Thus, for a circle with radius r, centered at (c0 , c1 ), the value of x and y is
c0 + r cos θ and c1 + r sin θ and the value of dz = dx + i dy = r(− sin θ + i cos θ) dθ = rieiθ dθ .

Page 76 of 98
H
Q.38. Evaluate C
z + 2z dz where C is the unit circle.
I I
I= z + 2z dz = 3x + iy dz
C C

(For the unit circle, centered at origin, (c0 , c1 ) = (0, 0) and r = 1)


I π
= (3 cos θ + i sin θ)(− sin θ + i cos θ) dθ
−π
I π
= (−4 cos θ sin θ + i(3 cos2 θ − sin2 θ)) dθ
−π
I π
= −2 sin(2θ) + i(4 cos2 θ − 1) dθ
−π
I
(∵ sin(2θ) is an odd function and cos2 θ = θ2 + cos(2θ)
4 + c)
π
= 0 + [i2θ + i cos(2θ) − iθ]−π = i(π + 1) − i(−π + 1) = i2π

Thus, the value of the given line integral is 2πi . (Ans.)

Q.39. Evaluate C 2z+5


H
z dz where C is the lower half of the circle |z| = 2.

2z + 5 5 5z
I I I
I= dz = 2+ dz = 2+ 2 dz
C z C z C |z|
(Multiplying and dividing by z)
5
I I
= 2 + 2 z dz = 2 + 1.25z dz
C 2 C

∵ (For the given curve |z| = 2)


(Assuming we travel in the counterclockwise direction, θ ∈ [−π, 0])
I 0
= (2 + (1.25)(2 cos θ − 2i sin θ)) 2(− sin θ + i cos θ) dθ
−π
I 0
= (4 + 5 cos θ − 5i sin θ)(− sin θ + i cos θ) dθ
−π
I 0
= −5(cos
(−4 sin θ( (( ((
θ sin θ +(
5 cos
(( ((
θ sin θ + 4i cos θ + 5i cos2 θ + 5i sin2 θ) dθ
−π
I 0
= 4(− sin θ + i cos θ) + 5i dθ
−π
0 0
= 4 [cos θ + i sin θ]−π + 5i [θ]−π = 4(2) + 5iπ

Thus, the solution to the given line integral is 8 + 5πi . (Ans.)


dz
H
Q.40. Evaluate C (z−3) 4 where C : |z − 3| = 4.

Let w := rw eiϕ ∈ C such that z − 3 = w. Then the associated circle |z − 3| = 4 becomes

Page 77 of 98
|w| = 4 and the infinitesimal dz becomes dw. (∵ w = z − 3 =⇒ dw = dz)
I I
dz dw
I= =⇒
C (z − 3)4 C′ w4
I ϕ=π I π
iϕ −4 −4
= (rw · e ) dϕ = 4 (e−4iϕ ) dϕ
ϕ=−π −π
" #π
1 e−4iϕ i i
= = (e−i4π − ei4π ) = (1 − 1)
256 −4i 1024 1024
−π

I=0

Thus, the solution to the given line integral is 0 . (Ans.)


H
Q.41. Evaluate C z dz where C is the left half of the unit circle from z = −i to z = i.
I I θ=π/2
I= z dz = (e−iθ ) d(eiθ )
C θ=3π/2
I π/2 I π/2
= ie−iθ eiθ dθ = i 1 dθ
3π/2 3π/2
π/2
= i[θ]3π/2 = −iπ

Thus, the solution to the given line integral is −iπ . (Ans.)

4.4.2 Cauchy’s Theorem


Types of Curves

To study Cauchy’s theorem, we must know about the types of curves we can encounter in the
complex plane:

Figure 4.6: (From left to right) Simple open, Simple closed, not simple but closed

Figure 4.7: (From left to right) Simple connected, multiple connected

Page 78 of 98
Theorem 4. If a function f (z) is analytic and its derivative f ′ (z) is continuous at each point
within and on a simple closed curve C then the integral of f (z) along the closed curve C is zero.
I
f (z) dz = 0 (4.15)
C

4.4.3 Cauchy’s Integral Formula


If f (z) is analytic within and on a simple closed curve C and z0 is any point within C then
I
f (z)
dz = 2πi · f (z0 ) (4.16)
C z − z0

Q.42. Evaluate the integral of 1/z along the curve |z − 2| = 2.


The given function f (z) = 1/z is not defined for z = 0 which is in the given region
|z − 2| = 2. Since a circle is a simple closed curve, we can apply Cauchy’s integral formula
g(z)
= w1 =⇒ g(z) = 1. From eq. (4.16), C f (z) dz = 2πi · g(z0 ) = 2πi · 1 =
H
for f (z) = z−z 0

2πi.
Thus, the solution to the given line integral is 2πi . (Ans.)
z
H
Q.43. Evaluate C z2 −3z+2 dz where C : |z − 1| = 1/2.
z
The given function f (z) = z 2 −3z+2 is not defined for z = 1 and z = 2 out of which, z = 1
does exist in the given region |z − 1| = 1/2. Since a circle is a simple closed curve, we can
apply Cauchy’s integral formula for f (z) = g(z)/z − z0 = z/(z − 1)(z − 2) =⇒ g(z) = z/z − 2.
H
From eq. (4.16), C f (z) dz = 2πi · g(z0 ) = 2πi · g(1) = 2πi · 1/ − 1 = −2πi.

Thus, the solution to the given line integral is −2πi . (Ans.)


z
H
Q.44. If f (z) = z2 −5z+6 , evaluate C f (z) dz where C : |z − 2| = 2.
z
The given function f (z) = z 2 −5z+6 is not defined for z = 2 and z = 3 both of which
exist in the given region |z − 2| = 2. Since a circle is a simple closed curve, we can apply
Cauchy’s integral formula:

z z
I I
I= dz = dz
z 2 − 5z + 6 (z − 2)(z − 3)
(Decompose it into partial fractions)
2 3
I
= − + dz
C z−2 z−3
= 2πi · g(z0 ) + 2πi · h(z0 )
= −4πi + 6πi
I = 2πi

Thus, the solution to the given line integral is 2πi . (Ans.)

4.4.4 Generalized Cauchy’s Integral Formula


If f (z) is analytic within and on a simple closed curve C and z0 is any point in the region then

2πi dn f
I
f (z)
n+1 dz = · (z0 ) (4.17)
C (z − z0 ) n! dz n

Page 79 of 98
z
H
Q.45. Evaluate C z 2 −4z+4
dz where C : |z − 2| = 3.
From eq. (4.17),
!
f (z)
I I
z z
I= dz = dz Which is of the form:
C z 2 − 4z + 4 C (z − 2)2 (z − z0 )n+1

2πi df
= · = 2πi · (1) = 2πi
1! dz

Thus, the solution to the given line integral is 2πi . (Ans.)

Page 80 of 98
Chapter 5
Elementary Number Theory

5.1 Foundations
5.1.1 Integers
The set of integers consists of the numbers zero (0), counting numbers (1, 2, 3, . . . ) and their
additive inverses (−1, −2, −3, . . . ). It is denoted as Z or Z.

Z := {. . . , −3 − 2 − 1, 0, 1, 2, 3, . . . }

• The set of integers under the addition (+) operation is closed, associative, has an identity
element (0), has inverses and commutative.

• The set of integers under the multiplication (×) operation is closed, associative, has an
identity element (1) and commutative.
Moreover, multiplication distributes over addition in Z.
The set of positive integers is denoted by Z+ or N (natural numbers) and negative integers by
Z− . A discrete range {a, a + 1, a + 2, . . . , b − 2, b − 1, b} is denoted by Ja, bK.
In this chapter, every variable is assumed to be an integer unless stated otherwise.

5.1.2 Divisibility
For two integers a ̸= 0 and b, if b = aq for some integer q, then a is said to be a factor or divisor
of b, and b is said to be divisible by a or a multiple of a. This is denoted by a|b, which is read
as “a divides b”.

a|b ⇐⇒ ∃q ∈ Z : b = aq (5.1)

If b = aq + r for some integers q and r ∈ J1, |a| − 1K, then b is said to be not divisible by a and
is denoted by a|b.

a|b ⇐⇒ ∃q ∈ Z ∃r ∈ J1, |a| − 1K : b = aq + r (5.2)

Page 81 of 98
In the expression b = aq + r, b is called the dividend, a is called the divisor, q is called the
quotient and r is called the remainder.
Note:
The set [a, b] ∧ Z is referred to as Ja, bK commonly throughout this book.

Q.1. Prove that the square of an even number is even and the square of an odd number is odd.
Proof. We will prove the statement by cases.

Case 1 Consider an even number a := 2q. Squaring both sides, we obtain a2 = 4q 2 =


2(2q 2 ), which is even.

∴ The square of an even number is even. −→ (i)


Case 2 Consider an odd number a := 2q + 1. Squaring both sides, we obtain a2 =
4q 2 + 4q + 1 = 2(2q 2 + 2q) + 1, which is odd.
∴ The square of an odd number is odd. −→ (ii)

From (i) and (ii), we can deduce that the square of an even number is even and the square
of an odd number is odd.
Q.2. Prove that the square of an odd integer can be written in the form 8k + 1.
Proof. Consider an odd integer a := 2q + 1. Squaring both sides, we obtain a2 = 4q 2 + 4q +
1 = 4q(q + 1). Since q and q + 1 are consecutive integers, q(q + 1) is even, i.e. q(q + 1) := 2k
for some integer k.
Therefore, a2 = 4 · 2k + 1 = 8k + 1 =⇒ the square of any odd integer can be written in
the form 8k + 1.

Q.3. Prove that the cube of any integer has one of the forms: 9k, 9k + 1 or 9k + 8.
Proof. We will prove the statement by cases.
Case 1 Consider any integer a of the form a := 3q. Cubing both sides, we obtain a3 =
27q 3 = 9(3q 3 ) := 9k for some integer k −→ (i).
Case 2 Consider any integer a of the form a := 3q + 1. Cubing both sides, we obtain
a3 = 27q 3 + 27q 2 + 9q + 1 = 9(3q 3 + 3q 2 + q) + 1 := 9k + 1 for some integer
k −→ (ii).
Case 3 Consider any integer a of the form a := 3q + 2. Cubing both sides, we obtain
a3 = 27q 3 + 54q 2 + 36q + 1 = 9(3q 3 + 6q 2 + 4q) + 8 := 9k + 8 for some integer
k −→ (iii).

From (i), (ii) and (iii), we can deduce that the cube of any integer has one of the forms:
9k, 9k + 1 or 9k + 8.

5.1.3 Prime Numbers


A prime number is a positive integer that has exactly two distinct positive factors — 1 and itself.
The first few primes are 2, 3, 5, 7, 11, . . .
A number that has more than two distinct positive divisors is said to be composite. The first
few composites are 4, 6, 8, 9, 10, . . .

Page 82 of 98
1 is said to be neither prime nor composite, as it has exactly one positive divisor.

p is prime ⇐⇒ p > 1 ∧ (∀d ∈ N d|p =⇒ d = 1 ∨ d = p)

Cardinality of Primes

There are infinitely many prime numbers. It can be proven using Euclid’s proof.

Proof. Suppose for contradiction, there are only r prime numbers: p1 , p2 , . . . , pr .


Qr
Construct the integer n := i=1 pi + 1 = p1 p2 · · · pr + 1. Since by assumption there are only
finitely many primes, n must be composite, i.e. at least one prime pi must divide n.
 Qr pj

j=1
Comparing with b = aq +r, n is the dividend, each pi can be considered a divisor, each pi
can be considered a quotient and 1 is the remainder.

However, from eq. (5.2) we can see that ∀pi pi |n. Therefore, there are no factors of n other than
1 and itself =⇒ n is prime.
Now if n is added to the set of primes {p1 , p2 , . . . , pr , n}, we can iterate this process and always
produce more prime numbers. Therefore, there are infinitely many primes.

5.2 Modular Arithmetic


5.2.1 Modulus Operator
The modulus operator is used to find the remainder when an integer a is divided by another
integer b. It is denoted as a % b or a mod b, and is read as “a modulo b”.

Modulo m, integers can be imagined as a clock with m numbers labeled from 0 to m − 1. This
is because the remainders cycle every m numbers.

0 0
11 1
10 2

9 3

2 1 8 4
7 5
6
Figure 5.1: Modulo 3 Figure 5.2: Modulo 12

Q.4. Show that 3a2 − 1 is not a perfect square.


Proof. Observe that (3a2 − 1) mod 3 = (3(a2 − 1) + 2) mod 3 = 2.
Now, consider squaring any integer modulo 3:

(3q)2 mod 3 = 9q 2 mod 3 = 0 −→ (i).


(3q + 1)2 mod 3 = (9q 2 + 6q + 1) mod 3 = (3(3q 2 + 2q) + 1) mod 3 = 1 −→ (ii).
(3q + 2)2 mod 3 = (9q 2 + 12q + 4) mod 3 = (3(3q 2 + 4q + 1) + 1) mod 3 = 1 −→ (iii).

Page 83 of 98
From (i), (ii) and (iii), we can deduce that the square of any integer has a remainder of
either 0 or 1 modulo 3, i.e. 3a2 − 1 cannot be the square of any integer.

5.2.2 Properties of Divisibility


For all integers a, b, c and d the following properties hold:

• a|0, 1|a, a|a


• a|1 ⇐⇒ a = ±1
• (a|b ∧ c|d) =⇒ ac|bd
• (a|b ∧ b|c) =⇒ a|c

• (a|b ∧ b|a) =⇒ |a| = |b|


• (a|b ∧ b ̸= 0) =⇒ |a| ≤ |b|
Proof.

a|b =⇒ b := aq q ̸= 0
∴|b| = |aq|, |q| ≥ 1
∴|b| = |a||q|, |q| ≥ 1 =⇒ |b| ≥ |a|

• (a|b ∧ c|d) =⇒ ∃x ∃y : a|bx + cy

Proof. a|b ⇐⇒ b := aq1 , a|c ⇐⇒ c := aq2


Now, bx + cy = aq1 x + aq2 y = a(q1 x + q2 y) =⇒ a|bx + cy .
Corollary 1. If a|bk , where k = 0, 1, 2, . . . , n then a|(b0 x0 + b1 x1 + · · · + bn xn ).

5.2.3 Division Algorithm


Given integers a and b (not both zero). Then there exists a unique ordered pair of integers q
and r such that:
a = bq + r, 0 ≤ r < b

Proof. Consider a set S := {a − bx : a − bx ≥ 0 ∧ x ∈ Z}.


S is a non-empty set of non-negative integers.

Therefore, by the well-ordering principle, there exists a least element r ∈ S. So, r = a − bq ≥ 0.


We need to show that r < b. For contradiction, assume r ≥ b.
Now,

r≥b
=⇒ r − b ≥ 0
=⇒ a − bq − b ≥ 0
=⇒ a − (q + 1)b ≥ 0 =⇒ a − (q + 1)b ∈ S

Or, r − b ∈ S.

Page 84 of 98
∵ b > 0, r − b < r, which implies that r − b is an element lesser than r, which contradicts our
claim that r is the least element.
Therefore, by contradiction r < b.
Now to prove uniqueness:
Let us consider that for the integers a and b, there exist two pairs of ordered pairs (q1 , r1 ) and
(q2 , r2 ) such that a = bq1 + r1 and a = bq2 + r2 where r1 , r2 ∈ [0, b).

Consider the integer |r1 − r2 |,

|r1 − r2 | = |bq1 − bq2 |


= |b||q1 − q2 |
= b|q1 − q2 |

But we know that r1 , r2 ∈ J0, b − 1K. Therefore, |r1 − r2 | < b.

∵ |r1 − r2 | < b
∴ b|q1 − q2 | < b
∴ |q1 − q2 | < 1

But since, q1 , q2 ∈ Z. Therefore the only possibility is |q1 − q2 | = 0 =⇒ (q1 = q2 ) ∧ (r1 = r2 ).

Hence, we have proved the uniqueness as well.

Theorem 5. Given two non-zero integers a and b, there exist integers x and y such that ax+by =
gcd(a, b).

Proof. Consider a set S := {ax + by : ax + by > 0 ∧ x, y ∈ Z}.


Therefore, by the well-ordering principle, there exists a least element d ∈ S. So d = au + bv > 0.
Now we must prove that d = gcd(a, b).
By the division algorithm,

a = dq + r ,0 ≤ r < d
r = a − dq
= a − (au + bv)q
= a(1 − uq) + b(vq)
∴r∈S ∵r≥0

But since r < d, it means that r is the minimum element of S, which contradicts our original
assumption. It means that the only possible value of r is 0 i.e. r ∈
/ S.

This means that a = dq, i.e. d|a. Similarly, we can prove that d|b.
Any other common divisor of a and b of the form ax + by which is greater than d does not exist.
We can easily prove this by contradiction.
Thus, we have proved that the gcd(a, b) is of the form ax + by where x and y are integers,

Corollary 2. If the greatest common divisors of a and b is equal to 1, then there exist integers

Page 85 of 98
x and y such that ax + by = 1.

5.3 Greatest Common Divisor


If a and b are arbitrary integers then a positive integer d is said to be the Greatest Common
Divisor (GCD) (or Highest Common Factor (HCF)) of a and b if d satisfies the following condi-
tions:
(i) d|a ∧ d|b
(ii) c|a ∧ c|b =⇒ c ≤ d
We denote this as gcd(a, b) = d.

Q.5. Evaluate gcd(−12, 30). Prime factorizing,

|−12| = 12 = 2 × 2 × 3
|30| = 30 = 2 × 3 × 5

Therefore, the GCD of −12 and 30 is 2 × 3 = 6 . (Ans.)

If gcd(a, b) = 1, then a and b are said to be relatively prime or coprime.


Theorem 6. If a|c and b|c with gcd(a, b) = 1, then ab|c.

Proof. Since a|c, we can say that c = ak1 and since b|c, we can say that c = bk2 .

Let a = p11 p12 p13 · · · p1n and b = p21 p22 p23 · · · p2m . Where pij ; i ∈ {1, 2}; j ∈ J1, max(m, n)K
are the prime factors of a and b.
Now we can say that
n
k
Y
c = pk1111 pk1212 pk1313 · · · pk1n1n × k1 = k1 × p1jij
j=1
m
k
Y
c = pk2121 pk2222 pk2323 · · · pk2m
2m
× k2 = k2 × p2j2j
j=1

Since, gcd(a, b) = 1, it means that none of the prime factors of a are equal to prime factors of b.
It means that:
   
n m
k k
Y Y
c= p1j1j  ×  p2j2j  × k3
j=1 j=1

∴ c = abk3
∴ ab | c

Hence, we have proved that ab|c.

5.3.1 Properties of the GCD


Least Common Multiple (LCM)

a×b
lcm (a, b) = (5.3)
gcd (a, b)

Page 86 of 98
Euclid’s Lemma

If a|bc with gcd(a, b) = 1 then a|c.

Proof.

∵ a|bc
∴ bc = ak1

But, for any k2 ∈ N.

∵ gcd(a, b) = 1
=bk2 ∧ b
∴ a =ak2

Therefore, we can say c = ak3 i.e. a|c.

5.3.2 Euclidean Algorithm


Let a and b be two integers whose greatest common divisor d can be obtained by applying the
division algorithm repeatedly applying the division algorithm to a and b.

gcd(a, b) = gcd(b, a mod b) (5.4)

a = bq + r ,0 ≤ r < b
∴ b = rq1 + r1 , 0 ≤ r1 < r
∴ r = r1 q2 + r2 , 0 ≤ r2 < r1
∴ r1 = r2 q3 + r3 , 0 ≤ r3 < r2
..
.
∴ rn−1 = rn qn+1 + rn+1 , 0 ≤ rn+1 < rn
∴ rn = rn+1 qn+2 , for some n + 2, rn+2 = 0
∴ gcd(a, b) = rn+1

5.4 Linear Diophantine Equations


The equation
ax + by = c, x, y ∈ Z+ ∪ {0}

where a, b and c are integers is called a linear Diophantine equation.


Theorem 7. The linear Diophantine equation ax + by = c has a solution iff d|c where d :=
gcd(a, b).
If (x0 , y0 ) is any particular solution of ax + by = c then all other solutions are given by:
 
b a
x := x0 + t, y := y0 − t
d d

where t is an arbitrary integer.

Page 87 of 98
Q.6. A theater charges $1.80 for adult admission and $0.75 for children admission for a particular
evening; the total receipts were $90. Assuming more adults than children were present,
how many were there in the theater?
Suppose there were x adults and y children. According to the given conditons,

1.80x + 0.75y = 90, x > y ≥ 0

or, 180x + 75y = 9000


or, 12x + 5y = 600 −→ (i).
By Euclidean algorithm,

12 = 5 × 2 + 2
5=2×2+ 1
2= 1 ×2

Therefore, gcd(12, 5) = 1.
Expressing the GCD as a linear combination of a and b,

1=5−2×2
= 5 − (12 − 5 × 2) × 2
= 5 − 12 × 2 + 5 × 4
∴ 1 = 5(5) + 12(−2)

Therefore, 12(−2) + 5(5) = 1 =⇒ 12(−1200) + 5(3000) = 600 −→ (ii).


From (i) and (ii), x0 = −1200, y0 = 3000
The solution to the Diophantine equation is given by,
   
5 12
x = x0 + t y = y0 − t
1 1
∴ x = −1200 + 5t ∴ y = 3000 − 12t

Given x > y ≥ 0

x>y y≥0
∴ −1200 + 5t > 3000 − 12t ∴ 3000 − 12t ≥ 0
∴ 17t > 4200 ∴ 12t ≤ 3000
4200 3000
∴t> =⇒ t ≥ 248 ∴t≤ =⇒ t ≤ 250
17 12

Page 88 of 98
Therefore, 248 ≤ t ≤ 250 =⇒ t ∈ {248, 249, 250}.

t = 248 =⇒ x = −1200 + 5 × 248 = 40


y = 3000 − 12 × 248 = 24

t = 249 =⇒ x = −1200 + 5 × 249 = 45


y = 3000 − 12 × 249 = 12

t = 250 =⇒ x = −1200 + 5 × 250 = 50


y = 3000 − 12 × 250 = 0

There were 40 adults and 24 children , or 45 adults and 12 children ,


or 50 adults and no children in the theater. (Ans.)

5.5 Fundamental Theorem of Arithmetic


Theorem 8. Every positive integer greater than 1 is either prime or a product of primes, whose
representation is unique upto the order of the factors in the product.

r r
n := pk11 pk22 pk33 · · · pkr r , where {ki }i=1 are positive integers and {pi }i=1 are primes.

Q.7. Find the GCD of 33 · 52 · 7 and 23 · 32 · 5 · 72 .

Since, we already have the numbers present in the form of prime factors. We can make
use of these to find the GCD of these two numbers.
We just take the intersection of the prime factors of the two numbers.

gcd(33 · 52 · 7, 23 · 32 · 5 · 72 ) = 2min(3,0) · 3min(3,2) · 5min(2,1) · 7min(1,2)


= 20 · 32 · 5 · 7
= 315

The GCD of the two numbers is 315 . (Ans.)

5.6 Theory of Congruences


Let n be a fixed positive integer. Two integers a and b are said to be congruent modulo n iff
n|(a − b) and is written as a ≡ b (mod n).

5.6.1 Properties of Congruence


• a ≡ a (mod n) (Reflexitvity)
• a ≡ b (mod n) =⇒ b ≡ a (mod n) (Symmetricity)

• a ≡ b (mod n) ∧ b ≡ c (mod n) =⇒ a ≡ c (mod n) (Transitivity)

Page 89 of 98
• a ≡ b (mod n) ∧ c ≡ d (mod n) =⇒ a + c ≡ b + d (mod n)
Corollary 3. a ≡ b (mod n) =⇒ a + c ≡ b + c (mod n)
• a ≡ b (mod n) ∧ c ≡ d (mod n) =⇒ ac ≡ bd (mod n)
Corollary 4. a ≡ b (mod n) =⇒ ac ≡ bc (mod n)
• a ≡ b (mod n) =⇒ ∀k ∈ N : ak ≡ bk (mod n)

Proof.

a ≡ b(mod n) =⇒ n|a − b
=⇒ n| (a − b) × ak−1 + ak−2 b + · · · + abk−2 + bk−1

| {z }
∈Z
k k

=⇒ n| a − b
∴ a ≡ b(mod n) =⇒ ak ≡ bk (mod n)

Q.8. Show that 41|240 − 1.


To show that 41|240 − 1, we must show that 240 ≡ 1 (mod 41).

Consider, 210 ≡ 1024 (mod 41)


Taking the remainder of 1024/41
∴ 210 ≡ 40 (mod 41)
∴ 210 ≡ −1 (mod 41)
∴ 220 ≡ (−1)2 (mod 41)
∴ 220 ≡ 1 (mod 41)
∴ 240 ≡ 12 (mod 41)
∴ 240 ≡ 1 (mod 41)

Thus, from the definition of the modular congruency, it follows that 41|240 − 1 (Ans.)
Q.9. Show that 97|248 − 1.

To show that 97|248 − 1, we must show that 248 ≡ 1 (mod 97).

Consider, 212 ≡ 4096 (mod 97)


Taking the remainder of 4096/97
∴ 212 ≡ 22 (mod 97)
∴ 224 ≡ 222 (mod 97)
∴ 224 ≡ 484 (mod 97)
Taking the remainder of 484/97
∴ 224 ≡ 96 (mod 97)
∴ 224 ≡ −1 (mod 97)
∴ 248 ≡ (−1)2 (mod 97)
∴ 248 ≡ 1 (mod 97)

Thus, from the definition of the modular congruency, it follows that 97|248 − 1 (Ans.)
Q.10. Show that 89|244 − 1.

Page 90 of 98
To show that 89|244 − 1, we must show that 244 ≡ 1 (mod 89).

Consider, 211 ≡ 2048 (mod 89)


Taking the remainder of 2048/89
∴ 211 ≡ 1 (mod 89)
∴ 244 ≡ 14 (mod 89)
∴ 244 ≡ 1 (mod 89)

Thus, from the definition of the modular congruency, it follows that 89|244 − 1 (Ans.)

5.6.2 Residue Classes


The residue class or congruence class of n is denoted as [n] and is congruent to the set
{0, 1, 2, . . . , |n| − 1} modulo n.

5.6.3 Solution of Linear Congruences


ax ≡ b (mod n)

ax ≡ b (mod n) =⇒ n|ax − b
=⇒ ax − b = ny For some y ∈ N
=⇒ ax − ny = b

Now, we must just find a solution to the above linear diophantine equation and the value of x
is our required answer.
Q.11. Solve:

(i) 25x ≡ 15 (mod 29)


We need to find the solution of the linear Diophantine Equation 25x − 29y = 15.
By Euclidean Algorithm,

29 = 1 × 25 + 4
25 = 6 × 4 + 1
4=4× 1

Therefore, gcd(25, 29) = 1.

Expressing the GCD as a linear combination of a and b.

1 = 25 − 4(6)
= 25 − (29 − 25)(6)
∴ 1 = 25(7) − 29(6)
Multiply by 15
∴ 15 = 25(105) − 29(90)

x0 = 105 is a solution of the above equation.

Page 91 of 98
To find a solution in the complete residue class of 29. Consider,

b
x = x0 + ·t x ∈ J0, n − 1K ∧ t ∈ Z
d
− 29
= 105 + ·t x ∈ J0, 28K
1
= 105 − 29t x ∈ J0, 28K

Therefore we say,

0 ≤ 105 − 29t ≤ 28
∴ −105 ≤ −29t ≤ −77
105 77
∴ ≥t≥
29 29
∴ 2.655 ≤ t ≤ 3.620
∴t=3

We now substitute, t = 3 in our above equation for x.

x = 105 − 29(3)
= 105 − 87
∴ x = 18

We can easily verify that the 25×18 is congruent to 15 (mod 29) and that the answer
is x = 18 . (Ans.)
(ii) 6x ≡ 15 (mod 21)
We need to find the solution of the linear Diophantine Equation 6x − 21y = 15.

By Euclidean Algorithm,

21 = 3 × 6 + 3
6=2× 3

Therefore, gcd(6, 21) = 3.


Expressing the GCD as a linear combination of a and b.

3 = 21 − 6(3)
Multiply by 15
15 = 21(5) − 6(15)
15 = 6(−15) − 21(−5)

x0 = −15 is a solution of the above equation.

Page 92 of 98
To find a solution in the complete residue class of 21. Consider,

b
x = x0 + ·t x ∈ J0, n − 1K ∧ t ∈ Z
d
− 21
= −15 + ·t x ∈ J0, 20K
3
= −15 − 7t x ∈ J0, 20K

Therefore we say,

0 ≤ −15 − 7t ≤ 20
∴ 15 ≤ −7t ≤ 35
15 35
∴ ≥t≥
−7 −7
∴ −5 ≤ t ≤ −2.142
∴ t ∈ {−5, −4, −3}

We now substitute, t = −5, −4 and −3 in our above equation for x.

x = −15 − 7(−5)
= −15 + 35
∴ x = 20

x = −15 − 7(−4)
= −15 + 28
∴ x = 13

x = −15 − 7(−3)
= −15 + 21
∴x=6

We can easily verify that 6 × 6, 6 × 13 and 6 × 20 are congruent to 15 (mod 21).


Thus, the answer is x ∈ {6, 13, 20} . (Ans.)

5.6.4 Chinese Remainder Theorem (CRT)


The Chinese Remainder Theorem (CRT) helps us to solve systems of linear equations of the
form

x ≡ b1 (mod n1 )
x ≡ b2 (mod n2 )
..
.
x ≡ br (mod nr )
where ∀i, j; i ̸= j; gcd(ni , nj ) = 1

Page 93 of 98
There exists a solution of the form x = b1 N1 x1 + b2 N2 x2 + · · · + br Nr xr where

r
N Y
Ni = N = n1 × n2 × · · · × nr = ni
ni i=1

and xi is the inverse of Ni (mod ni ) i.e., Ni xi ≡ 1 (mod ni )

which is unique modulo the integer n1 n2 n3 · · · nr

Q.12. Solve the system: x ≡ 2 (mod 3), x ≡ 4 (mod 5), x ≡ 2 (mod 7).
The pairwise GCDs of 3, 5 and 7 are all 1. Thus we can apply the CRT.
There exists a solution of the form x = b1 N1 x1 + b2 N2 x2 + · · · + br Nr xr . The value of
N is 3 × 5 × 7 = 105. Thus, the value of Ni are N1 = 105/3 = 35, N2 = 105/5 = 21 and
N3 = 105/7 = 15.
Now let us find the values of xi ,

N1 x1 ≡ 1 (mod n1 ) N2 x2 ≡ 1 (mod n2 ) N3 x3 ≡ 1 (mod n3 )


∴ 35x1 ≡ 1 (mod 3) ∴ 21x2 ≡ 1 (mod 5) ∴ 15x3 ≡ 1 (mod 7)
∴ x1 ≡ 2 (mod 3) ∴ x2 ≡ 1 (mod 5) ∴ x3 ≡ 1(mod 7)

Now the solution

x = b1 N1 x1 + b2 N2 x2 + b3 N3 x3
= (2 · 35 · 2) + (4 · 21 · 1) + (2 · 15 · 1)
= 140 + 84 + 30
x = 254

Thus, the solution of the above system of linear congruences is x = 254 (mod 105) or
x ≡ 44 (mod 105) . (Ans.)

5.6.5 System of Linear Congruences


Consider a set of linear equations,

a1 x ≡ b1 (mod n1 )
a2 x ≡ b2 (mod n2 )
..
.
ar x ≡ br (mod nr )

We can solve this system of linear congruency as follows,


1. First solve the independent linear congruency and find their solutions congruent modulo
their respective ni .

Page 94 of 98
2. We will receive the system of linear congruency as follows:

x ≡ B1 (mod n1 )
x ≡ B2 (mod n2 )
..
.
x ≡ Br (mod nr )

3. Now we can make use of the CRT to evaluate this system of linear congruency.

5.6.6 Linear Congruences in Two Variables


Consider the system of linear congruency as follows:

ax + by ≡ r (mod n)
cx + dy ≡ s (mod n)

It has a unique solution if gcd(ad − bc, n) = 1.


Q.13. Find the solution of the system of linear congruences,

2x + 3y ≡ 1 (mod 7) (1)
5x + 9y ≡ 3 (mod 7) (2)

Since, the GCD of ad − bc = 2 × 9 − 3 × 5 = 18 − 15 = 3 and n = 7 is 1. There exists a


unique solution to this system of linear equations.

Multiplying (1) by 3: 6x + 9y ≡ 3 (mod 7) (3)


Subtracting (2) from (3):

(6x + 9y) − (5x + 9y) ≡ (3 − 3) (mod 7)


∴ x ≡ 0 (mod 7)

Now substituting this value in (1). Thus,

7(0) + 3y ≡ 1 (mod 7)
∴ 3y ≡ 1 (mod 7)
∴ y ≡ 5 (mod 7)

Q.14. Prove that the numbers of the sequence 1, 11, 111, 1111, . . . are not perfect squaares.
Each integer of the sequence is of the form n ≡ 3 (mod 4) i.e. n = 4k + 3 where k ∈ Z.
The relation congruent modulo 4 partitions the set of integers into the 4 equivalence classes
{n : n = 4k, k ∈ Z}, {n : n = 4k + 1, k ∈ Z}, {n : n = 4k + 2, k ∈ Z} and {n : n =
4k + 3, k ∈ Z}.

Page 95 of 98
Consider the squares of all these forms,

n2 = (4k)2 = 16k 2 = 4(4k 2 ) = 4c0 c0 ∈ Z


2 2 2 2
n = (4k + 1) = 16k + 8k + 1 = 4(4k + 2k) + 1 = 4c1 + 1 c1 ∈ Z
n2 = (4k + 2)2 = 16k 2 + 16k + 4 = 4(4k 2 + 4k + 1) = 4c2 c2 ∈ Z
2 2 2 2
n = (4k + 3) = 16k + 24k + 9 = 4(4k + 4k + 2) + 1 = 4c3 + 1 c3 ∈ Z

None of the squares of integers are congruent to 3 modulo 4. Hence, we can say that no
elements of the sequence are perfect squares.

Page 96 of 98
Acronyms

CR Cauchy-Riemann 59–64, 71
CRT Chinese Remainder Theorem 93–95

GCD Greatest Common Divisor 86, 88, 89, 91, 92, 94, 95

HCF Highest Common Factor 86

Jojo Aaditya Joil 1

LCM Least Common Multiple 86


LFT Linear Fractional Transformation 71–73

MCE Mathematics for Computer Engineers 1

NLA Numerical Linear Algebra 5

QF Quadratic Form 22, 23, 25, 26

RRG Rupak R. Gupta 1

Page 97 of 98
The End.

You might also like