0% found this document useful (0 votes)
97 views101 pages

Block 5

Block 5 bchct

Uploaded by

Hkgihijeh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views101 pages

Block 5

Block 5 bchct

Uploaded by

Hkgihijeh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

BMTE-141

LINEAR ALGEBRA
Indira Gandhi National Open University
School of Sciences

Block

5
INNER PRODUCTS AND QUADRATIC FORMS
UNIT 14 5
Inner Product Spaces

UNIT 15 23
Hermitian and Unitary Operators

UNIT 16 47
Real Quadratic Forms

UNIT 17 74
Conics

Miscellaneous Exercises 97
Curriculum Design Committee
Prof. J.N. Kapur (Chairmain) Prof. M.S. Narasimhan
Jawaharlal Nehru University Tata Institute of Fundamental Research
New Delhi Bombay
Prof. Izhar Husain Prof. (Mrs.) A.R. Singal
Dept. of Mathematics Dept. of Mathematics
Aligarh Muslim University Meerut University
Aligarh
Prof. D.N. Misra Prof. M.P. Singh
Council of Scientific & Industrial Research I.I.T., Delhi
New Delhi

Course Preparation Committee


Prof. Izhar Husain (Editor) Prof. Parvin Sinclair
Dept. of Mathematics School of Sciences
Aligarh Muslim University IGNOU, New Delhi
Aligarh
Prof. M.Mohsin Dr. Manik Patwardhan
Dept. of Mathematics School of Sciences
Aligarh Muslim University IGNOU, New Delhi
Aligarh
Prof. K.D. Singh
Dept. of Mathematics & Astronomy
Lucknow University
Lucknow

Course Coordinators: Dr. S. Venkataraman

Acknowledgement: To Dr. S. Venkataraman for his comments on the manuscript. To Sh.


Prashant Kumar for the word processing. The material was adopted by Dr. S. Biswas from
Block 4, Linear Algebra (MTE-02) of the Bachelor’s Degree Programme.

Disclaimer – Any materials adapted from web-based resources in this course are being used for
educational purposes only and not for commercial purpose.

April, 2022
© Indira Gandhi National Open University, 2021
ISBN-
All right reserved. No part of this work may be reproduced in any form, by mimeograph or any other
means, without permission in writing from the Indira Gandhi National Open University.
Further information on the Indira Gandhi National Open University courses, may be obtained from the
University’s office at Maidan Garhi, New Delhi-110 068 and IGNOU website www.ignou.ac.in.
Printed and published on behalf of the Indira Gandhi National Open University, New Delhi by
Prof. Sujatha Varma, Director, School of Sciences.

2
BLOCK 5 INNER PRODUCTS AND QUADRATIC
FORMS
This is the last block of this course. It deals with the interesting properties of a special
class of vector spaces which are known as inner product spaces. In this block the
only vector spaces we consider will be over R or C.

In the first unit (Unit 14) we introduce the basic notion of the inner product of two
vectors, along with its properties. This product helps us in introducing the well-known
geometrical notions of lengths and angles between vectors. We go on to discuss the
concept of orthogonality and the solution of the basic problem of the existence of an
orthonormal basis in a finite-dimensional inner product space.

The second unit (Unit 15) deals with the problem of characterising linear functionals in
inner product spaces. We show that such functionals are represented as inner
products. This further helps us in proving the existence of a unique adjoint for every
given operator. Some interesting relations between an operator and its adjoint lead us
to define self-adjoint and unitary operators. We also establish some of their properties.
Then we introduce you to Hermitian, unitary and orthogonal matrices and the concept
of orthogonal similarity.

In Units 16 and 17 we deal with real vector spaces only. The purpose of these two units
is to use the methods of linear algebra that you have studied in the course so far, to
2
reduce quadratic forms in and 3 into simpler forms. In Unit 17 you will study
various conics in detail.

What you study in these units will stand you in good stead in various mathematics
courses, particularly in geometry and mechanics.

In case you are interested in knowing more about the material covered in this block,
you may look up the books that we have given in the course introduction.

3
UNIT 14

INNER PRODUCT SPACES


Structure
Page Nos.

14.1 Introduction 5
Objective
14.2 Inner Product 6
14.3 Norm of a Vector 10
14.4 Orthogonality 13
14.5 Summary 19
14.6 Solution/Answers 19

14.1 INTRODUCTION
So far you have studied many interesting vector spaces over various field. In
this unit, and the following onces, we will only consider real and complex
vector spaces. In Unit 2 you studied geometrical notions like the length of a
vector, the angle between two vectors and the dot product in ℝ2 or ℝ3 . In this
unit we carry these concepts over to a more general setting. We will define a
certain special class of vector spaces which open up new and interesting vistas
for investigations in mathematics and physics. Hence their study is extremely
fruitful as far as the applications of the theory to problems are concerned. This
fact will become clear in Units 16 and 17.

Before going further we suggest that you refer to Unit 2 for the definitions and
properties of the length and the scalar product of vectors of ℝ2 or ℝ3 .

Objectives
After studying this unit, you should be able to:
• define and give examples of inner product spaces;

• define the norm of a vector and discuss its properties;

• define orthogonal vectors and discuss some properties of sets of orthogonal


vectors;

• obtain an orthonormal basis from a given basis of a finite-dimensional inner


product space.
5
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......

14.2 INNER RODUCT


In this section we start with defining a concept which is the generalisation of the
scalar product that we have learned before. Recall that if (x1 , x2 , x3 ) and
(y1 , y2 , y3 ) are two vectors in ℝ3 , then their scalar product is

(x1 , x2 , x3 ) ⋅ (y1 , y2 , y3 ) = x1 y1 + x2 y2 + x3 y3 .

We also remind you that given any complex number z = a + ib, where a, b ∈ ℝ, its
complex conjugate is z = a − ib.

Further, zz = |z|2 = a2 + b2 , and z = z.

Now we ready to define an inner product.

Definition 1: Let V be a vector space over the field 𝔽. A map ⟨ , ⟩ ∶ V × V → 𝔽


such that ⟨ , ⟩ (x, y) = ⟨x, y⟩ is called an inner product (or scalar product) over V
𝔽 = ℝ or ℂ if it satisfies the following conditions:

IP1) ⟨x, x⟩ ≥ 0 ∀ x ∈ V.

IP2) ⟨x, x⟩ = 0 iff x = 0.

IP3) ⟨x + y, z⟩ = ⟨x, z⟩ + ⟨y, z⟩ for all x, y, z ∈ V

IP4) ⟨𝛼x, y⟩ = 𝛼⟨x, y⟩ for 𝛼 ∈ 𝔽 and x, y ∈ V

IP5) ⟨y, x⟩ = ⟨x, y⟩ for all x, y ∈ V. (Here ⟨x, y⟩ denotes the complex conjugate of
the number ⟨x, y⟩.)

The scalar ⟨x, y⟩ is called inner product (or scalar product) of the vector x
with the vector y.

A vector space V over which an inner product has been defined is called an
inner product space, and is denoted by (V, ⟨ , ⟩).

We mark a remark here.

Remark 1: Let 𝛼 ∈ 𝔽. Then 𝛼 = 𝛼 iff 𝛼 ∈ ℝ. So IP5 implies the following


statements.

a) ⟨x, x⟩ ∈ ℝ ∀ x ∈ V, since ⟨x, x⟩ = ⟨x, x⟩.

b) If 𝔽 = ℝ, then ⟨x, y⟩ = ⟨y, x⟩ ∀ x, y ∈ V.

Now, let us examine a familiar example.

Example 1: Show that ℝ3 is an inner product space.

Solution: We need to define an inner product on ℝ3 . For this define


⟨u, v⟩ = u ⋅ v ∀u, v ∈ ℝ3 (‘.’ denoting the dot product). Then, for u = (x1 , x2 , x3 ) and
v = (y1 , y2 , y3 ) , ⟨u, v⟩ = x1 y1 + x2 y2 + x3 y3 . We want to check if ⟨ , ⟩ satisfies IP1-IP5.

6 i) IP1 is satisfied because ⟨u, u⟩ = x21 + x22 + x23 , which is always non-negative.
Unit
. . . . .14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inner
. . . . . . .Product
. . . . . . . . .Spaces
.......
ii) Now, ⟨u, u⟩ = 0 ⇒ x21 + x22 + x23 = 0 ⇒ x1 = 0, x2 = 0, x3 = 0 since the sum of
positive real number is zero if and only if each of them is zero. ∴, u = 0.
Also, if u = 0, then x1 = 0 = x2 = x3 . ∴⟨u, u⟩ = 0.
So, we have shown that IP2 is satisfied by ⟨ , ⟩.

iii) IP3 is satisfied because

⟨u + v, w⟩ = (x1 + y1 ) z1 + (x2 + y2 ) z3 + (x3 + y3 ) z3 , where w = (z1 , z2 , z3 .)


= (x1 z1 + x2 z2 + x3 z3 ) + (y1 z1 + y2 z2 + y3 z3 ) = ⟨u, w⟩ + ⟨v, w⟩.
∗∗∗

We suggest that you verify IP4 and IP5. That’s what E1 says!

In ℝn , the usual or standard inner product is defined by


⟨u, v⟩ = x1 y1 + x2 y2 + ⋯ + xn yn ∀ u, v ∈ ℝn .

Example 2: What is the value of the standard inner product of u = (5, −1, 2) and
v = (−1, 0, 1)?

Solution: ⟨u, v⟩ = (5) (−1) + (−1) (0) + (2) (1) = −3.

∗∗∗

E1) Check that the inner product on ℝ3 in Example 1 satisfies IP4 and IP5.

The inner product that have been given in Example 1 can be generalised to the
inner product ⟨ , ⟩ on ℝn , defined by ⟨(x1 , … , xn ) , (y1 , … , yn )⟩ = x1 y1 + x2 y2 + ⋯ + xn yn .
This is called the standard inner product on ℝn .

Let us consider another example now.

Example 3: Take 𝔽 = ℂ and, for x, y ∈ ℂ, define ⟨x, y⟩ = xy. Show that the map
⟨ , ⟩ ∶ ℂ × ℂ → ℂ is an inner product.

Solution: IP1 and IP2 are satisfied because, for any complex number
x, xx ≥ 0. Also, xx = 0 if and only if x = 0.

∗∗∗

To compute the solution you can try E2.

E2) Show that IP4 and IP5 are true for Example 3.

E3) Find the standard inner product of

a) u = (2, −1), v = (1, 2)


b) u = (1, 2, 3), v = (2, 1, 3)

In fact, Example 2 can be generalised to ℂn , for any n > 0. We can define the
inner product of two arbitrary vectors x = (x1 , … , xn ) and y = (y1 , … , yn ) ∈ ℂn by
n
⟨x, y⟩ = ∑ xi yi . This inner product is called the standard inner product on ℂn .
i=1 7
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
The next example deals with a general complex vector space.

Example 4: Let V be a complex vector space of dimension n. Let B = {e1 , … , en }


be a basis of V. Given x, y ∈ V ∃ unique scalars a1 , … , an , b1 , … , bn ∈ ℂ such that
n n
x = ∑ ai ei , y = ∑ bi ei .
i=1 i=1
n
Define ⟨ x, y⟩ = ∑ ai bi .
i=1
Verify that ⟨ , ⟩ is an inner product.
n n n
Solution: Let x = ∑ ai ei , y = ∑ bi ei , z = ∑ ci ei ,
i=1 i=1 i=1

where ai , bi , ci ∈ ℂ ∀ i = 1, … , n. Then
n
⟨x, x⟩ = ∑ ai ai ≥ 0. Also, ⟨x, x⟩ = 0 ⇔ ai = 0 ∀ i = 1, 2, … , n ⇔ x = 0
i=1
n n n
⟨ x + y, z⟩ = ∑ (ai + bi ) ci = ∑ ai ci + ∑ bi ci = ⟨x, z⟩ + ⟨y, z⟩
i=1 i=1 i=1
n n
ab = a b ∀ a, b ∈ ℂ. Also, for any 𝛼 ∈ ℂ, ⟨𝛼x, y⟩ = ∑ 𝛼ai bi = 𝛼 ∑ ai bi = 𝛼⟨x, y⟩
i=1 i=1

n n n
Finally, ⟨y, x⟩ = ∑ bi ai = ∑ bi ai = ∑ ai bi = ⟨x, y⟩
i=1 i=1 i=1

Thus, IP1-IP5 are satisfied. This proves that ⟨ , ⟩ is an inner product on V.


∗∗∗
Note that, in Example 4, the inner product depended on the basis of V that we
chose. This suggests that an inner product can be defined on any
finite-dimensional vector space. In fact, many such products can be defined by
choosing different bases in the same vector space.

You may like to try the following exercise now.

E4) Let X = {x1 , … , xn } be a set and V be the set of all functions from X to C.
Then, with respect to pointwise addition and scalar multiplication, V is a
vector space over ℂ. Now, for any f, g ∈ V, define
n
⟨f, g⟩ = ∑ f(xi ) g(xi ).
i=1

Show that (V, ⟨ , ⟩) is an inner product space.

E5) Find the standard inner product of

a) u = (2, −i), v = (i, 2)


b) u = (1 + i, i, 1 − i), v = (i, 1 − i, 1 + i)

Example 5: Show that C[a, b], where ‘C[a, b]’ denotes the set of all continuous
real-valued functions defined on the closed interval [a, b], is an inner product
8 space.
Unit
. . . . .14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inner
. . . . . . .Product
. . . . . . . . .Spaces
.......
Solution: We need to define an inner product on c[a, b]. For this, we define a
simple inner product
b
⟨f1 , f2 ⟩ = ∫ f1 (x) f2 (x) dx ∀ f1 , f2 ∈ c[a, b].
a

We want to check if ⟨ , ⟩ satisfies IP1-IP5.

b
i) IP1 is satisfied because ⟨f1 , f1 ⟩ = ∫ f21 (x) dx, which is always non-negative.
a
b
ii) Now ⟨f1 , f1 ⟩ = 0 ⇒ ∫ f21 (x) dx = 0, which is possible if and only if the
a
function f is identically zero.
So, we shown that IP2 is satisfied by ⟨ , ⟩.

iii) IP3 is satisfied because

b
⟨ f1 + f2 , f3 ⟩ = ∫ (f1 (x) + f2 (x)) f3 (x) dx
a
b b
= ∫ f1 (x) f3 (x) dx + ∫ f2 (x) f3 (x) dx
a a
= ⟨f1 , f3 ⟩ + ⟨f2 , f3 ⟩.

∗∗∗

We suggest that you verify IP4 and IP5. That’s what E6 says!

We now state some properties of inner products that immediately follow from
IP1-IP5.

Theorem 1: Let (V, ⟨ , ⟩) be an inner product space. Then, for any x, y, z ∈ V and
𝛼, 𝜇 ∈ ℂ,
a) ⟨𝛼x + 𝜇y, z⟩ = 𝛼⟨x, z⟩ + 𝜇⟨y, z⟩

b) ⟨x, 𝛼y + 𝜇z⟩ = 𝛼⟨x, y⟩ + 𝜇⟨x, z⟩

c) ⟨0, x⟩ = ⟨x, 0⟩ = 0

d) ⟨x − y, z⟩ = ⟨x, z⟩ − ⟨y, z⟩

e) ⟨x, z⟩ = ⟨y, z⟩ ∀ z ∈ V ⇒ x = y.

Proof: We will prove (a) and (c), and leave the rest to you.
a) ⟨𝛼x + 𝜇y, z⟩ = ⟨𝛼x, z⟩ + ⟨𝜇y, z⟩ (by IP3)
= 𝛼⟨x, z⟩ + 𝜇⟨y, z⟩ (by IP4)
b) The vector 0 ∈ V can be written as 0 = 0 ⋅ y for some y ∈ V.
Thus, ⟨0, x⟩ = ⟨0 ⋅ y, x⟩ = 0⟨y, x⟩ = 0.
Then, ⟨x, 0⟩ = ⟨0, x⟩ = 0 = 0.

The proof of this theorem will be complete once you solve E7.
9
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......

E6) Show that IP4 and IP5 are true for Example 5.

E7) Prove (b),(d) and (e) of Theorem 1.

We will now discuss the concept of the length of a vector.

14.3 NORM OF A VECTOR


In previous unit we defined the length of a vector v in ℝ2 or ℝ3 to be √v ⋅ v. We
will extend this definition to the length of a vector in any inner product space.

Definition 2: If (V, ⟨ , ⟩) is an inner product space and x ∈ V, then the norm (or
length) of the vector x is defined to be √⟨x, x⟩. It is denoted by ‖x‖.
Hence, ‖x‖ = √⟨x, x⟩.

We make some pertinent remark here.

Remark 2: a) By IP1, ⟨ x, x⟩ ≥ 0 ∀ x ∈ V. Thus ‖x‖ ≥ 0.


Also, by IP2, ‖x‖ = 0 iff x = 0.

b) For any 𝛼 ∈ ℂ, we get ‖𝛼x‖ = |𝛼| ‖x‖,


because

‖𝛼x‖ = √⟨ 𝛼x, 𝛼x⟩ = √𝛼𝛼⟨ x, x⟩ = √|𝛼|2 ⟨ x, x⟩

= |𝛼|√⟨ x, x⟩ = |𝛼| ‖x‖.

We call x ∈ V a unit vector if ‖x‖ = 1.

Example 6: Calculate the norm of


a) u = (2, 1, −1), using the standard inner product in ℝ3 .

b) u = (1, i, −i), using the standard inner product in ℂ3 .

c) f(x) = x2 , using the simple inner product defined in C[0, 1].


Solution:

a) ‖u‖ = √⟨u, u⟩ = √42 + 1 + 1 = 3√2

b) ‖u‖ = √⟨u, u⟩ = √12 + (i) (−i) + (−i) (i)


= √1 − 2i2 = √3

1
1/2
c) ‖f‖ = ⟨f, f⟩ = √∫ x2 x2 dx
0

x5 1 1
= √[ ] =
5 0 √5

10 ∗∗∗
Unit
. . . . .14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inner
. . . . . . .Product
. . . . . . . . .Spaces
.......

E8) Show that for any x ∈ V, x ≠ 0, ‖xx‖ is a unit vector.

E9) Calculate the norm of

a) u = (1, −1, 3), using the standard inner product in ℝ3 .


b) u = (1, −1, 1 − i), using the standard inner product in ℂ3 .

Observe that in E8, x is any vector of V and V is also any vector space. So, ‖xx‖
is a special form of any non-zero vector. It also implies that we can create a unit
vector by using any non-zero vector. So, E8 leads us to the following definition.

Definition 3: Given any vector x ∈ V, x ≠ 0, ‖xx‖ is the normalised form of x.

E8 tells us that the normalised from of a vector is always a unit vector.

We will now prove some results involving norms. The first one is the
Cauchy-Schwarz inequality. It is very simple, but very important because it
allows us to prove many other useful statements.

This inequality was discovered independently by the French mathematician


Cauchy, the German mathematician Schwarz and the Russian mathematician
Bunyakowski. However, in most of the literature available in English it is
ascribed only to Cauchy and Schwarz.

Theorem 2: Let (V, ⟨, ⟩) be an inner product space and x, y ∈ V.


Then |⟨x, y⟩| ≤ ‖x‖ ‖y‖.

Proof: If x = 0 or y = 0, then |⟨x, y⟩| = 0 = ‖x‖ ‖y‖.


So, let us assume that x ≠ 0 and y ≠ 0. Hence, ‖y‖ > 0.
y
Let z = ‖y‖ . Then z ∈ V, and ‖z‖ = 1. Now, for any 𝛼 ∈ F, consider the norm of
the vector x − 𝛼z ∈ V.

‖x − 𝛼z‖2 = ⟨x − 𝛼z, x − 𝛼z⟩


= ⟨x, x⟩ − 𝛼⟨z, x⟩ − 𝛼⟨x, z⟩ + 𝛼𝛼⟨z, z⟩, using Theorem 1.
= ‖x‖2 − 𝛼⟨x, z⟩ − 𝛼⟨x, z⟩ + 𝛼𝛼, since ⟨z, z⟩ = 1.

Adding and subtracting ⟨x, z⟩⟨x, z⟩, we get

‖x − 𝛼z‖2 = ‖x‖2 − 𝛼⟨x, z⟩ − 𝛼⟨x, z⟩ − 𝛼𝛼 + ⟨x, z⟩⟨x, z⟩ − ⟨x, z⟩⟨x, z⟩


= ‖x‖2 − |⟨x, z⟩|2 + {⟨x, z⟩ − 𝛼} {⟨x, z⟩ − 𝛼}
= ‖x‖2 − |⟨x, z⟩|2 + |⟨x, z⟩ − 𝛼|2

Now ‖x − 𝛼z‖2 ≥ 0. This means that ‖x‖2 − |⟨x, z⟩|2 + |⟨x, z⟩ − 𝛼|2 ≥ 0 ∀𝛼 ∈ F.
In particular, if we choose 𝛼 = ⟨ x, z⟩, we get 0 ≤ ‖x‖2 − |⟨x, z⟩|2 . Hence,
|⟨x, z⟩| ≤ ‖x‖, that is,
y 1
|⟨x, ⟩| ≤ ‖x‖ ⇔ |⟨ x, y⟩| ≤ ‖x‖ ⇔ |⟨x, y⟩| ≤ ‖x‖ ‖y‖.
‖y‖ ‖y‖

which is the required inequality. ■ 11


Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Let us see what the Cauchy-Schwarz inequality looks like in some cases.

Example 7: Write the expression for the Cauchy-Schwarz inequality for the
vector space given in E4.
n
2
Solution: For any f ∈ V, ‖f‖2 = ⟨f, f⟩ = ∑|f(xi )| . Thus, Theorem 2 says that
i=1

n n n
2 2
2 vectors x and y are |∑ f(xi )g(xi )| ≤ √∑|f(xi )| √∑ |g(xi )| ∀ f, g ∈ V.
i=1 i=1 i=1
called proportional if
∃ 𝛼 ∈ F, 𝛼 ≠ 0, with x = 𝛼y.
∗∗∗
Do try these exercise now.

E10) Write down the expressions for the Cauchy-Schwarz inequality for the
spaces given in Example 1, 3 and 4.

E11) If y = 𝛼x, show that |⟨x, y⟩| = ‖x‖ ‖y‖.

We come to the next theorem now, which is generalisation of well-known


results of Euclidean geometry.

Theorem 3: If (V, ⟨ , ⟩) is an inner product space and x, y ∈ V, then


a) ‖x + y‖ ≤ ‖x‖ + ‖y‖ (Triangle inequality)

If z = a + ib ∈ ℂ, then
b) ‖x + y‖2 + ‖x − y‖2 = 2 (‖x‖2 + ‖y‖2 ) (Parallelogram law)
a) the real part of z is a,
and is denoted by Re (z), Proof: a) Now
b) z + z = 2Re(z)
c) Re(z) ≤ |z| ‖x + y‖2 = ⟨x + y, x + y⟩ = ‖x‖2 + ⟨x, y⟩ + ⟨y, z⟩ + ‖y‖2 .
= ‖x‖2 + ⟨x, y⟩ + ⟨x, y⟩ + ‖y‖2 .
= ‖x‖2 + 2Re⟨x, y⟩ + ‖y‖2 .
≤ ‖x‖2 + 2|⟨x, y⟩| + ‖y‖2 , since Re ⟨x, y⟩ ≤ |⟨x, y⟩|.
≤ ‖x‖2 + 2‖x‖ ‖y‖ + ‖y‖2 (by Theorem 2)
= (‖x‖ + ‖y‖)2

Hence, ‖x + y‖2 ≤ (‖x‖ + ‖y‖)2 . Taking square roots of both sides we obtain

‖x + y‖ ≤ ‖x‖ + ‖y‖.

b) To prove the parallelogram law we expand ‖x + y‖2 + ‖x − y‖2 to get

⟨x, x⟩ + ⟨x, y⟩ + ⟨y, x⟩ + ⟨y, y⟩ + ⟨x, x⟩ − ⟨x, y⟩ − ⟨y, x⟩ + ⟨y, y⟩ = 2 (‖x‖2 + ‖y‖2 )
Fig. 1: ‖x + y‖ ≤ ‖x‖ + ‖y‖

Thus, (b) is also proved. ■

The reason (a) is called the triangle inequality is that for any triangle the sum of
12 the lengths of any sides is greater than or equal to the length of the third side.
Unit
. . . . .14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inner
. . . . . . .Product
. . . . . . . . .Spaces
.......
So, if we consider a triangle in any Euclidean space, two of whose sides are
the vectors x and y, then the third side is x + y (see Fig. 1,) and hence,
‖x‖ + ‖y‖ ≥ ‖x + y‖.

Similarly, (b) is called the parallelogram law because it generalises the fact that
the sum of the squares of the length of the diagonals of a parallelogram in
Euclidean space is always equal to the double of the sum of the squares of its
sides (Fig 2).

E12) Show that |‖x‖ − ‖y‖| ≤ ‖x − y‖ for x, y ∈ (V, ⟨ , ⟩) . Fig. 2: ‖x + y‖2 + ‖x − y‖2 =
(Hint: Use the triangle inequality for y and (x − y) .) 2 (‖x‖2 + ‖y‖2 )

Let us now discuss a general version of what we already discussed about


orthogonality in previous block.

14.4 ORTHOGONALITY
|⟨x,y⟩|
In Theorem 2 we showed that ‖x‖ ‖y‖
≤ 1 for any x, y ∈ V. We knew that, for
2 3 |⟨x,y⟩|
non-zero vectors x and y (in ℝ or ℝ ), ‖x‖ ‖y‖ is equal to the magnitude of the
cosine of the angle between them. We generalise this concept now.
|⟨x,y⟩|
For any inner product space V and for any non-zero x, y ∈ V, we take ‖x‖ ‖y‖ to
be the magnitude of the cosine of the angle between the two vectors x
and y.

So what happens if x and y are perpendicular to each other? We find that


⟨x, y⟩ = 0. This leads us to the following definition.

Definition 4: If (V, ⟨ , ⟩) is an inner product space and x, y ∈ V, then x is said to


be orthogonal (or perpendicular) to y if ⟨x, y⟩ = 0. This is denoted by x ⟂ y.

For example, i = (1, 0) is orthogonal to j = (0, 1) with respect to the standard


inner product in ℝ2 .
We now give some properties involving orthogonality. Their proof is left as an
exercise for you.

E13) Using the definitions of inner product and orthogonality, prove the
following results for an inner product space V.
a) 0 ⟂ x ∀ x ∈ V.
b) ‘ x ⟂ x iff x = 0, where x ∈ V.
c) x ⟂ y ⇒ y ⟂ x, for x ∈ V.
d) x ⟂ y ⇒ 𝛼x ⟂ y for any 𝛼 ∈ F, where x, y ∈ V.

Let us consider some examples now. 13


Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Example 8: Consider V = ℝn . If x = (x1 , … , xn ) and y = (y1 , … , yn ) are any two
vectors of V, we define the inner product of x with y by
n
⟨x, y⟩ = ∑ x1 yi
i=1

Let B = {e1 , … , en } be the stranded basis of V. Show that ei ⟂ ej when


i ≠ j, i, j = 1, … , n. What happens when i = j?

Solution: Consider e1 = (1, 0, 0, … , 0) and e2 = (0, 1, 0, … , 0) . We find that


⟨e1 , e2 ⟩ = 1.0 + 0.1 + 0 + ⋯ + 0 = 0. Hence, e1 ⟂ e2 . In a similar way, we can show
that ei ⟂ ej , for i ≠ j, i, j = 1, … , n.

Now let us see what ⟨ei , ei ⟩ is ∀ i = 1, … , n.

⟨e1 , e1 ⟩ = 1.1 + 0 + ⋯ + 0 = 1.
⟨e2 , e2 ⟩ = 0 + 1 + 0 + ⋯ + 0 = 1.

Similarly, ⟨ei , ei ⟩ = 1 ∀ i = 1, … , n.
Thus, ‖ei ‖ = 1 ∀ i = 1, … , n.
On the lines of Example 8, we can also show that the elements of the standard
basis of ℂn are mutually orthogonal and of unit length with respect to the
standard inner product.

∗∗∗

Try the following exercises now.

E14) For x, y ∈ (V, ⟨ , ⟩) such that x ⟂ y, show that

‖x + y‖2 = ‖x‖2 + ‖y‖2 .

(This is the Pythagoras Theorem when V = ℝ2 (see Fig.3).)

We will now define a set of orthogonal vectors.


Fig. 3:
‖x + y‖ = ‖x‖2 + ‖y‖2
2 Definition 5: A set A ⊂ V is called orthogonal if x ⟂ y ∀ x, y ∈ A such that x ≠ y.
An orthogonal set A is called orthonormal if ‖x‖ = 1 ∀ x ∈ A.
For example, the set B in Example 8 is orthogonal and orthonormal.

By definition, every orthonormal set is orthogonal. But the converse is not true,
as the following example tells us.

Example 9: Consider the standard basis B = {e1 , … , en } of ℝn . Show that the


set ℂ = {2e1 , 2e2 , … , 2en } is orthogonal but not orthonormal with respect to the
standard inner product.

Solution: For i ≠ j, ⟨2ei , 2ej ⟩ = 4⟨ei , ej ⟩ = 0. Thus, ℂ is an orthogonal set.


But ‖2ei ‖ = √4⟨ei , ei ⟩ = 2 ∀ i = 1, … , n.
∴, C is not an orthogonal set.

14 ∗∗∗
Unit
. . . . .14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inner
. . . . . . .Product
. . . . . . . . .Spaces
.......
Try the following exercise now.

E15) Let Pn be the real vector space of all polynomials of degree ≤ n. We define
an inner product on Pn by
n n n
⟨∑ ai xi , ∑ bi xi ⟩ = ∑ ai bi .
i=0 i=0 i=0

Show that the basis {1, x, x2 , … , xn } of Pn is an orthonormal set.

In the next two theorems we present some properties of an orthogonal set,


related to the linear combination of its vectors.

Theorem 4: Let (V, ⟨ , ⟩) be an inner product space and x, y1 , … , yn ∈ V such that


x ⟂ yi ∀ i = 1, … , n. Then x is orthogonal to every linear combination of the
vectors y1 , … , yn .

n
Proof: Let y = ∑ ai yi , where ai ∈ F ∀ i = 1, … , n.
i=1
n n
Then, y ∈ V and ⟨x, y⟩ = ⟨x, ∑ ai yi ⟩ = ∑ ai ⟨x, yi ⟩ = 0, because ⟨x, yi ⟩ = 0 ∀ i. This
i=1 i=1
shows that x ⟂ y. ■

Theorem 5: Let (V, ⟨, ⟩) be an inner product space and A = {x1 , … , xn } ⊆ V be an


orthogonal set. Then, for any ai ∈ F (i = 1, … , n) , we have
n 2 n
2
‖∑ ai xi ‖ = ∑|ai | ‖xi ‖2 .
i=1 i=1

n
Proof: Our hypothesis says that ⟨xi , xj ⟩ = 0 if i ≠ j. Consider y = ∑ ai xi .
i=1
n n n n n n
‖y‖2 = ⟨y, y⟩ = ⟨∑ ai xi , ∑ aj xj ⟩ = ∑ ∑⟨ai xi , aj xj ⟩ = ∑ ∑ ai aj ⟨xi , xj ⟩
i=1 j=1 i=1 j=1 i=1 j=1
n
= ∑ ai ai ⟨xi , xi ⟩ since ⟨xi , xj ⟩ = 0 for i ≠ j
i=1
n
2
= ∑|ai | ‖xi ‖2 . This proves the result.
i=1

Note: If ai = 1 ∀ i, in Theorem 5, we get


n 2 n
‖∑ xi ‖ = ∑‖xi ‖2
i=1 i=1

This is a generated form of what we gave in E14.

We now give an important result, which is actually a corollary to Theorem 5.

Theorem 6: Let A be an orthogonal set of non-zero vector of an inner product


space V. Then A is a linearly independent set. 15
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Proof: To show that A is linearly independent we will have to prove that any
finite subset {x1 , … , xn } of vectors of A is linearly independent. For this, assume
n
that y = ∑ ai xi = 0.
i=1

Then
n
2 2
‖y‖2 = 0 ⇒ ∑|ai | ‖xi ‖2 = 0 ⇒ |ai | ‖xi ‖2 = 0 ∀ i.
i=1

2
⇒ |ai | = 0 for i = 1, … , n, since ‖xi ‖2 ≠ 0 for any i.
⇒ ai = 0 for i = 1, … , n.
Thus, {x1 , … , xn } is linearly independent. Hence, A is linearly independent. ■

We have just proved that any orthogonal set is linearly independent. Therefore,
any orthogonal set in a vector space V of dimension n must have a maximum
of n elements. So, for example, any orthogonal subset of ℝ3 can have 3
elements, at the most.

We shall use Theorem 6 as a stepping stone towards showing that any inner
product space has an orthonormal set as a basis. But first, some definitions
and remarks.

Definition 6: A basis of an inner product space is called an orthonormal basis


if its elements form an orthonormal set.

For example, the standard basis of ℝn is an orthonormal basis (Example 8).

Now, a small exercise.

E16) Let {e1 , … , en } be an orthonormal basis for a real inner product space V.
n n n
Let x = ∑ xi ei and y = ∑ yi ei be elements of V. Show that ⟨x, y⟩ = ∑ xi yi .
i=1 i=1 i=1

We make a few observations now.

Remark 3: a) If A ⊆ V is orthogonal, then the set B = { ‖xx‖ | x ∈ A and x ≠ 0} is


orthonormal. For example, consider ℝ2 , with the dot product. Let v = (1, 1)
and w = (1, −1). Then v.w = 1 − 1 = 0. Thus, v ⟂ w. Therefore,

v w 1 1 1 −1
{ , } = {( , )( , )}
‖v‖ ‖w‖ √2 √2 √2 √2

is an orthonormal set in ℝ2 . In fact, this is a basis of ℝ2 , since {v, w} is a


linearly independent set and dimR ℝ2 = 2.

b) For any 0 ≠ x ∈ V, { ‖xx‖ } can be regarded as an orthonormal set in V.

We now state the theorem that tells us of the existence of an orthonormal


basis. Its proof consists of a method called the Gram-Schmidt
16 orthogonalisation process.
Unit
. . . . .14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inner
. . . . . . .Product
. . . . . . . . .Spaces
.......
Theorem 7: Let (V, ⟨ , ⟩) be a non-zero inner product space of dimension n.
Then V has an orthonormal basis.

Proof: We shall first show that it has an orthogonal basis, and then obtain an
orthonormal basis.
Let {v1 , … , vn } be a basis of V. From this basis we shall obtain an orthogonal
basis {w1 , w2 , … , wn } of V in the following way.
⟨v2 ,w1 ⟩ ⟨v2 ,v1 ⟩
Take w1 = v1 . Define w2 = v2 − ⟨w1 ,w1 ⟩
w1 . Then w2 = v2 − v ,
⟨v1 ,v1 ⟩ 1
and

⟨v2 , v1 ⟩
⟨w2 , v1 ⟩ = ⟨v2 , v1 ⟩ − ⟨v1 , v1 ⟩ = 0.
⟨v1 , v1 ⟩
⟨v2 ,w1 ⟩
That is, ⟨w2 , w1 ⟩ = 0. Further, v2 = c1 v1 + w2 , where c1 = ⟨w1 ,w1 ⟩
∈ F.

⟨v3 ,w2 ⟩ ⟨v3 ,w1 ⟩


Define w3 = v3 − ⟨w2 ,w2 ⟩
w2 − ⟨w1 ,w1 ⟩
w1 . Then ⟨w3 , w2 ⟩ = 0 = ⟨w3 , w1 ⟩.

Also, v3 = c1 w1 + c2 w2 + w3 , where c1 , c2 ∈ F. Continuing in this manner, we can


define
⟨vm+1 , wi ⟩
wm+1 = vm+1 − c1 w1 − c2 w2 − ⋯ − cm wm , where ci = ∈ F.
⟨wi , wi ⟩

⇒ vm+1 = c1 w1 + c2 w2 + ⋯ + cm wm + wm+1 , for m = 0, … , n − 1.

This way we obtain an orthogonal set vectors {w1 , w2 , … , wn } , such that the v′i s
are a linear combination of the w′i s. By Theorem 6 this set is linearly
independent, and hence form a basis of V. ■

From this basis, we immediately obtain an orthonormal basis of V by using


w w
Remark 3. Thus, { ‖w1‖ , ⋯ , ‖wn‖ } is an orthonormal basis of V.
1 n

Note: The same process can be used to show that:

If (V, ⟨ , ⟩) is an inner product space and Y = {y1 , … , yn } a set of linearly


independent vectors of V, then an orthonormal set X = {x1 , x2 , … , xn } can be
obtained from Y such that the linear spans of X and Y coincide.

Let us set how the Gram-Schmidt process work in a few case.

Example 10: Obtain an orthonormal basis for P2 , the space of all real
polynomials of degree at most 2, the inner product being defined by
1
⟨p1 , p2 ⟩ = ∫ p1 (t)p2 (t)dt.
0

Solution: {1, t, t2 } is a basis for P2 . From this we will obtain an orthogonal


1
basis {w1 , w2 , w3 } . Now w1 = 1 and ⟨w1 , w1 ⟩ = ∫ dt = 1.
0

1
⟨t,w1 ⟩ 1 t2 1 1
w2 = t − ⟨w1 ,w1
w . Now ⟨t, w1 ⟩ = ∫ t dt =
⟩ 1
| = . Therefore, w2 = t − 2
0 2 0 2
1 1 2 1
∴, ⟨w2 , w2 ⟩ = ∫ (t − ) dt = .
0 2 12 17
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
⟨t2 , w2 ⟩ ⟨t2 , w1 ⟩
w3 = t2 − w2 − w1
⟨w2 , w2 ⟩ ⟨w1 , w1 ⟩
1 1 1 1
= t2 − 12 { (t − )} − = t2 − t + .
2 12 3 6
1 1 2 1
Also ⟨w3 , w3 ⟩ = ∫ (t2 − t + ) dt = .
0 6 180
Thus, the orthonormal basis is
w1 w2 w3 1 1
{ , , } = {1, √12 (t − ) , √180 (t2 − t + )} .
‖w1 ‖ ‖w2 ‖ ‖w3 ‖ 2 6

∗∗∗

Here’s an exercise.

E17) Obtain an orthonormal basis with respect to the standard inner product,
for
a) the subspace of ℝ3 generated by (1, 0, 3) and (2, 1, 1)
b) the subspace of ℝ4 generated by (1, 0, 2, 0) and (1, 2, 3, 1).

We will now prove a theorem that leads us to an important inequality, which is


used for studying Fourier coefficients.

Theorem 8: Let (V, ⟨ , ⟩) be an inner product space and A = {x1 , … , xn } be an


orthonormal set in V. Then, for any y ∈ V,
n 2 n
2
‖y − ∑⟨y, xi ⟩xi ‖ = ‖y‖2 − ∑|⟨y, xi ⟩|
i=1 i=1

n
Proof: Let x = ∑ ai xi (ai ∈ F) be any linear combination of the elements of A.
i=1
Then

‖y − x‖2 = ⟨y − x, y − x⟩ = ‖y‖2 − ⟨y, x⟩ − ⟨x, y⟩ + ‖x‖2


n n
= ‖y‖2 − ⟨y, ∑ ai xi ⟩ − ⟨∑ ai xi , y⟩ + ‖x‖2
i=1 i=1
n n n
2
= ‖y‖2 − ⟨y, ∑ ai xi ⟩ − ∑⟨ai xi , y⟩ + ∑|ai | ‖xi ‖2 , since ⟨xi , xj ⟩ = 0 for i ≠ j.
i=1 i=1 i=1

As ‖xi ‖2 = 1 ∀ i, it follows that


n n n
‖y − x‖2 = ‖y‖2 − ∑ ai ⟨y, xi ⟩ − ∑ ai ⟨xi , y⟩ + ∑ ai ai
i=1 i=1 i=1
n n n n n
2 2
= ‖y‖2 − ∑|⟨y, xi ⟩| + ∑|⟨y, xi ⟩| − ∑ ai ⟨y, xi ⟩ − ∑ ai ⟨xi , y⟩ + ∑ ai ai
i=1 i=1 i=1 i=1 i=1
n n n n n
2
= ‖y‖2 − ∑|⟨y, xi ⟩| + ∑⟨y, xi ⟩⟨y, xi ⟩ − ∑ ai ⟨y, xi ⟩ − ∑ ai ⟨y, xi ⟩ + ∑ ai ai
i=1 i=1 i=1 i=1 i=1
n n n
2
= ‖y‖2 − ∑|⟨y, xi ⟩| + ∑⟨y, xi ⟩ {⟨y, xi ⟩ − ai } − ∑ ai {⟨y, xi ⟩ − ai }
18 i=1 i=1 i=1
Unit
. . . . .14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inner
. . . . . . .Product
. . . . . . . . .Spaces
.......
n n
2
= ‖y‖2 − ∑|⟨y, xi ⟩| + ∑ {⟨y, xi ⟩ − ai } {⟨y, xi ⟩ − ai }
i=1 i=1
n n
2 2
= ‖y‖2 − ∑|⟨y, xi ⟩| + ∑|⟨y, xi ⟩ − ai |
i=1 i=1

This is true for any ai ∈ F. Now choose ai = ⟨y, xi ⟩ ∀ i, … , n. Then we get


n 2 n
2
‖y − ∑⟨y, xi ⟩xi ‖ = ‖y‖2 − ∑|⟨y, xi ⟩| , which is the desired result.
i=1 i=1

And now we come to a corollary of Theorem 8, known as Bessel’s inequality.


It is named after the German astronomer, Friedrich Wilhelm Bessel
(1784-1846).

Corollary 1: Let A = {x1 , … , xn } be any orthonormal set (V, ⟨ , ⟩) . Then, for any
y ∈ V,
n
2 2
∑|⟨y, xi ⟩ | ≤ ‖y‖ .
i=1

E18) Prove the corollary given above.

We end the unit by summarising what we have covered in it.

14.5 SUMMARY
In this unit we have discussed the following points. We have

1. defined and given examples of inner product spaces.

2. defined the norm of a vector.

3. proved the Cauchy-Schwarz inequality.

4. defined an orthogonal and an orthonormal set of vectors.

5. shown that every finite-dimensional inner product space has an


orthonormal basis, using the Gram-Schmidt orthogonalisation process.

6. proved Bessel’s inequality.

14.6 SOLUTIONS/ANSWERS

E1) For 𝛼 ∈ ℝ and (x1 , x2 , x3 ) , (y1 , y2 , y3 ) ∈ ℝ3 ,

⟨𝛼 (x1 , x2 , x3 ) , (y1 , y2 , y3 )⟩ = (𝛼x1 , 𝛼x2 , 𝛼x3 ) (y1 , y2 , y3 )


= 𝛼x1 y1 + 𝛼x2 y2 + 𝛼x3 y3 = 𝛼 (x1 y1 + x2 y2 + x3 y3 )
= 𝛼⟨(x1 , x2 , x3 ) , (y1 , y2 , y3 )⟩.
19
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
∴ IP4 is satisfied.
Also, for any x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ) in ℝ3 ,

⟨x, y⟩ = x1 y1 + x2 y2 + x3 y3 = y1 x1 + y2 x2 + y3 x3 = ⟨y, x⟩.

∴ IP5 is satisfied.

E2) For x, y, z ∈ ℂ and 𝛼 ∈ ℂ we have

⟨x + y, z⟩ = (x + y) z = xz + yz = ⟨x, z⟩ + ⟨y, z⟩,


⟨𝛼x, y⟩ = (𝛼x) y = 𝛼(xy) = 𝛼⟨x, y⟩,
⟨x, y⟩ = xy = xy = yx = ⟨y, x⟩.

∴ ⟨ , ⟩ satisfies IP3, IP4 and IP5.

E3) a) ⟨u, v⟩ = 2 − 2 = 0
b) ⟨u, v⟩ = 2 + 2 + 9 = 13
E4) Let f, g, h ∈ V and 𝛼 ∈ ℂ. Then
n n
2
⟨f, f⟩ = ∑ f(xi )f(xi ) = ∑|f(xi )| ≥ 0.
i=1 i=1
⟨f, f⟩ = 0 ⇔ f(xi ) = 0 ∀ i = 1, … , n
⇔ f is the zero function.
n
⟨f + g, h⟩ = ∑ (f + g) (xi )h(xi )
i=1
n
= ∑ {f(xi ) + g(xi )} h(xi )
i=1
n n
= ∑ f(xi )h(xi ) + ∑ g(xi )h(xi )
i=1 i=1
= ⟨f, h⟩ + ⟨g, h⟩.
n n
⟨𝛼f, g⟩ = ∑(𝛼f)(xi )g(xi ) = ∑ 𝛼f(xi )g(xi )
i=1 i=1
n
= 𝛼 ∑ f(xi )g(xi ) = 𝛼⟨f, g⟩
i=1
n n
⟨f, g⟩ = ∑ f(xi )g(xi ) = ∑ f(xi )g(xi )
i=1 i=1
n
= ∑ g(xi )f(xi ) = ⟨g, f⟩.
i=1

∴ (V, ⟨ , ⟩) is an inner product space.

E5) a) ⟨u, v⟩ = (2) (−i) + (−i) (2) = −4i


b) ⟨u, v⟩ = (1 + i) (−i) + i(1 + i) + (1 − i) (1 − i) = −2i
b b
E6) ⟨𝛼f1 , f2 ⟩ = ∫ (k f1 (x)) f2 (x) dn = k ∫ f1 (x) f2 (x) dx
a a
= k⟨f1 , f2 ⟩
∴ IP4 satisfied.
Also for any f1 , f2 ∈ C[a, b]
b b
⟨f1 , f2 ⟩ = ∫ f1 (x) f2 (x) dx = ∫ f2 (x) f1 (x) dx = ⟨f2 , f1 ⟩
a a

20 ∴ IP5 satisfied.
Unit
. . . . .14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inner
. . . . . . .Product
. . . . . . . . .Spaces
.......

E7) b) ⟨x, 𝛼y + 𝜇z⟩ = ⟨𝛼y + 𝜇z, x⟩, by IP5


= 𝛼⟨y, x⟩ + 𝜇⟨z, x⟩, by Theorem 1(a).
= 𝛼 ⟨y, x⟩ + 𝜇 ⟨z, x⟩
= 𝛼 ⟨x, y⟩ + 𝜇⟨x, z⟩, by IP5.
∴ (b) is proved.

d) ⟨x − y, z⟩ = ⟨x + (−1)y, z⟩ = ⟨x, z⟩ + (−1)⟨y, z⟩, by Theorem 1(a).


= ⟨x, z⟩ − ⟨y, z⟩.

e) ⟨x, z⟩ = ⟨y, z⟩ ∀ z ∈ V
⇒ ⟨x − y, z⟩ = 0 ∀ z ∈ V, by (d) above.
⇒ ⟨x − y, x − y⟩ = 0, taking z = x − y, in particular.
⇒ x − y = 0, by IP2.
⇒ x = y.

x
E8) Let u = ‖x‖
. Then ⟨u, u⟩ = ⟨ ‖xx‖ , ‖xx‖ ⟩ = 1
‖x‖2
⟨x, x⟩ = 1
‖x‖2
‖x‖2 = 1.

∴, ‖u‖ = √⟨u, u⟩ = 1.

E9) a) ‖u‖ = √⟨u, u⟩ = √1 + 1 + 9 = √11


b) ‖u‖ = √⟨u, u⟩ = √1(1) + (−1) (−1) + (1 − i) (1 + i)

= √1 + 1 + 12 − i2 = 2.
E10) In the situation of Example 1 we get
|u ⋅ v| ≤ ‖u‖ ‖v‖ for u, v ∈ ℝ3 . Thus,

2 2 2 2 2 2
|x1 y1 + x2 y2 + x3 y3 | ≤ √x1 + x2 + x3 √y1 + y2 + y3 ∀ (x1 , x2 , x3 ) , (y1 , y2 , y3 ) ∈ ℝ3 .

In the situation of Example 3 we get

|xy| ≤ |x| |y| ∀x, y ∈ ℂ.

Theorem 2 and Example 4 gives us


n n n
2 2
|∑ ai bi | ≤ √∑|ai | √∑|bi | .
i=1 i=1 i=1
n n
where ∑ ai ei , ∑ bi ei are elements of V.
i=1 i=1

E11) ‖y‖ = √⟨𝛼x, 𝛼x⟩ = √|𝛼|2 ⟨x, x⟩ = |𝛼| ‖x‖.


∴ ‖x‖ ‖y‖ = |𝛼| ‖x‖2 = |𝛼|⟨x, x⟩ = |𝛼⟨x, x⟩| = |⟨x, 𝛼x⟩| = |⟨x, y⟩|.

E12) ‖y + (x − y)‖ ≤ ‖y‖ + ‖x − y‖

⇒‖x‖ ≤ ‖y‖ + ‖x − y‖
⇒‖x‖ − ‖y‖ ≤ ‖x − y‖

Similarly, ‖y‖ − ‖x‖ ≤ ‖y − x‖ = ‖x − y‖, since ‖x‖ = ‖−x‖.


∴ |‖x‖ − ‖y‖| ≤ ‖x − y‖, since |𝛼| = 𝛼 or −𝛼 for any 𝛼 ∈ ℝ.

E13) a) Use Theorem 1(c).


b) Since ⟨x, x⟩ = 0 ⇔ x = 0, (b) is true. 21
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
c) x ⟂ y ⇒⟨x, y⟩ = 0 ⇒ ⟨y, x⟩ = 0 ⇒ ⟨y, x⟩ = 0
⇒ y ⟂ x.

d) x ⟂ y ⇒⟨x, y⟩ = 0 ⇒ 𝛼⟨x, y⟩ = 0 ∀𝛼 ∈ F
⇒ ⟨𝛼x, y⟩ = 0 ∀𝛼 ∈ F ⇒ 𝛼x ⟂ y ∀ 𝛼 ∈ F.
E14) If x ⟂ y, then ⟨x, y⟩ = 0. Hence,

‖x + y‖2 = ⟨x, x⟩ + ⟨y, y⟩ = ‖x‖2 + ‖y‖2 .

E15) For i ≠ j, ⟨xi , xj ⟩ = 1.0 + 0.1 = 0.


Also, ∀ i = 0, … , n, ⟨xi , xi ⟩ = 1.1 = 1.
∴, the given set is orthonormal.

E16) x⟨x, y⟩ = ⟨∑ xi ei , ∑ yj ej ⟩ = ∑ ∑ xi yj ⟨ei , ej ⟩


i j i j

= ∑ xi yi , since ⟨ei , ei ⟩ = 1 ∀i = 1, … , n and ⟨ei , ej ⟩ = 0 for i ≠ j.


i
E17) a) Here v1 = (1, 0, 3) , v2 = (2, 1, 1) .
w w
We want the set { ‖w1‖ , ‖w2‖ } , where w1 = v1 and
1 2
⟨v ,w ⟩
w2 = v2 − ⟨w2 ,w1 ⟩ w1
1 1
Now, ⟨v2 , w1 , ⟩ = ⟨v2 , v1 , ⟩ = 2 + 0 + 3 = 5.
Also ⟨w1 , w1 ⟩ = ⟨v1 , v1 ⟩ = 10, so that ‖w1 ‖ = √10.
∴ w2 = (2, 1, 1) − 105
(1, 0, 3) = ( 23 , 1, −21 )

∴ ‖w2 ‖ = √ 94 + 1 + 1
4
= √ 72

∴ { 1
(1, 0, 3) , √ 27 ( 32 , 1, −21 )} is the required orthonormal basis.
√10

b) w1 = (1, 0, 2, 0)
7 2 1
w2 = (1, 2, 3, 1) − (1, 0, 2, 0) = (− , 2, , 1) ,
5 5 5
‖w1 ‖ = √5, ‖w2 ‖ = √ 26
5
w w
Then { ‖w1‖ , ‖w2‖ } is the required basis.
1 2

E18) Theorem 8 says that


n
2
‖y‖2 − ∑|⟨y, xi ⟩| ≥ 0.
i=1

22
UNIT 15

HERMITIAN AND UNITARY


OPERATORS
Structure
Page Nos.
15.1 Introduction 23
Objective
15.2 Linear Functionals of Inner Product Spaces 24
15.3 Adjoint of an Operator 26
15.4 Some Special Operator 30
Self-adjoint Operator
Unitary Operator
15.5 Hermitian and Unitary Matrices 34
Matrix of the Adjoint Operator
Hermitian Matrix
Unitary (Orthogonal) Matrix
15.6 Summary 41
15.7 Solution/Answers 42

15.1 INTRODUCTION
In the preceding unit we discussed general properties of linear product spaces.
In this unit we will show that we can precisely determine the nature of linear
functionals defined over inner product spaces.

We, then, discuss the adjoint of an operator. The behaviour of this adjoint leads
us to the concepts of self-adjoint operators and unitary operators. As usual, we
will discuss their matrix analogues also. This will entail studying the definitions
and properties of Hermitian, unitary and orthogonal matrices.

Regarding the notation in this unit, F will always denote R or C. And, unless
otherwise mentioned, the inner product of Rn or Cn will be the standard inner
product. Also, if T is a function acting on x, then we will often write Tx for T(x),
for our convenience.

Being reading this unit we advise you to look and recall the definitions of a
linear functional and a dual space. 23
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Objectives
After studying this unit, you should be able to:

• represent a linear functional on an inner product spaces as an inner product


with a unique vector;

• prove the existence of a unique adjoint of any given linear operator on an


inner product space;

• identify self-adjoint, Hermitian, unitary and orthogonal linear operators;

• establish the relationship between self-adjoint (or unitary) operators and


Hermitian (or unitary) matrices.

• prove and use the fact that a matrix is unitary iff its rows (or columns) form an
orthogonal set of vectors;

• use the fact that any real symmetric matrix is orthogonally similar to a
diagonal matrix.

15.2 LINEAR FUNCTIONALS OF INNER


PRODUCT SPACES

If V is a non-zero inner product space over F, then ∃ 0 ≠ x ∈ V. Consider the


linear functional f on V defined by

f(v) = ⟨v, x⟩ ∀ v ∈ V.

Then f(x) ≠ 0, since x ≠ 0. Therefore, f ≠ 0. Also, f ∈ V∗ . Therefore, V∗ = {0}. But,


what do the elements of V∗ look like?

Before going into the detailed study of such functions let us consider an
example.

Example 1: Consider V = ℝ2 . Take y = (1, 2) ∈ ℝ2 and define, for any

x = (x1 , x2 ) ∈ ℝ2 , f ∶ ℝ2 → ℝ

by f(x) = ⟨x, y⟩ = x1 + 2x2 . Show that f is a linear functional on ℝ2 .

Solution: Firstly,

f [(x1 , x2 ) + (y1 , y2 )] = f(x1 , x2 ) + f(y1 + y2 )∀ (x1 , x2 ), (y1 , y2 ) ∈ ℝ2 .

Also, for any a ∈ ℝ,

f (a(x1 , x2 )) = af(x1 , x2 ), ∀ (x1 , x2 ) ∈ ℝ2 .

Therefore, f is a linear functional on ℝ2 .


∗∗∗

Try the following exercise on the same lines as Example 1.


24
Unit
. . . . .15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Hermitian
. . . . . . . . . . .and
. . . . Unitary
. . . . . . . . .Operators
..........

E1) Fix y ∈ ℝ2 . Show that the function fy ∶ ℝ2 → ℝ ∶ fy (x) = ⟨x, y⟩ is a linear


functional on ℝ2 .

Let us now consider any inner product space (V, ⟨ , ⟩) . We choose a vector
z ∈ V and fix it. With the help of this vector we can obtain a linear functional
f ∈ V∗ = L (V, F) in the following way:

Define f ∶ V → F by f(x) = ⟨x, z⟩ ∀ x ∈ V. Clearly f is a well-defined map, and

f(x + y) = ⟨x + y, z⟩ = ⟨x, z⟩ + ⟨y, z⟩


= f(x) + f(y).

Also f(𝛼x) = ⟨𝛼x, z⟩ = 𝛼⟨x, z⟩ = 𝛼f(x) for any 𝛼 ∈ F.

Hence, f is a linear functional on V. (To show the relationship of f with z, we


sometimes denote f by fz .)

Thus, we have succeeded in proving the following result.

Theorem 1: If (V, ⟨ , ⟩) is an inner product space over F (F = ℝ or ℂ) and z is a


given vector of V, then the map

fz ∶ V → F ∶ fz (x) = ⟨x, z⟩,

is a linear functional on V.

Theorem 1 is true for any finite-dimensional or infinite-dimensional inner


product space. What is interesting about finite-dimensional inner product
spaces is that the converse of this result is also true. We now proceed to state
and prove it.

Theorem 2: If (V, ⟨ , ⟩) is an inner product space over F with dimension n, and f


is a linear functional defined on V, then ∃ a unique element z in V such that
f(x) = ⟨x, z⟩ for x ∈ V, that is, f = fz .

Proof: As dim V = n, there exists a finite orthogonal basis for V. Let this basis
be B = {e1 , e2 , … , en } . Then

0, i ≠ j
⟨ei , ej ⟩ = {
1, i = j

Let f(ei ) = ai , (i = 1, … , n).


n
Now, any x ∈ V can be written as x = ∑ bi ei , bi ∈ F.
i=1

Then
n n n
f(x) = f (∑ bi ei ) = ∑ bi f(ei ) = ∑ bi ai …(1)
i=1 i=1 i=1
n
Now consider the vector z ∈ V such that z = ∑ ai ei .
i=1 25
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
As each ai is known to us, z is a known vector of V. Also,
n n
⟨x, z⟩ = ⟨∑ bi ei , ∑ aj ej ⟩
i=1 j=1
n n
= ∑ ∑⟨bi ei , aj ej ⟩
i=1 j=1
n n
= ∑ ∑ bi aj ⟨ei , ej ⟩
i=1 j=1
n
= ∑ bi ai , since B is an orthonormal set.
i=1
= f(x), from (1) above.

Thus, f(x) = ⟨x, z⟩∀ x ∈ V.

Suppose there also exists z1 ∈ V such that f(x) = ⟨x, z1 ⟩∀ x ∈ V.

Then,

⟨x, z⟩ − ⟨x, z1 ⟩ = 0 for all x ∈ V, i.e.


⟨x, z − z1 ⟩ =0 for all x ∈ V.

Hence, we obtain z − z1 = 0, i.e., z = z1 .

Thus, there exists a unique z ∈ V such that

f(x) = ⟨x, z⟩ ∀ x ∈ V.

We can also represent f in Theorem 2 by f = ⟨, z⟩. Thus, in Example 1, f = ⟨, (1, 2)⟩.

See if Theorem 2 can help you in solving the following exercise.

(z1 +z2 +z3 )


E2) Define f ∶ ℂ3 → ℂ by f (z1 , z2 , z3 ) = 3
.
Find the vector y ∈ ℂ3 such that f = ⟨, y⟩.

Let us now use linear functionals to define the adjoint of a linear transformation
from V to V.

15.3 ADJOINT OF AN OPERATOR


In this section we will obtain a linear transformation from V to V, which
corresponds to a given linear operator T ∶ V → V.

Let V be a finite-dimensional vector space over F, and let T ∶ V → V be a linear


operator. Choose any vector y ∈ V. Then, keeping T and y fixed, we can define
a map f ∶ V → F by

f(x) = ⟨Tx, y⟩ ∀ x ∈ V.

26 Try the following exercise to verify that f is a linear functional.


Unit
. . . . .15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Hermitian
. . . . . . . . . . .and
. . . . Unitary
. . . . . . . . .Operators
..........

E3) Show that f is a linear functional, i.e. f ∈ V∗ .

By E3 and Theorem 2, ∃ a unique element z ∈ V such that f = ⟨•, z⟩, that is,
f(x) = ⟨x, z⟩ ∀ x ∈ V, that is, ⟨Tx, y⟩ = ⟨x, z⟩ ∀ x ∈ V.

Note that the choice of this vector z depends upon the fixed vector y. This is
because if the fixed vector y is replaced by another vector y1 , we shall get
another linear functional f1 , and f1 will be represented as an inner product with
some other vector z1 . Of course, you can see that f depends on T also!

So, for each y ∈ V, ∃ a unique vector z ∈ V, that depends only upon y, if we


keep T fixed. Therefore, we get a function

T∗ ∶ V → V ∶ T∗ (y) = z.

Then, we can write

⟨Tx, y⟩ = ⟨x, T∗ y⟩ for all x, y ∈ V (since both are equal to ⟨x, z⟩).

We will look at some characteristics of the map T∗ in the following two


theorems. Henceforth, unless otherwise mentioned, we will only deal with
finite-dimensional inner product spaces.

Theorem 3: If (V, ⟨ , ⟩) is an inner product space over the field F and T ∈ A(V),
then T∗ is a linear transformation, i.e. T∗ ∈ A(V).

Proof: Choose y1 , y2 ∈ V. Then, for any x ∈ V,

⟨x, T∗ (y1 + y2 )⟩ = ⟨Tx, y1 + y2 ⟩, by definition.


= ⟨Tx, y1 ⟩ + ⟨Tx, y2 ⟩
= ⟨x, T∗ y1 ⟩ + ⟨x, T∗ y2 ⟩, by definition.
= ⟨x, T∗ y1 + T∗ y2 ⟩

This is true for any x ∈ V.


Therefore, T∗ (y1 + y2 ) = T∗ (y1 ) + T∗ (y2 ) ∀ y1 , y2 ∈ V, by Unit 14 (Theorem 1).
Again, choose y ∈ V. Then, for any x ∈ V, and 𝛼 ∈ F,

⟨x, T∗ (𝛼y)⟩ = ⟨Tx, 𝛼y⟩ = 𝛼⟨Tx, y⟩


= 𝛼⟨x, T∗ y⟩
= ⟨x, 𝛼T∗ y⟩,

which implies that T∗ (𝛼y) = 𝛼T∗ (y).

Thus, we have shown that T∗ is linear.

So, we have shown that given T ∈ A(V) ∃ T∗ ∈ A(V), such that ⟨Tx, y⟩ = ⟨x, T∗ y⟩ for
x, y ∈ V. Now, we will show that T∗ is unique. ■

Theorem 4: If (V, ⟨ , ⟩) is an inner product space over F and T ∈ A(V), then ∃ a


unique T∗ ∈ A(V) for which ⟨Tx, y⟩ = ⟨x, T∗ y⟩ for all x, y ∈ V. 27
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Proof: Suppose T∗ is not unique. Then there will exist at least two operators
T∗1 , T∗2 ∈ A(V) such that ⟨Tx, y⟩ = ⟨x, T∗1 y⟩ and ⟨Tx, y⟩ = ⟨x, T∗2 y⟩ for all x, y ∈ V. This
will mean that ∀ x, y ∈ V
∗ ∗ ∗ ∗
⟨x, T1 y⟩ = ⟨x, T2 y⟩. ∴, ⟨x, T1 (y) − T2 (y)⟩ = 0 ∀ x ∈ V.
∗ ∗ ∗ ∗
∴, T1 y = T2 y for all y ∈ V. This shows that T1 = T2 . ■

Theorem 4 allows us to give the following definition.

Definition 1: If (V, ⟨ , ⟩) is an inner product space over the field F and T ∈ A(V),
then the unique operator T∗ ∈ A(V) for which ⟨Tx, y⟩ = ⟨x, T∗ y⟩ holds for all
x, y, ∈ V, is called the adjoint of the operator T. (We also call T∗ the adjoint
operator.)

Let us look at some examples.

Example 2: Let Pn (ℂ) denote the vector space of all polynomials of degree ≤ n
with complex coefficients. Show that we can define an inner product on
Pn (ℂ) = Pn as follows:
n
⟨f, g⟩ = ∑ ai bi where f = a0 + a1 t + ⋯ + an tn and g = b0 + b1 t + ⋯ + bn tn . Find T∗ for the
i=0
operator T defined by (Tf)(t) = af(t), a ∈ ℂ.

Solution: Take B = {1, t, t2 , … , tn } in Example 4 of Unit 14. Then you can see
that ⟨ , ⟩, defined above, is an inner product. Now for f, g ∈ Pn .

⟨Tf, g⟩ = ⟨af, g⟩ = a⟨f, g⟩ = ⟨f, ag⟩.

∴, ⟨f, T∗ g⟩ = ⟨f, ag⟩ ∀ f, g ∈ Pn . ∴, T∗ g = ag ∀ g ∈ Pn .


∴, we get T∗ ∶ Pn → Pn ∶ T∗ (f) = af.
∗∗∗
Example 3: Find D∗ for the differential operator D, defined on Pn by Df(t) = f‵ (t).

Solution: For f = a0 + a1 t + ⋯ + an tn and g = b0 + b1 t + ⋯ + bn tn , we have

⟨Df, g⟩ = ⟨f′ , g⟩ = ⟨a1 + 2a2 t + ⋯ + nan tn−1 , g⟩


= a1 b0 + 2a2 b1 + ⋯ + nan bn−1
= ⟨a0 + a1 t + ⋯ + an tn , b0 t + 2b1 t2 + ⋯ + nbn−1 tn ⟩

∴, D∗ (b0 + b1 t + ⋯ + bn tn ) = b0 t + 2b1 t2 + ⋯ + nbn−1 tn = t (b0 + 2b1 t + ⋯ + nbn−1 tn−1 )

∗∗∗

Try the following exercise now.

E4) Obtain the adjoint of the operator T ∶ ℝn → ℝn ∶ T (x1 , … , xn ) = (x1 , 0, … , 0) .

Let us now look at some basic properties of the adjoint operator.

Theorem 5: Let (V, ⟨ , ⟩) be an inner product space over F. Then, for S, T ∈ A(V),
28 the following relations hold.
Unit
. . . . .15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Hermitian
. . . . . . . . . . .and
. . . . Unitary
. . . . . . . . .Operators
..........
a) I∗ = I, I being the identity operator.

b) (S + T)∗ = S∗ + T∗

c) (𝛼T)∗ = 𝛼T∗ , for any 𝛼 ∈ F.

d) ⟨T∗ (y), x⟩ = ⟨y, T(x)⟩, for all x, y ∈ V.

e) T∗∗ = T (T∗∗ means (T∗ )∗ )

f) T∗ T = 0 iff T = 0.

g) (T ∘ S)∗ = S∗ ∘ T∗ .

Proof: We will prove (e), (f) and (g) here, assuming (a) to (d). We leave the
proof of (a)-(d) to you (see E5.)
e) Choose any two vectors x, y ∈ V. Then,

⟨T∗∗ (x), y⟩ = ⟨(T∗ )∗ (x), y⟩ = ⟨x, T∗ (y)⟩, by (d.)


= ⟨T(x), y⟩, by definition.

This is true for any y ∈ V. Therefore, T∗∗ (x) = T(x) ∀ x ∈ V. Hence, T∗∗ = T.

f) If T∗ T = 0, then, for each x ∈ V, T∗ T(x) = 0.


Hence, ⟨T∗ T(x), y⟩ = 0 for any y ∈ V.
Thus, for y = x we get 0 = ⟨T∗ T(x), x⟩ = ⟨T∗ (T(x)), x⟩ = ⟨T(x), T(x)⟩, by (d)
⇒ T(x) = 0, by IP2 (Unit 14).
Therefore, T(x) = 0 for each x ∈ V. Hence T = 0.
Conversely, if T = 0 then

T(x) = 0 ∀ x ∈ v
⇒ T∗ T(x) = 0 ∀ x ∈ V
⇒ T∗ T = 0.

g) For any x, y ∈ V,

⟨(T ∘ S)∗ (x), y⟩ = ⟨x, (T ∘ S)(y)⟩, by (d)


= ⟨x, T(S(y))⟩ = ⟨T∗ (x), S(y)⟩, by (d).
= ⟨S∗ (T∗ (x)), y⟩, by (d).
= ⟨(S∗ ∘ T∗ )(x), y⟩

∴ (T ∘ S)∗ (x) = (S∗ ∘ T∗ )(x) for any x ∈ V. Hence, (T ∘ S)∗ = S∗ ∘ T∗ .



To complete the proof of this theorem, try E5.

E5) Prove (a)-(d) of Theorem 5.

Now, look closely at (e) and (f) of Theorem 5. They tell us that for any T ∈ A(V)

TT∗ = 0 ⇔ T∗∗ T∗ = 0, since T∗∗ = T.


⇔ T∗ = 0, by (f) applied to T∗ .

Try the following exercises now. 29


Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......

E6) Show that if T = 0, then so is T∗ .

E7) Show that the map 𝜙 ∶ A(V) → A(V) ∶ 𝜙(T) = T∗ is sesquilinear, that is ,
𝜙(S + T) = 𝜙(S) + 𝜙(T), and 𝜙(𝛼S) = 𝛼𝜙(S), ∀ S, T ∈ A(V) and 𝛼 ∈ F.

E8) Using Theorem 5, prove that if T ∈ A(V) and T−1 exists then (T−1 )∗ = (T∗ )−1 .

Now that you are familiar with the adjoint operator, let us look at some
operators whose adjoints have special properties.

15.4 SOME SPECIAL OPERATORS


In this section we will define two types of transformations. They are classified
according to the way their adjoints behave. The two types are self-adjoint
operators and unitary operators.

15.4.1 Self-adjoint Operators


As the name indicates, the members of this class will consist of operators that
are the same as their adjoints. We make a formal definition.

Definition 2: Let (V, ⟨ , ⟩) be an inner product space over F and T ∈ A(V). T is


said to be self-adjoint (or Hermitian) if T = T∗ .

Thus, if T is self-adjoint, then

⟨Tx, y⟩ = ⟨x, Ty⟩ = ⟨Ty, x⟩

for any x, y ∈ V.

If V is a real inner product space and T is self-adjoint, then the above condition
is reduced to ⟨Tx, y⟩ = ⟨Ty, x⟩ (since z = z ∀ z ∈ ℝ).
In this case T is said to be symmetric.

Can you think of an example of a self-adjoint operator? Theorem 5 tells us that


the identity operator is self-adjoint.

The following exercises deal with self-adjoint operators.

E9) Define a function f ∶ ℝ2 → ℝ2 ∶ f(x, y) = (y, x). Show that f is self-adjoint.

E10) If S, T ∈ A(V) are self-adjoint, then show that S ∘ T is self-adjoint iff


S ∘ T = T ∘ S, i.e. S and T commute. (Use Theorem 5.)

In Unit 12, you have already studied about the eigenvalues and eigenvectors of
operators. Let us see what they look like in the case of self-adjoint operators.

Theorem 6: Let (V, ⟨ , ⟩) be an inner product space and T ∈ A(V) be self-adjoint.


30 Then the eigenvalues of T are all real.
Unit
. . . . .15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Hermitian
. . . . . . . . . . .and
. . . . Unitary
. . . . . . . . .Operators
..........
Proof: Let 𝛼 be an eigenvalue of T. Then ∃v ∈ V, v ≠ 0, such that T(v) = 𝛼v. We
want to show that 𝛼 ∈ ℝ. Now,

𝛼⟨v, v⟩ = ⟨𝛼v, v⟩ = ⟨Tv, v⟩


= ⟨v, T∗ v⟩ = ⟨v, Tv⟩, since T = T∗
= ⟨v, 𝛼v⟩ = 𝛼⟨v, v⟩

Since ⟨v, v⟩ ≠ 0, we get 𝛼 = 𝛼. This means that 𝛼 ∈ ℝ. ■


T ∈ A(V) is called
The following exercise tells us something about skew-Hermitian operators. skew-Hermitian if
T∗ = −T.

E11) Let V be a complex inner product space and T ∈ A(V) such that T∗ = −T.
Show that
a) iT is self-adjoint, where i = √−1.
b) the eigenvalues of T are purely imaginary number or 0.

We will now prove a useful result about self-adjoint operators.

Theorem 7: Let (V, ⟨ , ⟩) be an inner product space and T ∈ A(V) be self-adjoint.


Then T = 0 iff ⟨Tx, x⟩ = 0 ∀ x ∈ V.

Proof: For any operator T,

T = 0 ⇒ Tx = 0 ∀ x ∈ V ⇒ ⟨Tx, x⟩ = 0 ∀ x ∈ V.

Conversely, assume that ⟨Tx, x⟩ = 0 ∀ x ∈ V.


Then ⟨T(x + y), x + y⟩ = 0 ∀ x, y ∈ V.

⇒ ⟨Tx, y⟩ + ⟨Ty, x⟩ = 0 ∀ x, y ∈ V. …(2)

⇒ ⟨Tx, y⟩ + ⟨y, Tx⟩ = 0 ∀ x, y ∈ V, ∴ T = T∗ .


⇒ ⟨Tx, y⟩ + ⟨Tx, y⟩ = 0 ∀ x, y ∈ V.
⇒ Re⟨Tx, y⟩ = 0 ∀ x, y ∈ V.

Now 2 cases arise F = ℝ or F = ℂ. If F = ℝ, then ⟨Tx, y⟩ = Re⟨Tx, y⟩ = 0 ∀ x, y ∈ V.


∴ T = 0. If F = C, then ⟨T(ix + y), ix + y⟩ = 0 ∀ x, y ∈ V gives us

⟨Tx, y⟩ − ⟨Ty, x⟩ = 0 ∀ x, y ∈ V.

This, with (2), gives us ⟨Tx, y⟩ = 0 ∀ x, y ∈ V


∴, again, T = 0. ■

This theorem will come in useful in the next sub section, where we look at
another type of linear transformation.

15.4.2 Unitary Operators


We will now study the class of operators which satisfy the condition T∗ = T−1 .
First, a definition. 31
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Definition 3: If (V, ⟨ , ⟩) is an inner product space over F and T ∈ A(V), then T is
called unitary if

TT∗ = I = T∗ T.

Thus, T is unitary if and only if T∗ = T−1 .

If F = ℝ, a unitary operator is also called orthogonal.

Can you think of an example of a unitary operator? Does the identity operator
satisfy the equation II∗ = I = I∗ I? Yes.
Another example is f ∶ ℝ2 → ℝ2 ∶ f(x, y) = (y, x).
From E9 you know that f = f∗ . Also

ff∗ (x1 , x2 ) = f(x2 , x1 ) = f(x1 , x2 ) ∴ ff∗ = I.

Similarly f∗ f = I. ∴ f is unitary.

In both these examples you may have noticed that the operators are also
self-adjoint. The following exercise will give you an example of a unitary
operator which is not self-adjoint.

E12) Show that the operator

T ∶ ℝ3 → ℝ3 ∶ T (x1 , x2 , x3 ) = (x3 , x1 , x2 )

is not self-adjoint, but it is unitary.


(Hint: Show that T∗ = T2 and T3 = I.)

We will now prove a theorem that shows the utility of a unitary (orthogonal)
operator.

Theorem 8: If (V, ⟨ , ⟩) is an inner product space over F and T ∈ A(V), then the
following conditions are equivalent.

a) T∗ T = I.

b) ⟨Tx, Ty⟩ = ⟨x, y⟩ for all x, y ∈ V.

c) ‖Tx‖ = ‖x‖ for all x ∈ V.

Proof: We shall prove (a)⇒(b)⇒(c)⇒(a). This will show that all three
statements are equivalent.

(a) ⇒(b): Assume (a). Then, for any x, y ∈ V, ⟨x, y⟩ = ⟨Ix, y⟩.

= ⟨T∗ Tx, y⟩ = ⟨Tx, Ty⟩.

Thus (b) holds.

(b) ⇒(c): If (b) holds for all x, y ∈ V, then it also holds when x = y. This means
that, ∀ x ∈ V.

⟨Tx, Tx⟩ = ⟨x, x⟩ or ‖Tx‖2 = ‖x‖2 .

32 ∴ ‖Tx‖ = ‖x‖ ∀ x ∈ V. Thus, (c) holds.


Unit
. . . . .15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Hermitian
. . . . . . . . . . .and
. . . . Unitary
. . . . . . . . .Operators
..........
(c) ⇒(a): If (c) holds, then
⟨Tx, Tx⟩ = ⟨x, x⟩ for all x ∈ V.

⇒⟨T∗ Tx , x⟩ = ⟨x, x⟩ for all x ∈ V.


⇒⟨T∗ Tx , x⟩ − ⟨x, x⟩ = 0 for all x ∈ V.
⇒⟨(T∗ T − I) x, x⟩ = 0 for all x ∈ V.
⇒T∗ T − I = 0 (by Theorem 7, since T∗ T − I is self-adjoint)
⇒T∗ T = I, which shows that (a) holds.

Remark 1: Theorem 8 says that T is a unitary operator iff

i) ⟨Tx, Ty⟩ = ⟨x, y⟩ ∀ x, y ∈ V, that is, T preserves inner products.

ii) ‖Tx‖ = ‖x‖ ∀ x ∈ V, that is, T preserves the length of a vector.

You will learn about some properties of unitary operators from the following
exercises.

E13) If V is a given inner product space over ℂ and S, T ∈ A(V) are unitary
operators, show that
a) S ∘ T is a unitary operator.
b) 𝛼T is a unitary operator for 𝛼 ∈ ℂ iff |𝛼| = 1.

E14) Show that the characteristic roots of a unitary operator have absolute
value 1.

Let us now talk about the action of a unitary operator on an orthonormal basis.
From Unit 14 Theorem 7 you know that (V⟨ , ⟩) has an orthonormal basis. The
following theorem characterises unitary operators in terms of their action on an
orthonormal basis.

Theorem 9: Let (V⟨ , ⟩) be an inner product space over F of dimension n. Then


T ∈ A(V) is unitary if and only if T maps an orthonormal basis of V onto an
orthonormal basis of V.

Proof: Let {e1 , … , en } be an orthonormal basis of V. Then ⟨ei , ej ⟩ = 0 if i ≠ j and


⟨ei , ei ⟩ = 1 ∀ i, j = 1, … , n.

We will first show that if T is unitary then {Te1 , … , Ten } is an orthonormal basis of
V. Now, since T preserves inner products, we get ⟨Tei , Tej ⟩ = 0 for i ≠ j, and
⟨Tei , Tej ⟩ = 1 ∀ i, j = 1, … , n. Also, since T is invertible (in fact, T−1 = T), you know
that T maps a basis to a basis. Hence {Te1 , … , Ten } is an orthonormal basis.

Conversely, we will show that if B = {Te1 , … , Ten } is an orthonormal basis then T


is unitary. For this, consider
x = ∑ni=1 𝛼i ei , y = ∑ni=1 𝛽i ei in V, where 𝛼i , 𝛽i ∈ F ∀ i = 1, … , n. 33
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Then

⟨Tx, Ty⟩ = ⟨∑ 𝛼i T(ei ), ∑ 𝛽j T(ej )⟩


i j

= ∑ ∑ 𝛼i 𝛽j ⟨Tei , Tej ⟩
i j

= ∑ 𝛼i 𝛽i , since B forms an orthonoraml basis.


i

Also, ⟨x, y⟩ = ∑ 𝛼i 𝛽i . Thus, ⟨Tx, Ty⟩ = ⟨x, y⟩ ∀ x, y ∈ V. Hence, by Theorem 8 we


i
can say that T is unitary. ■
We will use Theorem 9 to solve the following example.

Example 4: Let (V, ⟨ , ⟩) be a real inner product space of dimension 2. Obtain


an orthogonal operator T ∈ A(V) such that ⟨Tx, x⟩ = 0 ∀ x ∈ V.

Solution: Let {e1 , e2 } be an orthonormal basis of V. Then, so is {e2 , −e1 } . If we


define T ∈ A(V) by T(e1 ) = e2 and T(e2 ) = −e1 , by Theorem 9 we know that T is
orthogonal. Also, ⟨Te1 , e1 ⟩ = 0 = ⟨Te2 , e2 ⟩.
Now take any x ∈ V. Then ∃ a, b ∈ F such that x = ae1 + be2 . What is ⟨Tx, x⟩? It is
⟨T (ae1 + be2 ) , ae1 + be2 ⟩
= ⟨ae2 − be1 , ae1 + be2 ⟩
= ab − ab = 0. Thus, T is the required operator.

∗∗∗

Note that this example shows us that Theorem 7 is false if T is not self-adjoint.

Try the following exercise now.

E15) If T ∈ A(V) be such that T2 = I, show that T is Hermitian if and only if T is


unitary.

So far we have been discussing various kinds of operators. You may have
wondered about their matrix analogues. That is what we will discuss in the next
section.

15.5 HERMITIAN AND UNITARY MATRICES


In previous block you have seen the inter-relationship between operators and
matrices representing them. In this section we will show you the link between
self-adjoint operators and Hermitian matrices, and between unitary operators
and unitary matrices.

15.5.1 Matrix of the Adjoint Operator


Let (V⟨ , ⟩) be an inner product space over F. Given the matrix representation of
an operator T ∈ A(V), a natural problem that we can ask is: what is the matrix
34 representation of its adjoint T∗ ?
Unit
. . . . .15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Hermitian
. . . . . . . . . . .and
. . . . Unitary
. . . . . . . . .Operators
..........
To solve it let us consider orthonormal basis B = {e1 , … , en } of V. Let [T]B = [aij ]
and [T∗ ]B = [bij ]. Then we know that

T(ei ) = a1i e1 + a2i e2 + ⋯ + ani en ∀ i = 1, … , n.

and

T∗ (ei ) = b1i ei + ⋯ + bni en ∀ i = 1, … , n.

Now for ei , ej ∈ B, we have ⟨Tei , ej ⟩ = ⟨ei , T∗ ej ⟩

n n
⇒⟨∑ aki ek , ej ⟩ = ⟨ei , ∑ bkj ek ⟩
k=1 k=1
n n
⇒ ∑ aki ⟨ek , ej ⟩ = ∑ bkj ⟨ei , ek ⟩
k=1 k=1

0, if i ≠ j
⇒aji = bij , since ⟨ei , ej ⟩ = {
1, if i = j.

Thus, we have proved the following result.

Theorem 10: Let V be an inner product space over F(F = ℝ or ℂ ) of dimension


n, and T ∈ A(V) have the matrix representation [aij ] with respect to a given
orthonormal basis B. Then the matrix representation of the adjoint T∗ of T with
respect to the same basis is the matrix [bij ], where bij = aji .

Note: When F = ℝ, then bij = aji .

Recall that given a matrix A = [aij ], its conjugate transpose is the matrix A∗ = [a∗ij ],
where a∗ij = aji , i.e., A∗ = At .

Thus, Theorem 10 says that: If A = [aij ] is the matrix representation of T ∈ A(V)


with respect to B, then the matrix representation of the adjoint to T∗ with
respect to B is A∗ = At .

For example, if D ∶ P2 → p2 is the differential operator, then its matrix with


respect to the orthonormal basis B = {1, x, x2 } is

0 1 0 0 0 0

[0 0 2] ∴ [D ]B = [1 0 0]
0 0 0 0 2 0

Try the following exercises about the conjugate transpose of matrices.

E16) Show that


a) (AB)∗ = B∗ A∗ for any two n × n matrices A and B.
(Hint: Show that the (i, j)th elements of (AB)∗ and B∗ A∗ are the same.)
b) If an n × n matrix A is invertible, then so is A∗ and (A∗ )−1 = (A−1 )∗ .

Now let us look at the matrix of a self-adjoint operator. 35


Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
15.5.2 Hermitian Matrix
Recall, that a matrix A is said to be Hermitian if it is equal to its conjugate
transpose, that is, if A = A∗ . The following result tells us that the matrix of a
Hermitian operator is Hermitian.

Theorem 11: Let V be an inner product space over F and T ∈ A(V). Let the
matrix representation of T with respect to an orthonormal basis B = {e1 , … , en }
be A. Then T is self-adjoint iff A is Hermitian.

Proof: Let [T]B = A = [aij ]. Then, by Theorem 10, [T∗ ]B = [bij ], where bij = aij . That
is, [T∗ ]B = A∗ .

If T is self-adjoint, then T = T∗ . Therefore, [T]B = [T∗ ]B . Therefore, A = A∗ , which


means A is Hermitian.
Conversely, if A is Hermitian, then A = A∗ . Therefore, aij = a∗ij = aij ∀ i, j = 1, … , n.
n
Now, by definition, T(ei ) = ∑ aji ej . Therefore,
j=1

n n
⟨Tei , ek ⟩ = ⟨∑ aji ej , ek ⟩ = ∑ aji ⟨ej , ek ⟩ = aki = aik ∀ i, k = 1, … , n.
j=1 j=1
n
Also T∗ ei = ∑ a∗ji ej ∀ i = 1, 2, … , n.
j=1


∴ ⟨T∗ ei , ek ⟩ = aki = aik ∀ i, k = 1, … , n.
∴ ⟨Tei , ek ⟩ = ⟨T∗ ei , ek ⟩ ∀ i, k = 1, … , n.

This means that T = T∗ , that is, T is self-adjoint. Thus, the theorem is


proved. ■

So, by Theorem 11 we know the matrix of the operator in E9, with respect to
0 1
the standard basis, is a Hermitian matrix. That is, [ ] is Hermitian.
1 0

Theorem 11 also tells us that the following Hermitian matrices, treated as


operators, are self-adjoint:

k a + ib c + id
2 1+i
[3], [ ] and [a − ib m e + if ], where a, b, c, d, e, f, k, m, n ∈ ℝ.
1−i 0
c − id e − if n

You may like to try the following exercises now.

E17) How any characteristic roots of a Hermitian matrix are purely imaginary?
(Hint: Use Theorem 6.)

E18) Show that the matrix A of a skew-Hermitian operator T ∈ A(V) (i.e.,


T = −T∗ ), with respect to an orthonormal basis of V is skew-Hermitian (i.e.
A = −A∗ ).

36 We will now introduce you to the matrix corresponding to a unitary operator.


Unit
. . . . .15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Hermitian
. . . . . . . . . . .and
. . . . Unitary
. . . . . . . . .Operators
..........
15.5.3 Unitary (Orthogonal) Matrix
Remember, whenever we discuss unitary operators, we include orthogonal
operators, that is, the case F = ℝ. We will lead yo to the definition of a unitary
matrix, via the following theorem.

Theorem 12: let V be an inner product space over F with dim V = n. Let
U ∈ A(V) have a matrix representation A = [aij ], with respect to an orthonormal
basis B of V. If U is unitary, then
n
a) ∑ aik ajk = 𝛿ij
k=1
n
b) ∑ aki akj = 𝛿ij , ∀ i, j, … , n.
k=1

Proof: U has the matrix representation A = [aij ], with respect to B. Therefore,


U∗ has the matrix representation A∗ = [a∗ij ] with respect to B, a∗ij = aji . Since U is
unitary, UU∗ = I = U∗ U. Therefore, AA∗ = I = A∗ A.

That is [aij ][a∗ij ] = [𝛿ij ] = [a∗ij ][aij ]

Now, [aij ][a∗ij ] = [𝛿ij ]


n

⇒ ∑ aik akj = 𝛿ij
k=1
n
⇒ ∑ aik ajk = 𝛿ij
k=1

Similarly, [a∗ij ][aij ] = [𝛿ij ].


n
⇒ ∑ aki akj = 𝛿ij .
k=1

The above result leads us to the following definition.

Definition 4: If A is a given n × n matrix with entries in a field F, then A is said to


be a unitary matrix (an orthogonal matrix, if F = ℝ) if AA∗ = I = A∗ A.

Thus, Theorem 12 says that:

The matrix representation of any unitary (or orthogonal) operator on an inner


product space V, with respect to an orthonormal basis, is a unitary (or
orthogonal) matrix.

1 −1
Example 5: Show that the matrix A = [ ] is not orthogonal.
2 3

Solution: A∗ = At in this case, since the entries of A are real.


Thus,

1 −1 1 2 2 −1
AA∗ = [ ] [ ]=[ ]≠I
2 3 −1 3 −1 13 37
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
This means that A is not orthogonal.

∗∗∗

The following exercises will give you some examples of unitary matrices.

E19) Which of the following matrices are unitary?

0 1 i 0 i 1 1+i
[ ], [ ], [ ].
−1 i 2 i 0 1−i 0

cos 𝜃 sin 𝜃
E20) Is [ ] orthogonal?
− sin 𝜃 cos 𝜃

We will now derive a basic property of unitary (and orthogonal) matrices.

Theorem 13: For a square matrix A over ℂ the following are equivalent.

a) A is unitary.

b) The rows of A form an orthonormal set of vectors.

c) The columns of A form an orthonormal set of vectors.

Proof: We will prove that (a)⇔(b) and (a)⇔(c).

R1
(a)⇔(b): Let A = [aij ] = [ ⋮ ] , where Ri is the ith row of A. Then Rti will be the ith
Rn

column of A .

R1
∗ t t
∴ AA = I ⇔ [ ⋮ ] [R1 , … , Rn ] = I.
Rn

R1 Rt1 R1 Rt2 = … R1 Rtn


⇔[ ⋮ ⋮ … ⋮ ] = I.
Rn Rt1 Rn Rt2 … Rn Rtn
t
⇔ Ri Rj = 𝛿ij ∀ i, j = 1, … , n.

aj1
⇔ [ai1 , … , ain ] [ ⋮ ] = 𝛿ij
ajn
n
⇔ ∑ aik ajk = 𝛿ij .
k=1
⇔ the set of vectors { (ai1 , … , ain )| i = 1, … , n} are orthonormal.
⇔ the rows of A are orthonormal.

Hence, we have proved that (a)⇔(b).

Similarly, using the fact that A∗ A = I, we can prove that (a)⇔(c). Hence, we
38 have proved the theorem. ■
Unit
. . . . .15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Hermitian
. . . . . . . . . . .and
. . . . Unitary
. . . . . . . . .Operators
..........
Note: Just as we have proved Theorem 13, we can prove that a real square
matrix is orthogonal iff rows (or columns) form an orthonormal set of vectors.

You can apply what we have just said to solve the following exercise.

E21) Consider the matrix representing the linear operator

T ∶ ℝ3 → ℝ3 ∶ T(x, y, z) = (x cos 𝜃 − y sin 𝜃, x sin 𝜃 + y cos 𝜃, z) ,

with respect to the standard basis. Do its columns form an orthonormal


set of vectors?

Now let us look at real matrices only for the rest of the section.

Recall that a matrix A is symmetric if A = At . In Unit 12, you also came across
the concept of similar matrices. We now define an allied concept.

Definition 5: Two square matrices A and B, of the same order, are said to be
orthogonally similar if A = P−1 BP, for some orthogonal matrix P.

Remember that if P is orthogonal, then it is invertible, and its inverse is Pt . Thus,


A and B are orthogonally similar if A = Pt BP, for an orthogonal matrix P.

Let us consider an example.

1 2 −2 −1
Example 6: Show that [ ] and [ ] are orthogonally similar.
−1 −2 2 1

a b
Solution: Suppose P = [ ] is an orthogonal matrix satisfying
c d

1 2 −2 −1
[ ] = Pt [ ] P.
−1 −2 2 1

Then we have
1 2 a c −2 −1 a b
[ ]=[ ] [ ] [ ]
−1 −2 b d 2 1 c d
−2a + 2c −a + c a b (2a + c)(c − a) (2b + d)(c − a)
=[ ] [ ]=[ ]
−2b + 2d −b + d c d (2a + c)(d − b) (2b + d)(d − b)

Solving the equations

1 = (2a + c)(c − a)
−1 = (2a + c)(d − b)
2 = (2b + d)(c − a)
−2 = (2b + d)(d − b),

we get

0 1 0 −1
P=[ ] or [ ].
1 0 −1 0 39
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
t
1 2 0 1 −2 −1 0 1
∴ [ ]=[ ] [ ] [ ]
−1 −2 1 0 2 1 1 0
t
1 2 0 −1 −2 −1 0 −1
or [ ]=[ ] [ ] [ ].
−1 −2 −1 0 2 1 −1 0

Check that these equalities do hold, by multiplying the right hand side.

∗∗∗

This example shows that there can be several orthogonal matrices P such that
A = Pt BP.

Now we shall use an orthogonal matrix to diagonalise a real symmetric matrix.


In Unit 12 you have studied about diagonalising matrices. Theorem 5 of Unit 12
gives you a practical method of diagonalising a square matrix. We will use this
theorem to prove the following result.

Theorem 14: Let A be a real symmetric matrix of order n with distinct


eigenvalues 𝛼1 , … , 𝛼n . Let X1 , … , Xn ∈ Vn (ℝ) be normalised eigenvectors of A
corresponding to 𝛼1 , … , 𝛼n , respectively. Let P = (X1 , … , Xn ) . Then

a) P is orthogonal.

b) Pt AP is the diagonal matrix, diag (𝛼1 , … , 𝛼n ) .

Proof: a) We will first show that {X1 , … , Xn } is an orthonormal set in Vn (ℝ).


Remember that the standard inner product in Vn (ℝ) is given by

n x1 y1
t
X ⋅ Y = ∑ xi yi = X Y ∀ X = [ ⋮ ] , Y = [ ⋮ ] in Vn (ℝ).
i=1
xn yn

Now,

(𝛼1 − 𝛼2 )(X1 ⋅ X2 ) = 𝛼1 (X1 ⋅ X2 ) − 𝛼2 (X1 ⋅ X2 )


= (𝛼1 X1 ⋅ X2 ) − (X1 ⋅ 𝛼2 X2 ) = (AX1 ⋅ X2 ) − (X1 ⋅ AX2 )
t
= (AX1 )t X2 − X1 AX2
t t
= X1 AX2 − X1 AX2 (since At = A).
=0

Since 𝛼1 ≠ 𝛼2 , we get X1 ⋅ X2 = 0.
Similarly Xi ⋅ Xj = 0 ∀ i ≠ j.
Also ‖Xi ‖ = 1 ∀ i = 1, … , n, since the X‵i s are normalised vectors.
Therefore, (X1 , X2 , … , Xn ) is an orthonormal set.
Therefore, by Theorem 13, P is orthogonal.

b) From Unit 12 (Theorem 5) you know that P−1 AP = diag (𝛼1 , … , 𝛼n ) . That is,
Pt AP = diag (𝛼1 , … , 𝛼n ) .

40 ■
Unit
. . . . .15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Hermitian
. . . . . . . . . . .and
. . . . Unitary
. . . . . . . . .Operators
..........
What Theorem 14 says is that any real symmetric n × n matrix with n
distinct eigenvalues is orthogonally similar to a diagonal matrix.

Note: Though we have proved Theorem 14 for real symmetric matrices with
distinct eigenvalues, it is true for any real symmetric matrix. That is, any real
symmetric matrix is orthogonally similar to a diagonal matrix. The proof of
this result is beyond the scope of this course.

Let us consider an example of how to use Theorem 14.

1 1 1
Example 7: Reduce [1 1 −1] to diagonal form.
1 −1 −1

1 1 1
Solution: The matrix A = [1 1 −1] is a real symmetric matrix. Its
1 −1 −1
𝜆−1 −1 −1
characteristic equation is | −1 𝜆−1 1 | = 0. This shows us that the
−1 1 𝜆+1
eigenvalues of A are 1, 2, −2.

Eigenvectors corresponding to them are (1, −1, 1), (1, 1, 0) and (−1, 1, 2),
respectively. Therefore, the normalised eigenvectors are

(1/√3) (1, −1, 1), (1/√2) (1, 1, 0), (1/√6) (−1, 1, 2).

These vectors give us the orthogonal matrix

1/√3 1/√2 −1/√6


P = [−1/√3 1/√2 1/√6 ]
1/√3 0 2/√6

1 0 0
t
Then, we get P AP = [0 2 0 ] .
0 0 −2
∗∗∗

Do try the following exercise now.

7 −1 −10
E22) Reduce [ −1 7 10 ] to diagonal form.
−10 10 −2
(Its eigenvalues are 6, −12 and 18.)

Let us end with summarising what we have covered in this unit.

15.6 SUMMARY
As in the previous unit, the vector spaces considered in this unit are all defined
over the fields ℂ or ℝ. We made the following points in this unit. 41
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
1. Any linear functional on an inner product space is represented by the
inner product with a fixed vector.

2. The definition and properties of the adjoint of an operator defined on an


inner product space.

3. The definition and properties of a self-adjoint operator.

4. The definition and properties of a unitary (orthogonal) operator.

5. A self-adjoint operator on an inner product space is represented by a


Hermitian matrix, with respect to an orthonormal basis of the underlying
space.

6. A unitary (orthogonal) transformation on an inner product space is


represented by a unitary (orthogonal) matrix, with respect to an
orthonormal basis of the underlying space.

7. A matrix is unitary (orthogonal) iff its rows form an orthonormal set of


vectors iff its columns form an orthonormal set of vectors.

8. Any real symmetric matrix is orthogonally similar to a diagonal matrix.

15.7 SOLUTIONS/ANSWERS
E1) For any x1 , x2 ∈ ℝ2 , we have

fy (x1 + x2 ) = ⟨x1 + x2 , y⟩ = ⟨x1 , y⟩ + ⟨x2 , y⟩


= fy (x1 ) + fy (x2 ).

Also, for any a ∈ ℝ and x ∈ ℝ2 ,

fy (ax) = ⟨ax, y⟩ = afy (x).

∴ fy is a linear functional on ℝ2 .

E2) {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is an orthonormal basis of ℂ3 .


Now f(1, 0, 0) = 31 = f(0, 1, 0) = f(0, 0, 1).
∴, as in the proof of Theorem 2,
1 1 1 1 1 1
y=(1, 0, 0) + (0, 1, 0) + (0, 0, 1) = ( , , )
3 3 3 3 3 3
is what we want. To check whether y is the required vector you must
ensure that f(z) = ⟨z, y⟩ ∀ z ∈ ℂ3 .
E3) For x1 and x2 ∈ V, we have

f(x1 + x2 ) = ⟨T(x1 + x2 ), y⟩ = ⟨Tx1 + Tx2 , y⟩ = ⟨Tx1 , y⟩ + ⟨Tx2 , y⟩


= f(x1 ) + f(x2 )

Also, for a ∈ F and x ∈ V,

f(ax) = ⟨T(ax), y⟩ = ⟨aTx, y⟩ = a⟨Tx, y⟩ = af(x).

42 ∴ f ∈ V∗ .
Unit
. . . . .15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Hermitian
. . . . . . . . . . .and
. . . . Unitary
. . . . . . . . .Operators
..........
E4) ⟨T(x1 , … , xn ), (y1 , … , yn )⟩ = ⟨(x1 , 0, … , 0), (y1 , … , yn )⟩
= x1 y1 = ⟨(x1 , … , xn ), (y1 , 0, … , 0)⟩.
∴ T∗ (y1 , … , yn ) = (y1 , 0, … , 0)
∴ T∗ = T in this case.

E5) a) For any x, y ∈ V, ⟨I(x), y⟩ = ⟨x, y⟩

⇒⟨x, I∗ y⟩ = ⟨x, y⟩
⇒I∗ (y) = y ∀ y ∈ V ⇒ I∗ = I.

b) For any x, y ∈ V,

⟨(S + T)(x), y⟩ = ⟨Sx + Tx, y⟩ = ⟨Sx, y⟩ + ⟨Tx, y⟩


= ⟨x, S∗ y⟩ + ⟨x, T∗ y⟩
= ⟨x, S∗ y + T∗ y⟩
= ⟨x, (S∗ + T∗ )y⟩.

∴ (S + T)∗ = S∗ + T∗ .
c) For any x, y ∈ V,

⟨(𝛼T)(x), y⟩ = ⟨𝛼Tx, y⟩ = 𝛼⟨Tx, y⟩


= 𝛼⟨x, T∗ y⟩
= ⟨x, (𝛼T∗ )(y)⟩

∴ (𝛼T)∗ = 𝛼T∗ .
d) ⟨T∗ y, x⟩ = ⟨x, T∗ y⟩ = ⟨Tx, y⟩ = ⟨y, Tx⟩.
E6)
T = 0 ⇒ Tx = 0 ∀ x ∈ V ⇒ ⟨Tx, y⟩ = 0 ∀ x, y, inV.
⇒ ⟨x, T∗ y⟩ = 0 ∀ x, y ∈ V.
In particular, for x = T∗ (y) ∈ V, we get
⟨T∗ y, T∗ y⟩ = 0 ∀ y ∈ V.

⇒ T∗ y = 0 ∀ y ∈ V ⇒ T∗ = 0.

E7) By Theorem 5(b), 𝜙(S + T) = 𝜙(S) + 𝜙(T).


By Theorem 5(c), 𝜙(𝛼S) = 𝛼𝜙(S).
E8) By Theorem 5, (T ⋅ T−1 )∗ = (T−1 )∗ ⋅ (T∗ )

⇒ I∗ = (T−1 )∗ ⋅ (T∗ )
⇒ I = (T−1 )∗ ⋅ (T∗ )

Similarly, T∗ (T−1 )∗ = I. ∴ (T−1 )∗ = (T∗ )− 1.


E9) Now

⟨f(x1 , x2 ), (y1 , y2 )⟩ = ⟨(x2 , x1 ), (y1 , y2 )⟩


= x2 y1 + x1 y2
= x1 y2 + x2 y1
= ⟨(x1 , x2 ), (y2 , y1 )⟩

∴ f∗ (y1 , y2 ) = (y2 , y1 ) = f(y1 , y2 ) ∀ (y1 , y2 ) ∈ ℝ2 .


∴ f∗ = f. 43
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
E10) S ∘ T = (S ∘ T)∗
⇔ S ∘ T = T∗ ∘ S∗ = T ∘ S, since S = S∗ and T = T∗ .

E11) a) (iT)∗ = iT∗ = (−i)(−T) = iT.


b) Let 𝛼 ∈ ℂ be an eigenvalue of T. Then ∃0 ≠ v ∈ V such that Tv = 𝛼v.
We will show that 𝛼 = −𝛼.
Now

𝛼⟨v, v⟩ = ⟨𝛼v, v⟩ = ⟨Tv, v⟩ = ⟨v, T∗ v⟩


= ⟨v, −Tv⟩ = −⟨v, 𝛼v⟩
= −𝛼⟨v, v⟩

∴ 𝛼 = −𝛼 ⇒ 𝛼 = 0 or 𝛼 is purely imaginary.

E12) ⟨T (x1 , x2 , x3 ) , (y1 , y2 , y3 )⟩ = ⟨(x3 , x1 , x2 ) , (y1 , y2 , y3 )⟩


= x3 y1 + x1 y2 + x2 y3 = x1 y2 + x2 y3 + x3 y1 .
= ⟨(x1 , x2 , x3 ) , (y2 , y3 , y1 )⟩

∴ T (y1 , y2 , y3 ) = (y2 , y3 , y1 ) = T2 (y1 , y2 , y3 ) ∀ (y1 , y2 , y3 ) ∈ ℝ3 .
∴ T∗ = T2 ≠ T.
Also T3 (x) = x ∀ x ∈ ℝ3 . ∴ T2 = T−1 , i.e., T∗ = T−1 .
∴ T is unitary.

E13) a) (ST)∗ = T∗ S∗ = T−1 S−1 = (ST)−1 .


b) (𝛼T)∗ = (𝛼T)−1 ⇔ 𝛼T∗ = 𝛼−1 T−1 ⇔ 𝛼T−1 = 𝛼−1 T−1 .
⇔ 𝛼 = 𝛼−1 ⇔ 𝛼𝛼 = 1 ⇔ |𝛼| = 1.
E14) Let 𝛼 be a characteristic root, i.e., an eigenvalue of a unitary operator
T ∈ A(V). Then ∃ 0 ≠ v ∈ V such that T(v) = 𝛼v.
Now

𝛼⟨v, v⟩ = ⟨𝛼v, v⟩ = ⟨Tv, v⟩ = ⟨v, T∗ v⟩ = ⟨v, T−1 v⟩


= ⟨v, 𝛼−1 v⟩ (∵ T−1 v = 𝛼−1 v).
= 𝛼−1 ⟨v, v⟩.

∴ 𝛼 = 𝛼−1 ⇒ 𝛼𝛼 = 1 ⇒ |𝛼| = 1.

E15) T2 = I ⇔ T = T−1 . Now,


T is Hermitian ⇔ T = T∗ ⇔ T−1 = T∗ (∵ T = T−1 )
⇔ T is unitary.

E16) a) Let A = [aij ], B = [bij ] and AB = C = [cij ]. Then the (i, j)th element of
n
C∗ = conjugate of the (j,i)th element of C = ∑ ajk bki …(3)
k=1

Now, if B∗ = [dij ] and A∗ = [eij ], then dij = bji and eij = aji . Also,the (i, j)th
element of
n n
B∗ A∗ = ∑ dik ekj = ∑ bki ajk
k=1 k=1

n
= ∑ ajk bki …(4)
k=1

44 (3) and (4) ⇒ C∗ = B∗ A∗ .


Unit
. . . . .15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Hermitian
. . . . . . . . . . .and
. . . . Unitary
. . . . . . . . .Operators
..........
b) Let B = A−1 . Then (AB)∗ = I∗ ⇒ B∗ A∗ = I.
Similarly, A∗ B∗ = I.
∴ B∗ = (A∗ )−1 , that is, (A−1 )∗ = (A∗ )−1 .

E17) 𝜆 is a characteristic root of a Hermitian matrix A.


⇔ 𝜆 is an eigenvalue of A.
⇔ 𝜆 is an eigenvalue of A, treated as an operator.
⇔ 𝜆 is real, by Theorems 6 and 11.
∴, no characteristic root of A is purely imaginary.

E18) Let B = {e1 , … , en } be an orthonormal basis of V. Let [T]B = A = [aij ]. Then


[T∗ ]B = A∗ = [bij ], bij = aij . Now, T = −T∗ ⇒ [T]B = −[T∗ ]B .
⇒ A = − A∗ .
0 1 i
E19) Since [ ] is not a square matrix, it can’t be unitary.
−1 i 2

0 i 0 −i
Now, if A = [ ] , then A∗ = [ ].
i 0 −i 0

0 i 0 −i 1 0
∴ AA∗ = [ ] [ ]=[ ].
i 0 −i 0 0 1
Similarly, A∗ A = I.
∴ A is unitary.
1 1+i 1 1+i
If A = [ ] , then A∗ = [ ].
1−i 0 1−i 0
1 1+i 1 1+i 3 1+i
∴ AA∗ = [ ] [ ]=[ ] ≠ I.
1−i 0 1−i 0 1−i 2
∴ A is not unitary.
cos 𝜃 sin 𝜃 cos 𝜃 − sin 𝜃
E20) Let A = [ ] . Then A∗ = [ ].
− sin 𝜃 cos 𝜃 sin 𝜃 cos 𝜃
1 0
∴ AA∗ = [ ] . Also A∗ A = I.
0 1
∴ A is orthogonal.
cos 𝜃 − sin 𝜃 0
E21) The matrix is A = [ sin 𝜃 cos 𝜃 0]
0 0 1
cos 𝜃 sin 𝜃 0

Then A = [− sin 𝜃 cos 𝜃 0]
0 0 1
∗ ∗
∴ AA = A A = I.
∴ A is unitary. ∴, its columns form an orthonormal set of vectors.
1 1 1
E22) The eigenvectors corresponding to 6, −12 and 18 are [1] , [−1] and [−1] ,
0 2 −1
respectively.
∴, the normalised eigenvectors are

1 1 1
1/√2 [1] , 1/√6 [−1] , 1/√3 [−1] .
0 2 −1 45
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
1/√2 1/√6 1/√3
∴ P = [1/√2 −1/√6 −1/√3] is an orthogonal matrix such that
0 2/√6 −1/√3
6 0 0
t
P AP = [0 −12 0 ] .
0 0 18

46
UNIT 16

REAL QUADRATIC FORMS


Structure
Page Nos.
16.1 Introduction 47
Objective
16.2 Quadratic Forms 48
16.3 Quadratic Forms as Matrix Product 50
16.4 Transformation of a Quadratic From Under a Change of Basis 56
16.5 Rank of a Quadratic Form 58
16.6 Orthogonal Canonical Reduction 61
16.7 Normal Canonical Form 65
16.8 Summary 69
16.9 Solution/Answers 70

16.1 INTRODUCTION
So far you have studied various kinds of matrices and inner products. In this
unit, we shall discuss a particular kind of inner product, which is closely
connected to symmetric matrices. This is called a quadratic form. It can also
be thought of as a particular kind of second degree polynomial, which is the
way we shall first define it. We will discuss the geometric aspect of a particular
case of quadratic forms in the next unit.

Quadratic forms are encountered in various mathematical and physical


problems. For example, in physics, expressions for moment of inertia, energy,
rate of generation of heat and stress ellipsoid in the theory of elasticity involve
quadratic forms. Quadratic forms also appear while studying chemistry, the life
sciences, and of course, many branches of mathematics.

In this unit we will always assume that the underlying field is ℝ.

Before going further make suer that you are familiar with Units 14 and 15.

Objectives
After studying this unit, you should be able to:
• identify a real quadratic form;

• find the symmetric matrix associated to a quadratic form; 47


Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
• calculate the rank of a quadratic form;

• obtain the orthogonal canonical reduction of a quadratic form;

• find the normal canonical reduction of a quadratic form;

• calculate the signature of quadratic form.

16.2 QUADRATIC FORMS


The word ”quadratic” is not new to you. You have already encountered it when
solving equations of the type

ax2 + bx + c = 0, a, b, c ∈ R, a ≠ 0, …(1)

which are called quadratic equations. The left hand side of Eqn. (1) is a
quadratic function in one variable over R. We call the second degree term in
Eqn. (1), i.e., ax2 , a quadratic form of order one. It is called of order one,
since it involves only one variable.

The most general quadratic equation over R involving two variable x and y is

(ax2 + 2hxy + by2 ) + (2gx + 2fy) + c = 0, a, b, c, f, g, h ∈ R,

where at least one of a, h, b is non-zero. Its left hand side is a quadratic


function, or quadratic polynomial, of order 2. The second degree terms
occurring in this equation, i.e. the expression

ax2 + 2hxy + by2

is called a quadratic form of order two, since it involves two variables x and y.

The most general quadratic equation over R involving three variables is

(ax2 + by2 + cz2 + 2hxy + 2gxz + 2fyz) + 2ux + 2vy + 2wz + d = 0,

a, b, c, d, f, g, h, u, v, w ∈ R, where at least one of a, b, c, f, g, h is non-zero. Its left


hand side is a quadratic function, or quadratic polynomial, in three variables.
The bracketed part of if this equation, containing only second degree terms, is
called a quadratic form of order three.

By now you can see how we can generalise this concept. We call the non-zero
form
n
∑ aij xi xj
i,j=1

a quadratic form R of order n, where the a′ij s are real constants and x1 , x2 , … , xn
are real variables.

Remark 1: These expressions are called quadratic, since they are of second
degree. They are called forms, since every term in them has the same degree.

48 We are now ready to make a formal definition.


Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......
Definition 1: A homogeneous polynomial of degree two is called a quadratic
form. Its order is the number of variables that occur in it.

For example, x2 − 3y2 + 4xz is a quadratic form of order 3.

A quadratic form is real if its variables can only take real values and the
coefficients are real number. We have already stated, in the unit introduction,
that all spaces considered in this unit shall be over R. Therefore, by a
quadratic form we shall always mean a real quadratic form.

From the definition of a quadratic form it is clear that a real valued function will
be a quadratic form if and only it it satisfies each of the following conditions:

a) it is a polynomial,

b) it is homogeneous, and

c) it is of degree two.

Let us look at some examples now.

Example 1: Which of the following are quadratic forms? In the case of


quadratic forms, find the order.

a) x2 + x + 1

b) 2x2 + y2 + z2

c) x2 − √2y2 = 0

d) 3x21 + x1 x2 − √3x22

e) x31 − x22 + x2 x3

f) x3 + x2 y − y3

g) x2 + log x.

Solution: (c) is an equation, and not a polynomial. (a) and (e) are
polynomials, but they are not homogeneous. (f) is a polynomial which is
homogeneous, but its degree is three and not two. (g) is not a polynomial. Only
(b) and (d) represent quadratic forms. (b) involves three variables, and hence,
its order is three. (d) involves two variables, and thus, has order two.

∗∗∗

Try the following exercises now.

E1) Give an example of a function that is

a) a non-homogeneous polynomial of degree 2.


b) a homogeneous polynomial, but not of degree 2. 49
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
E2) Which of the following represent quadratic forms?

a) x2 − xy
b) x1 + x2
c) x31
d) x3 − xy2
e) sin (x2 + 2y2 )
f) x21 − √2x22 = 0

E3) Find the values of the integer k for which the following will represent
quadratic forms.

a) x2 − 2y2 − kxy2
b) xk + 2y2
c) x41 + 2x1 x2 − xk1

E4) Let Q1 and Q2 be two quadratic forms, both of order n, in the n variables
x1 , x2 , … , xn . Which of the following will be a quadratic form?

Q1 + Q2 , aQ1 + bQ2 , Q1 − Q2 , Q1 Q2 , Q1 /Q2 .

Let us now see how to represent a quadratic form as a product of matrices. In


fact, you will see how a quadratic form can be written as an inner product.

16.3 QUADRATIC FORM AS MATRIX


PRODUCT
Consider the quadratic form of order two,

Q = 2x2 + 2xy + 3y2 .

x 2 1
Putting X = [ ]and A = [ ] , we find that
y 1 3

2 1 x
Q = Xt AX = [xy] [ ] [ ] …(2)
1 3 y

The question now is whether we can replace the matrix A by another matrix
without changing the quadratic form Q. In fact, you can check that
2 2 2 −1
Q = Xt BX, where B = [ ] , and Q = Xt CX, where C = [ ].
0 3 3 3

Thus, we see that if we replace A by B or C in (2), the quadratic form is not


changed. This shows us that the choice of the matrix A in (2) is not unique. In
this section we shall find the reason for this, and also investigate the general
matrix which can replace A in (2).

Note that we can also write Q = ⟨AX, X⟩, where ⟨Y, Z⟩ = Zt Y for any Y, Z ∈ V2 (R).
So, as you go along, remember that we are simultaneously discussing the
50 representation of Q as a matrix product, as well as an inner product.
Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......
Look carefully at the matrices A, B and C, given above. Do they have a
common feature? You must have noticed that the diagonal elements of all
these matrices are the same, i.e., A, B and C have the same diagonal. Now,
what about the off-diagonal (i.e., non-diagonal) entries? Have you noticed that
the sum of the off-diagonal entries in all these matrices is 2? Note that the
coefficient of the term xy, of the given quadratic form, is also 2.

E5) Change one of the diagonal entries of A and verify that this will change
the quadratic form.

2 a
In fact, any matrix P = [ ] , with a + b = 2, can replace A without changing the
b 3
quadratic form Q. This is because the coefficient of xy in the quadratic form
Xt PX is (a + b). However, if we insist that the matrix P should be symmetric, then
2 1
we must have a = b; and hence, the choice is unique, namely, [ ].
1 3

We, therefore, conclude that A is the only symmetric matrix for which Q = Xt AX.

This symmetric matrix A is called the matrix of the quadratic form Q, or the
matrix associated to the quadratic form Q. Observe that

coe. of x2 (1/2) coef. of xy


A=[ ],
(1/2) coef. of xy coef. of y2

where coef. is short for coefficient.

We can sum up the above discussion as follows:.

Given a quadratic form Q of order 2, there are infinitely many square matrices
B for which Q = Xt BX. However, there will be unique symmetric matrix A for
which Q = Xt AX. This matrix A, which is called the matrix of the quadratic form
Q, is given by the rule

coe. of x2 (1/2) coef. of xy


A=[ ] …(3)
(1/2) coef. of xy coef. of y2

Actually, there is a one-to-one correspondence between the set of all


symmetric square matrices of order 2 and the set of all quadratic forms of order
a b
2. This is because, given any 2 × 2 symmetric matrix B = [ ] , we can obtain
b d
a unique quadratic form of order 2 corresponding to it, namely,
Xt BX = ax2 + 2bxy + dy2 . Conversely, given any quadratic form of order 2 we can
obtain a unique 2 × 2 symmetric matrix by the rule Eqn. (3). The following
examples will illustrate this correspondence.

1 −1
Example 2: What is the quadratic form generated by A = [ ]?
−1 1

1 −1 x
Solution: The quadratic form generated by A is [xy] [ ] [ ].
−1 1 y 51
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
On expanding this we get x2 − 2xy + y2 .

Observe that you could have obtained the quadratic form simply by applying
the rule (3) as follows:

Comparing the given matrix A with the matrix in (3) gives


coef. of x2 = 1, coef. of y2 = 1, (1/2) coef. of xy = −1.
Therefore, the required quadratic form is x2 − 2xy + y2 .

∗∗∗

𝛼1 0
Example 3: A general diagonal matrix of order 2 is A = [ ].
0 𝛼2
What is the corresponding quadratic form?

Solution: Once again you can either compute

𝛼1 0 x
Xt AX = [xy] [ ] [ ] = 𝛼 1 x2 + 𝛼 2 y2 ,
0 𝛼2 y

or use rule (3) to get

coef. of x2 = 𝛼1 , coef. of y2 = 𝛼2 , coef. of xy = 0.

∴, the required form is 𝛼1 x2 + 𝛼2 y2 .

Such a quadratic form is called a diagonal form.

∗∗∗

Example 4: Find the matrices associated to the following quadratic forms.

a) x2

b) −y2 − 4xy

Solution: Rule (3) is very handy for writing the symmetric matrix of a given
quadratic form. It is easy to see that the corresponding matrices will be
1 0 0 −2
a) [ ], b) [ ].
0 0 −2 −1

∗∗∗

Now for an exercise!

E6) Find the 2 × 2 matrices associated to


a) −y2 , b) 2x2 + y2 , c) 2xy, d) px2 + qxy + ry2 .

The above discussion involved matrices and quadratic forms of order two. It
can be extended to matrices and quadratic forms of higher orders. Let us look
52 at the case of quadratic forms of order 3.
Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......
Let us consider a general 3 × 3 matrix

a11 a12 a13


A = [aij ] = [a21 a22 a23 ]
a31 a32 a33

The quadratic form determined by A will be

Q = Xt AX, …(4)

where Xt = [x1 x2 x3 ] .

Expand the matrix product in Eqn. (4) and verify that

Q = a11 x21 + a22 x22 + a33 x23 +(a12 + a21 ) x1 x2 +(a23 + a32 ) x2 x3 +(a13 + a31 ) x1 x3 . …(5)

Observe that the diagonal elements of A, i.e., a11 , a22 and a33 , are the
coefficients of x21 , x22 , and x23 , respectively, in Q given by Eqn. (5).

Also note that the sum of the two entries a12 and a21 determines the
coefficients of x1 x2 , while these two entries do not occur elsewhere in Eqn. (5).
So, if we replace a12 and a21 by two different numbers a′12 and a′21 such that
a′12 + a′21 = a12 + a21 , while keeping other entries of A unchanged, the new matrix
A′ , thus obtained, will not be equal to A. But the quadratic forms generated by A
and A′ will be the same, i.e.,

Q = Xt AX = Xt A′ X.

Similar changes can be made for the entries contributing to the coefficients of
x1 x3 , to obtain matrices different from A which can replace A without changing
the quadratic form. However, if the matrix A′ is restricted to being symmetric
then the choice is unique, i.e.,

a′12 = a′21 = 1
2
(a12 + a21 ) = 1
2
(coef. of x1 x2 ),

a′13 = a′31 = 1
2
(a13 + a31 ) = 1
2
(coef. of x1 x3 ),

and a′23 = a′32 = 1


2
(a23 + a32 ) = 1
2
(coef. of x2 x3 ).

Therefore, the unique symmetric matrix corresponding to the quadratic form (5)
will be
coef. of x21 1
2
coef. of x1 x2 1
2
coef. of x1 x3
A′ = [ 12 coef. of x1 x2 coef. of x22 1
2
coef. of x2 x3 ] …(6)
1
2
coef. of x1 x3 1
2
coef. of x2 x3 coef. of x23

We sum up the above discussion as follows:

Given a quadratic form of order 3, there are infinitely many matrices of order 3
which will generate it. However, a symmetric matrix that will generate a
quadrate form of order three is unique. This symmetric matrix is called the
matrix associated to the quadratic form, or simply, the matrix of the quadratic
form.

Just as in the case of order 2 forms, there is a one-to-one correspondence


between the set of all symmetric matrices of order three and the set of all 53
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
quadratic forms of order three. The next few examples will illustrate the above
discussion.

Example 5: Find the quadratic form Q corresponding to the symmetric matrix


1 −2 3
A = [−2 4 1] = [aij ], say.
3 1 2

Solution: A straight-forward way will be to expand Xt AX where


Xt = [x1 , x2 , x3 ] . Then we would get

Q = x21 + 4x22 + 2x23 − 4x1 x2 + 6x1 x3 + 2x2 x3 .

But, a quicker way is to use the rule (6). Comparing the entries of A′ in (6) with
those of A above we can obtain all the coefficients of the quadratic form as
follows:

Coefficients of x21 , x22 , x23 will be the elements of the diagonal in A, i.e. 1, 4 and 2,
respectively.

coef. of x1 x2 = a12 + a21 = −4


coef. of x1 x3 = a13 + a31 = 6
coef. of x2 x3 = a23 + a32 = 2

Then the required quadratic form is Q, as obtained above.

∗∗∗

Example 6: Find the symmetric matrix associated with the form

2x21 − x22 + x23 + 2x1 x2 − 6x1 x3 .

2 1 −3
Solution: Using the rule (5), we can write the matrix as [ 1 −1 0 ]
−3 0 1

∗∗∗

Example 7: Find the quadratic form associated with the zero matrix of order
three.

Solution: All the entries of a zero matrix are zero. Therefore, using (6), we
get all the coefficients to be zero. The associated quadratic form is, then,

0x21 + 0x22 + 0x23 + 0x1 x2 + 0x1 x3 + 0x2 x3 ,

which is the zero quadratic form of order three.

∗∗∗

𝜆1 0 0
Example 8: Consider the general diagonal matrix of order three, [ 0 𝜆2 0 ].
0 0 𝜆3
54 What is the associated quadratic form?
Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......
Solution: The associated quadratic form is the diagonal form 𝜆1 x22 + 𝜆2 x22 + 𝜆3 x23 .

∗∗∗

The following exercises deal with quadratic forms of orders 2 and 3.

E7) Write the following quadratic forms as Xt AX, where A is a symmetric


matrix.

a) 7x2 + 7y2 − 2z2 + 20yz − 20zx − 2xy (in ℝ3 )


b) x21 + x22 − x1 x2 (in ℝ2 )
c) x21 − 2x1 x2 (in ℝ3 )
d) 2yz + 2zx (in ℝ3 )

E8) Expand Xt AX as a polynomial, where Xt = [x, y, z] and A is


a h g 1 0 0
a) [h b f ] , b) [0 −1 0]
g f c 0 0 0

Can we extend the comments about quadratic forms of order two and three to
a quadratic form of any finite order n? Yes. You know that a general quadratic
form of order n is given by
n
Q = ∑ aij xi xj , where aij = aji ∀ i, j = 1, … , n.
i,j=1

The associated symmetric matrix A of order n will be

a11 a12 … a1n


⎡a21 a22 … a2n ⎤
⎢ ⎥
A=⎢ … … … … ⎥ , where aij = aji ∀ i, j = 1, … , n.
⎢ ⎥
… … … …
⎣an1 an2 … ann ⎦

Thus, Q can be written as

A = Xt AX, where Xt = [x1 x2 … xn ] .

So, there is a one-to-one corresponding between the set of all symmetric


matrices of order n and the set quadratic forms of order n. Under this
correspondence the matrix A corresponds to the quadratic form Xt AX. The
following exercise illustrates this for order 4.

E9) Expand Xt AX as a polynomial, where Xt = [x1 , x2 , x3 , x4 ] and


4 0 0 2
2 1 0 0
A=[ ]
3 6 0 1
0 0 0 4
Find the symmetric matrix A′ such that Xt AX = Xt A′ X. 55
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......

Before going further, we would like to remind you that the quadratic form of
order n, Xt AX, is simply the inner product ⟨AX, , X⟩ in Vn (R).

Let us now see what happens to the matrix of a quadratic form if we change
the basis of the underlying vector space.

16.4 TRANSFORMATION OF A QUADRATIC


FORM UNDER A CHANGE OF BASIS
In the previous section you have seen that a quadratic form Q of order n can be
expressed as Xt AX, where Xt = [x1 , x2 , … , xn ] and A is a real symmetric matrix of
order n. Now, x1 , x2 , … , xn are the components (or the coordinates) of the vector
X with respect to a preassigned basis {e1 , e2 , … , en } of ℝn . If we will change the
basis of ℝn form B = {e1 , e2 , … , en } to another basis B′ = [e′1 , … , e′n ] , the
components of X will also change. Therefore, the quadratic form Q will also
change. We show that, under a change of basis, the quadratic form changes
according to a certain transformation law.

Let P be the matrix of the change of basis from B to B′ . Then P = [aij ], where
n
e′j = ∑ aij ei .
i=1

Previously, we have seen that P is invertible. Note that the columns of P are
the components of the vectors of the new basis B′ , expressed in terms of the
original basis B.

Now, if Xt = [x1 , … , xn ] and Yt = [y1 , … , yn ] denote the coordinates of a vector in


ℝn with respect to B and B′ , respectively, then
n n n

∑ xi ei = ∑ yj ej = ∑ yj aij ei .
i=1 j=1 i,j=1

Since {e1 , … , en } is a basis, we get


n
xi = ∑ aij yj ∀ i = 1, 2, … , n.
j=1

This is equivalent to the matrix equation

a11 … a1n
x1 y1
… … …
[ ⋮ ]=[ ][ ⋮ ]
… … …
xn yn
an1 … ann

i.e., X = PY.

This equation is the coordinate transformation corresponding to the change of


basis from B to B′ . The change of basis will convert the quadratic form Xt AX into

(PY)t A(PY) = Yt (Pt AP)Y


= Yt CY, where C = Pt AP.
56
Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......
But, is C symmetric? Well, Ct = (Pt AP)t = Pt AP = C. ∴, C is symmetric.

The above discussion shows that, under a change of basis given by the
invertible matrix P, the coordinate transformation is given by X = PY, and the
quadratic form Xt AX gets transformed into another quadratic form Yt CY, where
C = Pt AP. This leads us to the following definitions.

Definition 2: Two real symmetric matrices A and B are called congruent if


there exists an invertible real matrix P such that B = Pt AP.

Two quadratic forms Xt AX and Yt BY are called equivalent if their matrices, A


and B, are congruent.

In particular, if the matrices A and B are orthogonally similar (see Unit 15) then
the corresponding quadratic forms, Xt AX and Yt BY are called orthogonally
equivalent.

So, under a change of basis, a quadratic form gets transformed to an


equivalent quadratic form. They may or may not be orthogonally equivalent.
Let us look at an example.

Example 9: Consider the change of basis of ℝ2 from the standard basis


B1 = {(1, 0), (0, 1)} to B2 = {(1, 0), (1, 2)} . Let (x1 , x2 ) and (y1 , y2 ) represent
coordinates with respect to B1 and B2 , respectively.

a) Find the coordinate transformation that expresses x1 , x2 in terms of y1 , y2 .

b) Let Q(X) = x21 − 2x1 x2 + 4x22 . Find the expression of Q in terms of y1 and y2 .

Solution:

a) The change of basis from B1 to B2 is given by the coordinate


transformation.

x1 1 1 y1
[ ]=[ ] [ ] , or X = PY, say. …(7)
x2 0 2 y2

(Remember that the columns of P will be the components of the basis


vectors expressed in terms of the old basis.) From (7)

x1 y + y2
[ ]=[ 1 ]
x2 2y2
i.e., x1 = y1 + y2
x2 = 2y2

which is the required coordinate transformation.

1 −1 x1
b) Now Q(X) = [x1 , x2 ] [ ] [ ]
−1 4 x2

= Xt AX, say …(8) 57


Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......

Using (7), Q(Y) = Yt (Pt AP)Y, …(9)

where

1 0 1 −1 1 1
Pt AP = [ ] [ ] [ ]
1 2 −1 4 0 2
1 −1
=[ ]
−1 13

Using this in (9), we get

Q(Y) = y21 − 2y1 y2 + 13y22 …(10)

Thus, under the change of basis given by X = PY, the given quadratic form
transforms into (9).

∗∗∗

The following exercises will give you some more practice in dealing with
quadratic forms under a change of basis.

E10) Verify that the matrix P in Example 9 is not orthogonal. (Therefore, (7) is
not a orthogonal transformation. Therefore, (8) and (10) are equivalent,
but not orthogonally equivalent.)

E11) Consider the quadratic form given in Example 9. Replace B2 by


{(1, 0), (1, 1)} . Is this change of basis orthogonal? Find the quadratic form
with respect to the new basis B2 .

E12) Let a quadratic form have expression

7x2 + 52xy − 32y2 ,

with respect to the standard basis B1 = {(1, 0), (0, 1)} of ℝ2 . Find its
expression with respect to the basis B2 = {(2, 1), (1, −2)} .

Now let us see what mean by the rank of a quadratic form.

16.5 RANK OF A QUADRATIC FORM


You already have studied about the rank of a matrix. Here we will discuss the
rank of a quadratic form. Since quadratic forms are closely associated with
matrices, the concept of the rank of a matrix can be used to define the rank of a
quadratic form. But first we shall prove the following result.

58 Theorem 1: Congruent matrices have the same rank.


Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......
Proof: Let A and B congruent matrices. Then there is a non-singular matrix P
such that B = Pt AP.

Recall that multiplication by a non-singular matrix does not change the rank of
a matrix. Therefore,

rank (B) = rank (Pt AP) = rank (A),

which proves the theorem. ■

We are now all set to define the rank of a quadratic form.

Definition 3: The rank of a quadratic form is the rank of its associated matrix.

You may think that this definition is not meaningful, because the associated
matrix depends on the basis of the vector space. But Theorem 2 will assure us
that the definition is meaningful. So, basically if we find the rank of any
associated matrix, that rank will be the rank of the given quadratic form. (See
Example 10(b)).

Theorem 2: The rank of a quadratic form does not change under a change of
basis.

Proof: Let Q(X) = Xt AX be a quadratic form of rank r. Under a change of basis


let X = PY. Then Q(Y) = Yt (Pt AP)Y, And then,

rank Q(X) = rank A = rank (Pt AP) (by Theorem 1)


= rank Q(Y)

Thus, we have proved the theorem. ■

Try the following simple exercise.

E13) Verify that the rank of a diagonal form is the number of non-zero terms in
its expression.

Now let us obtain the ranks of some more quadratic forms.

Example 10: Consider the quadratic form

Q(X) = 2x21 + 2x1 x2 + 2x22 − 6x1 x3 − 6x2 x3 + 6x23 ,

where [x1 , x2 , x3 ] are the coordinates of X with respect to the standard basis of
ℝ3 .

a) Find the expression of Q with respect to the basis

1 1 −2 1 −1 1 1 1
B = {( , , ),( , , 0) , ( , , )}
√6 √6 √6 √2 √2 √3 √3 √3

b) What is the rank of Q? 59


Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Solution:

a) Let Yt = [y1 , y2 , y3 ] denote the coordinates with respect to the new basis B.
Then, the change of coordinates is given by
1 1 1
√6 √2 √3
−1
X = [ √1 √2
1
√3
] Y = PY (say)
6
−2 1
0
√6 √3

The given quadratic form can be written as Xt AX, where

2 1 −3
A = [ 1 2 −3]
−3 −3 6

The change of coordinates given by X = PY will convert Xt AX into


Yt (Pt AP)Y, where
1 1 −2 1 1 1
√6 √6 √6 2 1 −3 √6 √2 √3 9 0 0
1 −1 −1
t
P AP = [√ 0 ] [ 1 2 −3] [ 1 1
] = [0 1 0]
2 √2 √6 √2 √3
1 1 1 −3 −3 6 −2 1 0 0 0
0
√3 √3 √3 √6 √3

Using this, we get Q(Y) = 9y21 + y22 , which is the required quadratic form.
Note that P is an orthogonal matrix. ∴ Q(X) and Q(Y) are orthogonally
equivalent.

b) Now, let us obtain rank (Q) directly. We know that rank (A) = 2.
∴ rank (Xt AX) = 2, i.e., the rank of Q is 2.

Another way of showing that rank Q(X) = 2 is as follows: Q(X) and Q(Y)
are equivalent, and the rank of the diagonal quadratic form Q(Y) is two. ∴,
rank of Q(X) is also two.

∗∗∗

The following exercise will give you some practice in obtaining the rank of a
quadratic form.

E14) Find the rank of the following quadratic forms in ℝ3 .

a) 5x2 + 6y2 + 7z2 − 4xy − 4yz


b) x2 + y2 + z2 + 2xy + 2yz + 2xz
c) 2x2 + 2y2 + 2z2 + 2xy + 2yz + 2xz
d) x2 − y2

We shall now use results of Units 14 and 15 to establish a method to reduce a


quadratic form into a diagonal form, by using a suitable orthogonal change of
60 basis.
Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......

16.6 ORTHOGONAL CANONICAL


REDUCTION
Recall from Unit 15 that for any real symmetric matrix A, we can construct an
orthogonal matrix R whose columns are a set of orthogonal eigenvectors (say,
U1 , U2 , … , Un ) of A such that

Rt AR = diag (𝜆1 , … , 𝜆n ) , …(11)

𝜆1 , … , 𝜆n being the eigenvalues of A corresponding to the eigenvectors


U1 , … , Un , respectively.

Remember, R may not be unique. This could be due to two factors:

i) Changing the order in which eigenvectors are taken will change R.

ii) An orthonormal eigenvector corresponding to an eigenvalue need not be


unique.

We shall now use the relation (11) to transform any quadratic form to a
diagonal form.

Let A be the matrix of a quadratic form with respect to a pre-assigned basis.


Let R be an orthogonal matrix obtained form A as indicated above. Now
consider the change of basis from the pre-assigned basis to the basis
{U1 , U2 , … , Un } . The coordinate transformation will be given by

X = RY, …(12)

Yt = [y1 , y2 , … , yn ] begin the coordinates with respect to the new basis. R being
orthogonal, (12) is an orthogonal transformation which will convert Xt AX into

Yt (Rt AR) Y = 𝜆1 y21 + ⋯ + 𝜆n y2n , …(13)

because of (11).

Thus Xt AX is orthogonally equivalent to the diagonal form in (13) whose


coefficients are the eigenvalues of A. The form in (13) is called an orthogonal
canonical reduction of Xt AX.

We say that the orthogonal transformation (12) has reduced the quadratic
form Xt AX into its orthogonal canonical form, given by (13). The form in
(13) is orthogonal since the transformation used to convert Xt AX into it is
orthogonal. It is called canonical as the reduced form is the Simplest
orthogonal reduction of Xt AX. The elements of the basis which diagonalise the
quadratic form (in this case they are U1 , … , Un ) are called the principal axes of
the quadratic form.

We can summarise the above discussion in the form of a theorem. 61


Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Theorem 3: A real quadratic form Xt AX can always be reduced to the diagonal
form
2
𝜆1 y1 + ⋯ + 𝜆n y2n

by an orthogonal change of basis, where 𝜆1 , … , 𝜆n are the eigenvalues of A. The


new ordered basis is an orthonormal set of eigenvectors corresponding to the
eigenvalues 𝜆1 , … , 𝜆n .

Now, if the matrix of a quadratic form is orthogonally similar to diag(𝜆1 , … , 𝜆n ) , it


is also orthogonally similar to diag(𝜆2 , 𝜆1 , … , 𝜆n ) . Thus, the orthogonal canonical
form to which a quadratic form is orthogonally equivalent is unique except for
the order of the coefficients. If we insist that the non-zero eigenvalues be
written in decreasing order followed by the zero eigenvalues, if any, then we
can obtain a unique orthogonal canonical form.

So, we can state the following result.

Theorem 4: A quadratic form of rank r is orthogonally equivalent to a unique


orthogonal canonical form 𝜆1 y21 + ⋯ + 𝜆r y2r , where 𝜆1 , … , 𝜆r are the non-zero
eigenvalues of the matrix of the quadratic form, such that 𝜆1 ≥ 𝜆2 ≥ … ≥ 𝜆r .

Proof: Let Xt AX be a quadratic form of rank r. Then rank (A) = r. Therefore, A


has r non-zero eigenvalues. We write them as 𝜆1 , … , 𝜆r , in decreasing order.
Now, by Theorem 3 we get the required result. ■

So far we have spoken about the orthogonal canonical form in an abstract way.
Let us now look at a practical method of reducing a quadratic form to its
orthogonal canonical form.

Step by step procedure for orthogonal canonical reduction: We will now


give the sequence of operations which are needed to reduce a given quadratic
form to its orthogonal canonical form, and to obtain the required coordinate
transformations or the new basis.

1. Construct the symmetric matrix A associated to the given quadratic form


n
∑ aij xi xj .
i,j=1

2. Form the characteristic equation det (A − 𝜆I) = 0 and find the eigenvalues of
A. Let 𝜆1 , … , 𝜆r be the non-zero eigenvalues arranged in decreasing order,
i.e., 𝜆1 ≥ 𝜆2 ≥ … ≥ 𝜆r .

3. An orthogonal canonical reduction of the given quadratic form is


2
𝜆1 y1 + ⋯ + 𝜆r y2r .

4. Obtain an ordered system of n orthogonal vectors U1 , … , Un consisting of


eigenvectors corresponding to the eigenvalues 𝜆1 , … , 𝜆n (here
𝜆r+1 = 0 = ⋯ = 𝜆n ). Note that for repeated eigenvalues also we must
62 obtain linearly independent orthogonal eigenvectors.
Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......
5. Construct the orthogonal matrix P whose columns are the eigenvectors
U1 , … , Un .

6. The required change of basis is given by X = PY.

7. The new basis {U1 , U2 , … , Un } is called the canonical basis and its
elements are the principal axes of the given quadratic form.

In step 2 you are required to find the eigenvalues, i.e., the roots of the
characteristic equation. In a realistic situation the roots can be irrational
numbers and we may have to use numerical methods determine such roots.
We have avoided irrational numbers by carefully selecting the quadratic forms
in our examples and exercises so that the roots of characteristic equations are
rational numbers.

To clarify the procedure given above we present some examples and exercises.

Example 11: Obtain the unique orthogonal canonical form of the quadratic
form 5x21 − 6x1 x2 + 5x22 .

Also give the associated coordinate transformation, canonical basis and


principal axes of the given form.

5 −3
Solution: The matrix of this quadratic form is A = [ ].
−3 5

𝜆−5 3
The eigenvalues of A are given by | | = 0, i.e.,
3 𝜆−5
𝜆2 − 10𝜆 + 16 = 0 ⇒ 𝜆 = 8, 2.

Thus, the required orthogonal canonical reduction will be 8y21 + 2y22 .

The normalised eigenvectors corresponding to the eigenvalues 8 and 2 are U1


−1/√2 1/√2
and U2 , where U1 = [ ] and U2 = [ ].
1/√2 1/√2

Thus, the new orthonormal basis is {U1 , U2 } , which is the canonical basis. U1
and U2 are the principal axes of the given form.

The associated coordinate transformation will be

x1 −1/√2 1/√2 y1
[ ]=[ ] [ ]
x2 1/√2 1/√2 y2

i.e., x1 = 1/√2 (−y1 + y2 )


x2 = 1√2 (y1 + y2 )
∗∗∗

Remark 2: Remember that the choice of normalised eigenvectors is not


unique. You could have as well taken −U1 or −U2 , instead of U1 and U2 ,
respectively.

63
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......

E15) In Example 11 take the normalised eigenvectors corresponding to 8 and 2


to be −U1 and −U2 , respectively. Find the coordinate transformation
needed for the orthogonal canonical reduction.

Now we look at an example in which the associated matrix has repeated


eigenvalues.

Example 12: Consider the quadratic form

x2 + y2 + z2 + 2xy + 2xz + 2yz …(14)

Find its orthogonal canonical reduction and the corresponding new basis.

1 1 1
Solution: The matrix of (14) is A = [1 1 1] .
1 1 1

The eigenvalues of A are 3, 0, 0. Thus, the orthogonal canonical reduction of


(14) is

3x21 , …(15)

where x1 , y2 , z1 are the new coordinates.

1/√3
A normalised eigenvector corresponding to the eigenvalue 3 is [1/√3] .
1/√3

Eigenvectors corresponding to the eigenvalue 0 are given by

1 1 1 x 0
[1 1 1] [y] = [0] ,
1 1 1 z 0

i.e., x + y + z = 0 …(16)

Here we can choose any two mutually orthogonal normalised vectors satisfying
1/√2 1/√6
(16). Let us choose [−1/√2] and [ 1/√6 ] .
0 −2/√6

The new basis, in this case, is

1/√3 1/√2 1/√6


{[1/√3] , [−1/√2] , [ 1/√6 ]} ,
1/√3 0 −2/√6

which is the canonical basis. Its elements are the principal axes of (14). The
change of basis needed to convert (14) into (15) is given by

x 1/√3 1/√2 1/√6 x1


√ √ √
[y] = [1/ 3 −1/ 2 1/ 6 ] [y1 ] .
64 z 1/√3 0 −2/√6 z1
Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......
We again observe that the canonical basis, principal axes and the coordinate
transformation needed for reduction are not uniquely determined. We could
have chosen any two mutually orthogonal nomalised eigenvectors of 0.

∗∗∗

The next few exercises will give you some practice in applying the procedure of
reduction.

E16) Find the orthogonal canonical forms to which the following quadratic
forms can be reduced by means of an orthogonal change of basis. Also
obtain a set of principal axes for them.

a) x2 + 4xy + y2
b) 8x2 − 4xy + 5y2
c) 3x22 + 3x23 + 4x1 x2 + 4x1 x3 − 2x2 x3

E17) Which of the following quadratic forms are orthogonally equivalent?

a) 9x22 + 9x23 + 12x1 x2 + 12x1 x3 − 6x2 x3


b) −3y21 + 6y22 + 6y23 − 12y1 y2 + 12y1 y3 + 6y2 y3
c) 11z21 − 4z22 + 11z23 + 8z1 z2 − 2z1 z3 + 8z2 z3

E18) Show that the quadratic forms x2 − 2y2 + z2 and z21 − 2x21 + y21 are
orthogonally equivalent. Find the orthogonal transformation which will
transform the first of these into the second.

We will now try to reduce the matrix of a quadratic form to a diagonal form
whose diagonal elements are only 1, −1 or 0.

16.7 NORMAL CANONICAL FORM


If we do not restrict ourselves to an orthogonal change of basis, then we can
reduce a quadratic form to a simpler form than the one we considered in the
previous section. In this simpler version the coefficients of the reduced form
are ±1 or zero.
n
Let Xt AX = ∑ aij xi xj …(17)
i,j=1

be a quadratic form of order n. From Theorem 4 we know that Xt AX can be


reduced to its unique orthogonal canonical form
2
𝜆1 y1 + ⋯ + 𝜆r y2r , …(18)

where 𝜆1 , … , 𝜆r are the non-zero eigenvalues of A such that 𝜆1 ≥ 𝜆2 ≥ … ≥ 𝜆r .


Thus, rank (A) = r or, equivalently, the rank of (17) is r.

Now consider the coordinate transformation


zi = √|𝜆i |yi , i = 1, 2, … , r
} …(19)
zi = yi , i = r + 1, … , n
65
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
This is a non-singular transformation which will convert (18) into
For 𝛼 ∈ R
1, if 𝛼 > 0 sign (𝜆1 )z21 + ⋯ + sign (𝜆n )z2n …(20)
sign (𝛼) = {−1, if 𝛼 < 0 n
i.e., ∑ sign (𝜆i ) z2i
0, if 𝛼 = 0
k=1
Remember, sign (𝜆r+1 ) = 0 = ⋯ = sign (𝜆n ).

Thus, by two successive transformations, one orthogonal and the other


non-singular, we have reduced the given quadratic form to a diagonal form (20)
order n whose coefficients are ±1 or 0. We call the form (20) the normal
canonical form of the quadratic form (17). We give the following definition.

Definition 4: A diagonal quadratic form, whose coefficients are ±1 or 0, is


called a normal canonical form.

For example, x2 − y2 is a normal canonical form, but 2x2 + y2 is not.

The procedure involved in transforming (17) to (20) is described as reducing a


quadratic form to its normal canonical form.

E19) The transformation (19) is not, in general, an orthogonal transformation.


Under what conditions will it become orthogonal?

We can sum up the above discussion in the following theorem.

Theorem 5: A real quadratic form can always be reduced to a normal


canonical form by a suitable non-singular transformation.

Let us now look at some examples that will help you in understanding the
procedure.

Example 13: Reduce the quadratic form

5x21 − 6x1 x2 + 5x22 …(21)

to a normal canonical form.

Solution: From Example 11 we know that (21) can be reduced to

8y21 + 2y22 …(22)

Now consider the coordinate transformation.

z1 = √8y1 , z2 = √2y2

√8 0 z y
i.e., z = [ ] Y, where Z = [ 1 ] and Y = [ 1 ] .
0 √2 z2 y2

This transformation, which is non-singular but not orthogonal, will convert (22)
into z21 + z22 , which is the required normal canonical form.

66 ∗∗∗
Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......
Example 14: Reduce the diagonal form 2x21 − 3x22 − 7x23 into its normal
canonical form.

Solution: Consider the transformation

y1 = √2x1
y2 = √3x2
y3 = √7x3

√2 0 0
i.e., Y = [ 0 √3 0 ] X.
0 0 √7

This will convert the given diagonal form into y21 − y22 − y23 .

Which is the required normal canonical form.

∗∗∗

Try the following exercise now.

E20) Reduce the following quadratic forms to their normal canonical forms.

a) 8x2 − 4xy + 5y2


b) 2y2 − 2yz + 2zx − 2xy

E21) Show that the rank of a normal canonical form is the number of non-zero
terms in its expression.

E22) Show that a quadratic form and its normal canonical reduction have the
same rank.

In view of the above exercises a normal canonical reduction of a quadratic form


of rank r has the form

y21 + ⋯ + y2p − y2P+1 − ⋯ − y2r ,

where P is the number of positive terms in the reduced form.

But is a normal canonical reduction of a quadratic form unique? In other words,


is the number of positive terms in a normal canonical reduction of a quadratic
form uniquely determined? We answer this question in the following theorem,
due to the English mathematician J.J.Sylvester (1814-1897).

Theorem 6: (Sylvester): The number of positive terms in a normal canonical


reduction of a quadratic form is uniquely determined. Consequently, a
quadratic form of rank r has a unique normal canonical reduction

y21 + ⋯ + y2p − y2P+1 − ⋯ − y2r . 67


Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Proof: Let Q be a quadratic form of order n and rank r. Let {u1 , … , un } be a
basis of ℝn in which Q is represented by

Q(X) = x21 + ⋯ + x2p − x2p+1 − ⋯ − x2r , …(23)


n
where x = ∑ xi ui .
i=1

Let {v1 , … , vn } be another basis of ℝn in which Q is represented by

Q(Y) = y21 + ⋯ + y2p′ − y2p′+1 − ⋯ − y2r , …(24)


n
where Y = ∑ yi vi .
i=1

Thus, (23) and (24) are both normal canonical reduction of Q, in which the
number of positive terms are p are p′ , respectively. To prove the theorem we
have to prove that p = p′ . Let U and V be the subspaces of ℝn generated by
{u1 , … , up } and {vp′+1 , … , vn } , respectively.

Thus, dim U = p and dim V = n − p′ . We will show that U ∩ V = 0.

Suppose U ∩ V ≠ 0. Let 0 ≠ u ∈ U ∩ V.

Now, since u ∈ U and u ≠ 0, we have


u = a1 u1 + ⋯ + ap up , a ∈ R ∀ i, where ai ≠ 0 for some i.

Therefore, from (23)

Q(u) = a21 + ⋯ + a2p >, 0 …(25)

Also, since u ∈ V, we have


u = bp′+1 vp′+1 + ⋯ + bn vn , bi ∈ P ∀ i, bi ≠ 0 for some i.
2
∴, from (24) we get Q(u) = −bp′+1 − ⋯ − b2r ≤ 0 …(26)

(25) and (26) bring us to a contradiction. ∴, our supposition must be wrong.


∴ U ∩ V = 0.

At this stage, recall that

dim U + dim V − dim (U ∩ V) = dim (U + V) .

Therefore, p + n − p′ = dim (U + V) ≤ dim(ℝn ) = n, as U + V ⊆ ℝn .

⇒ p + n − p′ ≤ n

⇒ p ≤ p′ …(27)

Interchanging the roles of p and p′ in the above argument, we get

p′ ≤ p …(28)

(27) and (28) show that p = p′ , which proves the theorem. ■

By Theorem 1 and Sylvester’s theorem the rank r and number p remain


unchanged under a change of basis, i.e., under a non-singular transformation.
68 Hence, the number 2p − r also remains unchanged.
Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......
Definition 5: The signature of a quadratic form is defined to be (the number of
positive terms)-(the number of negative terms) appearing in its normal
canonical reduction. It is denoted by the letter s.

Thus, s = p − (r − p) = 2p − r.

For example, for the form Example 13, we have p = 2, r = 2 and s = 2. For the
form in Example 14, p = 1, r = 3, s = −1.

E23) Find the rank and signature of the quadratic forms given in E 20.

The rank and the signature completely determine the normal canonical
reduction. Also, any two quadratic forms having the same canonical reduction
will be equivalent. We can, therefore, state the following result.

Theorem 7: Two quadratic forms are equivalent if and only if they have the
same rank and signature.

In Section 16.3 we said that there is a one-to-one correspondence between the


set of all symmetric matrices of order n and the set of quadratic forms of order
n. So we can expect Sylvester’s theorem to have a matrix interpretation. This is
as follows:

A symmetric matrix of order n and rank r is equivalent to a unique diagonal


matrix of the type

Ip 0 0
[ 0 −Ir−p 0 ]
0 0 0n−r×n−r
And now we end the unit by briefly recalling what we have done in it.

16.8 SUMMARY
In this unit all the spaces considered are over the field R. In it we have covered
the following points.

1. A homogenous polynomial of degree two is called a quadratic form. Its


order is the number of variables occurring in its expression.

2. Each quadratic form can be uniquely expressed as Xt AX, where A is a


unique symmetric matrix and is called the matrix of the quadratic form.

3. There is a one-to-one correspondence between the set of real symmetric


n × n matrices and the set of real quadratic forms of order n.

4. Two quadratic forms are called equivalent (respectively, orthogonally


equivalent) if their matrices are congruent (respectively, orthogonally
similar). Two equivalent (respectively, orthogonally equivalent) quadratic
forms convert into each other by a suitable change of basis. 69
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
5. The rank of a quadratic form is defined to be the rank of its matrix.

6. A quadratic form Xt AX of rank r is orthogonally equivalent to a unique


diagonal form
2
𝜆1 y1 + ⋯ + 𝜆r y2r , 𝜆1 ≥ 𝜆2 ≥ … ≥ 𝜆r ,

called its orthogonal canonical reduction, where 𝜆1 , … , 𝜆r are the non-zero


eigenvalues of A.

7. A quadratic form of rank r is equivalent to a unique diagonal form

y21 + ⋯ + y2p − y2p+1 − ⋯ − y2r ,

called its normal canonical reduction. Here the number p is uniquely


determined (Sylvester’s theorem). The number 2p − r is called the
signature of the quadratic form.

16.9 SOLUTIONS/ANSWERS

E1) There are plenty of possible answers. We give one each.


a) x2 + 1, b) x3 .

E2) Only (a).

E3) a) k = 0, otherwise the polynomial is of degree 3.


b) k = 2
c) k = 4.

E4) The first three will be quadratic forms, if they are non-zero. Q1 Q2 will be
degree 4. Q1 /Q2 will also not be quadratic; in fact, it may not even be a
polynomial.
1 1
E5) For example, the matrix [ ] gives us the quadratic form
1 3

1 1
Xt [ ] X = x2 + 2xy + 3y2 .
1 3

0 0 2 0 0 1 p q/2
E6) a) [ ], b) [ ], c) [ ], d) [ ].
0 −1 0 1 1 0 q/2 r

7 −1 −10 x
E7) a) [x y z] [ −1 7 10 ] [y]
−10 10 −2 z
−1
1 2
x1
b) [x1 x2 ] [ −1 ] [ ]
2
1 x2
1 −1 0 x1
c) [x1 x2 x3 ] [−1 0 0] [x2 ]
70 0 0 0 x3
Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......
0 0 1 x
d) [x y z] [0 0 1] [y]
1 1 0 z
E8) a) ax2 + by2 + cz2 + 2hxy + 2gxz + 2fyz
b) x2 − y2
c) 4x2 − √2y2 − z2
E9) Xt AX = 4x21 + x22 + 4x24 + 2x1 x2 + 3x1 x3 + 2x1 x4 + 6x2 x3 + x3 x4 .

4 1 3/2 1
1 1 3 0
A′ = [ ]
3/2 3 0 1/2
1 0 1/2 4

E10) Since its columns are not orthonormal, it is not orthogonal.


1 1
E11) Now X = AY, where P = [ ] . This is also not orthogonal, since its
0 1
columns are not orthonormal.
Now Q(Y) = Yt (Pt AP) Y, where

1 0 1 −1 1 1 1 0
Pt AP = [ ] [ ][ ]=[ ]
1 1 −1 4 0 1 0 3
2 2
∴ Q(Y) = y1 + 3y2 .

7 26
E12) A = [ ]
26 −32
The coordinate transformation corresponding to the change from B1 to B2
2 1
is given by the matrix P = [ ] . ∴, the matrix of the form will now be
1 −2

2 1 7 26 2 1 100 0
Pt AP = [ ] [ ] [ ]=[ ]
1 −2 26 −32 1 −2 0 −225

∴, the quadratic form will now be expressed as 100x′2 − 225y′2 .

E13) The rank of the quadratic form a1 x21 + ⋯ + an x2n


= the rank of the matrix diag(a1 , … , an )
= number of non-zero ai ’s
= number of non-zero terms in the expression of the quadratic form.

5 −2 0
E14) a) The rank of the form = rank of [−2 6 −2] = 3, since its determinant
0 −2 7
rank is 3.
1 1 1
b) rank (Q) =rank of [1 1 1] = 1, since its row-reduced echelon form is
1 1 1
1 1 1
[0 0 0] .
0 0 0 71
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
2 1 1
c) rank (Q) =rank of [1 2 1] = 3, since its determinant is non-zero.
1 1 2
1 0 0
d) rank (Q) =rank of [0 1 0] = 2, using the determinant rank method.
0 0 0
E15) The required transformation is X = PY, where P = [−U1 − U2 ] , i.e.,
1
x1 = (y1 − y2 )
√2
−1
x2 = (y1 + y2 )
√2

1 2
E16) a) The matrix of the form is A = [ ] . Its eigenvalues are 3 and −1. ∴,
2 1
the given form is equivalent to 3x21 − y21 . Normalised eigenvectors
1
1/√2
corresponding to 3 and −1 are [ ] and [ √−12 ] , respectively. ∴, they
1/√2
√2
form a set of principal axes of the form. Remember that the principal
axes are not unique.
b) Its orthogonal canonical form is 9x21 + 4y21 .
−2/√5 1/√5
A set of principal axes is {[ ],[ ]} .
1/√5 2/√5
c) Its orthogonal canonical reduction is 4y21 + 4y22 − 2y23 .
Eigenvectors corresponding to the eigenvalues 4 are given by

0 2 2 x x
[2 3 −1] [y] = 4 [y] ⇒ 2x − y − z = 0.
2 −1 3 z z

∴, two linearly independent orthonormal eigenvectors corresponding


to 4 can be obtained by putting x = 0 and y = 0 respectively, in this
equation. So we get

0 1/√5
[ 1/√2 ] , [ 0 ]
−1/√2 2/√5

as the required vectors.


Also, corresponding to the eigenvalue −2, we get a normalised
−2/√6
eigenvector, [ 1/√6 ] .
1/√6
1 −2
0 √6
√5
1 1
∴, a set of principal axes is {[ √2 ] , [ 0 ] , [ √ ]} .
6
−1 2 1
√2 √5 √6

E17) Any two forms are orthogonally equivalent iff they the same orthogonal
canonical forms as given in Theorem 4. ∴, their matrices should have the
72 same eigenvalues (including repetitions).
Unit
. . . . .16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Real
. . . . . .Quadratic
. . . . . . . . . . .Forms
......
Now, the eigenvalues of the matrices in (a) and (c) are 12, 12 and −6. ∴,
the forms in (a) and (c) are orthogonally equivalent. The matrix of the
form in (b) has eigenvalues 9, 9, −9. ∴, it is not orthogonally equivalent to
the others.
E18) Both the forms have the same diagonal forms, as given in Theorem 4,
x x′ x1 x′
namely x′2 + y′2 − 2z′2 . If [y] = P [y′ ] and [y1 ] = Q [y′ ] , then PQ−1 will
z z′ z1 z′
transform the first to the second, and

PQ−1 = PQt , since Q is orthogonal.


1 0 0 0 0 1 0 0 1
= [0 0 1] [ 0 1 0] = [1 0 0]
0 1 0 1 0 0 0 1 0

E19) The transformation (19) is given by Y = PZ, where

1 1
P = diag ( ,⋯, , 1, … , 1) .
√|𝜆1 | √|𝜆r |

This matrix is orthogonal provided PPt = I, i.e., |𝜆i | = 1 ∀ i = 1, … , r, i.e. 𝜆i = 1


or −1 ∀ i = 1, … , r.
E20) a) First obtain the orthogonal canonical form 9x21 + 4y21 . Then obtain its
normal canonical form x22 + y22 .
b) x21 − y21 is the normal canonical form.
E21) The rank of any diagonal form is the number of non-zero terms in its
expression.
E22) Since the normal canonical reduction is obtained by non-singular
transformations, the rank remains unchanged.
E23) a) rank= 2, signature= 2 × 2 − 2 = 2.
b) rank= 2, signature= 2 × 1 − 2 = 0.

73
UNIT 17

CONICS
Structure
Page Nos.
17.1 Introduction 74
Objective
17.2 Definitions and Equations 75
What is a Conic?
Standard Equations of Conics
17.3 Ellipse 79
Description
Geometrical Properties
17.4 Hyperbola 83
Description
Geometrical Properties
17.5 Parabola 86
Description
Geometrical Properties
17.6 The General Theory of Second Order Curves in ℝ2 88
17.7 Summary 69
17.8 Solution/Answers 70

17.1 INTRODUCTION
In Unit 16 you have studied about real quadratic forms of any order n. This unit
is only a geometric extension of the previous one. In it we shall confine
ourselves to the two dimensional case.

Circles, parabolas, hyperbolas and ellipses are curves which we come across
quite often. The ancients Greeks studied these curves and named them coins
sections, since they could be obtained by taking a plane section of a right
circular double cone (Fig. 1). However, from the analytic viewpoint, the Greek
definition of conics, as sections of a cone, is not particularly useful. We shall
Fig. 1: Right circular
consider a conic to be a curve which can be represented by an equation of
double cone
second degree.

After defining conics, we shall list the different types of standard conics. Then
74 we shall study the ellipse, the hyperbola, and the parabola in detail. In the last
Unit
. . . . .17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......
section we will look at one of the basic problems of plane analytic geometry
that deals with conics–how to obtain a rectangular coordinate system in which
the equation of a given conic takes the standard form.

Before going further, we suggest that you revise Unit 16.

Objectives
After studying this unit, you should be able to:
• recognise different types of conics and their standard equations;

• reduce a general equation of second degree to one of the standard forms of


conics;

• trace a conic whose standard equation is given.

17.2 DEFINITONS AND EQUATIONS


You have come across polynomials in several variables already. We will
consider the curves that represent polynomials of degree two, in two variables.

17.2.1 What is a Conic?


Let us go back to Sec 16.2, where we told you that the general equation of
second degree in ℝ2 is

ax2 + 2hxy + by2 + 2gx + 2fy + c = 0, …(1)

where a, h, b, g, f and c are real constants, of which at least one of a, h, b is


non-zero. Note that if a, h, b are all zero, then (1) will become an equation of
first degree, and hence, will represent a straight line.

Now, (1) represent a curve in ℝ2 . We call this curve a conic. Let us make some
formal definitions now.

Definition 1: The set of points of ℝ2 whose coordinates satisfy an equations of


second degree is called a conic.

It may happen that there is no points of ℝ2 that satisfies a given equation of


second degree. (For example, no point of ℝ2 satisfies the equation x2 + y2 = −1.)
In such a case we say that the conic represented by the equation is an
imaginary conic.

Let us look at some examples.

Example 1: Investigate the nature of the conic given by

x2 + y2 = a, a ∈ R. …(2)

Solution: There are three cases to consider depending on the sign of


a ∶ a < 0, a = 0, a > 0. 75
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Case 1: If a < 0, then no real values of x and y will satisfy (2), and therefore, the
conic represented by (2) will be imaginary.

A conic consisting of only Case 2: If a = 0, then the only real solution of (2) is x = 0 and y = 0. Hence, the
one point is called a point conic represented by (2) will consist of just one point, i.e., (0, 0).
conic. 2
Case 3: If a > 0, then √a ∈ R and a = (√a) . ∴, a point (x, y) will satisfy (2) if and
only if the distance of (x, y) from the origin is √a. Hence, the conic represented
by (2) will be a circle of radius √a and centre (0, 0).

∗∗∗

Example 2: Find the nature of the conic represented by

2x2 − xy − 3x = 0. …(3)

Solution: Equation (3) can be written as x (2x − y − 3) = 0. This shows that a


point (x, y) will satisfy (3) if it it satisfies x = 0 or 2x − y − 3 = 0. Therefore, we see
A first degree equation in that the points satisfying (3) are points of the lines x = 0 and 2x − y − 3 = 0. ∴, the
ℝ2 represents a straight conic consists of a pair of straight lines.
line.
∗∗∗

The examples above show that a circle, a point and a pair of straight lines are
conics.

Try the following exercises now.

E1) Find equations of second degree which will represent a pair of


a) parallel lines, b) coincident lines.
(Hint: Remember that parallel lines have the same slope.)

E2) Find the nature of the conics represented by the following equations.
a) x2 − 2xy + y2 = 0
b) 4x2 − 9x + 2 = 0
c) x2 = 0
d) xy = 0

In the examples and exercises that you have done so far, you have dealt with
simple second degree equations. These and other simple forms are what we
will discuss now.

17.2.2 Standard Equation of Conics


Did you notice that we have not given any examples of conics like

x2 + 5xy + y2 + 2x − 6y + 10 = 0

so far? Because we can always choose a coordinate system so that the


76 equation of the conic in this system is in the ”simplest” form, that is, it has as
Unit
. . . . .17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......
few terms as possible. Such a form is called the standard equation of the
conic. In this sub-section we shall discuss this form.

Table 1: Standard forms of Conics

Conic Standard Equation Sketch

x2 y2
Ellipse a2
+ b2
= 1, a, b > 0

Circle x2 + y2 = a2 , a ≠ 0

x2 y2
Hyperbola a2
− b2
= 1, a, b > 0

Parabola y2 = 4px, p > 0

x2 y2
Pair of intersecting lines a2
− b2
= 0 a, b ≠ 0

Pair of parallel lines y2 = a2 , a ≠ 0

Pair of coincident lines y2 = 0

x2 y2
Point conic a2
+ b2
= 0, a, b ≠ 0

i) Interchanging the role of the axis: We apply the orthogonal 77


Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
transformation.

x=Y
} …(4)
y=X

to the conic.
There are several types of standard conics to which a general quadratic
equation can be reduced. The classification is made on the basis of the
coefficients of the various terms and the constant term appearing in the
equation. In Table 1 we list different types of real conics along with their
standard equations.
From the standard equations of conics that we have listed Table 1, we can
obtain other equality simple equations by the following two methods.

ii) Reversing the direction of an axis: For example, the direction of the
x-axis can be reversed by applying the orthogonal transformation

x = −X
} …(5)
y=Y

to the conic.
Similarly, we can reverse the direction of the y-axis by applying the
orthogonal transformation x = X, y = −Y.

Let us illustrate the above discussion.

Example 3: Consider the standard equation y2 = 4px(p > 0) of a parabola.


What are the different forms of this equation that we can obtain under
transformations (4) and (5)?

Solution: If we interchange the x and y axes, the given equation will


transform to X2 = 4pY, p > 0.

To apply (5) we replace x by −X and y by Y. Then the given equation will


transform to Y2 = −4pX, p > 0.

All three equations represent the same parabola with respect to different
coordinate systems.

∗∗∗

Try the following exercise now.

E3) What are the different forms of the equation of the circle x2 + y2 = a2 that
we get on applying the transformations (4) and (5) given above?

Let us now study some of these conics in detail. In the following sections we
will describe ellipses, hyperbolas, parabolas and other conics. As we go along
we will also pictorially show you how conics occur as planar sections of a right
78 circular double cone.
Unit
. . . . .17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......
Before starting these sections you may like to recall what you studied about
curve tracing in Block 2 of the Calculus course.

17.3 ELLIPSE
In your school, you have already studied that any planet orbits the sun in an
elliptical path. The sun is at a focus of these ellipses. In this sections, you will
see what exactly an ellipse is and study some of its geometrical properties. In
Fig. 2 you can see why an ellipse is called a conic.

17.3.1 Description
Fig. 2: Ellipse as a
From Sec. 17.2 you know that the standard equation of an ellipse is
section of a double
2 2 2 2
x /a + y /b = 1, a, b > 0 …(6) cone

We may assume a > b. (If b > a, then we can interchange the x and y axes to
arrive at the assumed case.) We want to trace the ellipse (1). For this purpose
we start gathering information.

a) (6) is symmetric about the axes: If we replace x by (−x) or y by (−y) in (6), it


remains unchanged. This shows that the ellipse is symmetric with respect
to both the axes.

b) (6) is a central conic: If we replace both x and y by (−x) and (−y) in (6), it
remains unchanged. Thus, the ellipse is symmetric with respect to the
origin. Hence, (0, 0) is the centre of the ellipse.
(a) and (b) tell us that it is enough to sketch the graph in the first quadrant
only, i.e., for x, y 0.

c) (6) is contained in the rectangle boundary by x = a and y = b: (6) can be


written as x2 = a2 (1 − y2 /b2 ) .
This shows that there are no real values of x for |y| > b. Hence, the ellipse
does not exists in the regions y < −b and y > b. Similarly, writing the
equations as y2 = b2 (1 − x2 /a2 ) , we see that the ellipse does not exist in
the regions given by |x| > a, i.e. for x < −a and x > a.

d) (6) is bounded by the circle x2 + y2 = a2 .


If a point P (x1 , y1 ) lies on (6), then
x21 y21 y21 y21
a2
+ b2
= 1. Since a ≥ b, we get a2
≤ b2
.
x21 +y21 x21 y2
Therefore, a2
≤ a2
+ b2
=1
2 2
i.e., x1 + y1 ≤ a2 . This shows that P lies inside, or on, the circle x2 + y2 = a2 .

e) (6) intersects the coordinate axes in (a, 0) and (0, b).

f) The part of (1) in the first quadrant is given by


b√ 2 2
y= a −x , 0≤x≤a …(7)
a

a 2 2
or x = √b − y , 0 ≤ y ≤ b …(8)
b 79
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......

x2 y2
Fig. 3: The ellipse a2
+ b2
=1

Here y is a continuous function of x, and it attains its maximum value b, at x = 0.


As x increases continuously from 0 to a, y will continuously decrease from b to
0. From (7) above, y is a differentiable function of x over the interval [0, a]. The
tangent at B(0, b) is y = b. From (8), x is a differentiable function of y over the
interval [0, b], the tangent at A(a, 0) being x = a.

E4) Prove that the tangents at (a, 0) and (0, b) of the ellipse (6) are x = a and
y = b, respectively.

From the above information the ellipse (6) will be represented by the curve in
Fig. 3.

The terms related to this ellipse are given below.

i) The points (±a, 0) are called its vertices.

ii) A′ A and B′ B are called the major and minor axes of the ellipse,
respectively. Their lengths are 2a and 2b, respectively. These axes are
the principal axes of the ellipse. Can you see why? It is because they are
1 0
given by the normalised eigenvectors [ ] and [ ] , of the form
0 1
Fig. 4: Circle as a
x2 /a2 + y2 /b2 .
section of a cone.
iii) The positive real number e defined by a2 e2 = a2 − b2 , is called the
eccentricity of the ellipse. Note that 0 < e < 1.

iv) The points (ae, 0) and (−ae, 0) are called the foci (plural of focus).

v) The line x = a/e is called the directrix (plural: directrices) corresponding


to the focus (ae, 0). Similarly, x = −a/e is the directrix corresponding to
80 (−ae, 0).
Unit
. . . . .17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......
Remark 1: If a = b the equations (6) reduces to x2 + y2 = a2 , which represents a
circle of radius a (see Fig. 4). A circle is, thus, a special case of an ellipse.

We will study a circle in the following example.

Example 4: Find the eccentricity, foci and directrices of the circle x2 + y2 = a2 .

Solution: Since x2 + y2 = a2 is a special case of (6) with b = a, we get e = 0. ∴,


both the foci, (±ae, 0) coincident at the origin, (0, 0). The two directrices x = ±a/e
diverge to infinity as e → 0, and do not exists in the real plane.

∗∗∗

We have seen what happens if a = b in (6). But what happens if b > a in (6)?
The role of the major and minor axes will be interchanged and the terminology
given for an ellipse will have to be suitably modified as follows:

i) the points (0, ±b) will be the vertices.

ii) B′ B and A′ A will be the major and minor axes, and their lengths will be 2b
and 2a, respectively.

iii) the eccentricity e will be defined by b2 e2 = b2 − a2 .

iv) the points (0, ±be) will be the foci. They will lie on the y-axis. Therefore,
the major axis will lie along the y-axis.

v) The lines y = b/e and y = −b/e will be the directrices corresponding to the
foci (0, be) and (0, −be), respectively.

By now you must be ready to describe an ellipse yourself. Try the following
exercise.

E5) Find the vertices, eccentricity, foci and directrices of the ellipse
Fig. 5: The ellipse
a) 9x2 + 4y2 = 36 (see Fig. 5.) x2 y2
4
+ 9
= 1.
b) 16x2 + 25y2 = 400

Now let us look closely at some properties of an ellipse. The eccednttricity


measures the ratio of the
17.3.2 Geometrical Properties distance of a point on the
curve from the focus and
The ellipse has some very interesting geometrical properties. We shall study from the corresponding
three important ones here. directrix.

Focus-directrix Property: The distance of any point of the ellipse from a


focus is e times its distance from the corresponding directrix, where e is the
eccentricity of the ellipse.

Proof: Let P (x1 , y1 ) be a point on the ellipse x2 /a2 + y2 /b2 = 1, a > b (see Fig.
6). Let F1 (ae, 0) be the focus under consideration. The directrix corresponding 81
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
to F1 is x = a/e. Let D be the foot of the perpendicular from P to the directrix
x = a/e. Since P lies on the ellipse we have

x21 /a2 + y21 /b2 = 1 ⇒ b2 x21 + a2 y21 = a2 b2


2 2
⇒ (a2 − a2 e2 ) x1 + a2 y1 = a2 (a2 − a2 e2 ) , since b2 = a2 − a2 e2
2 2 2
⇒x1 + y1 + a2 e2 = e2 x1 + a2 .

Adding −2aex1 on the sides, we get,


2 2 2 2 2 2
(x1 − ae) + y1 = (ex1 − a) ⇒ (x1 − ae) + y1 = e2 (x1 − a/e) ,

which is equivalent to PF21 = e2 PD2 , i.e., PF1 = e(PD), which proves the
Fig. 6: The ellipse
x2 y2 statement for the focus F1 . For completing the proof, try E6. ■
a2
+ b2
= 1.

E6) Prove the focus-directrix property for the other focus F2 .

Another property that holds for ellipse is the

String Property: For each point P of the ellipse the sum of the distances of P
from the two foci of the ellipse is the same, and is equal to the length of the
major axis.

Proof: let P be a point on the ellipse whose foci are F1 and F2 (see Fig. 6). Let
D1 and D2 be the feet of the perpendiculars from P to the two directrices. Using
the focus-directrix property, we get

PF1 + PF2 = e (PD1 + PD2 ) = e (D1 D2 ) = e(2a/e) = 2a,

which proves the string property. ■

You may wonder why this property is called the string property. It provides a
Fig. 7: Sketching an
mechanical method to construct an ellipse by using a string. Let us see what
ellipse using staring
the method is.

A mechanical method for drawing an ellipse: Take a piece of string of length


2a and fix its end points at the points F1 and F2 (F1 F2 < 2a) of a plane of paper
(see Fig. 7). Use the pencil point of a pencil to stretch the string the string into
the segments. Now rotate the pencil point all around on the paper while sliding
it along the string, making sure that the string is taut all the time. The curve
traced will be an ellipse whose foci are F1 and F2 , and the length of the major
axis is 2a.

E7) Use the method we have just given to draw an ellipse whose eccentricity
is 0 and minor axis is 3 inches in length, on a piece of paper.

An ellipse has another important property which we shall state, but not prove in
this course.
82
Unit
. . . . .17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......
Reflected Wave Property: A ray of light (or sound, or any other type of wave)
emitted from one focus of an ellipse is reflected back from its reflecting interior
to the other focus (see Fig. 8). An interesting consequence of this property is
that rooms with an ellipsoidal ceiling have whispering galleries. A person
standing at one focus of the ellipse can whisper so as to be heard by a person Fig. 8: Reflected wave
at the other focus, while the people in between cannot hear what is said. property

Let us now study the hyperbola in detail.

17.4 HYPERBOLA
In this section we shall present the description and some geometrical
properties of a hyperbola. See Fig. 9 for a representation of a hyperbola as a
planner section of a double cone.

17.4.1 Description
From Table 1 you know that the standard equation of a hyperbola is

x2 /a2 − y2 /b2 = 1, a, b > 0. …(9)

You can check that this is symmetric about both the axes, and hence about the
origin. The origin is, therefore, the centre of the hyperbola. Thus, the hyperbola
is a central conic. The x-axis meets the hyperbola in (± a, 0) while the y-axis
does not meet it at all.

Due to symmetry about both the axes, it is enough to sketch the hyperbola in
the first quadrant only, i.e., for x, y ≥ 0. In this quadrant it is given by
Fig. 9: Hyperbola as a
x2 y2
y = b √ 2 − 1 (or x = a√ 2 + 1) . section of a double
a b
cone
This provides the following information.

a) The hyperbola does not exists in the region |x| < a.

b) y = 0 for x = a.

c) y is a continuous function of x, which increases continuously from 0 to ∞


as x increases from 0 to ∞. The hyperbola, therefore, extends to infinity.

d) x is a differentiable function of y, and hence, a tangent can be drawn at


each point of the hyperbola. The tangent at (a, 0) is parallel to the y-axis.

All this information allows us to sketch the hyperbola as in Fig. 10.

Can you see that the hyperbola consists of two branches? Of all the conics,
this property is typical of hyperbolas only.

The terminology for the hyperbola is as follows:

i) The point (± a, 0) are called its vertices. 83


Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......

x2 y2
Fig. 10: The hyperbola a2
− b2
=1

ii) The line segment joining the vertices is called the principal (or
transversal) axis, while the line segment joining B and B′ is called the
conjugate axis. The length of the principal axis is 2a, while the length of
the conjugate axis is 2b.
As in the case of an ellipse, these axes are in the direction of the
1 0
normalised eigenvectors [ ] and [ ] , of the matrix of the form
0 1
2 2 2 2
x /a − y /b .

iii) The positive real number e, defined by a2 e2 = a2 + b2 , is called the


eccentricity of the hyperbola.
Note that e > 1 in this case.

iv) The points (± ae, 0) are the foci of the hyperbola.

v) The line x = a/e (respectively, x = −a/e) is called the directrix


corresponding to the focus (ae, 0) (respectively, (−ae, 0)).

Can you solve the following exercise now?

E8) Find the vertices, eccentricity, foci and directrices of the hyperbola
a) 9x2 − 16y2 = 144
b) 25x2 − 9y2 = 225.

Let us look at the geometry of a hyperbola now.

17.4.2 Geometrical Properties


A hyperbola has properties analogous to those of an ellipse. We discuss some
84 important properties here.
Unit
. . . . .17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......

Fig. 11: String property for a hyperbola

Focus-directrix Property: The distance of any point of the hyperbola from


either focus is e time its distance from the corresponding directrix.

Proof: We will start the proof and you can complete it! Let P (x1 , x2 ) be any
point of the hyperbola x2 /a2 − y2 /b2 = 1, a, b > 0. Consider the foci F1 (ae, 0) and
F2 (−ae, 0). Now do E9. ■

E9) Prove that PF1 = ePD, where D =distance of P from the directrix x = a/e.
Also show that PF2 = ePD′ , where D′ =distance of P from the line x = −a/e.

So you have proved the focus-directrix property.

Corresponding to the string property of an ellipse we have the following


property for a hyperbola.

String Property: For each point of a hyperbola the absolute value of the
difference of its distances from the two foci is the same, and is equal to the
length of the principal axis.

Proof: Let P be a point of the hyperbola whose foci are F1 and F2 . Let D1 and
D2 be the feet of the perpendicular from P on the two directrices. Fig. 11 shows
the two cases, when P is on the branch or the other. ■

From the focus-directrix property

PF1 = ePD1
PF2 = ePD2 .
85
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Hence,

|PF1 − PF2 | = e |PD1 − PD2 | = e (D1 D2 ) = e2a/e = 2a,

which proves the string property.

You must have noticed the similarly in the properties of an ellipse and a
hyperbola. Sometimes an ellipse or a hyperbola is defined by the
focus-directrix property, an ellipse being when e < 1, and a hyperbola when
e > 1. What happens when e = 1? In other words, what is the locus of a point
whose distance from a fixed point (a focus) is equal to its distance from a fixed
line (a directrix)? We shall answer this question in the next section.

17.5 PARABOLA
Have you ever noticed the path of a projectile when it is acted upon by the
force of gravity only? It is a parabola. In this section we will discuss parabolas
in some detail. In Fig. 12 we show how it can be represented by a planner
section of a cone.

17.5.1 Description

Fig. 12: Parabola as a Table 1 tells you that the standard equation of a parabola is y2 = 4px, p > 0.
section of a double
You can verify the following information about it, as you have done for an
cone
ellipse or a hyperbola.

a) It is symmetrical about the x-axis, and not about the y-axis.

∴, this is not a central conic.

b) For x < 0 there are no real value of y, and hence, this parabola does not
exists in the second and third quadrants.

c) This parabola meets the axes only at the origin.

In view of (a) and (b), it is enough to sketch the parabola in the first quadrant
only. The part of the parabola in the first quadrant is given by

x = y2 /4p (or y = 2√px, x ≥ 0) .

x is a continuous and differentiable function of y, and hence, the tangent exists


at each point.

The tangent at (0, 0) is the y-axis. As x increases continuously from 0 to ∞, y


also increases from 0 to ∞. Hence the parabola is an infinite curve.

86 From the above information we draw the parabola in Fig. 13.


Unit
. . . . .17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......

Fig. 13: The parabola y2 = 4px

For the parabola given in Fig. 13.

i) the origin is called the vertex.

ii) the line of symmetry, i.e. the x-axis, is its axis.

iii) (p, 0) is the focus.

iv) the line x = −p is the directrix.

You can use this knowledge to solve the following exercise.

E10) Find the coordinates of the focus, and the equation of the directrix, of the
parabola
a) y2 = 3x, b) x2 = 4ay, c) y2 = −4ax. Fig. 14: PF = PD

Draw a rough sketch of these curves also.

We will now discuss the geometry of a parabola.

17.5.2 Geometrical Properties

We will talk about two geometrical properties of a parabola now.

Focus-directrix Property: Each point of a parabola is equidistant from the


focus and the directrix of the parabola.

Proof: Let the parabola have standard equation y2 = 4px. Then F(p, 0) is its
focus. Let P (x1 , y1 ) be any point on the parabola (see Fig. 14). Then

y21 = 4px1 . 87
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
Now
2 2 2
PF2 = (x1 − p) + y21 = (x1 − p) + 4px1 = (x1 + p)
= PD2
= (distance of P from the directrix x = −p)2 .

Hence, PF = PD, which proves the focus-directrix property. ■

Now we state (without proof) another important geometrical, as well as


physical, property of a parabolic curve.
Fig. 15: Reflected wave
Reflected Wave Property: If a source of light (or sound, or any other type of
property for a parabola
wave) is placed at the focus of a parabola which has a reflecting surface (see
Fig. 15), the rays that meet the reflecting surface of the parabola will be
reflected to the axis of the parabola. Conversely, the rays of light (or sound,
any other type of wave) entering parallel to the axis are reflected to converge at
the focus.

A paraboloid is a surface As a consequence of this property paraboloid surface is used in the headlight
generated by revolving a of cars, optical and radio telescopes, radars, etc. The focus-directrix property is
parabola about its axis. common to an ellipse, a hyperbola and a parabola. Each of them can be
considered as a locus of a point whose distance from a fixed point (a focus) is
a constant, e, times its distance from a fixed line (a directrix). The locus is an
The ellipse, hyperbola ellipse, parabola or hyperbola accordingly as e < 1, e = 1, e > 1. The
and parabola are called focus-directrix property, therefore, unifies all these conics. What about the rest
non-degenerate conics. of the conics given in Table 1? They are all limiting cases of an ellipse, a
hyperbola or a parabola.

For example, the pair of intersecting lines x2 − k2 y2 = 0 is a limiting case of the


hyperbola.

x2 /a2 − y2 /b2 = 1, a, b > 0 as a → 0, b → 0.

(Taking limits as a → 0, b → 0 such that lima→0, b→0 a/b = k (finite), we get


x2 − k2 y2 = 0.)

Similarly, the ellipse x2 /a2 + y2 /b2 = 1 degenerates into the pair of parallel lines
given by y2 = b2 , as a → ∞.

So far you have studied quite a few conics. But you must be wondering about
curves that are represented by the general equation of second degree.

We will now look at any conic and see how to reduce it to one of the standard
forms given in Sec. 17.2.

17.6 THE GENERAL THEORY OF SECOND


ORDER CURVES IN ℝ2
You know that the most general forms of an equation of second degree is

88 ax2 + 2hxy + by2 + 2gx + 2fy + c = 0, …(10)


Unit
. . . . .17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......
where a, h, b, g, f, c ∈ R and a, h, b are not all zero.

We will see how to reduce this equation to standard form, that is, one of the
forms listed in Table 1. You will see that the whole of this section will be
devoted to using the following theorem.

Theorem 1: If the conic represented by (10) is not imaginary, then it is always


possible to choose a rectangular coordinate system in which the equation (10)
will reduce to one of the standard forms of conics.

We will give a rough outline of the proof of this theorem. The idea is to first
reduce the quadratic forms ax2 + 2hxy + by2 to the orthogonal canonical form
2 2
𝜆1 x1 + 𝜆2 y1 , with 𝜆1 ≥ 𝜆2 (ref. Sec. 16.6). Let this transformation be given by

x x1
[ ] = P[ ].
y y1

On substituting these values of x and y in (10) we get a conic in x1 and y1 . If


this conic has any linear terms, we eliminate them by applying a translation of
the form

x1 = X + 𝛼, y1 = Y + 𝛽, 𝛼, 𝛽 ∈ R.

We will choose 𝛼 and 𝛽 in such a manner that the linear terms are reduced to
zero. Then our conic (10) will finally be transformed to one of the standard
conics.

Our proof may seem vague to you. To understand the method of reduction
consider the following examples.

Example 5: Reduce the conic 7x2 − 8xy + y2 = a to standard form. Hence,


identify it.

7 −4
Solution: The matrix of the quadratic form 7x2 − 8xy + y2 is [ ].
−4 1

∗∗∗

Its eigenvalues are 9 and −1. ∴, form Unit 16 (Theorem 4) you know that we
can find an orthogonal transformation which will reduce 7x2 − 8xy + y2 into
9X2 − Y2 . This transformation will reduce the given conic to 9X2 − Y2 = a.

The nature of this conic will depend on the value of a.

If a = 0, it will represent the pair of intersecting lines 3X − Y = 0 and 3X + Y = 0.

If a ≠ 0, it will represent a hyperbola.

Example 6: Investigate the nature of the conic

5x2 − 6xy + 5y2 + √2(x + y) = a,

Solution: The second degree terms in the given equation are the same as in
the quadratic form considered in Example 11 of Unit 16. The orthogonal 89
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
coordinate transformation

x = (1/√2) (−y1 + y2 )
y = (1/√2) (y1 + y2 )

will convert 5x2 − 6xy + 5y2 into 8y21 + 2y22 , and hence will transform the given
equation into

8y21 + 2y22 + (−y1 + y2 + y1 + y2 ) = a


2
i.e., 8y21 + 2 (y2 + 1/2) = a + 1/2.

Now a translation of axes given by

(y1 , y2 ) ↦ (X, Y − 1/2)

will transform the above equation into 8X2 + 2Y2 = a + 1/2, which is in standard
form.

∗∗∗

The nature of this conic will depend on the value of a. We have the following
three cases:

Case 1: a + 1/2 < 0. In this case no real values of X and Y satisfy the conic, and
hence the conic is imaginary.

Case 2: a + 1/2 = 0. In this case the conic is a point conic.

Case 3: a + 1/2 > 0. In this case the equation can be written as

X2 Y2
+ = 1,
(2a + 1)/16 (2a + 1)/4

which represents an ellipse.

Note that we have used two successive transformations in Example 6 to


convert the given equation into standard form. The first one was an orthogonal
transformation. The second one was a translation. Both these transformation
preserve the geometric nature of the curve. Thus, the given equation and its
reduced form, represent the same conic in the coordinate systems (x, y) and
(X, Y), respectively.

Over here we would like to make the following remark.

Remark 2: When we apply an orthogonal transformation, what are we doing


geometrically? We are simply rotating the axes, In fact, orthogonal matrices
correspond to rotations and reflections.

In the following example you can see what a conic looks like before and after
reduction to standard form.

Example 7: Let a = 4 in the equation considered in Example 6. Find the


90 coordinate transformation that will convert into standard form.
Unit
. . . . .17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......

Fig. 16: The ellipse 5x2 − 6xy + 5y2 + √2(x + y) = 4 (a) before reduction. (b) after
reduction.

Solution: The composite of the two transformations in Example 6 is


1
x= (−X + Y − 1/2)
√2
1
y= (X + Y − 1/2),
√2

which is the required coordinate transformation. Solving for X and Y, we get


y−x
X=
√2
y+x 1
Y= +
√2 2

For a = 4 the reduced equation becomes

X2 Y2
+ = 1.
9/16 9/4
We give the sketch of the original equation in Fig. 16(a) and the sketch of the
reduced equation in Fig. 16(b).

So, you see, the shape and size of the conic remains unchanged under the
transformations that we apply to reduced it to standard form.

∗∗∗

Let us look at another example in which we identify a conic by reducing it to


standard form.

Example 8: Find the nature of the conic

x2 + 2xy + y2 − 6x − 2y + 4 = 0.

1 1
Solution: The matrix of the quadratic form x2 + 2xy + y2 is [ ] , whose
1 1
eigenvalues are 2, 0. Normalised eigenvectors corresponding to the 91
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
eigenvalues 2, 0 are ( 1 1 ) and (−1/√2, 1/√2) , respectively. Hence, the
√2 √2
coordinate transformation
x 1/√2 −1/√2 y1
[ ]=[ ] [ ]
y 1/√2 1/√2 y2

i.e., x = (y1 − y2 ) /√2, y = (y1 + y2 ) /√2, will convert x2 + 2xy + y2 into 2y21 , and the
given equation into

2y21 − 3√2 (y1 − y2 ) − √2 (y1 + y2 ) + 4 = 0


2
i.e., (y1 − √2) = −√2y2 . Now, we want to get rid of the linear terms. If we apply
the translation

y1 − √2 = X, y2 = Y,

we can reduce the conic further into X2 = −√2Y.

This represents a parabola. Hence, the given equation represents a parabola.


∗∗∗
Let us formally write down what we have done in the various examples.

Step by step procedure for reducing a second degree equation in ℝ2 .


Consider the second degree equation.

ax2 + 2hxy + by2 + 2gx + 2fy + c = 0 …(11)

Step 1: Use the method of Section 16.6 to reduce ax2 + 2hxy + by2 to 𝜆1 y21 + 𝜆2 y22
using an orthogonal transformation. This transformation will reduce (11) to
2 2
𝜆1 y1 + 𝜆2 y2 + Ay1 + By2 + C = 0 …(12)

Step 2: Now use a suitable translation of axes (y1 , y2 ) ↦ (X, Y) to eliminate the
linear terms and reduce (12) into one of the standard forms. This will give the
reduction of (11).

By now you must be wanting to try and reduce equations on your own. Try this
exercise.

E11) Reduce the following second degree equations to standard form. (Here
a ∈ ℝ.) What is the type of conic they represent?
a) x2 + 4xy + y2 = a
b) 8x2 − 4xy + 5y2 = a
c) 3x2 − 4xy = a
d) 4x2 − 4xy + y2 = 1
e) 16x2 − 24xy + 9y2 − 104x − 172y + 44 = 0
f) 4x2 − 4xy + y2 − 12x + 6y + 9 = 0

We end this unit with briefly mentioning what has been done in it.

17.7 SUMMARY
92 In this unit we have covered the following points.
Unit
. . . . .17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......
1. A conic is defined to be the set of points in ℝ2 that satisfy an equation of
second degree. Conics can be real or imaginary.

2. Real conics can be one of the following types:


ellipse, circle, hyperbola, parabola, pair of straight lines, pair of parallel
lines, pair of coincident lines, or a point. Their standard equations are
listed in Table 1.

3. All these conic, except for a pair of parallel lines, can be obtained by
taking a plane section of a right circular double cone.

4. An ellipse, a parabola and a hyperbola satisfy the focus-directrix property,


i.e., the distance of any point P on them from a fixed point (a focus) is e
(the eccentricity) times the distance of P from a fixed line (a directrix).

5. e = 1, e > 1 or e < 1 accordingly as the conic is a parabola, a hyperbola or


an ellipse.

6. An ellipse (a hyperbola) satisfies the string property, i.e., for each point P
on the ellipse (hyperbola), the sum (absolute value of the difference) of
the distances of P from the two foci is constant, and is equal to the length
of the major (principal) axes.

7. The ellipse and parabola satisfy the reflected wave properties.

8. The ellipse, hyperbola and parabola are called non-degenerate conics.


The rest of the conics can be obtained as limiting cases of the
non-degenerate conics. The ellipse and hyperbola are non-degenerate
conics with a unique centre, and hence, are called central conics.

9. Any second degree equation can be reduced to standard form by


orthogonal transformations and translations.

17.8 SOLUTIONS/ANSWERS

E1) There can be many answers. We give the following:


a) y = x + 1 and y = x − 1 are a pair of parallel lines.
∴ {y − (x + 1)} {y − (x − 1)} = 0 represents a pair of parallel lines.

b) [y − (x + 1)]2 = 0 represents a pair of lines, both of which are y = x + 1.

E2) a) x2 − 2xy + y2 = 0 ⇔ (x − y)2 = 0. This represents the pair of coincident


lines x − y = 0 ie., y = x.
b) The equation represents the pair of parallel lines
1
(x − 2) (x − ) = 0,
4
i.e., (x − 2) (4x − 1) = 0.
c) The coincident lines x = 0, i.e., the y-axis.
d) The pair of lines x = 0 and y = 0, i.e., the y-axis and the x-axis. 93
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
E3) The equation of a circle is x2 + y2 = a2 , a ≠ 0. Applying (4) we get
Y2 + X2 = a2 . Applying (5) we get (−X)2 + Y2 = a2 , i.e., X2 + Y2 = a2 .
So, under either of these transformations the circle remains unchanged.
x2 y2 −bx dx −ay
E4) a2
+ b2
= 1 ⇒= and dy
= .
a√a2 −x2 b√b2 −y2
dy dx
∴ dx
= 0 at (0, b) and dy
= 0 at (a, 0).

∴, the tangents to (a, 0) and (0, b) are the lines x = a and y = b, respectively.
x2 y2
E5) The given ellipse is 4
+ 9
= 1. ∴ a = 2, b = 3.
√5
∴, the vertices are (0, ±3), e = √ 99−4 = 3
, the foci are (0, ±√5) and the
corresponding directrices are y = ±9.
√5

E6) Let P(𝛼, 𝛽) lie on the ellipse. Then 𝛼2 + 𝛽2 + a2 e2 = e2 𝛼2 + a2 ,


2
i.e., (𝛼 + ae)2 + 𝛽2 = e2 (𝛼 + ea )
−a
⇒ distance of P from (−ae, 0) = e (distance of P from x = e
)

E7) In this case e = 0. ∴ b = a = 3. ∴ 2a = 6.


x2 y2
E8) The hyperbola is 16
− 9
= 1.

a) Here a = 4, b = 3. ∴ e = √ 1616+9 = 54 .
The vertices are (± 4, 0).
The foci are (± 5, 0).
16
The corresponding directrices are x = ± 5
.
x2 y2
b) The hyperbola is 9
− 25
= 1.
√9+25
Here, a = 3, b = 5. ∴ e = 9
= √ 34
9
. The vertices are (±3, 0). The foci
9
are (± √34, 0). The corresponding directrices are x = ± .
√34

E9) Now b2 x21 − a2 y21 = a2 b2


2 2 2
⇒ x1 + y1 + a2 e2 = e2 x1 + a2 …(13)
2 2 a2
⇒ (x1 − ae) + y1 = e2 (x1 − e
) ⇒ PF1 = e(PD)
2 2
Also, (1) ⇒ (x1 + ae) + y21 = e2 (x1 + ea ) ⇒ PF2 = e(PD′ )
−3
E10) a) Here p = 43 . ∴, its focus is ( 34 , 0) . The directrix is x = 4
.

b) The focus is (0, a) and directrix is y = −a.


c) The focus is (−a, 0) and directrix is x = a.
Their sketches are in Fig.17.

E11) a) The second degree terms give the quadratic form x2 + 4xy + y2 . This
reduces to 3x21 − x22 . ∴, the given coins reduces to 3x21 − x22 = a.
If a = 0, this is a pair of straight lines.
If a ≠ 0, this is a hyperbola.
b) 8x2 − 4xy + 5y2 = a reduces to 9x21 + 4x22 = a.
If a = 0, this is a point conic.
If a < 0, this is imaginary.
94 If a > 0, this is an ellipse.
Unit
. . . . .17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......

Fig. 17: For E10

c) It reduces to 4x21 − x22 = a.


If a = 0, it is a pair of lines.
If a ≠ 0, this is a hyperbola.
d) This reduces to 5x21 = 1. This represents a pair of parallel lines.
16 −12
e) The matrix of the form 16x2 − 24xy + 9y2 is [ ] . Its eigenvalues
−12 9
4/5
are 25 and 0. The corresponding normalised eigenvectors are [ ]
−3/5
3/5
and [ ].
4/5
∴, applying the transformation

x 4/5 3/5 x1
[ ]=[ ] [ ],
y −3/5 4/5 y1

the conic becomes


4 3 −3 4
25x21 − 104 ( x1 + y1 ) − 172 ( x1 + y1 ) + 44 = 0.
5 5 5 5

2
⇒25x1 + 20x1 − 200y1 + 44 = 0.
2
⇒ (5x1 + 2) − 40 (5y1 − 1) = 0.

Now apply the translation X = 5x1 + 2, Y = 5y1 − 1.


We get X2 = 40Y, a parabola. ∴, the original equation is a parabola.
4 −2
f) The matrix of 4x2 − 4xy + y2 is [ ].
−2 1
Its eigenvalues are 5 and 0, and corresponding eigenvectors are

−2/√5 1/√5
[ ],[ ]
1/√5 2/√5
∴, the transformation

x −2/√5 1/√5 x1
[ ]=[ ] [ ]
y 1/√5 2/√5 y1
transforms the conic to
12 6
5x21 − (−2x1 + y1 ) + (x1 + 2y1 ) + 9 = 0.
√5 √5 95
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
2
⇒ (√5x1 + 3) = 0.
Now we apply the translation X = √5x1 + 3, Y = y. We get X2 = 0. This
represents a pair of coincident lines.

96
Miscellaneous
. . . . . . . . . . . . . . . .Exercises
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......

MISCELLANEOUS EXERCISES
The few exercises, given below cover the concepts and processes you have
studied in this block. Solving the exercises, will give a better understanding of
the underlying concepts concerned. Trying exercises on your own will also give
you more practice and confidence in solving such problem.

1. Calculate the norm of f(x) = x + 1, using the simple inner product defined
in C[1, 2].

2. Obtain a vector v = (x, y, z) ∈ ℝ3 so that v is perpendicular to (1, 0, 0) as well


as (−1, 2, 0) , with respect to the standard inner product.

3. Let V be a complex inner product space and T ∈ A(V) such that T∗ = −T.
Show that eigenvectors of T corresponding to distinct eigenvalues are
mutually orthogonal.

4. Show that a triangular Hermitian matrix is a diagonal matrix.

5. For which pairs of vectors does the equality hold in the Cauchy-Schwartz
inequality?

6. Can the inner product of two vectors be a negative real number?

7. Can the norm of a vector be negative real number?

8. Let P2 , the space of all real polynomials of degree at most 2, have the
inner product
1
⟨p, q⟩ = ∫ p(x) q(x) dx
−1

and let p = x, q = x2 . Examine that they are orthogonal or not.

9. Apply the Gram-Schmidt orthogonalisation process to find an orthonormal


basis for the subspace of ℂ4 generated by the vectors
{(1, i, 0, 1), (1, 0, i, 0), (−i, 0, 1, −1)}.

10. Find an orthonormal basis for ℂ3 by applying Gram-Schmidt


orthogonalisation process to the basis
{(1, i, 0), (−i, 0, 2), (0, −i, 2)}.

11. For 𝛼 ∈ C, let T𝛼 ∶ Cn → Cn be the linear transformation defined by


T𝛼(v) = 𝛼v. Find the adjoint of T𝛼. When will T𝛼 be Hermitian? When will
T𝛼 be Unitary? Justify your answer.

12. Consider the linear operator T ∶ ℂ4 → ℂ4 , defined by

T (z1 , z2 , z3 , z4 ) = (−iz2 , iz1 , −iz4 , z3 ) .

Compute T∗ and check whether T is self-adjoint under the standard inner


product on ℂ4 . Further, check whether T is unitary.

13. Suppose T is self-adjoint and T2 = 0. Show that T = 0 97


Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
14. Show that T∗ T and TT∗ are self-adjoint for any operator T on V.

15. Find the quadratic form corresponding to the Symmetric matrix


2 1 5
[1 −3 −4].
5 −4 3

16. Express the quadratic form in the matrix notation XT AX, where A is a
symmetric matrix.
a) 3x21 + x22 − 6x1 x2 b) 2x2 + 3y2 − z2 + 2xy − 4xz + yz

17. Expand Xt AX as a polynomial, where Xt = [x, y, z] and A is


4 0 0

[0 − 2 0 ] .
0 0 −1

18. Consider the change of basis of ℝ3 from the standard basis

B = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}

to the basis

B′ = {(−2, 6, 3), (3, −2, 6), (6, 3, −2)} .

Find the coordinate transformation corresponding to this change of basis.

19. Find an orthogonal coordinate transformation that eliminates the cross


product terms in the quadratic form Q and also write the orthogonal
canonical form of the quadratic form Q.
a) Q = −x2 − y2 − 6xy b) Q = 2x2 + 2y2 + 2z2 + 4xy

20. Identify the conic section represented by the equation


(a) x2 − y2 = 10 (b) 3x2 + 5y2 − 15 = 0
(c) 5y2 − 3x = 0 (d) 2x2 + 2y2 − 50 = 0

21. Reduce the following equations to standard form and identify the conic
section represented by the equation.
a) −x2 − y2 − 6xy = 8
b) 5x2 − 6xy + 5y2 = 16

SOLUTIONS/ANSWERS TO
MISCELLANEOUS EXERCISES
1
E1) ‖f‖ = ⟨f, f⟩1/2 = √∫ (x + 1)2 dx
0
1
= √∫ (x2 + 2x + 1) dx
0
1
x3
= √[ + x3 + x]
3 0
1
=√ +1+1
3
7
=√
98 3
Miscellaneous
. . . . . . . . . . . . . . . .Exercises
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......
E2) v ⟂ (1, 0, 0) ⇒ x ⋅ 1 + y ⋅ 0 + z ⋅ 0 = 0 ⇒ x = 0
v ⟂ (−1, 2, 0) ⇒ x ⋅ (−1) + y ⋅ 2 + z ⋅ 0 = 0 ⇒ −x + 2y = 0.
So, we get x = 0, y = 0. Thus, v is of the form (0, 0, z) for z ∈ ℝ.
E3) Let 𝛼, 𝛽 ∈ C be distinct eigenvalues of T. Let v, w ∈ V be eigenvectors
corresponding to 𝛼 and 𝛽, respectively. Then Tv = 𝛼v and Tw = 𝛽w.
Now

𝛼⟨v, w⟩ = ⟨𝛼v, w⟩ = ⟨Tv, w⟩ = ⟨v, T∗ w⟩


= −⟨v, Tw⟩ = −⟨v, 𝛽w⟩ = −𝛽⟨v, w⟩
= 𝛽⟨v, w⟩ (∵ 𝛽 = −𝛽 from (b) above).

⇒ (𝛼 − 𝛽)⟨v, w⟩ = 0 ⇒ ⟨v, w⟩ = 0, since 𝛼 ≠ 𝛽.


⇒ v is orthogonal to w.

E4) Let A be an upper triangular Hermitian matrix. Then aij = 0 for i > j. Also
A = A∗ . ∴ aij = aji .
∴, for i < j, aij = aji = 0 = 0, since j > i.
∴ ∀ i ≠ j, aij = 0. ∴ A is a diagonal matrix.
Similarly, if A is Hermitian and lower triangular, it must be a diagonal
matrix.
E5) Let (V, ⟨ , ⟩) be an inner product space and x, y ∈ V. Now if x and y are
linearly dependent then one of them is a scalar multiple of the other (in
that case without any loss of generality you can assume that y = 𝛼 x,
where 𝛼 is a scalar).
Then

|⟨x, y⟩| = |⟨x, 𝛼 x⟩| = |𝛼| |⟨x, x⟩|


= |𝛼| ‖X‖2
= ‖x‖ |𝛼| ‖x‖
= ‖x‖ ‖𝛼 x‖
= ‖x‖ ‖y‖

So, in this case, we can see that the equality holds in the
Cauchy-Schwartz inequality.
E6) Yes, See Example 2 in unit 14.
E7) No. By definition, it should be positive.
1 1
E8) ⟨p, q⟩ = ∫ x. x2 dn = ∫ x3 dn = 0.
−1 −1
So, p ⟂ q.
E9) Let v1 = (1, i, 0, 1) , v2 = (1, 0, i, 0) , v3 = (−i, 0, 1, −1) .
u u u ⟨v2 ,u1 ⟩
We want the set { ‖u1‖ , ‖u2‖ , ‖u3‖ } , where u1 = v1 , u2 = v2 − ⟨u1 ,u1 ⟩
u1 ,
1 2 3
⟨v3 ,u1 ⟩ ⟨v3 ,u2 ⟩
and u3 = v3 − ⟨u1 ,u1 ⟩
u1 − ⟨u2 ,u2 ⟩
u2
Now, ⟨v2 , u1 , ⟩ = ⟨v2 , v1 , ⟩ = 1 + 0 + 0 + 0 = 1.
Also ⟨u1 , u1 ⟩ = ⟨v1 , v1 ⟩ = 3, so that ‖u1 ‖ = √3.
1
∴ u2 = (1, 0, i, 0) − 3
(1, i, 0, 1) = ( 32 , − 3i , i, − 13 ) 99
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
√15
∴ ‖u2 ‖ = 3
.
−1−i 1−5i 2
∴ u3 = (−i, 0, 1, −1) − 3
(1, i, 0, 1) − 5
( 3 , − 3i , i, − 13 )

= ( 51 , 2i
5
, − 5i , − 35 ).
√15
∴ ‖u3 ‖ = 5
.
1 3
∴ { (1, i, 0, 1) , ( 23 , − 3i , i, − 13 ) , 5
( 15 , 2i
5
, − 5i , − 35 )} is the required
√3 √15 √15
orthonormal basis.
E10) Let v1 = (1, i, 0) , v2 = (−i, 0, 2) , v3 = (0, −i, 2) .
u u u ⟨v2 ,u1 ⟩
We want the set { ‖u1‖ , ‖u2‖ , ‖u3‖ } , where u1 = v1 , u2 = v2 − ⟨u1 ,u1 ⟩
u1 ,
1 2 3
⟨v3 ,u1 ⟩ ⟨v3 ,u2 ⟩
and u3 = v3 − ⟨u1 ,u1 ⟩
u1 − ⟨u2 ,u2 ⟩
u2
Now, ⟨v2 , u1 , ⟩ = ⟨v2 , v1 , ⟩ = −i + 0 + 0 = −i.
Also ⟨u1 , u1 ⟩ = ⟨v1 , v1 ⟩ = 2, so that ‖u1 ‖ = √2.
−i
∴ u2 = (−i, 0, 2) − 2
(1, i, 0) = (− 2i , − 21 , 2)
3
∴ ‖u2 ‖ = .
√2
⟨v3 ,u1 ⟩ ⟨v3 ,u2 ⟩
∴ u3 = v3 − ⟨u1 ,u1 ⟩
u1 − ⟨u2 ,u2 ⟩
u2

= ( 49 + 4i 4
,
9 9
− 4i 2
,
9 9
− 2i
9
).
2√2
∴ ‖u3 ‖ = 3
.
1 √2
∴ { (1, i, 0) , 3
(− 2i , − 12 , 2) , 3
(4 + 4i 4
, − 4i 2
, − 2i
)} is the required
√2 2√2 9 9 9 9 9 9
orthonormal basis.
E11) We have for u, v ∈ ℂn , ⟨T𝛼 u, v⟩ = ⟨𝛼u, v⟩ = ⟨u, 𝛼v⟩ = ⟨u, T𝛼 v⟩. Since
⟨T𝛼 u, v⟩ = ⟨u, T∗𝛼 v⟩ it follows that T∗𝛼 v = 𝛼v. If T𝛼 is self adjoint, we have
T𝛼 v = T𝛼 v, i.e 𝛼v = 𝛼v for all v ∈ ℂn . If v ≠ 0, (𝛼 − 𝛼)v = 0 ⇔ 𝛼 − 𝛼 = 0. So,

T𝛼 = T𝛼 ⇔ 𝛼v = 𝛼v ⇔ 𝛼 − 𝛼 = 0 ⇔ 𝛼 is a real number

So, T𝛼 is self-adjoint iff 𝛼 ∈ ℝ.


We have (T∗𝛼 ∘ T𝛼 )(v) = T∗𝛼 (𝛼v) = 𝛼𝛼v = |𝛼|2 v. Therefore

T𝛼 is unitary ⇔ T∗𝛼 ∘ T𝛼 = I
⇔ (T∗𝛼 ∘ T𝛼 )(v) = v
⇔ |𝛼|2 v = v ∀v ∈ ℂn
⇔ |𝛼| = 1

Therefore T𝛼 is unitary iff |𝛼| = 1.


E12) Let u = (u1 , u2 , u3 , u4 ) and w = (w1 , w2 , w3 , w4 ) ∈ ℂ4 . We have

⟨T(u), w⟩ = ⟨(−iu2 , iu1 , −iu4 , u3 ) , (w1 , w2 , w3 , w4 )⟩


= −iu2 w1 + iu1 w2 − iu4 w3 + u3 w4

= u1 (−iw2 ) + u2 (iw1 ) + u3 w4 + u4 (iw3 )


= ⟨(u1 , u2 , u3 , u4 ) , (−iw2 , iw1 , w4 , iw3 )⟩

∴T∗ (w1 , w2 , w3 , w4 ) = (−iw2 , iw1 , w4 , iw3 )

100 Here T ≠ T∗ So, T is not self adjoint.


Miscellaneous
. . . . . . . . . . . . . . . .Exercises
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Conics
.......
We have

(TT∗ ) (u1 , u2 , u3 , u4 ) = T (−iu2 , iu1 , u4 , iu3 )


= ((−i)(iu1 ), i(−iu2 ), (−i)(iu3 ), u4 )
= (u1 , u2 , u3 , u4 )

TT∗ = I, so T is unitary.

E13) For any v ∈ V, we have

‖Tv‖2 = ⟨Tv, Tv⟩ = ⟨v, T2 V⟩


= ⟨v, 0⟩
= 0.

Thus, ‖Tv‖ = 0 and hence Tv = 0. Since Tv = 0 for every v ∈ V, we have


T = 0.

E14) (T∗ T)∗ = T∗ T∗∗ = T∗ T,


and (TT∗ )∗ = T∗∗ T∗ = TT∗ .
Therefore, T∗ T and TT∗ are self-adjoint.

E15) Quadratic form = 2x2 − 3y2 + 3z2 + 4xy + 10xz − 8yz.


3 −3
E16) a) A = [ ]
−3 1

2 1 −2
b) A = [ 1 3 21 ]
−2 12 −1

E17) The required polynomial = 4x2 − √2y2 − z2


x1 y1
E18) Let the coordinates of a vector be X = [x2 ] and Y = [y2 ]
x3 y3
which respect to the bases B and B′ , respectively. Then the coordinate
−2 3 6
transformation is given by X = [ 6 −2 3 ] Y.
3 6 −2

⇒ x1 = −2y1 + 3y2 + 6y3


x2 = 6y1 − 2y2 + 3y3
x3 = 3y1 + 6y2 − 2y3

is the required coordinates transformation.


−1 −3
E19) a) The matrix of the quadratic form is [ ].
−3 −1
The eigenvalues are −4, −2.
2 2
∴, The required orthogonal canonical form will be −4x1 + 2y1 .
1
−1
√2
The eigenvectors corresponding to −4 and 2 are [ 1 ] and [ 1√2 ]
√2 √2
respectively. 101
Block
. . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Inner
. . . . . . Products
. . . . . . . . . . .and
. . . . .Quadratic
. . . . . . . . . . .Forms
......
∴, The orthogonal coordinate transformation is given by

1 1
x − x1
[ ] = [ √12 1
√2
][ ].
y y1
√2 √2

2 2 0
b) The matrix of the quadratic form is [2 2 0].
0 0 2
The eigenvalues are 4, 2, 0.
The required orthogonal canonical form is 4x21 + 2y21 .
1 1
0 −
√2 √2
The eigenvectors corresponding to 4, 2, 0 are [ 1 ] , [0] and [ 1 ].
√2 √2
0 1 0
∴, The orthogonal coordinate transformation is given by

1 1
x 0 − x1
√2 √2
[y] = [ 1 0 1
] [y1 ]
√2 √2
z 0 1 0 z1

E20) a) hyperbola
b) ellipse
c) parabola
d) circle
E21) a) The second degree terms give the quadratic form −x2 − y2 − 6xy. This
reduces to −4x21 + 2y21 . (After using the transformation as we have
done in E9).
2 2
∴, The given conic reduces to −4x1 + 2y1 = 8.
∴ The standard form is

y21 x21
− =1
22 (√2)2

This is a hyperbola with Y-axis as principal axis and x-axis as


conjugate axis.
b) The second degree terms give the quadratic form 5x2 − 6xy + 5y2 .
This reduces to 8x21 + 2y21 . (After using the transformation as we have
done in E9).
2 2
∴, The given conic reduces to 8x1 + 2y1 = 16.
∴ The standard form is

x21 y21
+ =1
2 2
(√2) (√8)

Which is an ellipse with y-axis as major axis and x-axis as minor axis.

102

You might also like