0% found this document useful (0 votes)
48 views225 pages

Geometry Draft 2025 2

The document is a draft for a Geometry I course, outlining various topics such as affine space, Cartesian coordinates, Euclidean space, and curves and surfaces. It includes detailed sections on geometric vectors, affine maps, isometries, and quadratic curves, among others. The document serves as a comprehensive guide for first-year students studying geometry, with appendices covering additional mathematical concepts.

Uploaded by

sararoza123321
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views225 pages

Geometry Draft 2025 2

The document is a draft for a Geometry I course, outlining various topics such as affine space, Cartesian coordinates, Euclidean space, and curves and surfaces. It includes detailed sections on geometric vectors, affine maps, isometries, and quadratic curves, among others. The document serves as a comprehensive guide for first-year students studying geometry, with appendices covering additional mathematical concepts.

Uploaded by

sararoza123321
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 225

Geometry I

Draft 2025
Contents

1 Affine space 5
1.1 Geometric Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Vector space structure of geometric vectors . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3 Affine space structure of the Euclidean space . . . . . . . . . . . . . . . . . . . . . . . . 17

2 Cartesian coordinates 19
2.1 Frames in dimension 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Frames in dimension 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Frames in dimension n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3 Affine subspaces 31
3.1 Lines in A2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Planes in A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3 Lines in A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4 Affine subspaces of An . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4 Euclidean space 49
4.1 Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3 Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5 Area and volume 67


5.1 Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.2 Cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3 Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6 Affine maps 87
6.1 Properties of affine maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

2
Geometry - first year [email protected]

6.2 Projections and reflections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

7 Isometries 99
7.1 Affine form of isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.2 Isometries in dimension 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.3 Isometries in dimension 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.4 Moving points with isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

8 Curves and surfaces 115


n
8.1 Smooth curves in E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.2 Smooth hypersurfaces in En . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

9 Quadratic curves (conics) 128


9.1 Ellipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.2 Hyperbola . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.3 Parabola . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

10 Hyperquadrics 142
10.1 Hyperquadrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10.2 Reducing to canonical form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10.3 Classification of conics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.4 Classification of quadrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

11 Canonical equations of real quadrics 159


11.1 Ellipsoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
11.2 Elliptic Cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
11.3 Hyperboloid of one sheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
11.4 Hyperboloid of two sheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
11.5 Elliptic paraboloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
11.6 Hyperbolic paraboloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

Appendices 178

A Axioms 179

B Lines and the real numbers 182

C Changing the basis in a vector space 185

D Coordinate systems 187


D.1 Polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
D.2 Cylindrical coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
D.3 Spherical coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
D.4 Barycentric coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

3
Geometry - first year [email protected]

E Bundles of lines and planes 190


2
E.1 Bundle of lines in A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
E.2 Bundle of planes in A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

F Some classical theorems in affine geometry 193

G Eigenvalues and Eigenvectors 205


G.1 Characteristic polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

H Bilinear forms and symmetric matrices 208


H.1 Affine diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
H.2 Isometric diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

I Trigonometric functions 216

J Some classical theorems in Euclidean geometry 219

K Quaternions and rotations 221


K.1 Algebraic considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
K.2 Geometric considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

Bibliography 224

4
CHAPTER 1

Affine space

Contents
1.1 Geometric Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Vector space structure of geometric vectors . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3 Affine space structure of the Euclidean space . . . . . . . . . . . . . . . . . . . . . . . 17

5
Geometry - first year [email protected]

1.1 Geometric Vectors


In this section, we introduce the concept of a vector based on the axiomatic framework provided by
Hilbert in his Foundations of Geometry [14]. Vectors are at the heart of our understanding of Euclidean
geometry. Axioms are the starting point of any process of logical reasoning. Hilbert’s axioms can be
found in Appendix A.
We denote by E the Euclidean space. It consists of a set of elements called points, which are
governed by the axioms. Points are primitives and the axioms describe how primitives interact,
not what they are. The interaction is expressed through relations, such as collinearity, betweenness,
coplanarity, incidence and congruence.
One approach to understanding a theory from its axioms is to systematically explore the infor-
mation derived from a given set of primitives. If a single point is given, no additional information
can be inferred. If two points are given, we arrive at the concept of a vector. A vector encapsulates all
the information carried by two points through the axioms. In Euclidean geometry, the information
carried by two points is length and direction. To formally define vectors, we treat two (not necessarily
distinct) points A and B as an ordered pair (A, B).
We begin with the concept of length. Given two points A and B, Axiom I.1 ensures the existence
of a straight line passing through them, commonly denoted as AB. For brevity, we use the shorter
term line for a straight line. Using the betweenness relation, the Axioms of Order allow us to define
segments. The segment [AB] is the set of points on the line AB which lie between the points A and B
together with the points A and B. Segments [AA], with equal endpoints, are called trivial segments.
If [AB] is congruent to [CD] we write [AB] ≡ [CD].

Definition 1.1. We say that two ordered pairs of points (A, B) and (C, D) are equidistant, and we write
(A, B) ≡ (C, D) if and only if the segments [AB] and [CD] are congruent1 .

Proposition 1.2. The equidistance relation is an equivalence relation.

Proof. The equidistance relation is equivalent to the congruence relation on segments. We show that
the latter relation is an equivalence relation. We need to show reflexivity, symmetry and transitivity.
In order to show that the relation is reflexive, fix [AB] and construct congruent a segment [A′ B′ ] using
Axiom III.1. Then, applying Axiom III.2 to the congruences [AB] ≡ [A′ B′ ] and [AB] ≡ [A′ B′ ] we obtain
[AB] ≡ [AB]. For symmetry, assume that [AB] ≡ [CD]. Applying Axiom III.2 to the congruences
[CD] ≡ [CD] and [AB] ≡ [CD] we obtain [CD] ≡ [AB] as desired. For transitivity, assume that [AB] ≡
[CD] and that [CD] ≡ [EF]. By symmetry we have [EF] ≡ [CD]. Then, applying Axiom III.2 to the
congruences [AB] ≡ [CD] and [EF] ≡ [CD] we obtain [AB] ≡ [EF] as desired.

Definition 1.3. The equivalence classes of the equidistance relation are called distances or lengths.
The distance defined by the pair (A, B) is denoted by |AB|. It is also called the length of the segment
[AB]. Explicitly, we have
n o
|AB| = ordered pairs (X, Y ) such that (X, Y ) ≡ (A, B) .

1 The axioms sometimes refer to segments as being ‘congruent or equal’ which may suggest that congruence is reserved
for unequal segments. For us, if two segments are equal they are congruent.

6
Geometry - first year [email protected]

We say that (A, B) represents the distance |AB| and that [AB] represents the length |AB|. If |AB| = |CD| we
say that [AB] and [CD] have the same length. The set of distances/lengths defined by segments in E is
the set of equivalence classes:
n o .
L = |AB| : (A, B) ∈ E × E = E × E ≡ .

Next, we introduce the concept of direction. The Axioms of Order also allow us to define the
notion of sides of the line AB with respect to the point A. The points B, C ∈ AB are on the same side of
AB relative to A if A is not between B and C (see [14, p.8]). The set of points on the same side of a line
relative to A is also called a ray emanating from A. If it contains the point B, we denote it by (AB.

Lemma 1.4. For any segment [AB], any line ℓ and any point A′ ∈ ℓ there is a unique segment [A′ B′ ],
congruent to [AB], lying on ℓ on a given side relative to A′ .

Proof. The existence of the point B′ , and therefore of the segment [A′ B′ ], is guaranteed by Axiom
III.1. In order to prove uniqueness, suppose for a contradiction that there is another point B′′ on
the ray (A′ B′ such that [A′ B′ ] ≡ [A′ B′′ ]. Choose a point C ′ not on the line A′ B′ . The existence of
such a point is guaranteed by Axioms I.2 and I.8. We have [A′ C ′ ] ≡ [A′ C ′ ], [A′ B′ ] ≡ [A′ B′′ ] and
∡C ′ A′ B′ ≡ ∡C ′ A′ B′′ . Applying Axiom III.6 we obtain ∡A′ C ′ B′ ≡ ∡A′ C ′ B′′ which contradicts Axiom
III.4.

C′

A′ B′′ B′

When doing geometry it is customary to avoid degenerate cases and overlaps. However, for our
purpose of building on the axioms, we take degenerate cases into account when defining directions.
This will open the way to linear algebra tools. Before giving the definition, let us recall that the
Axioms of Order allow us to define the notion of sides of a plane α with respect to a line contained
in α (see [14, p.8]).

Definition 1.5. We say that two ordered pairs of points (A, B) and (C, D) are equidirectional, and we

write (A, B) = (C, D), in the following three cases:

1. if A = B and C = D;

2. if A , B, the points A, B, C are collinear and

2.1. A = C and B, D are on the same side relative to A, or


2.2. B, C are on the same side of A and A, D are on opposite sides of C, or
2.3. B, C are on opposite sides of A and A, D are on the same side of C;

7
Geometry - first year [email protected]

C D A D B D D

A=C D B D

A C D B D B
A B=C D C
A
A B C D

3. if A , B, the points A, B, C are not collinear, AB and CD are parallel and B and D lie on the same
side of AC in the plane that the four points determine.

Proposition 1.6. Consider two pairs of distinct points (A, B) and (C, D).

1. If AB = CD then (A, B) = (C, D) if and only if (AB ∩ (CD = (AB or (AB ∩ (CD = (CD.

2. If AB, CD are distinct but parallel then (A, B) = (C, D) if and only if [AD] intersects [BC].

Proof. 1. If A = C then, by definition, we have (A, B) = (C, D) if and only if B, D are on the same side
of A. Thus, by definition, we have (AB = (CD, hence (AB∩(CD = (AB = (CD. For the remaining cases
we make use of the following fact which intuitively is clear: (∗) four points on a line can always be
relabeled P1 , P2 , P3 , P4 such that P2 and P3 lie between P1 and P4 , and furthermore, that P2 lies between
P1 and P3 and P3 lies between P2 and P4 . This is Theorem 5 in [14].
Consider the case where A , C. For the implication from left to right, assume first that B, C are
on the same side of A. Since C is between A and D, by (∗), for any point X on (CD we have C between
A and X, thus X ∈ (AC. Since (AC = (AB we showed that (AB ∩ (CD = (CD. Next assume that B, C
are on opposite sides of A. By (∗), for any X ∈ (AB we have A between C and X, thus X ∈ (CA. Since
A, D are on the same side of C we also have (CA = CD, hence (AB ∩ (CD = (AB. The implication from
right to left is easier.
2. Now assume that the lines AB and CD are distinct and parallel. We show the implication from
left to right since the other direction is easier. Assume for a contradiction that B and C lie on the same

side of the line AD. Then (DC lies on the same side of AD as B. By assumption, since (A, B) = (C, D),
we also have (CD on the same side of AC as B. By Axiom II.2, we may choose a point P between C
and D. Since P lies on both (CD and (DC it lies on the same side of AC and AD as B. Thus B lies in
the interior of the angle described by (AC and (AD. Hence AB intersects [CD], contradicting AB∥CD.
Thus B and C lie on opposite sides of AD, i.e. [BC] intersects AD. Interchanging the role of B, C with
A, D we find that [AD] intersects BC. Therefore [BC] intersects [AD].

Proposition 1.7. The equidirectional relation is an equivalence relation.


Proof. We need to show that the equidirectional relation is reflexive, symmetric and transitive. Re-
flexivity and symmetry follow directly from the definition. For transitivity, let (A, B), (C, D) and (E, F)
∧ ∧
be three ordered pairs of points. Assuming that (A, B) = (C, D) and that (C, D) = (E, F), we need to

show that (A, B) = (E, F). If A = B then C = D, thus E = F and the claim follows in this case. For the
rest of the proof assume that the considered pairs consist of distinct points.

8
Geometry - first year [email protected]

Next, we prove transitivity along lines by treating the case when the four points are collinear.
By Proposition 1.6, we have (AB ∩ (CD = (AB or (CD and similarly, (EF ∩ (CD = (EF or (CD. In all
four cases, by Axiom V.1 we find a point P such that (AB = (AP , (CD = (CP and (EF = (EP . Then, if
(AP ∩ (CP = (AP and (EP ∩ (CP = (EP , we have A, E between C and P . It follows from Theorem 5 in
[14] that E lies between C and A or that A lies between C and E. In the first case (AP ∩ (EP = (EP
and in the second case (AP ∩ (EP = (AP . The other three cases are treated similarly. Thus, for the rest
of the proof we may assume that we consider pairs of distinct points and that the six points are not
collinear.
Consider the cases where two of the lines overlap. The cases where AB or EF equal CD follow
directly from transitivity along lines proved in the previous paragraph. It remains to consider the
case where AC = EF = ℓ. Since the six points are not collinear, the line CD is distinct from ℓ and

parallel to ℓ. We need to show that (A, B) = (E, F). By transitivity along lines, we may replace (A, B)
and (E, F) by other equidirectional pairs of points on ℓ. Thus, we may assume that A = E. Then B and
∧ ∧
F are on the same side of the line AC = EC since (A, B) = (C, D) and since (E, F) = (C, D). Hence B, F
are on the same side of A.
Finally, consider the case where the three lines AB, CD and EF are distinct. Again, by transitivity
along lines, we may shift the pairs of points along these three lines. Thus we may assume that C is the

intersection of AE with CD. Then B, D are on the same side of the line AC = AE since (A, B) = (C, D)

and D, F are on the same side of the line CE = AE since (A, B) = (C, D). Thus B, F are on the same side
relative to the line AE.

Definition 1.8. The equivalence classes of the equidirectional relation are called directions. The
direction containing the ordered pair (A, B) is denoted by |AB⟩. Explicitly, we have
n ∧
o
|AB⟩ = ordered pairs (X, Y ) such that (X, Y ) = (A, B) .

We say that (A, B) is a representative of the direction |AB⟩. If |AB⟩ = |CD⟩ we say that (A, B) and (C, D)
define the same direction. The set of directions defined with points in E is the set of equivalence
classes: n o .∧
D = |AB⟩ : (A, B) ∈ E × E = E × E = .
The direction |BA⟩ is called the opposite direction of |AB⟩ and is denoted by −|AB⟩. This defines an
involution −□ : D → D.
At the intersection of the equidistance relation and the equidirectional relation lies the equipol-
lence relation which is used to define vectors. Before defining this relation, we define degenerate
parallelograms first.
Definition 1.9. A parallelogram ABCD is an ordered quadruple of points (A, B, C, D) with the usual
property of having parallel opposite sides if the four points are not collinear. In this case, the labeling
is such that C, D are on the same side of AB and the points D, A are on the same side of BC.
If the four points are collinear we require that the segments [AC] and [BD] have the same mid-
point. In this case, we say that the parallelogram is degenerate.
Definition 1.10. Two ordered pairs of points (A, B) and (C, D) are called equipollent, and we write
(A, B) ∼ (C, D), if the segments [AD] and [BC] have the same midpoints.

9
Geometry - first year [email protected]

D B

C A

Proposition 1.11. For two ordered pairs of points (A, B) and (C, D) the following statements are
equivalent:

1. (B, A) ∼ (D, C).

2. (A, B) ∼ (C, D).

3. ABDC is a parallelogram, possibly degenerate.

4. |AB⟩ = |CD⟩ and [AB] ≡ [CD].

5. |AB⟩ = |CD⟩ and |AB| = |CD|.

Proof. First notice that the equivalence (1. ⇔ 2.) follows directly from Definition 1.10 and that (4. ⇔
5.) is by Definition 1.1. Thus, it suffices to show (2. ⇒ 3.), (3. ⇒ 4.) and (4. ⇒ 2.).
(2. ⇒ 3.) Let (A, B) ∼ (C, D). Denote by M the common midpoint of the segments [AD] and [BC].
If A = B then [AD] and [AC] have M as common midpoint. Then, if A = C or A = D it follows that all
four points coincide, hence ABCD is a parallelogram by definition. If on the other hand A , C and
A , D then M lies between A and C and between A and D. From Lemma 1.4 it follows that C = D.
Thus, for A = B we showed that ABDC is a degenerate parallelogram. A similar argument shows
that ABDC is a parallelogram if C = D. For the rest of the proof we may assume that A , B and that
C , D.
Assume further that A, B, C are collinear. Since M is the common midpoint of [AD] and [BC] we
have DM = AM = BC. Thus, the points are collinear and ABDC is a degenerate parallelogram by
definition.
Assume next that A , B, C , D and that A, B, C are not collinear. In this case the lines AB and CD
are distinct. Therefore, the common midpoint M of [AD] and [BC] cannot lie on any of these two
lines. Thus, the angles ∡CMD and ∡AMB are defined (see [14, p.11]). They are so-called vertical
angles (see [14, §6]) and it follows from [14, Theorem 14] that vertical angles are congruent. Thus,
by the first congruence theorem for triangles [14, Theorem 12], the triangles MCD and MBA are
congruent. In particular ∡MDC is congruent to ∡MAB and, again by [14, Theorem 14], they have
congruent supplementary angles. It then follows from [14, Theorem 30] that the lines AB and CD
are parallel. Similarly, one shows that AC and BD are parallel. Thus ABDC is a parallelogram.
(3. ⇒ 4.) Let ABDC be a parallelogram. Assume first that it is degenerate, i.e. the four points are
collinear. Let M be the common midpoint of [AD] and [CB]. First consider the case where A = B. If
C = D we are done since [AB] ≡ [CD] and |AB⟩ = |CD⟩. Assume for a contradiction that C , D. Then

10
Geometry - first year [email protected]

D B

-
M

-
C A

D, M are on the same side of A and C, M are on the same side of B = A. From Lemma 1.4 it follows
that C = D. A similar argument shows that if C = D then A = B and our claim follows. Thus, we may
assume that A , B and C , D. Assume now that A = C. Then D, M are on the same side of A and
B, M are on the same side of C = A. From Lemma 1.4 it follows that B = D and our claim follows.
Next, assume that A = D. Then M = A = D. Since M is the midpoint of [CB], we have that A = D
lies between C and B, hence |CD⟩ = |AB⟩. Moreover, since A = D is the midpoint of [CB] we have
[CD] ≡ [AB] by Lemma 1.4. We may assume for the rest of the proof that the four points are distinct.
Since M is the common midpoint of the segments [AD] and [BC], from Axiom III.3 we deduce
that [AB] ≡ [CD]. It remains to show that |AB⟩ = |CD⟩. Since M is the midpoint of [AD], the points
A, D lie on distinct sides of AD relative to M. Similarly for B and C. Thus, we have two possibilities:
either A and B are on the same side relative to M or A and C are on the same side relative to M. In
the first case, there are two possibilities: either B is between A and M or A is between B and M. If B
is between A and M, since M is the common midpoint of [AD] and [BC], it follows from Axiom III.3
that C lies between M and D. Therefore A and D are on opposite sides relative to C, i.e. (A, B) and
(C, D) define the same direction. The remaining three cases are treated similarly.
Now assume that ABDC is not degenerate. By definition, the opposite sides of ABDC are parallel.
By [14, Theorem 30], we have the following congruences ∡CDA ≡ ∡DAB and ∡DAC ≡ ∡ADB. By
the second congruence theorem for triangles [14, Theorem 13], the triangles ACD and DBA are
congruent, in particular [AB] ≡ [CD]. Moreover, by definition, the labeling of the vertices are such
that B and D lie on the same side of the line determined by A and C. Thus, since the sides are parallel,
the pairs (A, B) and (C, D) define the same direction.

D B

C A

(4. ⇒ 2.) Assume that [A, B] ≡ [C, D] and |AB⟩ = |CD⟩. Then AB is parallel to CD. If the four

11
Geometry - first year [email protected]

points are on the same line ℓ, the configuration is degenerate. By definition, since (A, B) and (C, D)
define the same direction, we have two cases. In the first case B, C are on the same side of ℓ relative
to A and A, D are on distinct sides of ℓ relative to C. Assume that C is between B and D and let M be
the midpoint of [AD]. Then, by Axiom III.3 applied to M, A, B and M, D, C we have [MB] ≡ [MC], i.e.
M is also the midpoint of [BC]. The other cases are treated similarly. The only thing to pay attention
to is that Axiom III.3 requires segments which do not overlap.
Finally, assume that the four points are not collinear. Denote by M the intersection of the diago-
nals. It exists by Point 2. of Proposition 1.6. By [14, Theorem 30] and the second congruence theorem
for triangles [14, Theorem 13], the triangles MAB and MDC are congruent. Thus [AM] ≡ [MD] and
[BM] ≡ [MC], i.e. M is the common midpoint of [AD] and [BC].

Vectors are defined as equivalence classes of the equipollence relation (see Definition 1.13 below).
To this end, the following proposition shows that this relation is indeed an equivalence relation.

Proposition 1.12. The equipollence relation is an equivalence relation.

Proof. By Proposition 1.11 we have that (A, B) ∼ (C, D) if and only if |AB⟩ = |CD⟩ and [AB] ≡ [CD].
By Proposition 1.2 and Proposition 1.7 both the equidistance relation on the equidirectional relation
are equivalence relations. Thus, reflexivity, symmetry and transitivity for the equipollence relation
follow from the corresponding properties of these two relations.

Definition 1.13. The equivalence classes of the equipollence relation are called vectors. The vector
−−→
containing the ordered pair (A, B) is denoted by AB . Explicitly, we have
−−→ n o
AB = ordered pairs (X, Y ) such that (X, Y ) ∼ (A, B) .

−−→ −−→ −−−→


We say that (A, B) is a representative of the vector AB . If AB = CD we say that (A, B) and (C, D) define
−−→ −−−→
the same vector. By Proposition 1.11 we have that AB = CD if and only if |AB⟩ = |CD⟩ and |AB| = |CD|.
The set of vectors defined with points in E is the set of equivalence classes:
n −−→ o .
V = AB : (A, B) ∈ E × E = E × E ∼ .

−−→ →−
The vector AA is called the zero vector and we denote it by 0 or simply by 0 when there is no risk of
−−→
confusion. Since all representatives of a vector AB define the same length |AB| this will also be the
−−→ −−→ −−→
length of the vector AB and we denote it by | AB |. The vector BA is called the opposite of the vector
−−→ −−→
AB and is denoted by − AB . This defines an involution −□ : V → V.

The first observation about vectors (which we prove below) is that for any fixed but arbitrary
−−−→
point O there is a 1-to-1 correspondence between points A and vectors OA . In this correspondence
−−−→
OA is called the position vector of A relative to O or, if it is clear from the context what O is, we simply
say the position vector of A.

Proposition 1.14. For any ordered pair of points (A, B) and any point O, there exists a unique point
X such that (A, B) ∼ (O, X).

12
Geometry - first year [email protected]

Proof. Assume first that A, B and O are not collinear. By Axiom III.4 there is a line ℓ passing through
O and having the same angles with AO as AB. By [14, Theorem 22] the lines ℓ and AB cannot have a
point in common, i.e. they are parallel. By Axiom IV, the line ℓ is the unique line passing through O
which is parallel to AB. Similarly, there is a unique line ℓ ′ passing through B and which is parallel to
OA. Let X be intersection point of ℓ and ℓ ′ . Then ABXO is a parallelogram. Hence, by Proposition
1.11, we have (A, B) ∼ (O, X).
Now assume that A, B and O lie on a line ℓ. If O = A then [AB] = [OB], thus (A, B) ∼ (O, X) if
and only if X = B. If O , A, consider the two sides in which A divides ℓ. If O and B are on the same
side, there exists a unique segment [OX] on ℓ congruent to [AB] and such that A and X are not on the
same side of ℓ relative to O (by Lemma 1.4). Thus (A, B) and (O, X) define the same direction, hence
(A, B) ∼ (O, X). If O and B are on different sides of ℓ relative to A, there exists a unique segment [OX]
on ℓ which is congruent to [AB] and such that A and X lie on the same side of ℓ relative to O (by
Lemma 1.4). As in the previous case, X is the unique point such that (A, B) ∼ (O, X).

−−−→
Corollary 1.15. For any point O, the map φO : E → V defined by φO (A) = OA is a bijection.
−−−→ −−−→
Proof. Fix a vector CD . We show that there is a point A such that φO (A) = CD . By Proposition
1.14, there is a unique point A such that (C, D) ∼ (O, A), i.e. there exists a unique point A such that
−−−→ −−−→
CD = OA = φO . The existence of A implies the surjectivity of φO and the uniqueness implies that
φO is injective. Thus, φO is bijective.

Remark. It is clear that if in the set of ordered pairs of points E × E we fix the first entry then we
obtain a bijection with E. What Corollary 1.15 is saying is that, starting from the Axioms, vectors do
not carry more information than two points do. So, why not simply work with pairs of points instead?
We are simply working with pairs of points to which we formally attached the concept of length and
direction.

1.2 Vector space structure of geometric vectors


For a fixed point O, Corollary 1.15 allows us to identify points with geometric vectors. However, the
set V of vectors has more structure than the set of points. In this section we show that V is a real
vector space.

X Y
b
A A
b b
a+ a+
a

B
b
O O

13
Geometry - first year [email protected]

Definition 1.16. Consider two vectors a and b. If we fix a point O then, by Proposition 1.14, there
−−−→
is a unique point A such that a = OA and for the point A there exists a unique point X such that
−−−→ −−−→
b = AX . The sum of a and b is by definition the vector OX and we denote the sum by a + b.
−−−→ −−→
Equivalently, for a fixed point O there are unique points A and B such that a = OA and b = OB
and for the points O, A and B there is a unique point Y such that OAY B is a parallelogram. It follows
−−−→ −−−→
that X = Y and therefore a + b = OY = OX .

Proposition 1.17. The addition of vectors is well defined.

Proof. Let a, b ∈ V and O, A, X be as in Definition 1.16. The sum maps (a, b) to a+b as in the definition.
It is a map □ + □ : V × V → V on equivalence classes which is defined using representatives. The
proposition claims that the definition does not depend on the choice of representatives, i.e. if we replace
O with a point distinct from O, then the above construction yields the same equivalence class.
−−−−→
Fix a point O′ , O. Let A′ be the unique point such that a = O′ A′ and let X ′ be the unique point
−−−−→ −−−→ −−−−→ −−−−→ −−−→
such that b = A′ X ′ . We need to show that AX = A′ X ′ . Since O′ A′ = a = OA , we have that OAA′ O′
is a parallelogram. Similarly AXX ′ A′ is a parallelogram and it suffices to show that OXX ′ O′ is a
parallelogram.
If O = A, then O′ = A′ and OXX ′ O′ is a parallelogram since AXX ′ A′ is a parallelogram. The
cases where A = X or X = O are similar. Thus, we may assume that the points O, A, X are pairwise
distinct. Now, if O, A, X are collinear then [OX] is congruent to [O′ X ′ ] by Axiom III.3. Moreover, by
−−−→ −−−−→
the construction in Definition 1.16, we have |OX⟩ = |OX ′ ⟩ and therefore OX = O′ X ′ .
Finally, assume that A < OX. By the third congruence theorem for triangles [14, Theorem 18], we
have that the triangles OAX and OA′ X ′ are congruent. In particular ∡AOX ≡ ∡A′ O′ X ′ . Moreover,
since the lines OA and O′ A′ are parallel, they meet OO′ in congruent angles (by [14, Theorem 30]).
From this we deduce that OX and O′ X ′ meet OO′ in congruent angles. Thus OX and O′ X ′ are
parallel. Since OO′ ∥AA′ and AA′ ∥XX ′ we also have OO′ ∥XX ′ . Thus OXX ′ O′ is a parallelogram.

Proposition 1.18. The set of vectors V with addition is an abelian group.

Proof. Proposition 1.17 show that addition is indeed an operation on vectors. Thus we may choose
convenient representatives if needed. Let a, b, c be three vectors. Addition is associative if (a+b)+c =
a + (b + c). Fix a point A. Then, by Proposition 1.14, there exist unique points B, C and D such
−−→ −−→ −−−→ −−→ −−→ −−−→ −−→ −−−→ −−−→ −−→ −−→
that AB = a, BC = b, CD = c. But then ( AB + BC ) + CD = AC + CD = AD = AB + BD =
−−→ −−→ −−−→ −−→ −−→ −−→
AB + ( BC + CD ) which proves associativity. The neutral element is 0 = AA since AA + AB =
−−→ −−→ −−→ −−→ −−→ −−→ −−→ −−→
AB = AB + BB = AB + AA . The inverse element of AB is the opposite vector − AB = BA since
−−→ −−→ −−→ −−→ −−→
AB + (− AB ) = AB + BA = AA = 0. Finally, to show commutativity we notice the following. Since
a + b is constructed on the diagonal of a parallelogram with a and b represented on the sides, the
construction is symmetric and does not depend on the order of a and b in the sum.

Up to this point we have used the concept of length solely as equivalence classes given by the
congruence relation on segments (Definition 1.8). While this was enough for our discussion so far, in
what follows we need more structure on the set L of lengths. It is natural to ask about all the conse-
quences that the Axioms have on L. If we choose a segment [AB], a unit segment, one can deduce from

14
Geometry - first year [email protected]

the axioms that the line AB can be identified with the set of real numbers R and that L can be iden-
tified with the set of non-negative real numbers R≥0 such that |AB| = 1. These identifications require
care. They are available in Appendix B for the interested reader (see in particular Theorem B.9).
The identification is necessary for example in the existence and uniqueness claim of the following
definition which uses Proposition B.11.
−−−→
Definition 1.19. Assume that a unit segment was chosen. Consider a non-zero vector a = OA and
a scalar x ∈ R. If x > 0, there is a unique point X on the ray (OA such that |OX| = x · |OA|. The
multiplication of the vector a with the scalar x, denoted x · a (or simply xa), is given by
 −−−→ →−

 OX for a , 0 , x > 0 and X as above,
→−


x·a = 

−(|x|a) for a , 0 , x < 0,

− →




 0 for a = 0 or x = 0.

X
x>0 xa
a
−a
O O
xa x<0

Proposition 1.20. Assume that a unit segment was chosen. The multiplication of vectors with scalars
is well defined.
−−−→ −−−−→
Proof. Let OA = O′ A′ . For r > 0, construct X from O and A as in Definition 1.19 and similarly, con-
−−−→ −−−−→ −−−→ −−−−→
struct X ′ from O′ and A′ . We need to show that OX = O′ X ′ . By construction OX and O′ X ′ define
−−−→ −−−−→
the same direction, namely both define the same direction as OA = O′ A′ . Thus, by Proposition 1.11,
it suffices to show that |OX| = |O′ X ′ |. Since r > 0, this is clear since |OX| = r · |OA| = r · |O′ A′ | = |O′ X ′ |.
If r < 0 the claim also follows since |XO| = |OX|. The case where r = 0 is trivially true.

Proposition 1.21. Assume that a unit segment was chosen. For a, b ∈ V and x, y ∈ R we have
→−
1. 0 · a = 0 .

2. 1 · a = a

3. −1 · a = −a

4. (x + y) · a = x · a + y · a

5. x · (y · a) = (xy) · a

6. x · (a + b) = x · a + x · b

15
Geometry - first year [email protected]

Proof. The first five claims follow from Proposition B.12 and Definition 1.19. We prove the last claim.
If x = 0 the statement follows from the first assertion. Moreover, we may assume that x > 0. Indeed,
if x < 0 we have (−x) · (−(a + b)) = (−x) · (−a) + (−x) · (−b) which follows from the case x > 0. Now, by
Proposition 1.17 and Proposition 1.20 the operations do not depend on the choice of representatives.
−−−→ −−→
Let OAQB be a parallelogram such that a = OA and b = OB . By Proposition B.11, there is a unique
−−−→ −−−→
point A′ ∈ (OA such that xa = OA′ and there is a unique point B′ ∈ O such that xb = OB′ . Moreover,
−−−→ −−−→
we have x|a| = | OA′ | and x|b| = | OB′ |. Let Q be the unique point such that OA′ QB′ is a parallelogram.
−−−−→ −−−→ −−−−→
We have OQ′ = xa + xb and we need to show that x · OQ = OQ′ . Then the points O, Q, Q′ are
collinear and the angles in the triangles OAQ and OA′ Q′ are pairwise congruent by [14, Theorem
30]. It then follows from the proportionality of the sides of similar triangles (see Theorem 41 in [14])
that |OQ′ | : |OQ| = |OA′ | : |OA| = r.

Theorem 1.22. The set of vectors V with vector addition and scalar multiplication is a vector space.
Proof. For the axioms of a vector space see for example [13, Definition 1.1]. They follow directly
from Proposition 1.18 and Proposition 1.21.

Lemma 1.23. Let A, B, C be three non-collinear points and let π be the unique plane containing them.
Then, a point Q lies in π if and only if there exists a parallelogram AXY Q with X ∈ AB and Y ∈ AC.
Moreover, if such a parallelogram exists, it is unique.
Theorem 1.24. Assume that a unit segment was chosen. Let S be a subset of E and let O be a point
in S.
1. The set S is a line if and only if φO (S) is a 1-dimensional vector subspace.
−−−→ −−→
2. The vectors OA , OB are linearly dependent if and only if the points O, A, B are collinear.

3. The set S is a plane if and only if φO (S) is a 2-dimensional vector subspace.


−−−→ −−→ −−−→
4. The vectors OA , OB , OC are linearly dependent if and only if the points O, A, B, C are coplanar.

5. If S is a line or a plane then the vector subspace φO (S) is independent of the choice of O in S.
Proof. Assume that S is a line. By definition of the vector space operations (Definitions 1.16 and
1.19), φO (S) is stable under multiplication with scalars and vector addition. Moreover, by Proposi-
tion B.11, any two vectors represented on the line S are linearly dependent. Thus φO (S) is a vector
subspace of dimension 1.
For the other implication, assume that φO (S) is a 1-dimensional vector subspace and let e be a
basis vector. Let A be the unique point such that φO (A) = e. By the argument in the first paragraph,
φO (OA) is a 1-dimensional vector subspace of V. Since e is a basis vector for both φO (OA) and φO (S)
we have φO (OA) = φO (S) and since φO is bijective we have OA = S. For 2., notice that O, A, B are
−−→
collinear if an only if OB belongs to the 1-dimensional vector subspace φO (OA) which in turn is
equivalent to the two vectors being linearly dependent.
Assume now that S is a plane. Then there exist other two points A and B such that O, A, B are
non-collinear. By the above paragraph φO (OA) and φO (OB) are two distinct 1-dimensional vector

16
Geometry - first year [email protected]

subspaces contained in φO (S). Now, for any point Q in the plane S there is a unique parallelogram
−−−→ −−−→ −−−→
OXQY with X ∈ OA and Y ∈ OB (by Lemma 1.23). Thus OQ = OX + OY . In other words, any vector
in φO (S) is a linear combination of vectors in φO (OA) and φO (OB), i.e. φO (S) is the vector space
generated by these two one dimensional subspaces. For the other implication assume that φO (S) is a
2-dimensional vector subspace. Let e1 , e2 be a basis of φO (S) and let A, B be the unique points such
that φO (A) = e1 and φO (B) = e2 . Then O, A, B are non-collinear. Let π be the unique plane containing
these three points. By the first part of the argument φO (π) is a 2-dimensional vector subspace. Since
it is included in φO (S) we must have φO (S) = φO (π). Since φO is bijective, we have S = π.
The proof for 4. and 5. is similar.

Definition 1.25. Because of the above theorem, there is an overlap in terminology which we accept.
We say that two vectors are collinear if they are linearly dependent. Moreover, since two vectors are
linearly dependent if and only if they have the same or opposite directions, we say that such vectors
are parallel. Similarly, we say that three vectors are coplanar if they are linearly dependent.

Proposition 1.26. The dimension of V is at least 3.

Proof. Assume for a contradiction that dim V ≤ 2. Then dim V cannot be 0 since by Axiom I.8 there
are at least 3 vectors in V. If dim V = 1 then by Theorem 1.24 there are no planes - this contradicts
Axiom I.3 which guarantees that there are at least 3 non-collinear points, i.e. there is at least on plane
by Axiom I.4. If dim V = 2, by Theorem 1.24, all points lie in one plane which is a contradiction with
Axiom I.8.

1.3 Affine space structure of the Euclidean space


The vector space structure deduced in Theorem 1.22 consists in particular of two maps

□+□ : V×V → V and □ · □ : R × V → V.

Moreover, with Corollary 1.15 we can define an ‘addition’ of vectors with points. For a vector a and
−−−→
a point O there is a unique point X such that a = OX , i.e. we have a so-called translation map

□ + □ : V × E → E given by a + O = X. (1.1)

We say that the vectors in V act on the set of points E by translations. This is one key observation which
allows us to use real vector spaces as an underlying model for the Euclidean space E. This is made
precise by the following definition.

Definition 1.27. A real affine space A is a triple (P, V, t) where P is a non-empty set whose elements are
called points, where V is a real vector space called direction space of A, and where t is a map V × P → P
called translation map which satisfies the following two axioms:

(AS1) For every A, B ∈ P there is a unique a ∈ V such that

B = t(a, A).

17
Geometry - first year [email protected]

(AS2) For every A ∈ P and a, b ∈ V we have

t(a, t(b, A)) = t(a + b, A).

The dimension of the affine space A is by definition the dimension of V and is denoted by dim A. The
set of points is rarely mentioned separately. When we refer to ‘points in A’ we mean elements of P.
When dealing with an affine space it is common for the vector space V not to have a label. In that
case, we may invoke it as D(A).

Remark. Notice that if we fix a point O ∈ A, by Axiom (AS1), for each point P ∈ A there is a unique
vector v such that P = t(v, O). This vector is called the position vector of P relative to O and is denoted
−−→ −−→
by OP . This gives a bijection φO : A → V defined by φO (P ) = OP .

Theorem 1.28. The Euclidean space E has the structure of a real affine space of dimension dim E ≥ 3.

Proof. Considering the set P of points in E and the set of geometric vectors V, we observed in (1.1)
the existence of a translation map V × E → E using Corollary 1.15. This gives E the structure of a real
affine space by definition. The claim on the dimension follows from Proposition 1.26.

Example 1.29. The main, and in fact the only examples of finite dimensional real affine spaces are
the following. Every real vector space V is a real affine space over itself. Indeed, we may take the set
of points A to be V and the map

t : V × A → V, defined by t(v, P ) = v + P .

We know that up to isomorphism there is a unique real vector space of dimension n. We write Vn
for such a vector space and we know that Vn  Rn . However, since Rn has a standard basis, we use
the notation Vn in order to ignore the standard basis. Consequently, if dim V = n, we denote the
corresponding affine space by An .

Remark. The concept of a real affine space unlocks all linear algebra tools for Euclidean geometry.
In this set-up we identify the set of points with V and view the elements of V in two distinct ways:
as points and as vectors which act on points. A vector space has an origin, the zero vector, but on a
line or in a plane all points are equal. One way of phrasing this is by saying that ‘an affine space is
nothing more than a vector space whose origin we try to forget about’ [4, Chapter 2].

Remark. The mathematical library mathlib [22] written in Lean [21] also uses the concept of affine
space as the underlying model for Euclidean geometry (see [23]).

18
CHAPTER 2

Cartesian coordinates

Contents
2.1 Frames in dimension 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.1 Coordinates as projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.2 Changing frames (an example) . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1.3 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Frames in dimension 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.1 Coordinates as projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.2 Changing frames (an example) . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.3 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Frames in dimension n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.1 Algorithm for changing frames . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.2 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Coordinates are tuples of numbers associated to points in a given object or space. For an object
or space S, we seek a subset C ⊆ Rn and a bijective map C → S that establishes a correspondence
(x1 , . . . , xn ) ↔ P between coordinates and points. Typically, there are many choices of such maps, and
the specific choice determines the level of control you have over the object or space S. This choice
often depends on the intended purpose or application involving S. In this chapter, we focus exclu-
sively on Cartesian frames, which are named after René Descartes1 . For other types of coordinate
systems we refer to Appendix D.
Here is a passage from [15, Section 6.8]: “in the eighteenth century, historians of mathemat-
ics (French ones, in particular) considered Descartes the revolutionary who had freed them from
bondage to the tedious methods of the ancient Greeks, by reducing hard geometric problems to
1 1596–1650

19
Geometry - first year [email protected]

simple algebraic ones. This is a view which is often now regarded with some suspicion, although
Descartes himself promoted it [. . . ] In any case, his project was [. . . ] specific: the relation of ge-
ometry and algebra. A standard modern textbook criticizes Descartes for not being more practical
[. . . ] This criticism is interesting, but, I think, misplaced. Coordinate geometry even today is not
‘intrinsically’ practical – even the statistician who studies whether points in a scatter graph lie near
a straight line y = ax + b, let alone the geometer who wishes to picture the curve y 2 = x3 + x2 (Fig. 2)
are not thinking as surveyors or geographers. On the other hand, for some practical tasks, the new
ideas were very well adapted, as Newton and Leibniz were to understand.”
It should be clear that Descartes did not consider the concept of affine space in the form that we
introduced it in the previous chapter. However, the main idea is the same. Examples of classical
theorems which can be proved using Cartesian frames and algebraic manipulations are given in
Appendix F.

2.1 Frames in dimension 2


We denote by E2 an arbitrary plane. Fix two lines ℓ1 and ℓ2 in E2 which intersect in exactly one
point O. We can describe any point P in E2 as follows. By Theorem 1.24, φO (ℓ1 ) and φO (ℓ2 ) are
1-dimensional, linearly independent vector subspaces of the 2-dimensional vector space V2 = D(E2 ).
If we select non-zero vectors i ∈ φO (ℓ1 ) and j ∈ φO (ℓ2 ), then (i, j) is a basis of V2 . Consequently, there
exist unique scalars xP , yP ∈ R such that
−−→
φO (P ) = OP = xP i + yP j. (2.1)

y
P (4.2, 3.8)

−−→
OP 3.8j

j
x
O 4.2i

This establishes a bijection E2 ↔ R2 between points in the plane and ordered pairs of real num-
bers. This bijection arises as the composition of two bijections:
−−→
1. The bijection φO : E2 → V2 , which assigns to each point P its position vector OP .

2. The decomposition of vectors with respect to a basis, establishing the identification V2  R2 .

Thus, the bijection E2 ↔ R2 depends on two choices:

1. Selecting a point O for the map φO .

2. Choosing two vectors i and j that form a basis of V2 .

20
Geometry - first year [email protected]

Definition 2.1. A frame in E2 is a pair K = (O, B), where O is a point in E2 and B = (i, j) is a basis of V2 .
A frame is also referred to as a Cartesian coordinate system, or a Cartesian frame. We use the shorter
term for convenience. Given a frame K, the unique pair (xP , yP ) in (2.1) is called coordinates of P with
respect to the frame K. In other words, the coordinates of P are the components of the position vector
of P relative to O in the basis B. We write PK (xP , yP ) when we want to indicate the coordinates. If it is
clear from the context what K is, we omit the subscript K and simply write P (xP , yP ). By convention,
the first coordinate is typically denoted x, and the second y. The line ℓ1 is called the x-axis, denoted
Ox, while the line ℓ2 is the y-axis, denoted Oy. The point O is referred to as the origin of the frame K.
For computational purposes, points PK (xP , yP ) and vectors a = ax i+ay j are identified with column
matrices, using their coordinates and components, respectively:
" # " #
x a
[P ]K = P ∈ R2 and [a]B = x ∈ R2 .
yP ay

Here, the subscript K indicates the frame relative to which P has the indicated coordinate, while the
subscript B indicates the basis in which the components of the vector a are expressed.

2.1.1 Coordinates as projections


We explore projections extensively in Chapter 6. Here, we simply observe that the coordinates can be
interpreted as projections onto coordinate axes. Let K = (O, B) be a frame in E2 , where B = (i, j) is a
−−→
basis of V2 . By definition, a point P has coordinates (xP , yP ) if and only if OP = xP i + yP j. Since O, i, j
are fixed, this is equivalent to the existence of unique points X ∈ Ox and Y ∈ Oy such that OXP Y is
a parallelogram.

y
P (4.2, 3.8)

Pry (P ) = (0, 3.8)

j Prx (P ) = (4.2, 0)
x
O i

Definition 2.2. This defines the following maps

Prx : E2 → Ox, Prx (P ) = X(xP , 0) and Pry : E2 → Oy, Pry (P ) = Y (0, yP ).

The map Prx is called the projection on Ox along Oy, and Pry is called the projection on Oy along Ox.
For vectors a = ax i + ay j, we have similar maps defined as follows

prx : V2 → R, prx (a) = ax and pry : V2 → R, pry (a) = ay .

21
Geometry - first year [email protected]

The map prx is called the projection on the first component, and pry is called the projection on the second
component. From the definitions we immediately deduce the following identities
−−−−−−−−→ −−→ −−−−−−−−→ −−→ −−→ −−→ −−→
OPrx (P ) = prx ( OP )i, OPry (P ) = pry ( OP )j, OP = prx ( OP )i + pry ( OP )j and

pr ( −−→ 
" # 
xP OP ) −−→
[P ]K = =  x −−→  = [ OP ]B .
yP pry ( OP )
Remark. In the process of showing that E has the structure of an affine space, we made a choice, we
fixed a unit segment. If i and j are unit vectors, then for a point P as above, we have |OX| = xP , |OY | =
yP . In other words, the coordinates of P are the lengths of the sides of the parallelogram OXP Y .

2.1.2 Changing frames (an example)


With a fixed frame K, we may identify E2 with R2 . However, a pair of numbers (xP , yP ) has no
intrinsic geometric meaning in the absence of a frame. Moreover, the coordinates of a point vary
depending on the frame used. The process of translating from one Cartesian frame to another is
made precise in Section 2.3. Here, we illustrate this process with an example. Let K = (O, B) and
K′ = (O′ , B ′ ) be two frames in E2 , with B = (i, j) and B ′ = (i′ , j′ ). Assume that O′ , i′ and j′ are known
relative to K: " # " # " #
′ 7 ′ −2 ′ 1
[O ]K = , [i ]B = , [j ]B = . (2.2)
−1 1 2
How can we translate the coordinates of points from K to K′ ?

j C j′

O i
i′
O′

We can achieve this in two steps: (a) first, we change the origin, i.e., we go from (O, B) to (O′ , B), and
(b) we change the direction of the coordinate axes, i.e. we go from (O′ , B) to (O′ , B ′ ). The first step is
simply a translation, while the second corresponds to the usual base change from linear algebra.

22
Geometry - first year [email protected]

B B

A A
j C C j′
O i j j
i′
O′ i O i

(a) Change the origin. (b) Change the direction of the axes.

In the first step, when going from K = (O, B) to K̃ = (O′ , B), we are looking for the components of the
position vector of A relative to O′ with respect to B. We find these components by noticing that
−−−′→ −−−→ −−−−→′ −−−→ −−−→ −−−−→
O A = OA − OO and therefore [ O′ A ]B = [ OA ]B − [ OO′ ]B .

In the second step, when going from K̃ = (O′ , B) to K′ = (O′ , B ′ ), we are looking for the components
of the position vector of A relative to O′ with respect to B ′ . From linear algebra, we know that this
is done with the base change matrix (see Appendix C). Let MB ′ ,B be the base change matrix from the
basis B to the basis B ′ . Then,
−−−→ −−−→
[ OA ]B ′ = MB ′ ,B [ OA ]B .
Thus, composing the two operations, (a) and (b), we obtain
−−−→ −−−→ 
−−−→ −−−−→ 
[A]K′ = [ O′ A ]B ′ = MB ′ ,B ·[ O′ A ]B = MB ′ ,B · [ OA ]B − [ OO′ ]B (2.3)

−−−−→
Now, suppose that A has coordinates (1, 2) with respect to the frame K. Since [ OO′ ]B = [O′ ]K , by
(2.2), we already know the terms in the parentheses on the right-hand side of (2.3). Furthermore,
from our assumptions (2.2), it is easy to write down the matrix MB,B ′ and then MB ′ ,B = M−1
B,B ′ . Thus
" #−1 " # " #! " # " # " #! " #
−2 1 1 7 1 2 −1 1 7 3
[A]K′ = · − = · − = .
1 2 2 −1 −5 −1 −2 2 −1 0

2.1.3 Orientation
Consider the bases E = (e1 , e2 ) and F = (f1 , f2 ) of V2 . Represent all four vectors in a common point O
and rotate the basis F such that the vector f1 points in the same direction as the vector e1 , i.e. such
that f1 = λe1 for some positive scalar λ.

23
Geometry - first year [email protected]

e2
e1

e2

O f2 O

f2 f1

The line passing through O with direction vector e1 (and also f1 ) separates the plane E2 in two
half-planes. Considering the positioning of the second vectors, two things can happen: e2 and f2
point towards the same half-plane or they point towards different half-planes.

Proposition 2.3. Let E and F be as above.

1. If det(ME,F ) > 0 then e2 and f2 point to the same half-plane.

2. If det(ME,F ) < 0 then e2 and f2 point to different half-planes.

Proof. In the above process, we changed the basis F = (f1 , f2 ) to the basis F ′ = (f′1 , f′2 ) with a rotation.
So, the base change matrix MF ′ ,F is a 2 × 2-rotation matrix which has determinant equal to 1 (For
more on rotations see Chapter 7). Now, considering the coordinates of the vectors in F ′ with respect
to E, we have f′1 = (λ, 0) and f′2 = (a, b). Thus, we notice that the vectors e1 and f1 point in the same
half-plane if b > 0. If we now calculate the above determinant, we obtain

λ a
det(ME,F ) = det(ME,F ′ · MF ′ ,F ) = det(ME,F ′ ) · det(MF ′ ,F ) = = λ·b
| {z } 0 b
=1

and, since λ is positive, the proof is finished.

Definition 2.4. In the first case of Proposition (2.3) we say that E and F have the same orientation
and in the second case we say that E and F have opposite orientation.

Why is this relevant? Next to the fact that it gives a geometric interpretation of the sign of the
determinant det(ME,F ), it also allows us to understand some signs which appear in calculations of
areas (see Chapter 5). Moreover, the trigonometry that we know to hold true in E2 implicitly builds
on the notion of oriented Euclidean plane E2 . Mathematically, the distinction in orientation is only
a matter of keeping track of the signs of some determinants. However, in relation to the physical
world this distinction is more concrete.

24
Geometry - first year [email protected]

−−−→
Definition 2.5. Let (i, j) be a basis of V2 represented in a common point O ∈ E2 such that i = OX and
−−−→
j = OY . Rotate the plane such that i points downwards. If Y is in the right half-plane determined
by the line OX, then we say that the basis (i, j) is right oriented. If Y lies in the left half-plane, we say
that the basis (i, j) is left oriented. A coordinate system (O, i, j) is left or right oriented if the basis (i, j) is
left respectively right oriented.
Fixing an orientation in the Euclidean plane E2 is equivalent to choosing a coordinate system
K = (O, B) and calling it right oriented. Then, all other bases of V2 either have the same orientation
as B, in which case they are also called right oriented, or they have opposite orientation, in which
case they are called left oriented. When it comes to a concrete configuration of points, on a sheet of
paper for instance, such a choice can be made with the right-hand rule. Once we have such a choice,
E2 is called oriented. In other words, the oriented plane E2 is the usual Euclidean plane together with
a choice of which of the two opposite classes of bases contains the ‘prefered’ bases.

2.2 Frames in dimension 3


The situation in dimension 3 is similar to the one discussed for dimension 2 (Section 2.1) and is
a particular case of the n-dimensional case presented in Section 2.3. Fix three non-coplanar lines
ℓ1 , ℓ2 and ℓ3 in E3 which intersect in exactly one point O. We can describe any point P ∈ E3 as
follows: by Theorem 1.24, the sets φO (ℓ1 ), φO (ℓ2 ) and φO (ℓ3 ) are 1-dimensional, linearly independent
vector subspaces of the 3-dimensional vector space V3 = D(E3 ). Thus, if we choose non-zero vectors
i ∈ φO (ℓ1 ), j ∈ φO (ℓ2 ) and k ∈ φO (ℓ3 ), we obtain a basis (i, j, k) of V3 and there exist unique scalars
xP , yP , zP ∈ R such that
−−→
φO (P ) = OP = xP i + yP j + zP k. (2.4)
This defines a bijection between points P ∈ E3 and triples of real numbers (xP , yP , zP ) in R3 . As in
the case of E2 , this bijection is determined by the choice of O, i, j, k and it is the composition of two
bijections.
Definition 2.6. A frame in E3 is a pair K = (O, B) where O is a point in E3 and B = (i, j, k) is a basis
of V3 . Given a frame K, the unique triple (xP , yP , zP ) in (2.4) is the coordinates of P with respect to the
frame K. Again, the coordinates of P are just the components of the position vector of P relative to O.
We write PK (xP , yP , zP ) when we want to indicate the coordinates, or simply P (xP , yP , zP ) if it is clear
from the context what K is. By convention, we use x for the first coordinate, y for the second, and z
for the third. The line ℓ1 is the x-axis, denoted Ox, the line ℓ2 is the y-axis, denoted Oy, and ℓ3 is the
z-axis, denoted Oz. The point O is the origin of the frame K. The planes containing two coordinate
axes are the coordinate planes. We let Oxy denote the plane containing Ox and Oy, and similarly for
the other two.
Here again, for computational purposes, points P (xP , yP , zp ) and vectors a = ax i + ay j + az k are
represented as column matrices, using their coordinates and components, respectively:
   
xP  ax 
[P ]K = yP  ∈ R and [a]B = ay  ∈ R3 .
3
   
 
zP az
   

25
Geometry - first year [email protected]

2.2.1 Coordinates as projections


Projections are explored in detail in Chapter 6. Here we invoke the 3-dimensional image for viewing
coordinates as projections on coordinate axes. Let K = (O, B) be frame in E3 , where B = (i, j, k). By
−−→
definition, a point P has coordinates (xP , yP , zP ) if and only if OP = xP i + yP j + zP k. Since O, i, j, k are
fixed, this is equivalent to the existence of unique points X ∈ Ox, Y ∈ Oy, Z ∈ Oz such that O, X, Y , Z, P
are vertices of a parallelepiped.
In dimension 3, we have projection maps defined by Prx (P ) = (xP , 0, 0), Pry (P ) = (0, yp , 0) and
Prz (P ) = (0, 0, zP ). The map Prx is the projection on Ox along the plane Oyz, and similarly for the other
two. These are projections on coordinate axes along coordinate planes. For vectors a = ax i + ay j + az k
in V3 , we have maps prx , pry , prz : V3 → R defined by prx (a) = ax , pry (a) = ay , prz (a) = az . From the
definitions we immediately deduce the following identities

−−−−−−−−→ −−→ −−−−−−−−→ −−→ −−−−−−−−→ −−→


OPrx (P ) = prx ( OP )i, OPry (P ) = pry ( OP )j, OPrz (P ) = prz ( OP )k,

   −−→ 
−−→ −−→ −−→ −−→ xP  prx ( OP ) −−→
−−→ 
OP = prx ( OP )i + pry ( OP )j + prz ( OP )k and [P ]K = yP  = pry ( OP ) = [ OP ]B .
  
zP
   −−→ 
prz ( OP )

26
Geometry - first year [email protected]

2.2.2 Changing frames (an example)

The process of translating from one Cartesian frame to another is made precise in Section 2.3. Here,
we illustrate this process with an example. Let K = (O, B) and K′ = (O′ , B ′ ) be two frames in E3 with
B = (i, j, k) and B ′ = (i′ , j′ , k′ ). Assume that O′ , i′ , j′ and k′ are known relative to K:
       
 4  −1 −2 0
[O′ ]K =  5  , i′ = −i − 2j = −2 , j′ = −2i + j =  1  , k′ = j + 2k = 1 .
       
−1 0 0 2
       

In any dimension, the coordinates with respect to one frame can be obtained from the coordinates
with respect to another frame in two steps.

(a) Change the origin. (b) Change the direction of the axes.

Let B be the point with coordinates (1, 5, 1) relative to K. The argument used for dimension 2 (Section
2.2.2) literally translates to our 3-dimensional setting and we have

 −1            
−1 −2 0 1  4  −2 −4 2  1  4  1
1
[B]K′ = M−1 ′
K,K′ ·([B]K − [O ]K ) = 
−2 1 1 5 −  5  = −4 2 −1 5 −  5  = 1 .
        
     10 
             
0 0 2 1 −1 0 0 5 1 −1 1
      

27
Geometry - first year [email protected]

2.2.3 Orientation
Consider the bases E = (e1 , e2 , e3 ) and F = (f1 , f2 , f3 ) of V3 . Represent all six vectors in a common
point O and rotate the basis F such that the plane passing through O in the direction of f1 , f2 coin-
cides with the plane π passing through O in the direction of e1 , e2 . If in the plane π the bases (f1 , f2 )
and (e1 , e2 ) have opposite orientation, flip the vectors (f1 , f2 ) with a rotation such that they end up
having the same orientation with (e1 , e2 ). Any rotation with 180◦ around a line in the plane π which
passes through the origin will work and such a rotation has determinant equal to 1 (see Chapter 7).
Then, the plane π separates the space E3 in two half-spaces. Considering the positioning of the
third vectors, two things can happen: e3 and f3 point towards the same half-space or they point to-
wards different half-spaces. How can we tell the two cases apart? A similar argument as in dimension
2 shows that the sign of det(ME,F ) gives the answer. This fact is true in any dimension (see Section
2.3.2). In dimension 3, the orientation of a basis explains some signs which appear in calculations of
volumes (see Chapter 5). In relation to the physical world this distinction is more concrete.
−−−→
Definition 2.7. Let (i, j, k) be a basis of V3 represented in a common point O ∈ E3 such that k = OZ .
We say that the basis is right oriented if (i, j) is a right oriented basis of the plane Oxy when observed
from the point Z. We say that the basis is left oriented if (i, j) is a left oriented basis when observed
from the point Z. A coordinate system (O, i, j, k) is left or right oriented if the basis (i, j, k) is left
respectively right oriented.
There are many equivalent ways of deciding if a basis of V3 is left or right oriented. The Swiss
liked the three-finger rule so much, they put it on their 200-franc banknotes:

2.3 Frames in dimension n


Definition 2.8. A frame in An is a pair K = (O, B), where O is a point in the affine space An and
B = (i1 , . . . , in ) is a basis of the direction space D(An ). A frame is also called Cartesian coordinate
system, or Cartesian frame. We use the shorter term for convenience. Given a frame K, any point
−−→
P ∈ An has a unique corresponding position vector OP (see Section 1.27) which relative to B has
(unique) components (x1 , . . . , xn ), i.e.
−−→
OP = x1 i1 + · · · + xn in .
The n-tuple (x1 , . . . , xn ) is called coordinates of P with respect to the frame K and we write PK (x1 , . . . , xn )
when we want to indicate the coordinates, or simply P (x1 , . . . , xn ) if it is clear from the context what K

28
Geometry - first year [email protected]

is. The point O is the origin of the frame K. For computational purposes, points P (x1 , . . . , xn ) and vec-
tors a = a1 i1 + · · · + an in are identified with column matrices, using their coordinates and components,
respectively:
x1  a1 
   

[P ]K =  .  ∈ R and [a]B =  ...  ∈ Rn .


 .  n  
 . 
   
xn an

Definition 2.9. For vectors a = a1 i1 +· · ·+an in in D(An ) we have projection maps pr1 , . . . , prn : D(An ) → R
defined by pr1 (a) = a1 , . . . , prn (a) = an . From the definition we see that for a point P (x1 , . . . , xn ) we have
   −−→ 
x1  pr1 ( OP )
[P ]K =  ...  = 
   ..  = [ −
−→

.  OP ]B .
  
xn −−→ 
pr ( OP )

n

A frame K allows us to identify the set of points in An with Rn . However, an n-tuple of numbers
has no geometric meaning in the absence of a frame. Moreover, with respect to different frames,
points have different coordinates. The process of translating from one Cartesian frame to another is
made precise with the following theorem.

Theorem 2.10. Let K = (O, B) and K′ = (O′ , B ′ ) be two frames in An . For any point P ∈ An we have

[P ]K′ = MB ′ ,B ·([P ]K − [O′ ]K ) = M−1 ′


B,B ′ ·([P ]K − [O ]K ) = MB ′ ,B ·[P ]K + [O]K′ . (2.5)

Proof. Since the coordinates of a point are the components of its position vector, Equation (2.5) is
equivalent to:
−−−→ −−→ −−−−→ −−→ −−−−→′ −−→ −−−′−→
[ O′ P ]B ′ = MB ′ ,B ·([ OP ]B − [ OO′ ]B ) = M−1
B,B ′ ·([ OP ]B − [ OO ]B ) = MB ′ ,B ·[ OP ]B + [ O O ]B ′ . (2.6)
−−−→ −−→ −−−−→
Since [ O′ P ]B = [ OP ]B − [ OO′ ]B and since MB ′ ,B is the base change matrix from B to B ′ , the first
equality follows. The second equality follows from the property that MB ′ ,B = M−1 B,B ′ (see Appendix
C). The last equality is obtained by opening the parentheses and noticing that
−−−−→′ −−−−→′ −−−′−→
M−1
B,B ′ [ OO ]B = [ OO ]B ′ = −[ O O ]B ′ .

2.3.1 Algorithm for changing frames


Let us flesh out the steps needed to translate coordinates from one Cartesian frame to another, as
exemplified in Sections 2.1.2 and 2.2.2. We expand Theorem 2.5 using Appendix C.
Let K = (O, B) and K′ = (O′ , B ′ ) be two frames in An with B = (i1 , . . . , in ) and B ′ = (i′1 , . . . , i′n ).
Suppose we know K′ in terms of K, i.e., the coordinates of O′ relative to K and the components of
i′1 , . . . , i′n with respect to B are given. If the coordinates of a point P are given relative to K, we can
find the coordinates of P relative to K′ with the following steps:

29
Geometry - first year [email protected]

1. Construct the base change matrix MB,B ′ by placing [i′1 ]B , . . . , [i′n ]B in the columns of the matrix.

2. Calculate MB ′ ,B by inverting MB,B ′ .

3. Calculate [P ]K′ = MB ′ ,B ·([P ]K − [O′ ]K ) since [P ]K and [O′ ]K are given.

2.3.2 Orientation
Definition 2.11. Two bases E and F of the space Vn of geometric vectors are said to have the same
orientation if det(ME,F ) > 0. They have opposite orientation if det(ME,F ) < 0. Two frames K and K′
have the same orientation if their bases have the same orientation and they have opposite orientation
otherwise.
We say that the Euclidean space En is oriented if there is a choice of a frame K = (O, B) which is
called right oriented. Then, all other frames of En with the same orientation as K are also right oriented
and all other bases with opposite orientation are called left oriented.

Remark. Unless otherwise stated, whenever we consider a frame K = (O, B) of En , we will assume
that it is right oriented and that En is therefore an oriented Euclidean space.

30
CHAPTER 3

Affine subspaces

Contents
3.1 Lines in A2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.1 Parametric equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.2 Cartesian equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.1.3 Relative positions of two lines in A2 . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Planes in A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.1 Parametric equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.2 Cartesian equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.3 Relative positions of two planes in A3 . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Lines in A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3.1 Parametric equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.2 Cartesian equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.3 Relative positions of two lines in A3 . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.4 Relative positions of a line and a plane in A3 . . . . . . . . . . . . . . . . . . . 41
3.4 Affine subspaces of An . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.1 Hyperplanes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.2 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4.3 Relative positions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4.4 Changing the reference frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

31
Geometry - first year [email protected]

3.1 Lines in A2
Let E2 denote an arbitrary plane. It is a 2-dimensional real affine space. In order to emphasize the
fact that we treat E2 as an affine space only we denote it by A2 . In this setting, a line in A2 is a set of
points S such that the set of vectors which can be represented by points in S form a 1-dimensional
2
vector subspace of V2 (see Theorem 1.24). In terms of the map φQ : A2 → V2 which identifies points
with vectors when a point Q ∈ A2 is fixed, the subset S ⊆ A2 is a line if and only if for a point Q ∈ S
n −−→ o
2
φQ (S) = QP : P ∈ S is a 1-dimensional vector subspace of V2 .

It is not dificult to see that if the above description holds for one points Q ∈ A2 , it holds for any point
2
Q ∈ A2 . If S is a line, we call the vector subspace φQ (S) of V2 the direction space of the line S and
denote it D(S).

3.1.1 Parametric equations


−−→
For a line S and any two (distinct) points P , Q in S the vector QP is called a direction vector of S.
Since D(S) is 1-dimensional, all direction vectors are linearly dependent, i.e. v is a direction vector
−−→
for S if and only if it is linearly dependent on QP . So, for any direction vector v of S there is a unique
scalar t ∈ R such that
−−→
QP = tv.
2
Now, if you fix Q and let P vary on the line, then t varies in R. Since φQ : A2 → V2 is a bijection, the
line S can be described as
2 −−→
 
S = P ∈ A : QP = tv for some t ∈ R .

In this description, the point Q ∈ S is arbitrary but fixed. If we want to emphasize that the description
depends on fixing Q, we refer to this point as the base point of the line. Moreover, for any point O ∈ A2
−−→ −−→
we may split the vector QP in the equation QP = tv to obtain
−−→ −−−→
OP = OQ + tv. (3.1)

tv P
Q v

−−→
OP
−−−→
OQ

So, we can describe the line S as the set of points P in A2 which satisfy Equation (3.1) for some
t ∈ R. This equation is called the vector equation of the line S relative to O, having base point Q and
−−−→
direction vector v, or simply a vector equation of the line S. If v = QA , this equation also describes the

32
Geometry - first year [email protected]

segment [QA] if we restrict the parameter to t ∈ [0, 1]. Moreover, for t > 0 and t < 0 we obtain two
rays amanating from Q.
Notice that, a vector equation depends on the choice of the base point Q and on the choice of the
direction vector v. In particular, a line does not have a unique vector equation. Notice also that, the
vector equation does not depend on the coordinate system. In the above description O can be any
point in A2 .
Now fix a frame K = (O, B). If we write Equation (3.1) in coordinates relative to K, we obtain
( " # " # " #
x = xQ + tvx x x v
S: or, in matrix form S : = Q +t x (3.2)
y = yQ + tvy y yQ vy

where Q = Q(xQ , yQ ), v = v(vx , vy ) and where t is the parameter - for different values t we obtain
different points (x, y) on the line. The two equations in the System (3.2) are called parametric equations
of the line S. Traditionally, they are written in the form of a system of equations as indicated on
the left. Writing them as one equation, as indicated on the right, is closer to the computational
perspective where we identify points with column matrices. Clearly, the two ways of writing such
parametric equations are equivalent.

3.1.2 Cartesian equations


It is possible to eliminate the parameter t in (3.2). By expressing t in both equations and setting the
two expressions equal, we obtain
x − xQ y − yQ
= . (3.3)
vx vy
We refer to Equation (3.3) as symmetric equation of the line S. It could happen that vx or vy are zero.
In that case, translate back to the parametric equations to understand what happens.
Example 3.1. The line with symmetric equation
(
x−3 y −5 x = 3+2·t
= has parametric equations .
2 0 y = 5+0·t

Thus, it is the line parallel to Ox described by the equation y = 5.


We have just described a line with a linear equation relative to the frame K. The converse is also
true.
Proposition 3.2. Every line in A2 can be described with a linear equation in two variables

ax + by + c = 0 (3.4)

relative to a fixed coordinate system and any linear equation in two variables relative to a fixed
coordinate system describes a line if the constants a, b are not both zero. Moreover, if Equation (3.4)
describes the line ℓ relative to a coordinate system K = (O, B), then the direction space D(ℓ) of the
line is the 1-dimensional subspace of V2 which, relative to the basis B, satisfies the equation

D(ℓ) : ax + by = 0.

33
Geometry - first year [email protected]

Proof. This is a particular case of Theorem 3.7.

Equation (3.4) is called a Cartesian equation of the line which it describes. Notice that there are
infinitely many Cartesian equations describing the same line, since you can multiply one equation
by a non-zero constant. It is sometimes useful to rearrange the linear equation (3.4) in order to
emphasize some geometric properties. For example, you can rearrange it in the form
x y c c
+ =1 where α=− and β=− .
α β a b
In this form we have the equation of the line where we can read off the intersection points with the
coordinate axes since this line intersects Ox in (α,y0) and it intersects Oy in (0, β).

(0, β)

O
x
(α, 0)

Or, we may express the linear equation (3.4) with a determinant

x − xQ y − yQ
=0
vx vy
−−→
which says that a point P belongs to the line if QP is linearly dependent on v. In this form we may
describe the two half-planes which are separated by the line with the inequalities

x − xQ y − yQ x − xQ y − yQ
<0 and >0
vx vy vx vy
−−→ −−→
Indeed, any point P in the plane either lies on the line or ( QP , v) is a left oriented basis or ( QP , v) is
a right oriented basis.

3.1.3 Relative positions of two lines in A2


The tools of linear algebra readily apply to describe intersections of lines in A2 . Assume that we have
two lines
ℓ1 : a1 x + b1 y + c1 = 0 and ℓ2 : a2 x + b2 y + c2 = 0.
In order to determine if they intersect, one has to discuss the system:
(
ℓ1 : a1 x + b1 y + c1 = 0
. (3.5)
ℓ2 : a2 x + b2 y + c2 = 0

Discussing this system is basic linear algebra (see for example [9, Section 3.6]). In the plane the
situation is very simple:

34
Geometry - first year [email protected]

• two lines intersect in a unique point, the coordinates of which are the solution to (3.5); or

• they don’t intersect and (3.5) doesn’t have solutions, in which case the lines are parallel; or

• System (3.5) has infinitely many solutions in which cases ℓ1 = ℓ2 .

incident parallel

3.2 Planes in A3
The usual Euclidean space E3 is a 3-dimensional real affine space In order to emphasize the fact that
we treat E3 as an affine space only we denote it by A3 . A plane in A3 is a set of points S such that
the set of vectors which can be represented by points in S form a 2-dimensional vector subspace of
3
V3 (see Theorem 1.24). Considering the bijection φQ : A3 → V3 for a point Q ∈ A3 , the subset S is a
plane if and only if for any Q ∈ S

3 −−→
φQ (S) = { QP : P ∈ S} is a 2-dimensional vector subspace of V3 .

It is not dificult to see that the above description does not depend one the point Q ∈ S. If S is a plane,
3
we call the vector subspace φQ (S) of V3 the direction space of the plane S and denote it D(S).

3.2.1 Parametric equations


Since D(S) is 2-dimensional vector space, any basis will contain two vectors. Let (v, w) be a basis of
−−→
D(S). Then, for any two points P , Q ∈ S, the vector QP is a linear combination of the basis vectors,
i.e. there exist unique scalars s, t ∈ R such that
−−→
QP = sv + tw.

3
Now, if we fix Q and let P vary in the plane S then s and t vary in R. Since φQ : A3 → V3 is a bijection,
the plane S can be described as

−−→
 
S = P ∈ A3 : QP = sv + tw for some s, t ∈ R .

35
Geometry - first year [email protected]

In this description, the point Q ∈ S is arbitrary but fixed. If we want to emphasize that the description
depends on fixing Q, we refer to this point as the base point. Moreover, for any point O ∈ A3 we may
−−→ −−→
split the vector QP in the equation QP = sv + tw to obtain
−−→ −−−→
OP = OQ + sv + tw. (3.6)

So, we can describe the plane S as the set of points P in A3 which satisfy Equation (3.6) for
some s, t ∈ R. This equation is called the vector equation of the plane S relative to O, having base point
−−−→ −−−→
Q and direction vectors v and w, or simply a vector equation of the plane S. If v = QA , w = QC
−−→
and v + w = QB , this equation also described the interior of the parallelogram QABC if we restrict
the parameters s, t to (0, 1). The interior of the triangle QAC is obtained if we further impose the
condition that s + t < 1. Moreover, for t > 0 and t < 0 we obtain two half-planes separated by the line
QA.
Notice that a vector equation depends on the choice of the base point Q and on the choice of the
vectors v and w. In analogy with the case of the line in A2 we may call such vectors direction vectors
for the plane S. In particular, a plane does not have a unique vector equation. Notice also that the
vector equation does not depend on the coordinate system. In the above description O can be any
point in A3 .
Now fix a frame K = (O, B). If we write Equation (3.6) in coordinates relative to K then we obtain
        


 x = xQ + svx + twx x xQ  vx  wx 
S : y = y + sv + tw or, in matrix form S : y  = y  + s v  + t w  (3.7)

 Q y y    Q   y   y 
 z = z + sv + tw z zQ vz wz
        
Q z z

where Q = QK (xQ , yQ , zQ ), v = vK (vx , vy , vz ), w = vK (wx , wy , wz ). The values s and t are called parame-
ters and for different parameters we obtain different points (x, y, z) in the plane S. The three equations
in the System (3.7) are called parametric equations for the plane S.

3.2.2 Cartesian equations


As in the case of the line in A2 , it is possible to eliminate the parameters s, t in (3.7) to obtain

vx vy x − xQ z − zQ
! ! ! !
vx vz x − xQ y − yQ
− − = − − . (3.8)
wx wz wx wy wx wy wx wz

36
Geometry - first year [email protected]

We will not give this equation a name, because it is a bit much to keep in mind, and one has to make
sense of what happens when the denominators are zero. We simply notice that it is a linear equation
in x, y and z and that it can be obtained by eliminating the parameters in (3.7).
There is an easier way of describing S with a linear equation. For this, you can interpret (3.7) as
−−→
saying that the vector QP is linearly dependent on the vectors v and w. With this in mind, the point
P (x, y, z) lies in the plane S if and only if
x − xQ y − yQ z − zQ
vx vy vz = 0. (3.9)
wx wy wz
−−−→ −−→ −−−→
In particular, considering QE , v = QF and w = QG for some points Q, E, F and G, then the four
points are coplanar if and only if
xE − xQ yE − yQ zE − zQ
xF − xQ yF − yQ zF − zQ = 0. (3.10)
xG − xQ yG − yQ zG − zQ
Notice that Equation (3.10) is just a restatement of the fact that four points Q, E, F and G are coplanar

−−→ −−→ −−−→
if and only if the vectors QE , QF and QG are linearly dependent (see Theorem 1.24). Notice also
that if we replace the equals sign in (3.9) with inequalities, we describe the two half spaces separated
−−→
by this plane since, for a point P not in the given plane, ( QP , v, w) is a basis which is either left or
right oriented.
We have just described a plane with a linear equation (Equation (3.9)) relative to the coordinate
system K. The converse is also true.
Proposition 3.3. Every plane in A3 can be described with a linear equation in three variables
ax + by + cz + d = 0 (3.11)
relative to a fixed coordinate system and any linear equation relative to a fixed coordinate system in
three variables describes a plane if the constants a, b, c are not all zero. Moreover, if Equation (3.11)
describes the plane π relative to a coordinate system K = (O, B), then the direction space D(π) of the
plane is the 2-dimensional subspace of V3 which, relative to the basis B, satisfies the equation
D(π) : ax + by + cz = 0.
Proof. This is a particular case of Theorem 3.7.
Equation (3.11) is called a Cartesian equation of the plane it describes. Notice that there are
infinitely many Cartesian equations describing the same plane, since you can multiply one equation
by a non-zero constant. Here again it may be useful to rearrange the linear equation (3.11) in order
to emphasize some geometric properties. For example, you can rearrange it in the form
x y z d d d
+ + =1 where α=− , β=− and γ =− .
α β γ a b c
In this form we have the equation of the plane where we can read off the intersection points with the
coordinate axes since the plane intersects Ox in (α, 0, 0), it intersects Oy in (0, β, 0) and it intersects Oz
in (0, 0, γ).

37
Geometry - first year [email protected]

3.2.3 Relative positions of two planes in A3


In order to describe intersections of planes we make use of linear algebra. Assume that we have two
planes
π1 : a1 x + b1 y + c1 z + d1 = 0 and π2 : a2 x + b2 y + c2 z + d2 = 0.

We determine if they intersect or not by discussing the system:


(
π1 : a1 x + b1 y + c1 z + d1 = 0
(3.12)
π2 : a2 x + b2 y + c2 z + d2 = 0

Discussing this system is basic linear algebra (see for example [9, Section 3.6]). Here again, the
situation is very simple. Let M be the matrix of the system and M
e the extended matrix of the system.
Then we have:

• two planes either intersect in a line, the coordinates of the points on the line will be solutions
to (3.12), this happens if the rank of M and the rank of Me equal 2; or

• they don’t intersect and (3.12) doesn’t have solutions, in which case the planes are parallel, this
happens if the rank of M is strictly less than the rank of M;
e or

• the solutions to System (3.12) depend on two parameters in which case π1 = π2 , this happens
if the rank of M and the rank of M
e are equal to 1.

3.3 Lines in A3
Here again we treat the usual Euclidean space E3 as a 3-dimensional real affine space and denote it
by A3 . As in the case of A2 , by Theorem 1.24, a line in A3 is a set of points S such that the set of
vectors which can be represented by points in S form a 1-dimensional vector subspace of V3 . Hence,
the subset S is a line if for any Q ∈ S we have that
n −−→ o
3
φQ (S) = QP : P ∈ S is a 1-dimensional vector subspace of V3 .

3
If S is a line, we denote by D(S) the vector subspace φQ (S) of V3 .

38
Geometry - first year [email protected]

3.3.1 Parametric equations


−−→
For a line S and any two (distinct) points P , Q in S the vector QP is called a direction vector of S.
3
Since φQ (S) is 1-dimensional, all direction vectors are linearly dependent and v is a direction vector
−−→
for S if and only if it is linearly dependent on QP . So, for any direction vector v of S there is a scalar
t ∈ R such that
−−→
QP = tv.
3
Now, if you fix Q and let P vary on the line then t varies in R. Since φQ is a bijection, the line S can
be described as
−−→
 
S = P ∈ A3 : QP = tv for some t ∈ R .

In this description, the point Q is arbitrary but fixed. If we want to emphasize that this description
depends on fixing Q, we refer to this point as the base point. Moreover, for any point O ∈ A3 we may
−−→ −−→
split the vector QP in the equation QP = tv to obtain
−−→ −−−→
OP = OQ + tv. (3.13)

The image that goes with this description is the same as the one in dimension 2. The only difference
is that we interpret it in the 3-dimensional space A3 . So, again, we can describe the line S as the
set of points P in A3 which satisfy Equation (3.13) for some t ∈ R. This equation is called the vector
equation of the line S relative to O, having base point Q and direction vector v, or simply a vector equation
of the line S.
So far, the description of a line in A3 is ad litteram the one used for A2 . Now fix a frame K = (O, B).
If we write Equation (3.13) in coordinates relative to K then we obtain
      


 x = xQ + tvx x xQ  vx 
y = y + tv or, in matrix form y  = y  + t v  (3.14)


 Q y    Q   y 
 z = z + tv z zQ vz
      
Q z

where Q = Q(xQ , yQ , zQ ), v = v(vx , vy , vz ) relative to K and where t is the parameter yielding different
points (x, y, z) on the line. The three equations in the system (3.14) are called parametric equations for
the line S.

3.3.2 Cartesian equations


It is possible to eliminate the parameter t in (3.14) in order to obtain
x − xQ y − yQ z − zQ
= = . (3.15)
vx vy vz

We refer to the Equations (3.15) as symmetric equations of the line S. It could happen that vx , vy or vz
are zero. In that case, translate back to the parametric equations to understand what happens.
We have just described a line with two linear equations (Equations (3.15)) relative to the coordi-
nate system K. The converse is also true.

39
Geometry - first year [email protected]

Proposition 3.4. Every line in A3 can be described with two linear equations in three variables
(
a1 x + b1 y + c1 z + d1 = 0
(3.16)
a2 x + b2 y + c2 z + d2 = 0

relative to a fixed coordinate system and any compatible system of two linear equations of rank 2
in three variables relative to a fixed coordinate system describes a line. Moreover, if the Equations
(3.17) describe the line ℓ relative to a coordinate system K = (O, B), then the direction space D(ℓ) of
the line is the 1-dimensional subspace of V3 which, relative to the basis B, satisfies the equations
(
a1 x + b1 y + c1 z = 0
D(ℓ) : . (3.17)
a2 x + b2 y + c2 z = 0

Proof. This is a particular case of Theorem 3.7.

The Equations (3.17) are called Cartesian equations of the line which they describe. Notice that
they describe a line as an intersection of two planes.

3.3.3 Relative positions of two lines in A3


Again, the intersections of lines can be determined with linear algebra. Assume we have two lines
( (
a1 x + b1 y + c1 z + d1 = 0 a3 x + b3 y + c3 z + d3 = 0
ℓ1 : and ℓ2 : .
a2 x + b2 y + c2 z + d2 = 0 a4 x + b4 y + c4 z + d4 = 0

One way to determine if they intersect is to discuss the system:





 a1 x + b1 y + c1 z + d1 = 0

 a2 x + b2 y + c2 z + d2 = 0
. (3.18)

a3 x + b3 y + c3 z + d3 = 0





 a4 x + b4 y + c4 z + d4 = 0

40
Geometry - first year [email protected]

Discussing this system is basic linear algebra (see for example [9, Section 3.6]). It is somewhat easier
to discuss the relative positions of lines in A3 via their parametric equations:
 


 x = x1 + tvx 

 x = x2 + tux
ℓ1 :  y = y + tv and ℓ2  y = y2 + tuy .
:
 
 1 y 
 z = z + tv
  z = z + tu

1 z 2 z

We have the following cases:


• if the direction vectors v(v1 , v2 , v3 ) and u(u1 , u2 , u3 ) are proportional then the two lines are par-
allel;

• if they are parallel and have a point in common then the two lines are equal;

• if they are not parallel then they are coplanar (they lie in the same plane) if

x1 − x2 y1 − y2 z1 − z2
vx vy vz = 0.
ux uy uz

in which case they intersect in exactly one point;

• if they are not parallel and they don’t intersect, then we say that the two lines ℓ1 and ℓ2 are skew
relative to each other.

incident parallel skew

3.3.4 Relative positions of a line and a plane in A3


Consider the plane
π : ax + by + cz + d = 0
and the line 


 x = x0 + tvx
ℓ: y = y0 + tvy .


 z = z + tv

0 z

In order to see if they intersect, we check to see which points in ℓ satisfy the equation of π:

a(x0 + tvx ) + b(y0 + tvy ) + c(z0 + tvz ) + d = 0 ⇔ (avx + bvy + cvz )t + ax0 + by0 + cz0 + d = 0. (3.19)

The possibilities are:

41
Geometry - first year [email protected]

• avx + bvy + cvz = 0 and ax0 + by0 + cz0 + d , 0 in which case Equation (3.19) has no solution, i.e.
the plane and the line don’t intersect, they are parallel; or

• avx + bvy + cvz = 0 and ax0 + by0 + cz0 + d = 0 in which case any t ∈ R is a solution to Equation
(3.19), i.e. the line is contained in the plane, in particular they are parallel; or

• avx + bvy + cvz , 0 in which case Equation (3.19) has the unique solution

ax0 + by0 + cz0 + d


t0 = − .
avx + bvy + cvz

Hence, the point corresponding to the parameter t0 is the intersection point ℓ ∩ π.

3.4 Affine subspaces of An


Definition 3.5. A d-dimensional affine subspace of the affine space An is a subset S ⊆ An such that the
set of vectors D(S) which can be represented by points in S form a d-dimensional vector subspace
of Vn . The vector subspace D(S) is then called the direction space of S. Moreover, given two affine
subspaces S1 and S2 in An we say that S1 is parallel to S2 , and we write S1 ∥S2 , if and only if D(S1 ) ⊆
D(S2 ) or D(S2 ) ⊆ D(S1 ). The dimension of an affine subspace S is denoted by dim(S), and is defined
by dim(S) = dim(D(S)).

Proposition 3.6. An affine subspace of An is an affine space with the affine structure inherited from
An .

Proof. The proof is a simple matter of unpacking the definition of affine spaces (Definition 1.27).
The space An is a triple (P, V, t) where P is the set of points, V is an n-dimensional vector space and

42
Geometry - first year [email protected]

t : V × P → P is the translation map. If S is an affine subspace, it is in particular a subset of P. The


direction space D(S) consists of vectors in V which can be represented by points in S, thus D(S) is
a vector subspace of V. The inherited affine structure is obtained by restricting the translation map
to obtain t ′ : D(S) × S → S. Since t satisfies the axioms of an affine space, so does t ′ , hence the triple
(S, D(S), t ′ ) is an affine space.

Fixing a point O ∈ An , a point Q ∈ S and a basis (v1 , v2 , . . . , vd ) of D(S), it follows from the defini-
tion that S is a d-dimensional affine subspace if and only if
n −−→ −−−→ o
S = P ∈ An : OP = OQ + t1 v1 + t2 v2 + · · · + td vd for some t1 , . . . , td ∈ R . (3.20)

The equation in (3.20) is called the vector equation of the affine subspace S relative to O, having base
point Q and direction vectors v1 , v2 , . . . , vd , or simply a vector equation of S.
Fixing a coordinate system with origin O and translating the equation in (3.20) in coordinates,
one obtains parametric equations of the affine space S.
       
x1  q1  v1,1  vd,1 
x2  q2  v1,2  vd,2 
       
S :  .  =  .  + t1  .  + · · · + td  . 
       t1 , . . . , td ∈ R. (3.21)
 ..   ..   ..   .. 
       
xn qn v1,n vd,n

Another way of representing an affine subspace is by Cartesian equations (Equations (3.22)) as follows.

Theorem 3.7. Fix a coordinate system K = (O, B) in the affine space An . Let


 a11 x1 + · · · + a1n xn = b1
..


(3.22)



 .
 a x + ··· + a x = b

t1 1 tn n t

be a system of linear equations in the unknowns x1 , . . . , xn . The set S of points of An whose coordinates
are solutions to (3.22), if there are any, is an affine space of dimension d = n − r where r is the rank of
the matrix of coefficients of the system. The direction space D(S) is the vector subspace of Vn whose
equations relative to B are given by the associated homogeneous system


 a11 x1 + · · · + a1n xn = 0
..


D(S) :  . (3.23)


 .
 a x + ··· + a x = 0

t1 1 tn n

Conversely, for every affine subspace S of An of dimension d there is a system of n−d linear equations
in n unknowns whose solutions correspond precisely to the coordinates of the points in S.

Proof. We follow the proof in [19, Section 8]. Denote by W the vector subspace defined by the ho-
mogeneous system (3.23). First we show that the set of solutions to (3.22) is an affine subspace of An

43
Geometry - first year [email protected]

with the indicated properties. By assumption we only consider the cases where S , ∅. Thus, we may
fix a point Q(q1 , . . . , qn ) ∈ S. Then, for any point P (p1 , . . . , pn ) belonging to S we have

aj1 (p1 − q1 ) + · · · + ajn (pn − qn ) = (aj1 p1 + · · · + ajn pn ) − (aj1 q1 + · · · + ajn qn ) = 0


| {z } | {z }
bj bj

−−→
for each j = 1, . . . , t. Thus, QP ∈ W. This shows that S is contained in the affine subspace T passing
−−−→
through Q and parallel to W. Conversely, if R(r1 , . . . , rn ) ∈ T , then QR ∈ W an so, the components
−−−→
(r1 − q1 , . . . , rn − qn ) of QR are solutions to (3.23).Thus,

0 = aj1 (r1 − q1 ) + · · · + ajn (rn − qn ) = aj1 r1 + · · · + ajn rn − (aj1 q1 + · · · + ajn qn )


| {z }
bj

for each j = 1, . . . , t. That is, R ∈ S. Thus S = T , hence S is an affine subspace. Moreover, dim(S) =
dim(W) = n− the rank fo the matrix of coefficients of (3.23).
Next we show that an affine subspace S ⊆ An has a description by a linear system of the form
(3.22). Let S be any affine subspace of An with direction space W of dimension s. Being an s-
dimensional subspace of V, W can be described by a homogeneous system with n − s equations


 a11 x1 + · · · + a1n xn = 0
..


W: .


 .
n−s,1 x1 + · · · + an−s,n xn = 0
 a

−−→
Fix a point Q ∈ S. The points P (p1 , . . . , pn ) of S are characterized by the condition that QP ∈ W, i.e.

aj1 (p1 − q1 ) + · · · + ajn (pn − qn ) = 0

for each j = 1, . . . , n − s, equivalently,

aj1 p1 + · · · + ajn pn = bj

where we have put bj = aj1 q1 + · · · + ajn qn . Thus, the points P (p1 , . . . , pn ) ∈ S are precisely those points
in An whose coordinates satisfy the equations


 a11 x1 + · · · + a1n xn = b1
..


.



 .
n−s,1 x1 + · · · + an−s,n xn = bn−s
 a

3.4.1 Hyperplanes
Definition 3.8. Affine subspaces in An which have dimension n − 1 are called hyperplanes.

44
Geometry - first year [email protected]

Let H be a hyperplane and let (v1 , . . . , vn−1 ) be a basis of D(H). With respect to a frame K =
(O, e1 , . . . , en ) of An , parametric equations of H are of the from
        


 x1 = q1 + t1 v1,1 + · · · + tn−1 vn−1,1 x1  q1  v1,1  vn−1,1 
 x2 = q2 + t1 v1,2 + · · · + tn−1 vn−1,2 x2  q2  v1,2  vn−1,2 

        

H : .. or H :  .  =  .  + t1  .  + · · · + tn−1  . 
       (3.24)



 .  ..   ..   ..   .. 
 x = q + t v + ··· + t v
        
n n 1 1,n n−1 n−1,n xn qn v1,n vn−1,n

where vi = vi (vi,1 , . . . , vi,n ), where Q = Q(q1 , . . . , qn ) is a point in H and where ti ∈ R for each i ∈
{1, . . . , n − 1}. These parametric equations express the fact that a point P belongs to H if and only if
−−→
the vector QP is a linear combination of the basis vectors v1 , . . . , vn−1 , i.e. if and only if the vectors
−−→
QP , v1 , . . . , vn−1 are linearly dependent. We can reformulate this as follows. A point P (x1 , . . . , xn )
belongs to the hyperplane H if and only if

x1 − q1 x2 − q2 . . . xn − qn
v1,1 v1,2 ... v1,n
.. .. .. = 0. (3.25)
. . .
vn−1,1 vn−1,2 . . . vn−1,n

This is a Cartesian equation of the hyperplane H.

3.4.2 Lines
A line in An is a 1-dimensional affine subspace. If ℓ is such a line, then, by definition, the vectors
which can be represented by points in ℓ are linearly dependent. Any such non-zero vector v is called
a direction vector of ℓ. Thus, ℓ can be described as
−−→ −−−→
 
ℓ = P ∈ An : OP = OQ + tv for some t ∈ R .

for any point O ∈ An and any point Q ∈ ℓ. The image that goes with this description is the same as the
one in dimension 2, but here we interpret it in the n-dimensional space An . In coordinates, relative
to a given frame K of An , we obtain parametric equations for ℓ. They are of the form:
      


 x1 = q1 + tv1 x1  q1  v1 
 x2 = q2 + tv2 x2  q2  v2 

      

ℓ: .. or, in matrix notation, ℓ :  .  =  .  + t  . 
 ..   ..   .. 



 .      
 x = q + tv

xn qn vn
n n n

where Q = Q(q1 , . . . , qn ) and v = v(v1 , . . . , vn ) relative to K. Here again it is possible to eliminate the
parameter t in order to obtain symmetric equations of the line ℓ:
x 1 − q1 x 2 − q2 x −q
ℓ: = = ··· = n n .
v1 v2 vn

45
Geometry - first year [email protected]

These are in fact a system of (n − 1)-linear equations which you can rearrange to look like this:


 a1,1 x1 + · · · + a1,n xn = b1
..


ℓ: .


 .
n−1,1 x1 + · · · + an−1,n xn = bn−1
 a

Notice that the rank of this system is n−1 since dim(ℓ) = 1. Moreover, notice that each linear equation
in the above system describes a hyperplane. So, this is saying that a line in An can be described by
the intersection of n − 1 hyperplanes.

3.4.3 Relative positions


Let S and T be two affine subspaces of An . If S ∥ T (see Definition 3.5) then they are disjoint or one
is included in the other (see Proposition 3.9 below). Notice that if dim(S) = dim(T ), then S and T
are parallel if and only if D(S) = D(T ). In particular, if S and T are lines, they are parallel if they
have the same direction, i.e. any two of their direction vectors are proportional. Notice also that two
hyperplanes are parallel if the coefficients of the unknowns in their equations are proportional.
Proposition 3.9. Let S and T be parallel affine subspaces of An with dim(S) ≤ dim(T ).
1.) If S and T have a point in common then S ⊆ T .
2.) If dim(S) = dim(T ), and S and T have a point in common then S = T .
Proof. Fix Q ∈ S ∩ T . Since S and T are parallel, without loss of generality we may assume that
D(S) ⊆ D(T ). Then, we obtain 1.) from
−−→ −−→
S = {P ∈ An : QP ∈ D(S)} ⊆ {P ∈ An : QP ∈ D(T )} = T .

We obtain 2.) by noticing that dim(S) = dim(T ) implies equality in the above equation. Indeed, by
definition dim(S) = dim(T ) means that dim(D(S)) = dim(D(T )), but then D(S) is a vector subspace of
D(T ) of maximal dimension, hence D(S) = D(T ).

As a consequence of Proposition 3.9 we obtain the following corollary which implies the ‘parallel
postulate’ of Euclidean geometry (Axiom IV in Appendix A). The axioms of affine spaces therefore
imply the validity of this postulate.
Corollary 3.10. If S is an affine subspace of An and Q a point in An , there is a unique affine subspace
T of An which contains Q, is parallel to S and has the same dimension as S.
−−→
Proof. Fix a point Q ∈ An and a point P ∈ S. To see that T exists, translate all points of S with QP ,
i.e. consider
−−→ −−→ −−−→ −−−→
T = QP + S = QP + {P ′ ∈ An : P P ′ ∈ D(S)} = {P ′ ∈ An : QP ′ ∈ D(S)}
−−→ −−−→ −−−→
where for the last equality we use QP + P P ′ = QP ′ . Then T is an affine subspace with D(T ) = D(S),
in particular it is parallel to S and dim(T ) = dim(S). To see that T is unique, assume that T ′ is another
affine subspace passing through Q which is parallel to S and of the same dimension. By point 2.) of
Proposition 3.9 we see that T ′ has to equal T .

46
Geometry - first year [email protected]

Definition 3.11. If two affine subspace S and T of An are not parallel, they are said to be either skew
if they do not meet, or incident if they have a point in common.
In order to determine the intersection S ∩ T , suppose that the two subspace are given by the
Cartesian equations
X n
S: aij xj = bi for i = 1, . . . , n − s (3.26)
j=1
n
X
T : ckj xj = dk for k = 1, . . . , n − t. (3.27)
j=1

The intersection S ∩ T is the locus of points in An whose coordinates are simultaneously solutions to
both (3.26) and (3.27), i.e. they are solutions to the system
( Pn
aij xj = bi for i = 1, . . . , n − s,
S ∩ T : Pnj=1 (3.28)
c x
j=1 kj j = d k for k = 1, . . . , n − t.

By Theorem 3.7, if the System (3.28) has a solution, then it describes an affine subspace. Thus, if
S ∩ T is non-empty it is an affine subspace of An .
Proposition 3.12. If the intersection S ∩ T of two affine subspaces of An is non-empty it is an affine
subspace satisfying
n o
dim(S) + dim(T ) − dim(An ) ≤ dim(S ∩ T ) ≤ min dim(S), dim(T ) . (3.29)

Proof. Let s = dim(S) and let t = dim(T ). By Theorem 3.7 we have

S : nj=1 aij xj = 0
( P
for i = 1, . . . , n − s
Pn
T : j=1 bij xj = 0 for i = 1, . . . , n − t

Since S ∩ T is non-empty, the above system is compatible and the dimension of S ∩ T is n − r where r
is the rank of the matrix of coefficients of this system. Notice that

r ≤ n − s + n − t = 2n − (s + t)

Thus,
dim(S ∩ T ) = n − r ≥ n − [2n − (s + t)] = s + t − n = dim(S) + dim(T ) − dim(An ).
The last inequality is clear since S ∩T ⊆ S, T implies D(S ∩T ) ⊆ D(S), D(T ) and therefore dim(S ∩T ) ≤
dim(S), dim(T ).

Notice that in the previous proposition the second inequality is an equality if S ⊆ T or T ⊆ S.


What about the first inequality? when do we have equality there?
Proposition 3.13. Let S and T be two affine subspaces of An and denote the direction space of An by
Vn . Then Vn = D(S) + D(T ) if and only if S ∩ T , ∅ and

dim(S ∩ T ) = dim(S) + dim(T ) − dim(An ). (3.30)

47
Geometry - first year [email protected]

Proof. We use Grassmann’s identity which states that for any vector subspaces W and U we have
dim(W) + dim(U) = dim(W ∩ U) + dim(W + U).
Let s = dim(S) and let t = dim(T ). By Theorem 3.7 we have
S : nj=1 aij xj = 0
( P
for i = 1, . . . , n − s
T : nj=1 bij xj = 0
P
for i = 1, . . . , n − t

By convention dim(∅) = −∞, thus (3.30) holds only if S ∩ T , ∅. Assume that S ∩ T is non-empty.
Then, the above system is compatible and the dimension of S ∩ T is n − r where r denotes the rank of
the matrix of coefficients of this system. Moreover we notice that (3.30) is equivalent to
n−r = s+t−n ⇔ dim(D(S ∩ T )) = dim(D(S)) + dim(D(T )) − dim(Vn )
and since S ∩ T , ∅ this is in turn equivalent to
dim(D(S) ∩ D(T )) = dim(D(S)) + dim(D(T )) − dim(Vn )
rearranging the equation and using Grassmann’s identity the equation is equivalent to
dim(Vn ) = dim(D(S)) + dim(D(T )) − dim(D(S) ∩ D(T )) = dim(D(S) + D(T )).
At this point we use the fact that the vector subspace D(S) + D(T ) of Vn has maximal dimension if
and only if it equals the ambient space, i.e. if and only if D(S) + D(T ) = Vn .

3.4.4 Changing the reference frame


Let S be an affine subspace of An given with respect to the frame K = (O, B) via the parametric
equations (3.21). Then, if K′ = (O′ , B ′ ) is another frame, by Theorem 2.10, parametric equations with
respect to K′ are
 ′      
x1  q1  v1,1  vd,1 
x′  q  v1,2 
 
vd,2 
 
 2   2 
S :  .  = MB ′ ,B ·  .  + [O]K′ + t1 · MB ′ ,B ·  .  + · · · + td · MB ′ ,B ·  .  .
  
 ..   ..   ..   .. 
      
 ′       
xn qn v1,n vd,n
In terms of Cartesian equations, notice that (3.22) can be written in the form

x1 
 

S : A ·  ...  = b.


 
 
xn
Then, with respect to K′ , this system translates as follows
 ′
x
   .1 
S : A · MB,B ′ ·  ..  = b − A · [O′ ]K .

| {z } x′  | {z }
n =b′
=A′

48
CHAPTER 4

Euclidean space

Contents
4.1 Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.1.1 Orthonormal frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.2 Oriented angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.2.1 The Euclidean space Rn (first revision) . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.2 Gram-Schmidt algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2.3 Normal vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2.4 Angles between lines and hyperplanes . . . . . . . . . . . . . . . . . . . . . . . 60
4.3 Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3.1 Distance from a point to a hyperplane . . . . . . . . . . . . . . . . . . . . . . . 62
4.3.2 Loci of points equidistant from affine subspaces . . . . . . . . . . . . . . . . . 63
4.3.3 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

49
Geometry - first year [email protected]

4.1 Angles
In Chapter 1 we extracted the information that the Axioms encode in two points and arrived at the
notion of affine space. One natural way to proceed is to consider the information that the Axiom
encodes in three points. Three non-collinear points define a plane, 12 angles, a triangle and its area,
etc. Here, we focus on angles which is a second key concept in the formulation of the Axioms.
There was much debate among philosophers as to the particular category (according to the Aris-
totelian scheme) in which an angle should be placed; is it, namely, a quantity, a quality or a relation
(see [10, p.177]). Such questions are for philosophers. Based on Hilbert’s Axioms we will take an
angle ∡(h, k) to be the information carried by two rays h and k, emanating from a common point O.
One can deduce from the Axioms that an angle defines a unique plane which the two rays separate
in a convex and a concave subset. Visually, it is common to identify an angle with the convex subset
they define. As a matter of notation, for points A ∈ h and B ∈ k we may write ∡AOB for the angle
∡(h, k). Notice that if three points O, A, B are given, the symbols ∡AOB require that A , O and B , O.
In this section we derive properties of angles needed to introduce the scalar product. Standard
results concerning the trigonometry of the oriented Euclidean plane are deduced in Appendix I.
Throughout, we are interested in properties of angles up to congruence, i.e. in those properties
which all congruent angles share. From Axioms III.4 and III.5 one can deduce that congruence of
angles is indeed an equivalence relation and one may consider equivalence classes of angles under
this relation. This treatment is implicit in the notion of angles between two vectors. Instead of adding
more notation, we will simply say angle up to congruence to mean that the angle may be replaced with
a congruent angle.
Two lines which intersect in exactly one point form four angles. The opposite angles are con-
gruent (see [14]) and the adjacent angles are called supplementary. A right angle is an angle which is
congruent to its supplementary angle. If one of the angles of two intersecting lines is a right angle
then all of them are right angles and we say that the lines are orthogonal, or that the lines are perpen-
dicular to eachother. An acute angle is an angle less than a right angle and an obtuse angle is an angle
greater than a right angle.
Definition 4.1. Let ∡(h, k) be an angle with the two rays emanating from the point O. Let ℓ be the
line containing k and choose A ∈ h and B ∈ ℓ such that AB is orthogonal to ℓ. The sine and cosine of
the angle ∡(h, k) are defined by




 0 if the angle is a right angle;
|AB| 

 |OB|
sin ∡(h, k) = and cos ∡(h, k) = 
 |OA| if the angle is acute;
|OA| 

 − |OB| if the angle is obtuse.


|OA|

h h
A′ A′
A A

ℓ B′ B O k ℓ O B B′ k

50
Geometry - first year [email protected]

Proposition 4.2. The sine and cosine of an angle are well defined. Moreover, the following hold:
1. For an angle θ we have sin(θ) ∈ [0, 1], cos(θ) ∈ [−1, 1] and cos(θ)2 + sin(θ)2 = 1.
2. Two angles are congruent if and only if their cosines are equal.
3. Two angles have the same sine if and only if they are congruent or supplementary up to con-
gruence.
Proof. The sine function and the cosine function each attribute a real value to an angle by means of
certain choices. We need to show that these values are independent of the choices made. We show
this for acute angles, the other cases are similar. Let ∡(h, k) be an acute angle and let O, A, B be as
in Definition 4.1. Consider other two points A′ ∈ h and B′ ∈ k such that A′ B′ is orthogonal to k.
By Thales’ Intercept Theorem (see Theorem F.1) the ratio |A′ B′ |/|OA′ | equals |AB|/|OA| and the ratio
|OB′ |/|OA′ | equals |OB|/|OA|.
It remains to show that the definition does not depend on the order of the two rays. By Lemma
1.4 there is a unique point A′ ∈ k such that [OA] is congruent to [OA′ ]. Let B′ ∈ h be such that A′ B′ is
orthogonal to h. By the Second Congruence Theorem [14, Theorem 13], the triangles OAB and OB′ A′
are congruent, hence the ratio |OB′ |/|OA′ | equals |A′ B|/|B′ A| and |OB′ |/|OA′ | = |OB|/|OA|.

B′A
h

k
O BA′

It remains to consider the last three claims. Claim 1. follows from the fact that the length of
the catheti in a right angle triangle are always less than the length of the hypotenuse and, cos(θ)2 +
sin(θ)2 = 1 follows from Pythagoras’ Theorem. Since congruence of triangles is an equivalence rela-
tion, Claim 2. and 3. can be deduced with Axiom III.4.
−−−→ −−→
Definition 4.3. For two non-zero vectors a = OA and b = OB , the unoriented angle between a and
b, denoted ∡(a, b), is the angle ∡AOB up to congruence. By Proposition 4.2, the values of sine and
cosine do not change under congruence, thus, the sine sin ∡(a, b) and cosine cos ∡(a, b) of the unoriented
angle ∡(a, b) are well defined. If cos ∡(a, b) = 0 we say that a and b are orthogonal and we write a ⊥ b.
We denote the set of all unoriented angles by W (from the German word ‘Winkel’).

A′

a
a a A B′
a
a b
b O′ B
O b
b b
O

51
Geometry - first year [email protected]

Proposition 4.4. The unoriented angle of two (non-zero) vectors is well defined. Moreover, for any
two non-zero vectors a, b and a real number x > 0, the following hold:

1. ∡(a, b) = ∡(xa, b) = ∡(a, xb),

2. ∡(−xa, b) = ∡(a, −xb),

3. cos ∡(a, b) = − cos ∡(−a, b),

4. sin ∡(a, b) = sin ∡(−a, b).

Proof. Let a, b ∈ V2 be two non-zero vectors. Given a point O ∈ E2 , there are unique points A, B ∈ E2
−−−→ −−→
such that a = OA and b = OB . Thus we obtain an angle ∡AOB. For a different point O′ ∈ E2 ,
−−−−→ −−−−→
again there are unique points A′ , B′ ∈ E2 such that a = O′ A′ and b = O′ B′ giving us a second angle
∡A′ O′ B′ . It is not difficult to see that the two angles are congruent.

Remark. Let ℓ be a line in anplane π and let O be a point on ℓ. Choose a side of π orelative to ℓ and
1
consider the semicircle S 2 = A : with A on the given side of ℓ or on ℓ and |OA| = 1 . It follows from
1
Proposition 4.4 that the set of unoriented angles W is in bijection with S 2 .

b
O a ℓ

Definition 4.5. For a vector a we have orthogonal projection maps pr⊥ ⊥


a : V → R and Pra : V → V
−−−→ −−→
defined as follows. For a vector b let O, A, B be such that a = OA , b = OB . Drop a perpendicular BB′
on OA with B′ ∈ OA. Then
−−−→′ a
Pr⊥a (b) = OB and Pr⊥ ⊥
a (b) = pra (b) .
|a|

a a
Pr⊥ ⊥ a
a (b) = pra (b) |a|
B′
B
b b
O
Pr⊥ ⊥ b
b (a) = prb (a) |b|

52
Geometry - first year [email protected]

Proposition 4.6. The orthogonal projection map on vectors is a well defined linear map. Moreover,
for two vectors a and b we have
pr⊥
a (b) pr⊥ (a)
cos ∡(a, b) = = b .
|b| |a|
Proof. Showing that the map is well defined is left as an exercise. It is enough to show that pr⊥
a is
linear, since then
a a a
Pr⊥ ⊥
a (xb + yc) = pra (xb + yc) = x pr⊥
a (b) + y pr⊥
a (c) = x Pr⊥ ⊥
a (b) + y Pra (c).
|a| |a| |a|
a
Let i be the unit vector |a| and let j be a vector orthogonal to i. The linearity of pr⊥
a follows from the
fact that it is a projection on the coordinate x-axis of the frame having O as origin and (i, j) as basis.
The last claim follows from the definition of cosine.

4.1.1 Orthonormal frames


Definition 4.7. A basis B = (e1 , . . . , en ) of Vn is called orthogonal if the vectors ei are mutually orthog-
onal, i.e. if ei ⊥ ej for all i, j ∈ {1, . . . , n} with i , j. The basis B is called orthonormal if it is orthogonal
and all ei are unit vectors. A coordinate system K = (O, B) is called orthogonal or orthonormal if the
basis B is orthogonal or respectively orthonormal.
In dimension 2, the existence of orthonormal frames is a consequence of the existence of right
angles. In general the existence of these frames follows from the defining properties of the scalar
product (Proposition 4.15) in the context of bilinear forms (see Corollary H.9) or constructively using
the Gram-Schmidt algorithm (see Section 4.2.2).
Notice that by Proposition 4.6, for any vector a we have pr⊥ ei (a) = |a| cos ∡(a, ei ). Thus, with respect
−−→
to an orthonormal frame the coordinates of a point P (x1 , . . . , xn ) and its position vector a = OP are

x1  |a| cos ∡(a, e1 )


   

[P ]K =  ...  =  ..  = [ −−→


   
.  OP ]B .
   
xn |a| cos ∡(a, en )

4.1.2 Oriented angles


Proposition 4.4 shows that the set of (unoriented) angles can be represented with a semicircle. If we
are in dimension 2, i.e., when considering the Euclidean plane E2 , we may consider the exterior of
an angle to be an angle as well. This only works in dimension 2 because lines are hyperplanes here.
Definition 4.8. For an angle ∡(h, k) denote by −∡(h, k) the exterior of the angle, i.e. the region in the
plane which is not between the two rays h and k. Assume that a right oriented frame in E2 has been
choosen. Let h and k be two rays emanating from the same point O and let A ∈ h and B ∈ k. The
oriented angle defined by h and k is
( −−−→ −−→
∡or (h, k) = ∡(h, k) if ( OA , OB ) is right-oriented,
−∡(h, k) otherwise.

53
Geometry - first year [email protected]

−−−→ −−→
For two vectors a = OA and b = OB the oriented angle ∡or (a, b) is the oriented angle defined by the
rays (OA and (OB. The orientation of ∡or (h, k) is the orientation of the basis (a, b). We denote the set
of all oriented angles up to congruence by Wor .

A
O a

Proposition 4.9. There is a bijection between the set Wor of oriented angles and the unit circle S1 .

Lemma 4.10. Fix an orientation in E2 , an oriented angle ∡or (a, b) and a length x. For any non-zero
vector c there is a unique vector d of length x such that ∡or (a, b) = ∡or (c, d).

Definition 4.11 (Counterclockwise sum of angles). We define the sum of two oriented angles ∡or (a, b)
and ∡or (c, d) as follows. By Lemma 4.10, there is a unique unit vector d′ such that ∡or (c, d) = ∡or (b, d′ )
and we define
∡or (a, b) + ∡or (c, d) = ∡or (a, d′ )

d′

Proposition 4.12. The set Wor of oriented angles with addition is an abelian group.

Definition 4.13. For a vector v ∈ V2 let J(v) to be the unique vector in V2 satisfying the following
properties

(a) J(v) ⊥ v,

(b) |J(v)| = |v|,

54
Geometry - first year [email protected]

(c) (v, J(v)) is a right oriented basis of V2 .

The sine of the oriented angle ∡or (a, b) is defined to be

pr⊥
J(a) (b) pr⊥
J(b) (a)
sin ∡or (a, b) = = . (4.1)
|b| |a|

J(a)

|b| cos ∡or (a, b) a

|b| sin ∡or (a, b)


b

55
Geometry - first year [email protected]

4.2 Scalar product


Definition 4.14. The scalar product (or, dot product) of two vectors a, b ∈ V is the real number
(
0, if one of the two vectors is zero;
⟨a, b⟩ =
|a| · |b| · cos ∡(a, b) if both vectors are non-zero.

It is a map ⟨ , ⟩ : V × V → R. Notice that from the definitions we have

⟨a, b⟩ ⟨b, a⟩
Pr⊥
a (b) = a and Pr⊥
b (a) = b. (4.2)
⟨a, a⟩ ⟨b, b⟩

a a
⟨a,b⟩
⟨a,a⟩
a

b b

⟨b,a⟩
⟨b,b⟩
b

Proposition 4.15. The scalar product satisfies the following properties.

(SP1) It is bilinear, i.e. for all a, b ∈ R and all v, w, u ∈ V2 we have

⟨av + bw, u⟩ = a⟨v, u⟩ + b⟨w, u⟩ and ⟨v, aw + bu⟩ = a⟨v, w⟩ + b⟨v, u⟩.

(SP2) It is symmetric, i.e. for all v, w ∈ V2 we have

⟨v, w⟩ = ⟨w, v⟩.

(SP3) It is positive definite, i.e. for all v ∈ V2

if v , 0 then ⟨v, v⟩ > 0.

(SP4) It recognizes right angles and unit lengths, i.e. for all non-zero vectors v, w ∈ V2

v⊥w ⇔ ⟨v, w⟩ = 0 and |v| = 1 ⇔ ⟨v, v⟩ = 1.

56
Geometry - first year [email protected]

Proof. Since cos ∡(a, b) = cos ∡(b, a) scalar product is symmetric. Since cos ∡(a, a) = 1 the scalar prod-
uct is positive definite. In order to show bilinearity, notice that by symmetry it is enough to show
that the scalar product is linear in the first argument. For any non-zero vectors a, b, by Proposition
4.6, we have
pr⊥ (a)
⟨a, b⟩ = |a| · |b| · cos ∡(a, b) = |a| · |b| · b = |b| · pr⊥
b (a)
|a|
which is linear in a since b is fixed and since pr⊥
b is a linear map. The last property follows directly
from the definition.

Proposition 4.16. Let K = (O, B) be an orthonormal frame. For two vectors a(a1 , . . . , an ) and b(b1 , . . . , bn )
we have
⟨a, b⟩ = a1 b1 + · · · + an bn (4.3)
and therefore q
|a| = a21 + a22 + · · · + a2n ,

a1 b1 + a2 b2 + · · · + an bn
cos ∡(a, b) = q q ,
2 2 2 2 2 2
a1 + a2 + · · · + an b1 + b2 + · · · + bn

a ⊥ b ⇔ a1 b1 + a2 b2 + · · · + an bn = 0.

4.2.1 The Euclidean space Rn (first revision)


In Chapter 1 we denoted by E the Euclidean space, a set of points P governed by the Hilbert’s Axioms
listed in Appendix A. Fixing a unit segment, one shows that any line can be identified with the real
numbers R (see Appendix B). Moreover, from the Axioms we extracted the concept of geometric
vectors and, having fixed a unit segment, we showed that the set of vectors V is a vector space. Then,
we concluded that the set of points P is a real affine space over V. This structure encapsulates in
particular the Axioms of Continuity.
At this point we made the extra assumption that the dimension of V is n which we indicate with
the notation En . Then, the Cartesian frames introduced in Chapter 2 allow us to identify P with the
affine space Rn which opend the way to linear algebra. This identification amounts to a choice of a
frame and in particular contains the choice of an orientation.
In Chapter 3 we saw that lines, planes and higher dimensional analogues correspond to systems
of linear equations. This allows one to revise the relations of incidence, betweenness and parallelism.
In particular, by considering En as a real affine space we encapsulate the Axioms of Incidence, the
Axioms of Order and the Axiom of Parallels.
Thus, it remains to shed more light on the Axioms of Congruence. In other words, we wish
for a more precise description of the congruence relation for segments and angles. Congruence of
segments was already extracted in the notion of length. By definition, two segments are congruent if
and only if they define the same length and the set of lengths L can be identified with R≥0 (Appendix
B). By Proposition 4.2, two angles are congruent if and only if they have the same cosine. Therefore,
the scalar product allows us not only to efficiently calculate lengths and measure angles, but also to
give an explicit description of the congruence relation – we finish this in Section 7.1.1.

57
Geometry - first year [email protected]

By Corollary H.10, the scalar product is the unique positive definite symmetric bilinear form
which recognizes right angles and unit lengths. By Corollary H.9 for any positive definite bilinear
form (i.e. satisfying (SP1), (SP2) and (SP3) in Proposition 4.15) there is a basis in which it looks like
the scalar product (4.3). Thus, computationally such bilinear forms are indistinguishable.
Definition 4.17. The n-dimensional Euclidean space En is the pair (An , ⟨ , ⟩) where An is the n-dimensional
real affine space and where ⟨ , ⟩ is the unique positive definite symmetric bilinear form on D(An )
which recognizes right angles and unit lengths, i.e. with respect to an orthonormal basis B it has the
expression (4.3). Then, the distance between two points P , Q ∈ En is
q
−−→ −−→ −−→
q
d(P , Q) = | QP | = ⟨ QP , QP ⟩ = (p1 − q1 )2 + (p2 − q2 )2 + · · · + (pn − qn )2 . (4.4)

where the coordinates of P (p1 , . . . , pn ) and Q(q1 , . . . , qn ) are relative to an orthonormal frame K.
Remark (The Euclidean space Rn ). Choosing an orthonormal frame we may identify both An and Vn
with Rn . Since computationally there is no difference, it is more economical to say that the Euclidean
space is Rn with the standard basis an orthonormal basis. This is the starting point of any Analysis
course, and it is the advantage that Newton and Leibniz in particular saw in Descartes work.
Remark. The advantages of Definition 4.17 are conceptual. For example, by Proposition 3.6, a d-
dimensional affine subspace S of An is itself an affine space, which can be identified with Ad ; we
write S  Ad . It is easy to see that a scalar product on An defined with the properties in Proposition
4.15 restricts to a scalar product on S  Ad ⊆ An . Thus, an inclusion S  Ad ⊆ An automatically
translates to an inclusion S  Ed ⊆ En . Formally this can be stated as follows: An affine subspace of
En is a Euclidean space with the scalar product inherited from En . In particular, all the results which
we know to hold true for E2 or E3 will hold true when we consider 2-dimensional or 3-dimensional
subspaces of the Euclidean space En .

4.2.2 Gram-Schmidt algorithm


Clearly, not all coordinate systems are orthonormal, so what do we do if we have to deal with a non-
orthonormal reference frame K? The familiar formulas in Proposition 4.16 no longer hold true. We
have two options: 1. We deal with the scalar product in the given reference frame K, or 2. We find an
orthonormal reference frame K′ starting from K and translate everything to K′ . For the first option
one makes use of the Gram matrix (see Appendix H). We discuss the second option in this section.
Fix a basis B = (e1 , e2 , . . . , en ) of Vn . We want to construct an orthonormal basis B ′ starting from B.
⟨ej ,ei ⟩
Recall from (4.2) that the orthogonal projection of ei on ej is ⟨ej ,ej ⟩ ej . We construct B ′ in two steps:

1. Construct an orthogonal basis e′1 , e′2 , . . . , e′n as follows

e′1 = e1
⟨e′ ,e ⟩
e′2 = e2 − ⟨e1′ ,e2′ ⟩ e′1
1 1
⟨e′ ,e ⟩ ⟨e′ ,e ⟩
e′3 = e3 − ⟨e1′ ,e3′ ⟩ e′1 − ⟨e2′ ,e3′ ⟩ e′2
1 1 2 2
..
.

58
Geometry - first year [email protected]

2. Normalize the vectors to obtain the basis


e′1 e′n
( )

B = ,..., ′ .
|e′1 | |en |

This process of obtaining an orthonormal basis from a given basis is called the Gram-Schmidt process.
It can be used in an infinite dimensional vector space, hence the name process. If the vector space is
finite, the process terminates and we may call it algorithm.
Proposition 4.18. The basis B ′ obtained from the basis B with the Gram-Schmidt algorithm is an
orthonormal basis.
Proof. A more general statement and proof is given in [19, Theorem 17.4]. The vectors e′1 , e′2 , . . . , e′k , . . .
are constructed by induction on k. For each k we consider the vector subspace Vk generated by
Bk = (e1 , e2 , . . . , ek ) and show that Bk′ = (e′1 , e′2 , . . . , e′k ) is an orthogonal basis for Vk .
If k = 1 then e′1 = e1 and the claim is obvious. Assume that the claim holds for k. By definition we
have
k
X ⟨e′i , ek+1 ⟩ ′
e′k+1 = ek+1 − e .
⟨e′i , e′i ⟩ i
i=1
| {z }
v
Since the vectors e′i form a basis, ek+1 , 0 ⇔ v , ek+1 else ek+1 would be a linear combination Bk′ and

therefore of Bk (since both are bases of Vk ). It follows that Bk+1 is a basis of Vk+1 . Moreover, for each
i = 1, . . . , k we have
P ⟨e′ ,ek+1 ⟩
⟨e′k+1 , e′i ⟩ = ⟨ek+1 − kj=1 ⟨ej ′ ,e′ ⟩ e′j , e′i ⟩
j j
⟨e′i ,ek+1 ⟩ ′ ′
= ⟨ek+1 , ei ⟩ − ⟨e′i ,e′i ⟩
⟨ei , ei ⟩
= ⟨ek+1 , ei ⟩ − ⟨ek+1 , e′i ⟩ = 0
since, by the inductive hypothesis, ⟨e′i , e′j ⟩ = 0 for all 1 ≤ i, j ≤ k. Thus, Bk+1

is an orthogonal basis
and renormalizing it we obtain an orthonormal bases for Vk+1 .

4.2.3 Normal vectors


Recall that, with respect to a frame K = (O, B), a hyperplane H is given by a linear equation
H : a1 x1 + a2 x2 + · · · + an xn = b. (4.5)
Assume now that K is orthonormal, i.e. that B = (e1 , . . . , en ) is orthonormal. Fix a point Q(q1 , . . . , qn )
in H. Since Q lies in H, it satisfies the equation (4.5), i.e we have b = a1 q1 + a2 q2 + · · · + an qn . Having
expressed the constant b in terms of Q, any other point P (p1 , . . . , pn ) is a solution to (4.5) if and only if
a1 (p1 − q1 ) + a2 (p2 − q2 ) + · · · + an (pn − qn ) = 0.
Therefore, if we denote by n the vector with components (a1 , . . . , an ) then
−−→ −−→
P ∈ H if and only if ⟨n, QP ⟩ = 0, equivalently, if and only if n ⊥ QP .
In other words, the vector n of coefficients in (4.5) is orthogonal to any vector parallel to H, and we
say that it is orthogonal to H.

59
Geometry - first year [email protected]

Definition 4.19. Let H be a hyperplane of En . A vector v is called a normal vector of H if it is


orthogonal to H.
Example 4.20. Hyperplanes in E2 are lines. The line with equation

ℓ : x + 3y − 3 = 0,

relative to some orthonormal coordinate system, admits v(1, 3) as normal vector.

O i

Example 4.21. Hyperplanes in E3 are planes. The plane with equation


1
π : 2x − y + z + 7 = 0,
3
relative to some orthonormal coordinate system, admits v(6, −3, 1) as normal vector.
Proposition 4.22 (Hesse normal form). Let K = (O, B) be an orthonormal frame with B = (e1 , . . . , en ).
Any hyperplane H has, up to sign, a unique normal vector n of length 1. The components of n are
(cos(θ1 ), cos(θ2 ), . . . , cos(θn )) where θi is the angle ∡(n, ei ). Consequently, relative to K any hyper-
plane H has a unique equation of the form

H : cos(θ1 )x1 + cos(θ2 )x2 + · · · + cos(θn )xn = c. (4.6)

for some c > 0. Moreover, the distance d(O, H) from the origin to the hyperplane equals c.
Proof. Let H be a hyperplane with equation (4.5). A normal vector of H is n(a1 , . . . , an ). By the
discussion in Section 4.1.1, the normalized vector ñ = n/|n| has components (cos(θ1 ), . . . , cos(θn )) for
some angles θ1 , . . . , θn ∈ [0, π). Thus, with c = b/|n| the equation of H is has the form (4.6). If c < 0
replace each θi by π − θi to change the sign of c (this is equivalent to changing the sign of ñ). The last
claim follows from the distance formula (4.8) deduced below in Proposition 4.25.

4.2.4 Angles between lines and hyperplanes


The scalar product offers an efficient way of calculating angles with respect to an orthonormal basis
due to (4.3). In Section 4.1 we defined the sine and cosine of angles. The familiar properties of these
functions are derived in Appendix I. Notice however that due to Proposition 4.2 the cosine is enough
to distinguish between (unoriented) angles.
Let ℓ1 and ℓ2 be two lines in E2 . They define two angles: if v1 is a direction vector for ℓ1 and if v2 is
a direction vector for ℓ2 then the two angles described by ℓ1 and ℓ2 are ∡(v1 , v2 ) and ∡(−v1 , v2 ). They

60
Geometry - first year [email protected]

are supplementary angles so if you know one of them you know the other one. We may calculate this
with the scalar product since
⟨v , v ⟩
cos ∡(v1 , v2 ) = 1 2 .
|v1 | · |v2 |

n2 n1

v2

v1

Notice also that the two angles can be described with normal vectors: if n1 and n2 are normal
vectors for ℓ1 and ℓ2 respectively, then the two angles between ℓ1 and ℓ2 are ∡(n1 , n2 ) and ∡(−n1 , n2 ).
So, if these vectors are known we may calculate
⟨n1 , n2 ⟩
cos ∡(n1 , n2 ) = .
|n1 | · |n2 |
On the other hand, if we know a direction vector v1 for the first line and a normal vector n2 for the
second line then the acute angle between ℓ1 and ℓ2 is
π ⟨v , n ⟩ π
− arccos( 1 2 ) ∈ [0, ).
2 |v1 | · |n2 | 2
This generalizes in three ways. In En consider two line ℓ1 and ℓ2 with direction vectors v1 and v2
respectively as well as two hyperplanes H1 and H2 with normal vectors n1 and n2 respectively.
1. ℓ1 and ℓ2 define two supplementary angles: ∡(v1 , v2 ) and ∡(−v1 , v2 ) which can be calculated
with
⟨v , v ⟩
cos ∡(v1 , v2 ) = 1 2 .
|v1 | · |v2 |
2. H1 and H2 define two supplementary angles: ∡(n1 , n2 ) and ∡(−n1 , n2 ) which can be calculated
with
⟨n , n ⟩
cos ∡(n1 , n2 ) = 1 2 .
|n1 | · |n2 |
3. ℓ1 and H1 define two supplementary angles: if cos ∡(v1 , n1 ) ≥ 0 then ∡(v1 , n1 ) is acute and the
acute angle between ℓ1 and H1 is
π ⟨v , n ⟩
− arccos( 1 1 ).
2 |v1 | · |n1 |
Else, if cos ∡(v1 , n1 ) < 0 replace n1 with the normal vector −n1 of H.

61
Geometry - first year [email protected]

The angles between (hyper)planes in E3 are referred to as dihedral angles. Two planes, π1 and π2 ,
define four dihedral angles which are the four regions in which the two planes divide E3 . More
precisely, let ℓ be the line π1 ∩ π2 , choose a plane π orthogonal to ℓ and consider the lines ℓ1 = π ∩ π1
and ℓ2 = π ∩ π2 . The angles between ℓ1 and ℓ2 do not depend on the choice of π, i.e if we choose a
different plane orthogonal to ℓ we obtain congruent angles. So, up to congruence we have two angles
and they can be calculated using normal vectors as indicated above.

4.3 Distance
Definition 4.23. The distance between two points P , Q in En , denoted d(P , Q), is the length of the
segment [P Q] and we calculate it with (4.4). The distance between two sets of points S1 and S2 is
n o
d(S1 , S2 ) := inf d(P , Q) : P ∈ S1 and Q ∈ S2 . (4.7)

4.3.1 Distance from a point to a hyperplane


Proposition 4.24. Consider a hyperplane H and a point P not in H. Drop a perpendicular line from
P to H and let P ′ be the point in which the line intersects the hyperplane. Then

d(P , H) = |P P ′ |.

Proof. For any other point Q in H, distinct from P ′ , we have a right-angled triangle P P ′ Q. Since the
hypotenuse is larger than the catheti we have

d(P , H) ≤ |P P ′ | < |P Q|

for any point Q ∈ H. Thus d(P , H) ≤ |P P ′ |.

Proposition 4.25. Let K be a frame of En . Consider the point P (p1 , . . . , pn ) and a hyperplane H :
a1 x1 + a2 x2 + · · · + an xn = b. Then

|a1 p1 + a2 p2 + · · · + an pn − b|
d(P , H) = q . (4.8)
2 2 2
a1 + a2 + · · · + an

62
Geometry - first year [email protected]

Proof. Drop a perpendicular line from P to H, intersecting H in P ′ . By Proposition 4.24, we have


d(P , H) = |P P ′ |. Now let Q be a point in H, distinct from P ′ , and consider the normal vector n =
−−→
n(a1 , . . . , an ). Then |P P ′ | is the length of the orthogonal projection of QP on n. To see this, let N be
−−−→
such that n = QN and look at the quadrilateral QP ′ P N . We have
−−→ −−→ −−→ −−−→ −−→
⟨n, QP ⟩ |⟨n, QP ⟩| |⟨n, OP − OQ ⟩| |⟨n, OP ⟩ − b|
d(P , H) = | n| = = = .
|n|2 |n| |n| |n|
If we write this explicitly in coordinates we obtain the claim.

n
P

j
Q
O i

Proposition 4.26. Let S be an affine subspace of En parallel to a hyperplane H. Then


d(S, H) = d(P , H)
for any P in S.
Proof. Notice that a point is parallel to any affine subspace (this follows from Definition 3.5). We can
generalize the distance formula (4.8) to other affine subspaces parallel to a hyperplane. Consider an
affine subspace S of En which is parallel to H. Take two points A and B and from each of them drop
a perpendicular on H which intersects the hyperplane in M and N respectively. Since the sides MA
and N B are parallel to n, the quadrilateral ABN M is planar (lies in a plane). But then AB has to be
parallel to MN otherwise the two lines intersect in a point which would lie in S ∩ H contradicting
S∥H. Since we have right angles, ABN M is in fact a rectangle. This shows that d(A, H) = d(B, H).
Thus, the distance from S to H is the distance from any point A ∈ S to H
d(S, H) = d(A, H) ∀A ∈ S.

4.3.2 Loci of points equidistant from affine subspaces


Definition 4.27. Let S be a set of points in En . For c ∈ R, the set of points at distance c from S is
n o
L(S, c) = P ∈ En : d(P , S) = c .
Proposition 4.28. Let S be an affine subspace of En and let c > 0 be a constant. Table 4.1 classifies
the possible shapes of L(S, c).

63
Geometry - first year [email protected]

n a point a line a plane


1 two points - -
2 circle two lines -
3 sphere circular cylinder two planes

Table 4.1: Loci of points at constant distance from an affine subspace.

Proof. We notice first that if c = 0 then L(S, c) is the set S itself by definition, (4.7). Moreover, if
c < 0 then L(S, c) is empty since distances are positive. The non-trivial cases appear when c > 0.
If S consists of a single point, in dimension n = 1, then L(S, c) consists of two points, namely the
endpoints of a segment with midpoint S. We can discuss the diagonal entries in Table 4.1 together by
considering hyperplanes since, in dimension 1, these are points, in dimension 2, these are lines and,
in dimension 3, these are planes. Let the hyperplane S be given by the equation a1 x1 +a2 x2 +· · ·+an xn =
b. By (4.8), a point Q(q1 , . . . , qn ) belongs to L(S, c) if and only if for all P ∈ S we have

d(P , Q) = c ⇔ |a1 q1 + a2 q2 + · · · + an qn | = c′
q
where c′ = c · a21 + a22 + · · · + a2n . Thus, L(S, c) is the union of two hyperplanes with equations

a1 q1 + a2 q2 + · · · + an qn = c′ and a1 q1 + a2 q2 + · · · + an qn = −c′ .

Consider the case where S is a line in E3 . We notice that in dimension 3 we don’t yet have a
formula for the distance between a point and a line. Such a formula will be deduced in Section 5.2.2
and can be used to deduce equations for L(S, c) if S is an arbitrary line. However, we have other
options as well, for instance we can ‘scan’ the three dimensional space with planes passing through S
and notice that in each such plane we obtain two lines at equal distance from S. Yet another option,
since we are only interested in the possible shapes of L(S, c), is to make a good choice of a frame.
Choose the frame K such that S is the z-axis. An arbitrary point in S is P (0, 0, t) with t ∈ R. Then, for
a point Q(xQ , yQ , zQ ) we have
2 2
d(P , Q) = c ⇔ xQ + yQ = c2
Thus, this locus of points has equation L(S, c) : x2 + y 2 = c2 which describes a cylinder with axis S.
Lastly, in any dimension the set of points at distance c from a point Q(xQ , yQ , zQ ) is the (hy-
per)sphere of radius c centered in Q, since

d(P , Q) = c ⇔ (p1 − q1 )2 + (p2 − q2 )2 + · · · + (pn − qn )2 = c2 .

Definition 4.29. If S ′ is another set, the locus of points equidistant from S and S ′ is
n o
L(S, S ′ ) = P ∈ En : d(P , S) = d(P , S ′ ) .

Proposition 4.30. Let S and S ′ be two affine subspace of En . Table 4.2 classifies the possible shapes
of L(S, S ′ ).

64
Geometry - first year [email protected]

n 2 points point + line point + plane 2 lines line + plane 2 planes


1 point - - - - -
2 line parabola - line/lines - -
planes or cone or plane or
3 plane parabolic cylinder paraboloid
hyperboloid parabolic cylinder two planes

Table 4.2: Loci of points equidistant from two distinct affine subspaces.

Proof. Consider the first column in Table 4.2. Let A(a1 , . . . , an ) and B(b1 , . . . , bn ) be two fixed points.
A point P (p1 , . . . , pn ) is equidistant from A and B if and only if d(A, P ) = d(P , B). In coordinates this
gives the equation
(a1 − p1 )2 + · · · + (an − pn )2 = (b1 − p1 )2 + · · · + (bn − pn )2 .
which is equivalent to P satisfying the equation
1
(a1 − b1 )x1 + · · · + (an − bn )xn + (a21 − b12 + · · · + a2n − bn2 ) = 0
2
−−→
which is the equation of a hyperplane with normal vector AB . Moreover, it is easy to check that
the hyperplane contains the midpoint of the segment [AB]. It is called the perpendicular bisecting
hyperplane of the segment [AB]. In dimension 2 it is the perpendicular bisector of the segment [AB].
For the first diagonal in Table 4.2, consider a hyperplane H and a point Q outside H. We choose
the frame K such that Q is on the positive part of the last coordinate axes, such that the last coordinate
axes is orthogonal to H and such that the origin O is at the same distance from H and Q. Moreover,
eventually changing the unit segment, we may assume that Q has coordinates (0, . . . , 0, 1) and H has
equation xn + 1 = 0. Then, a point P is equidistant from H and Q if and only if
q
d(P , H) = d(P , Q) ⇔ |pn + 1| = p12 + · · · + pn−1 2
+ (pn − 1)2 = 0

which is equivalent to P satisfying the equation

x12 + · · · + xn−1
2
= 4xn

which is the equation of a hyperparaboloid. In dimension 1 this is a point (the origin, 0). In dimen-
sion 2 it is a parabola and in dimension 3 a paraboloid (see Chapter 11).
Next, consider the case of two hyperplanes

H : a1 x1 + · · · + an xn + an+1 = 0 and H′ : b1 x1 + · · · + bn xn + bn+1 = 0.

We may assume that the normal vectors which can be read of from the equations are unit vectors. A
point P (p1 , . . . , pn ) is at the same distance from H and H′ if

d(P , H) = d(P , H′ ) ⇔ |a1 x1 + · · · + an xn + an+1 | = |b1 x1 + · · · + bn xn + bn+1 |.

This translates to two equations

(a1 − b1 )x1 + · · · + (an − bn )xn + (an+1 − bn+1 ) = 0 and (a1 + b1 )x1 + · · · + (an + bn )xn + (an+1 + bn+1 ) = 0

65
Geometry - first year [email protected]

and we notice that if the hyperplanes are parallel then only one of the equation has solutions (the
hyperplane lying at half-distance between H and H′ ). In dimension 1 this is the midpoint of the
segment [HH′ ]. Assume that H and H′ are not parallel. In dimension 2 these are the angle bisectors
of the angles described by the two lines H and H′ . In dimension 3 these are bisecting planes of the
dihedral angles between H and H′ and in general these are angle bisecting hyperplanes.
Next consider the case of a point S = {Q} and a line S ′ in E3 . We choose a frame K such that the
point Q has coordinates (1, 0, 0) and such that the line S ′ contains the point (−1, 0, 0) and has direction
vector j(0, 1, 0). We anticipate and use the distance formula from a point to a line in dimension 3
(Section 5.2.2). Notice however that because of the choice of the frame, the distance formula can be
easily deduced in this case. For a point P , we have
−−→
d(P , S) = d(P , S ′ ) = | QP × j| ⇔ (xP − 1)2 + yP2 + zP2 = zP2 + (xP + 1)2 ⇔ yP2 = 4xP

which is the equation of a parabolic cylinder.


Next consider the case of two lines in E3 . Assume first that the line intersect in one point. We
assume that the two lines S and S ′ are the x-axis and the y-axis respectively. Then, for a point P , we
have
−−→ −−→
d(P , S) = d(P , S ′ ) ⇔ | OP × i| = | OP × j| ⇔ yP2 + zP2 = zP2 + xP2 ⇔ (xP − yP )(xP + yP ) = 0.

These are two planes orthogonal to the plane containing the two lines and which bisect the angles
formed by the two lines.
Assume now that the lines are skew. We choose a frame such that S is the x-axis and S ′ contains
the point Q(0, 1, 0) and has v(λ, 0, 1) as direction vector. Then, for a point P , we have
−−→ −−→
d(P , S) = d(P , S ′ ) ⇔ | OP × i| = | QP × v| ⇔ yP2 + zP2 = (xP − λzP )2 + (λ2 + 1)(yP − 1)2

which corresponds to a one-sheeted hyperboloid (see Chapter ??). In the last subcase, where the two
lines S and S ′ are parallel on checks that L(S, S ′ ) is a plane.
Finally, the last case that we need to consider is that of a line S and a plane S ′ in E3 . With a
similar argument we see that in this case we obtain an elliptic cone if the line punctures the plane
and, if the line and the plane are parallel we obtain a parabolic cylinder.

4.3.3 Convergence
Once we have a clear notion of distance, the notion of proximity to a point also becomes clear. Being
close to a point P means lying in a ball of radius ε centered at the point P and you may adjust ε at
will. The set of open balls centered at all points defines a topology on En , called the standard topology.
Recall Chapter 7 of your Analysis course [16]. All the results from your analysis course hold true for
En by fixing an orthonormal frame which identifies En with Rn .

66
CHAPTER 5

Area and volume

Contents
5.1 Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.1.1 Area of polygons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.1.2 Oriented area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.1 Algebraic identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.2.2 Distance between a point and a line . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2.3 Common perpendicular line of two skew lines . . . . . . . . . . . . . . . . . . 77
5.2.4 Distance between two skew lines . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3 Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.1 Volume of polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.2 Oriented volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.3.3 Hypervolume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

67
Geometry - first year [email protected]

5.1 Area
5.1.1 Area of polygons
Definition 5.1. A polygonal segment A1 A2 . . . An is a union of segments [A0 A1 ], . . . , [An−1 An ], such that
consecutive segments [Ai Ai+1 ] and [Ai+1 Ai+2 ] intersect only in their shared vertex Ai+1 . If all the
vertices lie in the same plane, the polygonal segment is said to be planar. The polygonal segment
A1 A2 . . . An is said to intersect itself in a point P if there exist indices i and j with i + 1 < j, such that
[Ai Ai+1 ] intersects [Aj Aj+1 ] at P .

A6
A2
A7
A3

A4
A1
A5

A polygon is an n-tuple of points (A1 , A2 , . . . , An ), denoted A1 A2 . . . An , such that the polygonal


segment A1 A2 . . . An A1 intersects itself only at the vertex A1 . The segments [Ai Ai+1 ] are called the
sides of the polygon. Two polygons are said to be congruent if their corresponding sides and the angles
between them are pairwise congruent. Two polygons P1 = A1 A2 . . . An and P2 = B1 B2 . . . Bm are said
to be similar if their corresponding angles are congruent. In this case, the ratio of the lengths of
corresponding sides is constant, and we denote this ratio by P1 : P2 .

A4
A5

A2

A3
A
A8
B
A1 A6
A7

We will only consider planar polygons. A planar polygon separates the points of the plane not ly-
ing on its sides into two regions (see [14, Theorem 9]): the interior and the exterior. These regions have
the following property: if A is a point of the interior (an inner point) and B is a point of the exterior
(an exterior point), then every polygonal segment lying in the plane of the polygon and connecting A
with B must intersect the sides of the polygon at least once.
The interior S of a (planar) polygon is bounded in the sense that, for any point P in the plane,
there is an r > 0 such that all points in S are at distance at most r from P . Indeed, take r to be the
maximum of the distances from P to all vertices. Our goal is to measure such bounded regions of
the plane - namely, interiors of polygons. We do this by means of triangles, which are the simplest

68
Geometry - first year [email protected]

possible polygons. Any polygon can be subdivided into triangles (see for example [3, Theorem 3.1]),
for instance with the so-called ear clipping method. Such a subdivision partitions the interior of a
polygon into interiors of triangles, if we are willing to disregard segments. Given all this, we may
measure the interior of polygons by comparing them with the the interior of a unit square.

A4
A5

A2

A3

A8

A1 A6
A7

Definition 5.2. For brevity, we use the term area of a polygon to mean the area of the interior of a
polygon. To define area, we consider two polygons to be disjoint if their interiors are disjoint. Denote
by P the set of all finite unions of polygons. We define an area function Area : P → R≥0 through the
following properties:

(A1) If S is a square of side length 1 then Area(S) = 1.

(A2) If S1 and S2 are similar polygons with ratio x then Area(S1 ) = x2 Area(S2 ).

(A3) If S1 and S2 are disjoint polygons then Area(S1 ∪ S2 ) = Area(S1 ) + Area(S2 ).

Remark. The above definition captures basic principles of the intuitive notion of area. We note that
in property (A2), the term ‘similar polygons’ can be replaced with ‘congruent polygons’, resulting in
a weaker assumption. Furthermore, in property (A3) the finite union can be extended to a countable
union, in which case the sum is replaced by a countable sum (see Section 5.3.3). Our definition of
area allows us to deduce the following well-known facts.

Proposition 5.3. We have

1. The area of a rectangle of sides a and b is ab.

2. The area of a triangle ABC with side length a and corresponding height h is ah/2.

3. The area of a parallelogram ABCD is twice the area of the triangle ABC.

Proof. Denote by Sa a square of side length a and by Ra,b a rectangle of sides length a and b. First,
notice that since the ratio Sa : S1 equals a, we have Area(Sa ) = a2 Area(S1 ) = a2 , by (A2) and (A1).

69
Geometry - first year [email protected]

1. Place Sa and Sb in opposite corners of Sa+b . The complement of the two small squares in the
big square are two disjoint rectangles which are congruent to Ra,b . Thus, by (A3) we have

Area(Sa+b ) = Area(Sa ∪ Ra,b ∪ Ra,b ∪ Sb ) = Area(Sa ) + Area(Sb ) + 2 Area(Ra,b )

hence
(a + b)2 = a2 + b2 + 2 Area(Ra,b )
and therefore Area(Ra,b ) = ab.

D C

Sb
Sa

A H B M

For claim 3, it suffices to notice that a diagonal divides a parallelogram in two congruent trian-
gles. For claim 2, we use claim 3 and check the known area formula for a parallelogram ABCD, i.e.
Area(ABCD) = ah where h is a height and a the length of the corresponding side. Let H be a point on
AB such that h = |DH|. The triangle ADH can be moved on the opposite side of the parallelogram to
form a rectangle with side lengths a and h. Thus, by claim 1, Area(ABCD) = Area(Ra,h ) = ah.

Orthonormal frames offer an efficient method for calculating the area of a parallelogram, and
therefore, the areas of triangles and polygons.

J(v)
D C
w

prJ(v) (w)

A H v B

Proposition 5.4. Suppose the vertices A(xA , yA ), B(xB , yB ) and D(xD , yD ) of the parallelogram ABCD
are given with respect to an orthonormal frame. Then,

xA yA 1
v vy
Area(ABCD) = | xB yB 1 | = | x |
wx wy
x D yD 1
−−→ −−−→
where v(vx , vy ) = AB and w(wx , wy ) = AD .

70
Geometry - first year [email protected]

Proof. Drop a perpendicular line from D on AB and denote by H its intersection with AB. By Propo-
sition 5.3, we have
−−→ −−−→ ⟨J(v), w⟩ v vy
Area(ABCD) = | AB | · | HD | = |v| · | prJ(v) (w)| = |v| · | | = |⟨J(v), w⟩| = | x |.
|J(v)| wx wy

Moreover, using properties of the determinants we obtain

xA yA 1 xA yA 1
vx vy xB − xA yB − yA
= = xB − xA yB − yA 0 = xB yB 1 .
wx wy xD − xA yD − yA
xD − xA yD − yA 0 xD yD 1

5.1.2 Oriented area


Fix an orientation in E2 , i.e. fix a right-oriented basis B. Proposition 5.4 shows that, up to sign, the
area of a parallelogram ABDC is given by the determinant of the base change matrix MB,B ′ , where
−−→ −−→
B ′ = ( AB , AC ). The value det(MB,B ′ ) does not depend on the orthonormal basis B in which the
vectors are expressed. The proof of this claim is the same in any dimension (see Proposition 5.21).
Definition 5.5. With the above notation, the box product of v and w is det(MB,B ′ ) and we denote it by
−−→ −−→
[v, w]. The oriented area of the parallelogram ABDC spanned by v = AB and w = AC is

Areaor (ABDC) = [v, w].

Similarly, the oriented area of the triangle ABC is Areaor (ABC) = Areaor (ABDC)/2.
−−→ −−→
Proposition 5.6. Let v = AB and w = AC be two vectors with ABC a triangle.
[v, w] 2 · Areaor (ABC)
sin ∡or (v, w) = = .
|v| · |w| |AB| · |AC|
where B ′ = (v, w).
Proof. This is a direct consequence of the definition of the sine function for oriented angles and of
the definition of oriented area of a triangle.

5.2 Cross product


The cross product is the 3-dimensional analogue of the operator J introduced in Section 4.1.2. Through-
out this section, we consider E3 . For two non-zero vectors a and b, the orthogonal complements
V⊥a and V⊥b are 2-dimensional vector subspaces of V3 . If a and b are linearly independent, then
V⊥a ∩ V⊥b is a 1-dimensional vector subspace of V3 . In other words, up to scalar multiple, there is a
unique vector c perpendicular to both a and b. When prescribing the length of c, we have exactly two
options: ±c. For these options, (a, b, c) is either left- or right-oriented. By fixing such an orientation,
the choice of c becomes unique. This brings us to the following definition.

71
Geometry - first year [email protected]

Definition 5.7. Let a, b ∈ V3 be two vectors. The cross product (or vector product) of a and b, denoted
a × b, is the vector defined by the following properties:
1. if a and b are parallel then a × b = 0.

2. if a and b are not parallel, then

(a) |a × b| equals the area of a parallelogram spanned by a and b,


(b) a × b ⊥ a and a × b ⊥ b,
(c) (a, b, a × b) is a right oriented basis of V3 .

In particular, the vectors a and b are parallel if and only if a × b = 0.


The above geometric definition can be translated into a more accurate algebraic description as
follows. Fix a non-zero vector a. Consider the orthogonal complement of a, i.e. V⊥a = {v ∈ V3 :
⟨v, a⟩ = 0}. It is a 2-dimensional vector subspace of V3 . For any vector v ∈ V3 , we have a unique
decomposition
v = v∥ + v⊥ with v∥ parallel to a and v⊥ orthogonal to a.
This gives a projection map Pr⊥a : V3 → V⊥a defined by Pr⊥a (v) = v⊥ . Let J⊥a (v⊥ ) denote the unique
vector in V⊥a of length |v⊥ | and such that (v∥ , v⊥ , J⊥a (v⊥ )) is right-oriented.

72
Geometry - first year [email protected]

Proposition 5.8. With the above notation, for any two non-zero vectors a, b ∈ V3 we have

a × b = |a| · J⊥a (Pr⊥a (b)).

Proof. By construction, we see that J⊥a (Pr⊥a (b)) is orthogonal to both a and b. Furthermore, by the
definition of the operator J⊥a , the vector J⊥a (Pr⊥a (b)) is a positive scalar multiple of a × b. Therefore,
all we need to show is that the vector on the right-hand side has length |a × b|. We have

|a × b| = |a| · |b| · sin ∡(a, b) = |a| · | Pr⊥a (b)| = |a| · |J⊥a (Pr⊥a (b))|

where the last equality follows from the fact that J⊥a is a rotation with a right angle - in particular it
doesn’t change the length of vectors.

Proposition 5.9. The cross product □ × □ : V3 × V3 → V3 satisfies the following properties.


(CP1) It is bilinear, i.e. for all a, b ∈ R and all v, w, u ∈ V2 we have

(av + bw) × u = a(v × u) + b(w × u) and v × (aw + bu) = a(v × w) + b(v × u).

(CP2) It is skew-symmetric, i.e. for all v, w ∈ V2 we have

v × w = −w × v.

Proof. Skew-symmetry follows from the orientation requirement in the definition of the cross prod-
uct. Indeed, if (a, b, v) is right-oriented then (b, a, v) is left-oriented, hence (b, a, −v) is right-oriented.
To check bilinearity, we need to verify that the cross product is linear in each argument. If the first
argument is fixed, the map a × □ : V3 → V3 is a composition of linear maps (by Proposition 5.8), so it
is linear. For the second argument we may use the skew-symmetry of the cross product.

Since the cross product is bilinear, its values are determined by the values on a basis. If B = (i, j, k)
is a right-oriented orthonormal basis, one can check with the definition of the cross product that the
values on the basis vectors are:
× i j k
i 0 k −j
.
j −k 0 i
k j −i 0
This table allows us to calculate the cross product for arbitrary vectors a(a1 , a2 , a3 ) and b(b1 , b2 , b3 ),
with components relative to B:

i j k
(a1 i + a2 j + a3 k) × (b1 i + b2 j + b3 k) = (a2 b3 − a3 b2 )i + (a3 b1 − a1 b3 )j + (a1 b2 − a2 b1 )k = a1 a2 a3 .
b1 b2 b3

We can therefore derive, for example, a formula for the area of a parallelogram P spanned by a and
b: s
2 2 2
a2 a3 a a3 a a2
Area(P ) = |a × b| = + 1 + 1 .
b2 b3 b1 b3 b1 b2

73
Geometry - first year [email protected]

5.2.1 Algebraic identities


When calculating consecutive cross products, we notice that the cross product is not associative. For
example, if (i, j, k) is a right oriented orthonormal basis, then:

(i × j) × j = k × j = −i whereas i × (j × j) = 0.

However, there is a rule which explains how iterated cross products behave when evaluated in dif-
ferent ways. This rule is the Jacobi identity:

(a × b) × c + (b × c) × a + (c × a) × b = 0. ∀a, b, c ∈ V3 . (5.1)

One way to prove this identity is to write out the above expression in coordinates and check that the
left-hand side simplifies to the zero vector. The geometric interpretation of the identity is as follows:
given a tetrahedron OABC, the three planes passing through the edges adjacent to O and orthogonal
to the respective opposite faces intersect in one line.

A short way to prove (5.1) is by using the double cross formula. This formula is of independent
interest, as it provides an efficient way of calculating iterated cross products. Specifically, it replaces
the calculation of a determinant with the calculation of two scalar products. Other beautiful identi-
ties, which can be derived with linear algebra only, can be found in the exercises and in [11, Chapter
4].
Theorem 5.10 (Double Cross Formula). For any vectors a, b and c we have

(a × b) × c = ⟨a, c⟩ · b − ⟨b, c⟩ · a. (5.2)

Proof. Notice that (5.2) can be checked directly in coordinates. A different proof is provided in [11,
p.74] or in [5, p.70]. Our proof has three steps. First, note that we may assume all three vectors to be
non-zero; otherwise, it is straightforward to check that both sides of the equality are zero.

74
Geometry - first year [email protected]

(Step 1) It suffices to prove (5.2) in the special cases where all three vectors are unit vectors.
Indeed, under this assumption, the linearity of the cross product (Proposition 5.9) and the linearity
of the scalar product we obtain:
" # " #
a b c a c b b c a
(a × b) × c = |a| · |b| · |c| · ( × ) × = |a| · |b| · |c| · ⟨ , ⟩ − ⟨ , ⟩ = ⟨a, c⟩ · b − ⟨b, c⟩ · a.
|a| |b| |c| |a| |c| |b| |b| |c| |a|

(Step 2) It suffices to prove (5.2) in the special cases where c = a and c = b. Indeed, since a
and b are non-parallel unit vectors (a, b, a × b) is a basis and c = αa + βb + γa × b for some scalars
α, β, γ ∈ R. Then, using the linearity of the cross product (Proposition 5.9) and the properties of the
scalar product, we have:

(a × b) × c = (a × b) × (αa + βb + γa × b)
= α(a
 × b) × a + β(a ×
 b) × b 
= α ⟨a, a⟩b − ⟨b, a⟩a + β ⟨a, b⟩b − ⟨b, b⟩a
   
= − α⟨b, a⟩ − β⟨b, b⟩ a + α⟨a, a⟩ + β⟨a, b⟩ b
= ⟨−αa − βb, b⟩a + ⟨αa + βb, a⟩b
= ⟨−αa − βb − γa × b, b⟩a + ⟨αa + βb + γa × b, a⟩b
= ⟨a, c⟩b − ⟨b, c⟩a.

(Step 3) It remains to show that (5.2) holds in the special cases where c = a and c = b, assuming
a and b are unit vectors. Using the skew-symmetry of the cross product, it is easy to see that the
two cases are equivalent. By Step 1, we may also assume that all three vectors are unit vectors. In
particular ⟨a, a⟩ = 1 and ⟨b, a⟩ = cos θ, where θ = ∡(a, b). Thus, it suffices to prove the identity:

(a × b) × a = b − cos θ · a.

Let v denote the left-hand side and w the right-hand side. Notice that Bv = (a × b, a, v) is a right-
oriented orthogonal basis. Since there is a unique vector of length |v| with this property, it suffices to
show that Bw = (a × b, a, w) is a right-oriented orthogonal basis and that |v| = |w|. The first property
can be verified by calculating the determinant of the base-change matrix to the basis B = (a, b, a × b),
which is also right-oriented:

0 1 − cos θ
det(MB,Bw ) = 0 0 1 =1>0
1 0 0

Consequently, Bw is right-oriented. Furthermore, because ⟨a, w⟩ = ⟨b, a⟩ − ⟨b, a⟩ = 0, it follows that


(a × b, a, w) is an orthogonal basis (the remaining orthogonality requirements are straightforward to
verify). It remains to check the length of w. Since the vectors a, b and c are unit vectors, and because
∡(a × b, a) is a right angle, we have

|v|2 = |a × b|2 = (sin θ)2 = 1 − 2(cos θ)2 + (cos θ)2 = ⟨b − ⟨b, a⟩ · a, b − ⟨b, a⟩ · a⟩ = |w|2 .

Thus, Bw equals Bv , hence v = w. This completes the proof of Step 3 and of the theorem.

75
Geometry - first year [email protected]

5.2.2 Distance between a point and a line


A line ℓ in dimension 2 is a hyperplane and for any point P of E2 we have an efficient way of calcu-
lating d(ℓ, P ) with respect to an orthonormal frame of E2 .
Now consider the 3-dimensional case. Let ℓ be a line in E3 and let P be a point which does not lie
on ℓ. Suppose that P has coordinates (xP , yP , zP ) with respect to an orthonormal frame K of E3 , and
that ℓ is given by a point A(xA , yA , zA ) ∈ ℓ and a direction vector v(vx , vy , vz ) with respect to K. There
is a plane π containg both ℓ and P . It is possible to calculate a frame Kπ of π, extend it to a frame K′
of E3 , translate the given coordinates and components from K to K′ and then to the calculations in π
(with respect to Kπ ) where ℓ is a hyperplane.
A much more efficient way to calculate d(ℓ, P ) can be deduced with the cross product. Let B be
−−→
another point on ℓ such that AB = v and consider the parallelogram ABCP . Then d(ℓ, P ) is the height
of the parallelogram corresponding to the side [AB]. Thus

−−→
Area (ABCP ) | AP × v|
d(ℓ, P ) = = .
|AB| |v|

P C

−−→ d(ℓ, P )
AP


A v B

From this we immediately deduce a formula for the distance between two parallel lines in E3 . Let
ℓ ′ be a line parallel to ℓ which passes through a point B(xB , yB , zB ). We may apply Proposition 4.26 to
conclude that d(ℓ, ℓ ′ ) = d(ℓ, Q) for any point Q in ℓ ′ . Thus,

−−→
′ | AB × v|
d(ℓ, ℓ ) = d(ℓ, B) = .
|v|

ℓ′ B
v

−−→ d(ℓ, ℓ ′ )
AB


A v

76
Geometry - first year [email protected]

5.2.3 Common perpendicular line of two skew lines


Consider two vectors v and w. When proving Theorem 5.10 we reduced the claim to (v × w) × v. Let
us give more insight into this expression by deducing equations for the common perpendicular line
of two skew lines. Let ℓ and ℓ ′ be two skew lines in E3 passing respectively through A and B. Let v
be the direction vector for ℓ and let w be the direction vector for ℓ ′ .

The common perpendicular line of ℓ and ℓ ′ is is the unique line d which intersects the two lines and
is orthogonal to both of them, i.e. the line satisfying the following properties

• d ⊥ ℓ and d ⊥ ℓ ′ , and

• d ∩ ℓ , ∅ and d ∩ ℓ ′ , ∅.

Since the two lines are skew relative to each other, the vectors v and w are linearly independent,
hence the vector v × w is non-zero and perpendicular to both v and w. Let π be the plane passing
through A and parallel to v and v × w. Notice that n = (v × w) × v is a normal vector for π and that ℓ is
included in π. Similarly, there is a plane π′ containing ℓ ′ and having normal vector n′ = (v × w) × w.
The intersection of these two planes is a line d and we claim that it is the common perpendicular of
ℓ and ℓ ′ . Since d = π ∩ π′ , a direction vector for d is
   
n × n′ = (v × w) × v × (v × w) × w .

Denote v × w by a and notice that, by Theorem 5.10, we have

n × n′ = (a × v) × (a × w) = ⟨a, a × w⟩ v − ⟨v, a × w⟩a (proportional to a = v × w)


| {z }
=0

where ⟨a, a × w⟩ = 0 since the two vectors are orthogonal. It follows that a is a direction vector for d.
But a = v × w is orthogonal to both v and w, hence it is orthogonal to both ℓ and ℓ ′ . Moreover, since d

77
Geometry - first year [email protected]

and ℓ lie in the plane π and have non-parallel direction vectors they necessarily intersect. Similarly,
we see that d ∩ ℓ ′ , ∅. This shows that d is indeed the common perpendicular of ℓ and ℓ ′ .
Writing the equations of π and π′ in coordinates we obtain



 x − xA y − yA z − zA
π : vx vy vz = 0





 a x a y az
d = π ∩ π′ : 




 x − xB y − yB z − zB


 π ′: wx wy wz = 0


ax ay az

where we used the coordinates of A and B with respect to an orthonormal frame K = (O, B) and the
components of v, w and a with respect to B.

5.2.4 Distance between two skew lines


As in the previous section, let ℓ and ℓ ′ be two skew lines in E3 passing respectively through A(xA , yA , zA )
and B(xB , yB , zB ). Let v(vx , vy , vz ) be the direction vector for ℓ and let w(wx , wy , wz ) be the direction
vector for ℓ ′ . We assume that the coordinates of the points and the components of the vectors are
known.
Let MN be the common perpendicular line of ℓ and ℓ ′ with M ∈ ℓ and N ∈ ℓ ′ . Consider the
−−−→ −−−−→ −−−→
cuboid spanned by MA , N M , N B it is easy to show that d(A, B) ≥ d(MN ) = |MN |. Hence d(ℓ, ℓ ′ ) =
|MN |. Moreover, we could calculate this distance by writing down equations for MN (as in Section
5.2.3), intersecting with ℓ and ℓ ′ to determine the coordinates of M and N , and then, calculate |MN |.
However, there is a shorter way of obtaining this distance.

Since the lines are skew, there is a unique plane π containing ℓ ′ which is parallel to ℓ. Let P be
a point in π. Considering the cuboid with vertices M, N and P we see that d(ℓ, π) = |MN |. Hence

78
Geometry - first year [email protected]

d(ℓ, ℓ ′ ) = d(ℓ, π). Now, dropping a perpendicular from A on π which intersects π in H we see that
d(ℓ, π) = |AH|. Therefore
−−→
′ −−→ ⟨v × w, AB ⟩
d(ℓ, ℓ ) = d(ℓ, π) = |AH| = | Pr⊥
v×w ( AB )| = | |
|v × w|

Hence, if we let a(ax , ay , az ) = v × w, we obtain

−−→
⟨v × w, AB ⟩ ax (xB − xA ) + ay (yB − yA ) + az (zB − zA )
d(ℓ1 , ℓ2 ) = | |= q .
|v × w|
a2x + a2y + a2z

A further interpretation of this formula is the following. Let P be a parallelepiped spanend by the
−−→
three vectors v, w, AB and denote by F a face spanned by v and w. The distance between the two
lines is the height of P corresponding to the face F. Moreover, anticipating the discussion in Section
5.3, we have
−−→
Vol(P ) [v, w, AB ]
d(ℓ1 , ℓ2 ) = =| |.
Area(F) |v × w|

5.3 Volume
5.3.1 Volume of polyhedra
Definition 5.11. ‘A polyhedron may be defined as a finite, connected set of plane polygons, such
that every side of each polygon belongs also to just one other polygon, with the proviso that the
polygons surrounding each vertex form a single circuit’ [7]. Well known examples are prisms having
two congruent parallel faces, in particular triangular prisms and parallelepipeds, or pyramids having
just one point, the apex, outside the plane of the polygonal base. The height of a prism is the distance
between the two main faces, and the height of a pyramid is the distance from the apex to the base.

Triangulation of a polyhedra, or tetrahedralization, refers to the subdivision of a polyhedra into


tetrahedral meshes. Similar to polygons in dimension 2, it is always possible to break a polyhe-
dron into tetrahedra. Unlike the two dimensional case, it is not always possible to do this without
adding vertices. The smallest example of a polyhedron where this is not possible is the Schönhardt
polyhedron. For a discussion of the various challanges and algorithms see [20, Chapter 25].

79
Geometry - first year [email protected]

For our purposes, the following non-standard definition suffices. The interior of a tetrahedron
is a polyhedral set. If ABCD and ABCD ′ are two tetrahedra with disjoint interior then the union of
the interior of the two tetrahedra together with the interior of the triangle ABC is a polyhedral set
obtained by gluing the two tetrahedra. Any polyhedral set is obtained after a finite number of gluings
of tetrahedra.

Definition 5.12. Let P denote the set of all polyhedral sets in E3 . We define a volume function
Vol : P → R≥0 through the following properties:

(V1) If S is a cuboid of side lengths 1, 1 and x then Vol(S) = x.

(V2) If S1 and S2 are similar polyhedral sets with ratio x = S1 : S2 then Vol(S1 ) = x3 · Vol(S2 ).

(V3) If S1 and S2 are disjoint polyhedral sets then Vol(S1 ∪ S2 ) = Vol(S1 ) + Vol(S2 ).

Proposition 5.13. We have

1. The volume of a prism of height h and base area a is ah. In particular, the volume of a parallel-
ogram with a face of area a and corresponding height h is ah.

2. The volume of a pyramid of height h and base area a is ah/3. In particular, the volume of a
tetrahedron ABCD with Area(BCD) = a and d(A, BCD) = h is ah/3.

3. The volume of a rectangular parallelepiped of side lengths a, b and c is abc.

Proof. It is clear that 3. follows from 1. However, the proof needs to start from the definitions. We
do this by first proving 3. Denote by Ra,b,c a rectangular parallelepiped with side lengths a, b and c.
By (V1) and (V2), we have
1
Vol(Ra,a,1 ) = a3 · Vol(R1,1, 1 ) = a3 · = a2
a a
for any a > 0. Now, as in the proof of Proposition 5.3, consider Ra,a,1 and Rb,b,1 in opposite corners
of Ra+b,a+b,1 . By (V3), we have Vol(Ra+b,a+b,1 ) = Vol(Ra,a,1 ) + Vol(Rb,b,1 ) + 2 · Vol(Ra,b,1 ) hence (a + b)2 =
a2 + b2 + 2 · Vol(Ra,b,1 ) and therefore Vol(Ra,b,1 ) = ab for any a, b > 0. Then, for an arbitrary rectangular
parallelepiped we have
ab
Vol(Ra,b,c ) = c3 · Vol(R a , b ,1 ) = c3 · 2 = abc.
c c c
Next, we use 3. to deduce the volume of an arbitrary parallelepiped. For this we section and rear-
range the parallelepiped as in the figure (this is Figure 4.1.2 from [1]) below.

80
Geometry - first year [email protected]

In (a), (b) and (c) the parallelepiped is transformed into a prism with base a parallelogram by
cutting and pasting congruent triangular prisms. In particular, the height h and the base area a are
unchanged. In the last step (d), the base is rearranged into a rectangle without changing its area - as
in the proof of Proposition 5.3. The final result is a rectangular parallelepiped Rx,y,h with volume ah
equal to the volume of the initial parallelepiped.
Now, slicing a parallelepiped along the diagonals of two opposite faces we obtain a triangular
prism with volume ha where a is half the area of the face which was cut. Thus, a triangular prism
has the claimed area. For an arbitrary prism, use a triangulation of the base polygon to deduce the
claim.

For 2. we first consider tetrahedra and follow the argument in [1, §4.1]. Slice a rectangular prism as
in the following figure1 . The proof of Proposition 5 of Book XII of Euclid’s Elements (see for example
[10]), is correct for our settings and assumptions. It implies that two triangular pyramids (tetrahedra)
with the same height and with congruent bases have the same volume [10, Volume 3, p.390].

1 Figure 4.1.4 from [1]

81
Geometry - first year [email protected]

In the above slicing the two tetrahedra with white base have the same heights and the bottom and
top tetrahedra are clearly congruent. Thus, if T is any of the three tetrahedra of the triangular prism
P , then
1 ah
Vol(T ) = Vol(P ) =
3 3
where a is the area of a base triangle of P and h is the height of P . Then, for an arbitrary pyramid,
the claim follows by considering a tetrahedralization.

Remark. Similar to our definition of area, the above definition of volume tries to capture basic prin-
ciples of the intuitive notion of volume. It is an over-simplified version of a so-called Lebesgue
measure (see Section 5.3.3). In property (V1) we may replace the rectangular parallelepiped with
a cube of side-length 1. However, after weakening the assumption, it is not possible to cover an
arbitrary rectangle allowing only the uniform scaling given in (V2). To see this difficulty, keep the
notation in the proof of Proposition 5.13 and place three cubes Ca , Cb and Cc on the diagonal of a
cube Ca+b+c .

The planes of the small cubes divide Ca,b,c in cuboids the sides of which can be checked to correspond
to the algebraic formula for the cube of a + b + c. We have

Vol(Ca,b,c ) = (a + b + c)3 = a3 + b3 + c3 + 3a2 b + 3a2 c + 3ab2 + 3b2 c + 3ac2 + 3bc2 + 6abc. (5.3)

Assuming that the volume Vol(Cx,y,y ) of a cuboid with two equal adjacent sides is xy 2 , it follows
from (V2), (V3) and (5.3) that Vol(Ra,b,c ) = abc. Thus, it is necessary and sufficient to show that
Vol(Cx,y,y ) = xy 2 . For this, place two cubes Ca and Cb on the diagonal of Ca+b and notice that the
subdivision of the big cube corresponds to

Vol(Ca+b ) = (a + b)3 = a3 + b3 + 3ab(a + b).

From this equation and (V2), (V3) it follows that Vol(Ra,b,(a+b) ) = ab(a + b) for any a, b > 0. However,
this does not yield a decomposition of a cube into smaller cubes and rectangles congruent to Rx,y,y
for arbitrary x and y. Compare this to the two dimensional case.

Orthonormal frames offer an efficient way of calculating the volume of a parallelogram, and
therefore volumes of tetrahedra and polyhedra provided that the coordinates of vertices are known.

82
Geometry - first year [email protected]

Proposition 5.14. The volume of a parallelepiped P spanned by the vectors a, b and c is

a1 a2 a3
Vol(P ) = |⟨a × b, c⟩| = | b1 b2 b3 |
c1 c2 c3

where the components of a(a1 , a2 , a3 ), b(b1 , b2 , b3 ) and c(c1 , c2 , c3 ) are with respect to an orthonormal
basis.

Proof. The area of the face spanned by a and b is |a × b| and the height corresponding to this face is
| pr⊥
a×b (c)|. Thus

⟨a × b, c⟩
Vol(P ) = |a × b| · | pr⊥
a×b (c)| = |a × b| · | a × b| = |⟨a × b, c⟩|.
⟨a × b, a × b⟩

Moreover, with respect to an orthonormal basis we have

i j k c1 c2 c3 a1 a2 a3
⟨a × b, c⟩ = ⟨ a1 a2 a3 , c1 i + c2 j + c3 k⟩ = a1 a2 a3 = b1 b2 b3 .
b1 b2 b3 b1 b2 b3 c1 c2 c3

5.3.2 Oriented volume


Definition 5.15. The above proposition shows that, up to sign, the volume of a parallelepiped
spanned by a, b and c is the determinant of the base change matrix MB,B ′ there B is a right-oriented
basis and B ′ = (a, b, c). The value det(MB,B ′ ) does not depend on the orthonormal basis B in which the
vectors are expressed. The proof of this claim is the same in any dimension (see Proposition 5.21).
We call this determinant the box product of a, b and c and denote it by [a, b, c]. We define the oriented
volume of a parallelepiped P spanned by the vectors a, b and c to be

Volor (P ) = [a, b, c].


−−→ −−→ −−−→
Similar, the oriented area of the tetrahedron ABCD is Volor (ABCD) = [ AB , AC , AD ]/6. Proposition
5.14 shows that mixing the cross product with the scalar product gives the box product. For this
reason ⟨a × b, c⟩ = [a, b, c] is sometimes called the mixed product of the vectors a, b and c.

83
Geometry - first year [email protected]

Proposition 5.16. Box product defines a map [ , , ] : V3 × V3 × V3 → R (a, b, c) 7→ [a, b, c] which is


linear in each argument. Moreover, for three vectors a, b, c we have

[a, b, c] = [b, c, a] = [c, a, b] = −[b, a, c] = −[a, c, b] = −[c, b, a].

The coplanarity (i.e. linear dependency) condition for a, b, c is

[a, b, c] = 0.

The sign of the box product determines the orientation of the basis (a, b, c)



 > 0 then (a, b, c) is right oriented
[a, b, c]  = 0 then (a, b, c) is not a basis .


 < 0 then (a, b, c) is left oriented

Proof. In dimension 3, linearity of the box product follows from the fact that it is the composition of
linear maps, cross product and scalar product. You can also deduce the linearity of this map from the
fact that with respect to an orthonormal basis it is the determinant of a base change matrix. The other
properties follow from this latter description and from known properties of the determinant.

Proposition 5.17 (Triple cross product formula). For vectors a, b, c, d ∈ V3 we have

(a × b) × (c × d) = b · [a, c, d] − a · [b, c, d] = c · [a, b, d] − d · [a, b, c]

5.3.3 Hypervolume
The higher-dimensional analogues of polygons and polyhedra are polytopes. The simplest polytope in
dimension n is an n-simplex. If n = 2 these are triangles and if n = 3 these are tetrahedra. In general,
in dimension n, consider n + 1 points P0 , P1 , . . . , Pn which do not lie in a hyperplane. This is equivalent
−−−−→ −−−−→
to asking for the vectors P0 P1 , . . . , P0 Pn to be linearly independent, i.e. to form a basis of Vn . The
convex hull of P0 , P1 , . . . , Pn is an n-simplex. The (n − 1)-faces of an n-simplex are n − 1-simplices,
in particular, the faces of a 4-simplex are tetrahedra. We may define polytopes as sets obtained by
gluing n-simplices along their faces.
A hyperparallelepiped spanned by v1 , . . . , vn is the set of all points obtained from a given point
P with linear combinations of the given vectors considering only coefficients in the interval [0, 1],
concretely
H = P + {α1 v1 + · · · + αn vn : α1 , . . . , αn ∈ [0, 1]} .
A hypercube is a hyperparallelepiped with all 1-faces of equal length and all angles right angles.

Definition 5.18. Let P denote the set of all polytopes in E3 . We define a rational hypervolume function
Vol : P → R≥0 through the following properties:

(V1) If S is hypercube of side length 1 then Vol(S) = 1.

(V2) If S1 and S2 are congruent polytopes then Vol(S1 ) = Vol(S2 ).

(V3) If S1 and S2 are disjoint polytopes then Vol(S1 ∪ S2 ) = Vol(S1 ) + Vol(S2 ).

84
Geometry - first year [email protected]

This volume function is called rational because it does not allow us to measure all polytopes. For
instance it is not possible to use it to deduce the volume of a cube with sides an arbitrary irrational
numbers.

Definition 5.19. We say that a set S is measurable, if it contains a sequence of strictly growing internal
polytopes Ii and if it is contained in a sequence of strictly shrinking exterior polytopes Ei such that
the volumes of these sets converge to a common value. Formaly, the following properties need to
hold
Ii ⊆ Ij ⊆ S ⊆ Ej ⊆ Ei ∀i < j and lim Area(Ii ) = lim Area(Ei )
i→∞ i→∞

If the limit exists, it is unique and we denote it by Vol(S). Denoting by M the set of all measurable sets
of En , we extended the (hyper)volume function to Vol : M → R≥0 . This is a version of the Lebesgue
measure (see for example [18, Chapter 11]).
Q
From this definition one may deduce that a cuboid with side lengths a1 , . . . , an has volume ai . Or
that if P is a hyperparallelepiped in dimension n, and if S is any n-simplex constructed from vertices
of P then
1
Vol(S) = Vol(P )
n!
which is the higher-dimensional analogue of the volume formula for a tetrahedron.
As in the case of dimension 2 and 3, the efficient way of calculating hypervolume is given by the
box product which allows us to calculate the hypervolume of hyperparallelepipeds and therefore of
n-simplices and of polytopes in general.

Definition 5.20. Let v1 , . . . , vn be vectors in Vn with components vi (vi,1 , . . . , vi,n ) relative to a right
oriented orthonormal basis B of Vn . The (n-fold) box product of these vectors is

v1,1 v2,1 . . . vn,1


v1,2 v2,2 . . . vn,2
[v1 , . . . , vn ] = . .. .. .
.. . .
v1,n v2,n . . . vn,n

Proposition 5.21. The definition of the box product does not depend on the choice of the right
oriented orthonormal basis B.

Proof. Let V = (v1 , . . . , vn ) be an n-tuple of n-dimensional vectors and let MB,V be the matrix whose
column are the components of the v′i s. Notice that

[v1 , . . . , vn ] = det(MB,V )

If the vectors are linearly dependent, then det(MB,V ) = 0 for any basis B. If the vectors are linearly
independent, then V is a basis and if B ′ is another left oriented orthonormal bases then

det(MB,V ) = det(MB ′ ,B MB,V ) = det(MB ′ ,B ) det(MB,V )

85
Geometry - first year [email protected]

and it suffices to show that det(MB ′ ,B ) = 1. Let ei denote the vectors in B. Then, since B ′ is orthonor-
mal, we have  
⟨e1 , e1 ⟩ ⟨e1 , e2 ⟩ . . . ⟨e1 , en ⟩
⟨e2 , e1 ⟩ ⟨e2 , e2 ⟩ . . . ⟨e2 , en ⟩
 
T
MB ′ ,B MB ′ ,B =  . .. ..  = In

 .. . ... . 

⟨en , e1 ⟩ ⟨en , e2 ⟩ . . . ⟨en , en ⟩

hence
1 = det(In ) = det(MTB ′ ,B MB ′ ,B ) = det(MTB ′ ,B ) det(MB ′ ,B ) = det(MB ′ ,B )2 .
It follows that det(MB ′ ,B ) = ±1 and since the two bases are both right-oriented, we deduce that
det(MB ′ ,B ) = 1 and the claim follows.

Definition 5.22. Let B be a basis of Vn . The oriented volume of the basis B, denoted by Volor (B), is the
value of the box product of the vectors in B. The volume of the basis B is the absolute value | Volor (B)|
and we denote it by Vol(B). If we are in dimension 2, we refer to these values as the area and oriented
area of B, and denote them by Area(B) and Areaor (B) respectively.
One can also generalize the J-operator and the cross product to higher dimensions. For n − 1
vectors v1 , . . . , vn−1 , the wedge product of these vectors, denoted by v1 ∧ · · · ∧ vn−1 is the unique vector
w such that |w| is the hypervolume of the hyperparallelepiped spanned by v1 , . . . , vn−1 , the vector w
is orthogonal to each vi and (v1 , . . . , vn−1 , w) is right oriented.

86
CHAPTER 6

Affine maps

Contents
6.1 Properties of affine maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.1.1 Homogeneous coordinates and homogeneous matrix . . . . . . . . . . . . . . . 90
6.2 Projections and reflections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2.1 Tensor product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2.2 Parallel projection on a hyperplane . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2.3 Parallel reflection in a hyperplane . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.2.4 Parallel projection on a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.2.5 Parallel reflection in a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

87
Geometry - first year [email protected]

6.1 Properties of affine maps


Definition 6.1. A map φ : An → Am is an affine map if and only if there exists A ∈ Matm×n (R) and
b ∈ Matm×1 (R) such that
[φ(P )]K′ = A · [P ]K + b (6.1)
relative to some frame K = (O, B) of An and some frame K′ = (O′ , B ′ ) of Am . Such a map defines a
linear map lin(φ) : D(An ) → D(Am ) where
−−→ −−→
[lin(φ)( P Q )]B ′ = A · [ P Q ]B (6.2)

since
−−→ −−−−−−−−−−→ −−→
[lin(φ)( P Q )]B ′ = [ φ(P )φ(Q) ]B ′ = (A · [Q]K + b) − (A · [P ]K + b) = A · ([Q]K − [P ]K ) = A · [ P Q ]B .

We notice that both A and b depend on the choice of the frames K and K′ and moreover A is the
matrix of lin(φ) relative to the bases in K and K′ .
If n = m, then φ : An → An is called an affine endomorphism. The set of all such endomorphisms
is denoted by Endaff (An ). It is easy to see that, the affine map φ is invertible if and only if the
map lin(φ) is invertible, equivalently, φ is invertible if and only if the matrix A of the linear map
lin(φ) is invertible (see proof of Proposition 6.5). An invertible affine endomorphism φ is called an
affine automorphism or affine transformation. The set of all affine transformations of An is denoted by
AGL(An )
Moreover, if in addition to n = m, O = O′ and b = 0 in (6.1) then φ can be viewed as a linear map
from Vn to Vn since it is given by multiplication with the matrix A. The set of invertible linear maps
Vn → Vn is denoted by GL(Vn ) and we have the following inclusion

GL(Vn ) ⊆ AGL(An ).

Example 6.2. A homothety φC,λ of En with center C is the map which rescales the space with a factor
λ along lines passing through the point C. With respect to a coordinate system K = (C, B) with origin
in C it has the form
[φC,λ (P )]K = λ · [P ]K .
Notice that if λ = 1 this is the identity map, if λ < 1 this is a contraction, if λ > 1 it is an expansion.

Example 6.3. The various parallel projections and reflections described in the next sections are ex-
amples of affine maps as well as isometries discussed in Chapter 7.

Proposition 6.4. Let φ : En → Em be an affine map. If a line ℓ is mapped onto a line ℓ ′ under φ, then
φ preserves the oriented ratio on ℓ, i.e. if A, B, C, D are points on ℓ, then
−−→ −−−−−−−−−−→
AC φ(A)φ(C)
−−→ = −−−−−−−−−→ . (6.3)
AB φ(A)φ(B)

In particular the ratios on ℓ are preserved.

88
Geometry - first year [email protected]

−−→ −−→ −−→ −−−−−−−−−−→ −−→


AC
Proof. The oriented ratio −−→ denotes a scalar λ such that AC = λ AB . Since φ(A)φ(C) = lin(φ)( AC )
AB
and since lin(φ) is a linear map we have

−−−−−−−−−−→ −−→ −−→ −−→ −−−−−−−−−→


φ(A)φ(C) = lin(φ)( AC ) = lin(φ)(λ AB ) = λ lin(φ)( AB ) = λ φ(A)φ(B)

and the claim follows.

Proposition 6.5. A map φ : En → En is an affine transformation if and only if

1. φ is injective;

2. φ preserves lines;

3. φ preserves the oriented ratio on lines.

Proof. Throughout we let φ be an affine map as in (6.1). First we notice that φ is an affine transfor-
mation, i.e. invertible, if the matrix A is invertible. Indeed, for two points P , Q we have

[φ(P )]K′ = [φ(Q)]K′ ⇔ A · [P ]K + b = A · [Q]K + b ⇔ A · [P ]K = A · [Q]K .

Therefore, injectivity and surjectivity of φ are equivalent to injectivity respectively surjectivity of


multiplication with A, i.e. φ is bijective if and only if A is invertible.
Now we start with the proof of our proposition. Assume that φ is an affine transformation. It is
bijective, so, in particular injective. By the above paragraph, the matrix A is invertible. Then, a point
on a line ℓ = {P + tv : t ∈ R} is mapped to

A · [P + tv]K + b = (A · [P ] + b) + t · A · [v]B

which shows that φ(ℓ) is a line passing through (A · [P ] + b) ∈ An and having direction vector A · [v]B ∈
D(An ) (which is a non-zero vector since A is invertible). The third claim follows from Proposition
6.3.
For the converse, assume that φ : An → An is a map with the three indicated properties. Since φ
preserves oriented ratios, it preserves midpoints of segments, i.e. if M is the midpoint of the segment
[AB] then φ(M) is the midpoint of the segment [φ(A)φ(B)]. Therefore, φ preserves parallelograms.
−−−→ −−−→ −−→ −−−−−−−−−−→ −−−−−−−−−−→
Hence φ preserves sums of any two vectors, i.e. if OA + OC = OB then φ(O)φ(A) + φ(O)φ(C) =
−−−−−−−−−−→ −−−→ −−→ −−−−−−−−−−→ −−−−−−−−−−→
φ(O)φ(B) . Moreover, since φ preserves oriented ratios, if OA = λ OB then φ(O)φ(A) = λ φ(O)φ(B) .
These two observations show that φ induces a linear map on vectors, i.e. that the map

−−−→ −−−−−−−−−−→
lin(φ) : OA 7→ φ(O)φ(A)

is linear. Now, Since φ is injective, it is easy to see that lin(φ) : D(An ) → D(An ) is injective. More-
over, it is a linear algebra fact that an injective linear map from a vectorspace to itself is bijective.
Therefore, lin(φ) is bijective.

89
Geometry - first year [email protected]

−−−−→ −−−−→
Now let K = (O, B) be a frame in An where B = ( OX1 , . . . , OXn ). Let O′ = φ(O), Yi = φ(Xi ) and
−−−−→ −−−−→
B ′ = ( O′ Y1 , . . . , O′ Yn ). Since lin(φ) is bijective, B ′ is a basis of D(An ) and for any point P we have

−−−−−−−→ −−−−−−−−−−→ −−−−−−−→ −−−−−−−→


[φ(P )]K′ = [ O′ φ(P ) ]B ′ = [ φ(O)φ(P ) ]B ′ + [ O′ φ(O) ]B ′ = MB,B ′ (lin(φ))[P ]B ′ + [ O′ φ(O) ]B ′ .
| {z } | {z }
A b

This shows that φ has the expression (6.1), i.e. it is an affine map, and since A is invertible, φ is a
transformation by the first paragraph of the proof.

6.1.1 Homogeneous coordinates and homogeneous matrix


Conceptually, homogeneous coordinates are used to describe the affine space An inside the projective
space Pn . This is outside the scope of these notes. However, there is also a computational advantage
to homogeneous coordinates which is what we describe here.

Definition 6.6. Let K be a reference frame of An . The homogeneous coordinates of the point P (p1 , . . . , pn )
are (p1 , . . . , pn , 1). Yes, they are the ordinary coordinates with an extra 1 at the end.

Definition 6.7. The homogeneous matrix of an affine map φ : An → Am defined with respect to some
reference frames K and K′ by φ(x) = Ax + b is
"#
A b
M̂K,K (φ) = .
0 1

The utility of introducing these notions is the following: composition of affine maps corresponds
to multiplication of homogeneous matrices. Indeed let φ(x) = A′ x + b′ be another affine map defined
on Am . Then

ψ ◦ φ(x) = ψ(φ(x)) = ψ(Ax + b) = A′ (Ax + b) + b′ = (A′ · A)x + (A′ · b + b′ ).

In terms of homogeneous matrices, we notice that


" ′
A · A A′ · b + b ′
# " ′
A b′ A b
# " #
M̂K′′ ,K (ψ ◦ φ) = = · = M̂K′′ ,K′ (ψ) · M̂K′ ,K (φ)
0 1 0 1 0 1

and the homogeneous coordinates of the values of the map φ can be obtained through a matrix
multiplication as well
" # " # " #
φ(x) A b x
= · .
1 0 1 1

Thus # " ′
A b′ A b x
" # " # " #
ψ ◦ φ(x)
= · · .
1 0 1 0 1 1

90
Geometry - first year [email protected]

6.2 Projections and reflections


6.2.1 Tensor product
Tensor products are widely used throughout Mathematics. We introduce this notion here to give
compact expressions for the projections and reflections discussed in the following subsections.
Definition 6.8. Let v(v1 , . . . , vn ) and w(w1 , . . . , wn ) be two vectors with components relative to a basis
B. The tensor product v ⊗B w is the n × n matrix defined by (v ⊗B w)ij = vi wj . In other words

v1  h v1 w1 . . . v1 wn 
   

v ⊗B w = v · wT =  ...  · w1 . . . wn =  ... ..  .


  i 
. 
  
vn v n w1 . . . v n wn

We write ⊗ instead of ⊗B if it is clear from the context what B is.


Proposition 6.9. The map ⊗B : Rn × Rn → Matn×n (R) given by (v, w) 7→ v ⊗B w has the following
properties:
1. It is linear in both arguments,
2. (v ⊗B w)T = w ⊗B v.
3. If B is orthonormal then (u ⊗B v) · w = ⟨v, w⟩u for any u, v, w ∈ Vn .
Proof. 1. Linearity of from the linearity of matrix multiplication. To see this for the first argument,
take two scalars α, β, three vectors u, v, w and notice that
(αu + βv) ⊗ w = (αu + βv) · wT = α(u · wT ) + β(v · wT ) = αu ⊗ w + βv ⊗ w.
2. For the second claim we use properties of transposition of matrices
(v ⊗B w)T = (v · wT )T = (wT )T · vT = w · vT = w ⊗ v.
3. For the last point we use the fact that ⟨v, w⟩ = vT · w since we are in an orthonormal basis. Then
(u ⊗B v) · w = (u · vT ) · w = u · (vT · w) = u · ⟨v, w⟩ = ⟨v, w⟩ · u.

6.2.2 Parallel projection on a hyperplane


Example 6.10. Fix a reference frame K of A2 . We want to project on the line (hyperplane)
ℓ : x+y −1 = 0
in the direction of the vector v(−2, −1). How do we do this? We do it pointwise. Take an arbitrary
point P (xP , yP ) and consider the line ℓP passing through P in the direction of v. It has parametric
equations (
x = xP − 2t
ℓP :
y = yP − t.
The projection of P on ℓ in the direction of v is the point P ′ = ℓ ∩ ℓP .

91
Geometry - first year [email protected]


ℓP
P
P′

Figure 6.1: Projection of the point P on the line ℓ in the direction of the vector v.

To determine P ′ , we check which point on ℓP satisfies the equation of ℓ


1
xP − 2t + yP − t − 1 = 0 ⇒ t = (xP + yP − 1).
3
Thus, the projection of P is
1 2 2
" # " #" # " #
′ 3 x P − 3 yp + 3 1 1 −2 xP 1 2
[P ]K = = + .
1
− 3 xP + 3 yp + 13
2
3 −1 2 yP 3 1
| {z } |{z}
A b

Definition 6.11. Let H be a hyperplane and let v be a vector in Vn which is not parallel to H. For
any point P ∈ En there is a unique line ℓP passing through P and having v as direction vector. The
line ℓP is not parallel to H, hence, it intersects H in a unique point P ′ . We denote P ′ by PrH,v (P ) and
call it the projection of the point P on the hyperplane H parallel to v. This gives a map

PrH,v : An → An

called, the projection on the hyperplane H parallel to v.

Fix a reference frame K = (O, B) of An . Consider the hyperplane

H : a1 x1 + · · · + an xn + an+1 = 0 (6.4)

92
Geometry - first year [email protected]

and a line ℓP passing through the point P (p1 , . . . , pn ) and having v(v1 , . . . , vn ) as direction vector:

ℓP = {P + tv : t ∈ R}. (6.5)

The intersection ℓP ∩ H can be described as follows

P + tv ∈ ℓ ∩ H ⇔ a1 (p1 + tv1 ) + · · · + an (pn + tvn ) + an+1 = 0.

So, the intersection point PrH,v (P ) = P ′ is

a1 p1 + · · · + an pn + an+1 aT · P + an+1
P′ =P − v=P − v. (6.6)
a1 v1 + · · · + an vn aT · v

where a = a(a1 , . . . , an ) and where in the second equality we use the convention that points and vectors
are identified with column matrices of their coordinates and components respectively. Hence, if we
denote by p1′ , . . . , pn′ the coordinates of the projected point PrH,v (P ) then
 ′

 p1 = p1 + µv1

 .. a1 p1 + · · · + an pn + an+1
where µ = − .



 . a1 v1 + · · · + an vn
 p′ = p + µv

n n n

Since aT · P · v = v · aT · P , in matrix form we have

v · aT
!
a
PrH,v (P ) = In − T · P − Tn+1 v
v ·a v ·a

where In is the n × n-identity matrix. In particular, if B is orthonormal, the linear part of this map is
!
v⊗a
MB (lin(PrH,v )) = In − .
⟨v, a⟩

Parallel projections on hyperplanes are affine maps. Obviously, they are not bijective, so

PrH,v ∈ Endaff (An ) but PrH,v < AGL(An ).

Definition 6.12. The orthogonal projection Pr⊥ n


H on the hyperplane H ⊆ E is the projection on H in
the direction of a vector which is orthogonal to H, i.e.

Pr⊥
H = PrH,v

where v is a normal vector of H. With the above notation we see that


!
a⊗a a

PrH (P ) = In − 2
· P − n+1 a
|a| |a|2

since we may choose v = a.

93
Geometry - first year [email protected]

6.2.3 Parallel reflection in a hyperplane


Example 6.13. We consider again the setup in Example 6.10. But this time, we want to reflect in the
line (hyperplane)
ℓ : x+y −1 = 0
in the direction of the vector v(−2, −1). We do it pointwise. Take an arbitrary point P (xP , yP ) and
consider the line ℓP passing through P in the direction of v. It has parametric equations
(
x = xP − 2t
ℓP :
y = yP − t.

The reflection of P in ℓ in the direction of v is the point P ′ such that Prℓ,v (P ) = ℓ ∩ ℓP is the midpoint
of the segment [P P ′ ]. Thus, identifying points with column matrices of their coordinates relative to
K we have
P +P′
= Prℓ,v (P ) ⇒ P ′ = 2 Prℓ,v (P ) − P .
2

ℓP
P
P′

Figure 6.2: Reflection of the point P in the line ℓ parallel to the vector v.

Using the calculation in Example 6.10, the reflection of P is


" 1
− 3 xP − 43 yp + 34
# " #" # " #
′ 1 −1 −4 xP 2 2
[P ]K = 2 = + .
− 3 xP + 13 yp + 23 3 −2 1 yP 3 1
| {z } |{z}
A b

Definition 6.14. Let H be a hyperplane and let v be a vector in Vn which is not parallel to H. For
any point P ∈ An there is a unique point P ′ such that PrH,v (P ) is the midpoint of the segment [P P ′ ].
We denote P ′ by RefH,v (P ) and call it the reflection of the point P in the hyperplane H parallel to v. This
gives a map
RefH,v : An → An
called, the reflection in the hyperplane H parallel to v.

94
Geometry - first year [email protected]

We keep the notation in the previous section. In particular the hyperplane H is given by the
equation (6.4). The idea here is the same as in Example 6.13: PrH,v (P ) is the midpoint of the segment
[P P ′ ]. Thus, identifying points with column matrices of their coordinates relative to K we have

P +P′ aT · P + an+1
PrH,v (P ) = ⇒ RefH,v (P ) = P − 2 v.
2 vT · a

Again, since aT · P · v = v · aT · P , in matrix form we have

v · aT
!
a
RefH,v (P ) = In − 2 T · P − 2 Tn+1 v.
v ·a v ·a

In particular, if B is orthonormal, the linear part of this map is


!
v⊗a
MB (lin(RefH,v )) = In − 2 .
⟨v, a⟩

Parallel reflections in hyperplanes are affine maps. Obviously, they are bijective, so

RefH,v ∈ AGL(An ) ⊆ Endaff (An ).

Definition 6.15. The orthogonal reflection Ref⊥ n


H in the hyperplane H ⊆ E is the reflection in H par-
allel to a vector which is orthogonal to H, i.e.

Ref⊥
H = RefH,v

where v is a normal vector of H. With the above notation we see that


!
a⊗a a
RefH (P ) = In − 2 2 · P − 2 n+1

a (6.7)
|a| |a|2

since we may choose v = a.

95
Geometry - first year [email protected]

6.2.4 Parallel projection on a line


Example 6.16. Consider again Example 6.10. We have a line ℓ and a point P which we want to project
on ℓ in the direction of v(−2, −1). We know how to do this, but let us change the role of the Cartesian
equation with that of parametric equations. Parametric equations for ℓ are
(
x = 1−t
ℓ: .
y=t

For an arbitrary point P (xP , yP ) the line ℓP passing through P has direction space given by the equa-
tion
D(ℓP ) : x − 2y = 0

with respect to the basis B of the current coordinate system K. Thus, ℓP is described by

ℓP : (x − xP ) − 2(y − yP ) = 0.

The projection of P on ℓ in the direction of v is the point P ′ = ℓ ∩ ℓP and the corresponding Figure is
6.1. The only difference is that we describe ℓ and ℓP with different types of equations. To determine
P ′ we find the intersection by plugging in the points of ℓ in the equation of ℓP

1
(1 − t − xP ) − 2(t − yP ) = 0 ⇒ t = − (xP − 2yP − 1).
3
As expected, a short calculation shows that P ′ has the same expression as in Example 6.10.

Definition 6.17. Let ℓ be a line and let W be an (n − 1)-dimensional vector subspace in Vn which is
not parallel to ℓ. For any point P ∈ An there is a unique hyperplane HP passing through P and having
W as associated vector subspace. The hyperplane HP is not parallel to ℓ, hence, it intersects ℓ in a
unique point P ′ . We denote P ′ by Prℓ,W (P ) and call it the projection of the point P on the line ℓ parallel
to W. This gives a map
Prℓ,W : An → An

called, the projection on the line ℓ parallel to W.

96
Geometry - first year [email protected]

With respect to the reference frame K, the vector subspace W is given by a homogeneous equation
 
x1 
x2 
 
T 
W : a1 x1 + a2 x2 + · · · + an xn = 0 ⇔ a ·  .  = 0 (6.8)
 .. 
 
xn

where a = a(a1 , . . . , an ). Thus, for a given point P (p1 , . . . , pn ) ∈ An , the equation of HP is


 
x1 
x2 
 
HP : a1 (x1 − p1 ) + a2 (x2 − p2 ) + · · · + an (xn − pn ) = 0 ⇔ T 
a ·  .  = aT · P
 .. 
 
xn

and Prℓ,W (P ) = PrHP ,v (Q) for a fixed but arbitrary point Q(q1 , . . . , qn ) ∈ ℓ. Hence, if we denote by
p1′ , . . . , pn′ the coordinates of the projected point Prℓ,W (P ) then, by (6.6),
 ′
 p1 = q1 + v 1 µ
aT · Q − aT · P

..


where µ = −



 . aT · v
 p′ = q + v µ

n n n

In matrix form we can rearrange this as follows

v · aT v · aT
!
Prℓ,W (P ) = T P + In − T Q.
v ·a v ·a

In particular, if B is orthonormal, the linear part of this map is


v⊗a
MB (lin(Prℓ,W )) = .
⟨v, a⟩

Parallel projections on lines are affine maps. Obviously, they are not bijective, so

Prℓ,W ∈ Endaff (An ) but Prℓ,W < AGL(An ).

Definition 6.18. The orthogonal projection Pr⊥ n


ℓ on the line ℓ ⊆ E is the projection on ℓ parallel to
vectors which are orthogonal to the line ℓ, i.e.

Pr⊥
ℓ = Prℓ,v⊥

where v is a direction vector of ℓ. With the above notation we see that


!
⊥ a⊗a a⊗a
Prℓ (P ) = P + In − Q
|a|2 |a|2

since we may choose v = a.

97
Geometry - first year [email protected]

6.2.5 Parallel reflection in a line


Definition 6.19. Let ℓ be a line and let W be an (n − 1)-dimensional vector subspace in Vn which is
not parallel to ℓ. For any point P ∈ An there is a unique point P ′ such that Prℓ,W (P ) is the midpoint
of the segment [P P ′ ]. We denote P ′ by Refℓ,W (P ) and call it the reflection of the point P in the line ℓ
parallel to W. This gives a map
Refℓ,W : An → An
called, the reflection in the line ℓ parallel to W.

As in Section 6.2.4, the vector subspace W is given by the homogeneous equation 6.8. The idea is
similar to the one used in Section 6.2.3: since PrH,v (P ) is the midpoint of the segment [P P ′ ], we have
Refℓ,W (P ) = 2 Prℓ,W (P ) − P
Here again we use the convention that point and vectors are identified with column matrices of their
coordinates and components respectively. Rearranging this in matrix form we obtain
v · aT v · aT
! !
Refℓ,W (P ) = 2 T − In P + 2 In − T Q.
v ·a v ·a
where Q(q1 , . . . , qn ) is a point on ℓ and v(v1 , . . . , vn ) is a direction vector for ℓ. In particular, if B is
orthonormal, the linear part of this map is
v⊗a
MB (lin(Refℓ,W )) = 2 −I .
⟨v, a⟩ n
Parallel reflections in lines are affine maps. Obviously, they are bijective, so
Refℓ,W ∈ AGL(An ) ⊆ Endaff (An ).
Definition 6.20. The orthogonal reflection Ref⊥ n
ℓ in the line ℓ ⊆ E is the reflection in ℓ parallel to
vectors which are orthogonal to the line ℓ, i.e.
Ref⊥
ℓ = Refℓ,v⊥
where v is a direction vector of ℓ. With the above notation we see that for any point Q ∈ ℓ
! !
⊥ a⊗a a⊗a
Refℓ (P ) = 2 2 − In P + 2 In − Q
|a| |a|2
since we may choose v = a.

98
CHAPTER 7

Isometries

Contents
7.1 Affine form of isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.1.1 The Euclidean space En (second revision) . . . . . . . . . . . . . . . . . . . . . 102
7.2 Isometries in dimension 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.2.1 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.2.2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.3 Isometries in dimension 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.3.1 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.3.2 Euler angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.3.3 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.4 Moving points with isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.4.1 Cycloids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.4.2 Surfaces of revolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

99
Geometry - first year [email protected]

7.1 Affine form of isometries


Definition 7.1. An isometry is a map φ : En → En which preserves distances, i.e.

d(φ(P ), φ(Q)) = d(P , Q)

for any points P , Q ∈ En .


Proposition 7.2. Isometries are affine transformations.
Proof. Let φ : En → En be an isometry. We prove our claim using the characterization of affine
transformations in Proposition 6.5.
The map φ is injective: if φ(P ) = φ(Q) then

0 = d(φ(P ), φ(Q)) = d(P , Q)

which implies that P = Q.


The map φ preserves lines: consider three collinear points A, B, C ∈ En . One of the points will
lie between the other two, assume that C ∈ [AB]. Then d(A, B) = d(A, C) + d(C, B). Let A′ = φ(A),
B′ = φ(B′ ) and C ′ = φ(C). Then d(A′ , B′ ) = d(A, B), d(A′ C ′ ) = d(A, C) and d(B′ , C ′ ) = d(B, C). Therefore

d(A′ , B′ ) = d(A, B) = d(A, C) + d(C, B) = d(A′ , C ′ ) + d(C ′ , B′ )

which implies C ′ ∈ [A′ B′ ].


The map φ preserves oriented ratios: notice that

d(A′ , C ′ ) d(A, C)
=
d(A′ , B′ ) d(A, B)
hence
−−−′ −→′ −−→
AC d(A′ , C ′ ) d(A, C) AC
−−−′−→′ = ± ′ , B′ )
= ± = ± −−→ .
AB d(A d(A, B) AB
But, the two ratios have the same sign since, as in the previous paragraph, C is between A and B if
and only if C ′ is between A′ and B′ .

Remark (Notation). At this point some simplifications in notation are useful. When it is clear from
the context that we are working in a frame K = (O, B), we will simply write v instead of [v]B . We
identify points with column matrices. For example, we write x to mean the column matrix with
entries (x1 , . . . , xn ). This will represent both the coordinates of a point and the components of the
position vector.
Moreover, when there is no risk of confusion, we denote the linear map induced by an afine
transformation with the same letter, i.e. we write ψ instead of lin(ψ). The arguments are either
points or vectors, and the context will make it clear which one it is.
Example 7.3. Translation maps are isometries. Let v be a vector then, the translation with the vector
v is the map Tv : En → En given by Tv (x) = x + v. It is just the translation map of the affine structure
of En where we fixed the vector argument to be v.

100
Geometry - first year [email protected]

Proposition 7.4. Let φ ∈ AGL(En ) be an affine transformation given by φ(x) = Ax + b with respect to
some orthonormal frame. The following are equivalent:
1. φ is an isometry
2. φ preserves lengths of vectors
3. φ preserves the scalar product, i.e. for any vectors v, w we have
⟨φ(v), φ(w)⟩ = ⟨v, w⟩.
−−→
Proof. The length of a vector AB is by definition the length of the segment [AB] which is the distance
between A and B, so 1. is equivalent to 2. If φ is an isometry, it is an affine map, by Proposition 7.2.
In particular it preserves oriented ratios by Proposition 6.5. From this it follows that it preserves the
cosine of angles. Then
⟨φ(v), φ(w)⟩ = |φ(v)| · |φ(w)| · cos ∡(φ(v), φ(w)) = |v| · |w| · cos ∡(v, w) = ⟨v, w⟩.
−−→
For the converse, let v = AB and notice that if φ preserves the scalar product then
q q
−−→ −−→ −−→ −−→ −−→ −−→
d(φ(A), φ(B)) = |φ( AB )| = ⟨φ( AB ), φ( AB )⟩ = ⟨ AB , AB ⟩ = | AB | = d(A, B).
This shows that 3. implies 1. and 2.
Proposition 7.5. Let φ ∈ AGL(En ) be an affine transformation given by φ(x) = Ax + b with respect to
some orthonormal frame. The following are equivalent:
1. φ is an isometry
2. A−1 = AT .
Proof. Consider ψ(x) = φ(x) − b = Ax. Since translations are isometries, φ is an isometry if and only if
ψ is an isometry. Let K = (O, B) with B = (e1 , . . . , en ) be the orthonormal frame with respect to which
φ is given in the form φ(x) = Ax + b. Let f1 = ψ(e1 ) = Ae1 , f2 = ψ(e2 ) = Ae2 , . . . , fn = ψ(en ) = Aen .
To see that 1. implies 2., notice that if ψ is an isometry, by Proposition 7.4 we have
(
1 if i = j
⟨fi , fj ⟩ = ⟨ψ(ei ), ψ(ej )⟩ = ⟨ei , ej ⟩ = .
0 if i , j
Since the components of fi = Aei are the entries in the i-th column of the matrix A, it follows that
AT A is the identity matrix In .
To see that 2. implies 1. consider two arbitrary points x and y in En . We have |φ(x) − φ(y)| =
|(Ax + b) − (Ay + b)| = |A(x − y)| and therefore
|φ(x) − φ(y)|2 = |A(x − y)|2
= ⟨A(x − y), A(x − y)⟩
= (A(x − y))T · A(x − y)
= (x − y)T AT A(x − y)
= (x − y)T (x − y)
= |x − y|2
which is equivalent to d(φ(x), φ(y)) = d(x, y).

101
Geometry - first year [email protected]

Definition 7.6. A matrix A ∈ Matn×n (R) such that AT A = In is called orthogonal and the set of all such
matrices is denoted by O(n).
Proposition 7.7. If A is an orthogonal matrix then det(A) = ±1.
Proof. Since AT A = In we have 1 = det(In ) = det(AT A) = det(AT ) det(A) = det(A)2 .

Definition 7.8. The set of matrices in O(n) with determinant 1 is denoted by SO(n). Such matrices
are called special orthogonal. The set O(n) is a subgroup of AGL(An ) and SO(n) is a normal subgroup
of O(n). With group theory notation we write this as follows:
SO(n) ⊴ O(n) ≤ AGL(Rn ).
Definition 7.9. Let φ be an isometry of En given by φ(x) = Ax + b with respect to some right-oriented
orthonormal frame. Then φ is called a displacement, or a direct isometry, if A ∈ SO(n). Else, if det(A) =
−1, the map φ is called an indirect isometry.

7.1.1 The Euclidean space En (second revision)


In Section 4.2.1 we have revised En to be Rn with the standard scalar product. This point of view
encapsulates all aspects of the Hilbert’s axioms (Appendix A) but doesn’t immediately give a trans-
parent description of the congruence relation. This description can be done now: a segment [AB] is
congruent to a segment [CD] if and only if there is an isometry φ such that φ(A) = C and φ(B) = D.
This shows that the notion of congruence (in dimension n) is completely described by translations
and by the orthogonal group O(n).
It is tautological to say that congruence is described by isometries. The point made here is that
isometries can be classified and described precisely by focusing the analysis on O(n). We do this for
dimension 2 and 3 in the following two sections.

7.2 Isometries in dimension 2


7.2.1 Rotations
Proposition 7.10. A matrix A is in SO(2) if and only if A equals
" #
cos(θ) − sin(θ)
Rθ = (7.1)
sin(θ) cos(θ)
for some θ ∈ R.
Proof. It is easy to check that Rθ is special orthogonal. For the converse consider the matrix
" #
a b
A= ∈ SO(2).
c d
We have det(A) = ad − bc = 1 and
a2 + c2 ab + cd
" # " #
T 1 0
A A= = .
ab + cd b2 + d 2 0 1

102
Geometry - first year [email protected]

Thus, the entries of the matrix A satisfy the system

a2 + c 2 = 1



b2 + d 2 = 1



.

ab + cd = 0





 ad − bc = 1

From the first two equations it follows that there exist θ, θ̃ such that a = cos(θ), c = sin(θ), d = cos(θ̃)
and b = sin(θ̃). Then, the last two equations become
(
0 = ab + cd = cos(θ) sin(θ̃) + sin(θ) cos(θ̃) = sin(θ + θ̃)
1 = ad − bc = cos(θ) cos(θ̃) − sin(θ) sin(θ̃) = cos(θ + θ̃)

and therefore θ̃ = −θ + 2kπ for some integer k.

Corollary 7.11. A direct isometry φ of E2 that fixes a point is either the identity or a rotation. More-
over, the angle θ of the rotation is such that

tr(lin(φ))
cos(θ) = .
2

Proof. Let φ(x) = Ax + b with respect to a right oriented orthonormal frame. By Proposition 7.10, it
is a direct isometry if A ∈ SO(2) which is equivalent to A = Rθ where Rθ is the rotation matrix (7.1).
If θ = 0 then φ is a (possibly trivial) translation by b thus, it has a fixed point only if b = 0 in which
case φ equals the identity map, i.e. φ = Id. For the rest of the proof assume that θ , 0.
Let p be a fixed point for φ. Then

φ(p) = p ⇔ Rθ p + b = p ⇔ b = (I2 − Rθ )p. (7.2)

Now notice that


" #
1 − cos(θ) sin(θ)
det(I2 − Rθ ) = det = (1 − cos(θ))2 + sin(θ)2 = 2 − 2 cos(θ).
− sin(θ) 1 − cos(θ)

So, if θ , 0 then equation (7.2) has a unique solution p, i.e. the fixed point of φ is unique. To finish
the proof we show that φ is a rotation around the point p. We notice that

φ(x) − φ(p) = φ(x) − p = Rθ x + b − p = Rθ x + (I2 − Rθ )p − p = Rθ (x − p)

which means that φ rotates − −→ with θ around p. Finally, the trace of the liner map lin(φ) is the trace
px
of its matrix with respect to any orthonormal basis. Thus, we may use A = Rθ to conclude that

tr(lin(φ)) = tr(Rθ ) = 2 cos(θ).

103
Geometry - first year [email protected]

Remark. Let us notice the effect of rotations on coordinates and on basis vectors in dimension 2. Let
K = (O, B) be a right-oriented orthonormal frame of E2 with B = (i, j). A rotation around the origin
with angle θ is given by the map
" # " # " # " ′#
x cos θ − sin θ x x
7→ . = ′ .
y sin θ cos θ y y
| {z }
=Rotθ

Thus, Rotθ is a base change matrix MB ′ ,B where B ′ = (i′ , j′ ) is another orthonormal basis. The compo-
T
nents of i′ and j′ with respect to B are the columns of the matrix M−1 B ′ ,B = MB ′ ,B :
" # " #
′ cos θ ′ sin θ
i = , j = .
− sin θ cos θ

We notice that the vectors in B ′ are obtained by rotating the vectors in B by −θ. Indeed rotating
points counterclockwise with respect to K is equivalent to rotating K clockwise.

7.2.2 Classification
Theorem 7.12 (Chasles). A direct isometry of the plane E2 is either

a) the identity, or

b) a translation, or

c) a rotation.

Proof. Consider a direct isometry φ(x) = Ax + b given with respect to a right-oriented orthonormal
frame. By Proposition 7.10 A = Rθ where Rθ is the rotation matrix (7.1) for some θ. If θ = 0 then φ
the identity map if b = 0 and it is a translation if b , 0. Suppose therefore that θ , 0. Then, by the
proof of Corollary 7.11 the map φ has a (unique) fixed point p. Thus, φ is a direct isometry which
fixes a point and the proof is finished by applying Corollary 7.11.

Definition 7.13. A glide reflection in the line ℓ is the composition of an orthogonal reflection in ℓ and
a translation in the direction of ℓ.

Example 7.14. A reflection in the x-axis followed by a translation with λi is a glide reflection. In
matrix form the map is given by
" #" # " #
1 0 x1 λ
x 7→ Ref⊥
Ox (x) + λi = + .
0 −1 x2 0

Lemma 7.15. A rotation with angle θ around the origin after a reflection in the x-axis is a reflection
in the line y = tan(θ/2)x.

104
Geometry - first year [email protected]

Proof. The mentioned map, the reflection in the x-axis followed by a rotatation in the origin, has the
form φ : x 7→ Ax where
" #" # " #
cos(θ) − sin(θ) 1 0 cos(θ) sin(θ)
A= = .
sin(θ) cos(θ) 0 −1 sin(θ) − cos(θ)

Notice that det(A − I2 ) = 0 thus φ fixes at least a line. Consider the vector v(cos( θ2 ), sin( θ2 )). A calcu-
lation shows that (A − I2 )v = 0 which means that v is fixed by A. Hence the line passing through the
origin in the direction of v is fixed by φ. To see that it is a reflection, check that J(v) is mapped to
−J(v).

Theorem 7.16. An indirect isometry of the plane E2 fixes a line ℓ and is either

a) a reflection in ℓ, or

b) a glide-reflection in ℓ.

Proof. Let φ(x) = Ax + b be an indirect isometry given with respect to a right-oriented orthonormal
basis. Consider the reflection ψ(x) = Ref⊥ Ox (x) = Cx, which is also an indirect isometry. Since (φ ◦
ψ)(x) = ACx+b and since det(A) = det(C) = −1 we see that AC is an orthogonal matrix of determinant
1, i.e. φ ◦ ψ is a direct isometry. Then by Theorem 7.12, we have 3 cases.
If φ ◦ ψ is the idenity or a translation, then AC = I2 which implies A = C −1 . Since C is the matrix
of a reflection we have C 2 = In , hence A = C. Therefore φ is the composition of Ref⊥ Ox followed by a
translation with b(b1 , b2 ). Then
" #
1 0
φ(x) = x + b2 j +b1 i
0 −1
| {z }
ψ ′ (x)

and one checks (for example with (6.7)) that ψ ′ is a reflection in the line y = b2 /2. Hence φ is the
composition of a reflection and a (possibly trivial) translation along the reflection axis. Thus, if b1 = 0
it is a reflection and if b1 , 0 it is a glide reflection.
If φ◦ψ is a rotation then AC = Rθ for some θ. In this case we have A = Rθ C. By Lemma 7.15, φ is a
reflection in a line ℓ followed by a translation and the claim follows as in the previous paragraph.

7.3 Isometries in dimension 3


7.3.1 Rotations
Theorem 7.17 (Euler). A direct isometry φ of E3 that fixes a point is either the identity or a rotation
around an axis that passes through that point. Moreover, the angle θ of the rotation is such that

tr(lin(φ)) − 1
cos(θ) = .
2

105
Geometry - first year [email protected]

Proof. By choosing the fixed point of φ to be the origin, we may assume that φ has the form φ(x) = Ax
with respect to a right-oriented orthonormal frame. Since φ is a direct isometry, we have A ∈ SO(3).
A rotation around an axis fixes the rotation axis. To see that this is the case for φ, it suffices to show
that A has an eigenvector v for the eigenvalue 1 since then

φ(tv) = A(tv) = t(Av) = tv ∀t ∈ R

which means that φ fixes the line passing through the origin in the direction of v. Notice that

det(A − I3 ) = det(AT ) det(A − I3 ) since det(A) = 1


= det(AT (A − I3 ))
= det(AT A − AT )
= det(I3 − AT ) since A is orthogonal
= det((I3 − A)T )
= det(I3 − A)

and since A − I3 is a 3 × 3 matrix we have det(I3 − A) = − det(A − I3 ). Thus

det(A − I3 ) = − det(A − I3 ) ⇒ det(A − I3 ) = 0.

Thus, A admits 1 as eigenvalue. Let v be a corresponding eigenvector of length 1.


Next we show that φ is a rotation around the axis Rv. Choose a unit vector u1 ⊥ v and let
u2 = v × u1 . Then Au1 and Au2 are also orthogonal to v since

⟨Aui , v⟩ = ⟨Aui , Av⟩ = ⟨ui , v⟩ = 0.

In fact (u1 , u2 ) is a basis for v⊥ , the orthogonal complement to v. Since φ is an isometry it maps v⊥
to itself. In particular, restricting φ to v⊥ we have an isometry in dimension 2. Thus, with respect to
the basis B = (u1 , u2 , v) we have " #
B 0
A=
0 1
where B is a 2 × 2 matrix. Since B is right-oriented and φ is a direct isometry, we have det(A) = 1.
Therefore det(B) = 1. Hence φ restricts to a direct isometry on v⊥ . Since it fixes the origin, it cannot
be a translation. Therefore, by Theorem 7.12, it must be a (possibly trivial) rotation. Then, the matrix
of φ with respect to B is
 
cos(θ) − sin(θ) 0
[lin(φ)]B =  sin(θ) cos(θ) 0
 
0 0 1
 

for some θ ∈ R. This is a rotation around the axis Rv. Finally, the trace of the linear map lin(φ) is the
trace of the matrix [lin(φ)]B . Thus

tr(lin(φ)) = tr(Rθ ) = 2 cos(θ) + 1.

106
Geometry - first year [email protected]

Proposition 7.18 (Euler-Rodrigues). Let v be a unit vector and θ ∈ R. The rotation of angle θ and
axis Rv is given by
Rotv,θ (x) = cos(θ)x + sin(θ)(v × x) + (1 − cos(θ))⟨v, x⟩v. (7.3)

Proof. Let x = x∥ + x⊥ where x∥ is parallel to v and x⊥ is orthogonal to v. Since v is a unit vector, we


know that
x∥ = ⟨v, x⟩v and x⊥ = x − x∥ .
Let x′⊥ = v × x and notice that the vector x′⊥ is orthogonal to x∥ and x⊥ and has the same length as x⊥
since
|x′⊥ |2 = ⟨v × x, v × x⟩
= ⟨v, v⟩⟨x, x⟩ − ⟨v, x⟩⟨v, x⟩
= |x|2 − |x∥ |2
= |x⊥ |2
where for the second equality we used Lagrange’s identity. Therefore, sice Rotv,θ rotates x⊥ by θ
around the axis Rv, we have

Rotv,θ (x) = x∥ + cos(θ)x⊥ + sin(θ)x′⊥ .

Replacing x∥ , x⊥ and x′⊥ with their expressions in terms of x we obtain (7.3).

7.3.2 Euler angles


Fix a right oriented orthonormal frame K = (O, B). Rotations around the coordinate axes are given
by the following matrices:
     
1 0 0   cos θ 0 sin θ  cos θ − sin θ 0
Rotx,θ = 0 cos θ − sin θ  , Roty,θ =  0 1 0  , Rotz,θ =  sin θ cos θ 0 .
     
0 sin θ cos θ − sin θ 0 cos θ 0 0 1
     

Consider the composition of the following three rotations Rotz,γ Rotx,β Rotz,α =
 
cos γ cos α − sin γ cos β sin α − cos γ sin α − sin γ cos β cos α sin γ sin β 
sin γ cos α + cos γ cos β sin α − sin γ sin α + sin γ cos β cos α − cos γ sin β  .
 
 
sin β sin α sin β cos α cos β

You may think of the composition of these three rotations as follows: Each of the three rotations is
a base change matrix. The first rotation, Rotz,α = MK′ ,K , rotates the versor of the x-axis and that of the
y-axis by −α (and therefore rotates points with respect to K by α). The next rotation Rotx,β = MK′′ ,K′ ,
rotates the versors of the y ′ -axis and that of the z′ -axis by −β (and therefore rotates points with
respect to K′ by β). Similarly with the last rotation. The observation here is that Rotx,β is a rotation
around the current x-axis, i.e. a rotation around the first axis of the coordinate system that you
are in. If we want to point out that at each step the coordinate system is changing we may write
Rotz′′ ,γ Rotx′ ,β Rotz,α for the overall rotation.

107
Geometry - first year [email protected]

Figure 7.1: Euler angles1

Proposition 7.19. All rotations in dimension 3 are of this form, i.e. any matrix in SO(3) can be
written in this form for some α, β, γ ∈ R.

Definition 7.20. You may restrict the range of the values α, β, γ by α ∈ [0, 2π[, β ∈ [0, π[ and γ ∈
[0, 2π[. Then, each triple (α, β, γ) corresponds to a unique rotation Rotz,γ Rotx,β Rotz,α ∈ SO(3). The
angles α, β and γ are called Euler angles. Another way of describing rotations in E3 is via quaternions
(see Appendix K).

Remark. Euler angles give coordinates on SO(3). Not to be confused with spherical coordinates (See
Appendix D).

7.3.3 Classification
Definition 7.21. A glide-rotation (or helical displacement) is the composition of a rotation in E3 with
a translation parallel to the rotation axis.

Theorem 7.22 (Chasles). A direct isometry of the Euclidean space E3 is either

a) the identity, or

b) a translation, or

c) a rotation, or

d) a glide-rotation.

Proof. Let φ be the direct isometry given by φ(x) = Ax + b with respect to a right-oriented frame. We
know that A ∈ SO(3). If A = I3 , then φ is a translation or the identity. Suppose that A , I3 . Applying
Theorem 7.17 to x 7→ Ax we see that A is a rotation matrix. Let v be a direction vector of length 1 for
the rotation axis of A and decompose the vector b into its components parallel to v and orthogonal
to v:
b = b1 + b2 where b1 ∥v and b2 ⊥ v.
1 Image source: Wikipedia

108
Geometry - first year [email protected]

We have b1 = ⟨v, b⟩v and b2 = (v × b) × v. Now consider the isometries

φ1 (x) = x + b1 and φ2 (x) = Ax + b2 .

Let π be the plane passing through the origin and orthogonal to the rotation axis. Then φ2 is an
isomtry which leaves π invariant (φ2 (π) = π). By Chasles’ Theorem in dimension 2 (Theorem 7.12),
the restriction of φ2 to π is a rotation around a fixed point p ∈ π. Thus, φ2 is a rotation around the
axis p + Rv. On the other hand, φ1 is a translation by b1 parallel to the rotation axis. This finishes
the proof since φ = φ1 ◦ φ2 .

Theorem 7.23. An indirect isometry of the Euclidean space E2 fixes a plane π and is either

a) a reflection in π, or

b) the composition of a reflection in π with a translation parallel to π, in which case it is called a


glide-reflection, or

c) the composition of a reflection in π with a rotation of axis orthogonal to π, in which case it is


called a rotation-reflection.

Proof. Let φ be the direct isometry given by φ(x) = Ax+b with respect to a right-oriented frame. One
can show that a matrix A ∈ O(3) of determinant −1 admits −1 as an eigenvalue. Let v be an eigen-
vector for the eigenvalue −1, i.e. Av = −v. Notice that we also have AT v = A−1 v = −v. Calculating we
obtain
⟨v, φ(x)⟩ = ⟨v, Ax + b⟩
= ⟨v, Ax⟩ + ⟨v, b⟩
= ⟨AT v, x⟩ + ⟨v, b⟩
= ⟨−v, x⟩ + ⟨v, b⟩
Thus, the plane
1
π : ⟨v, x⟩ = ⟨v, b⟩
2
is invariant under the isometry φ. Moreover, if we choose the frame with the first basis vectors
parallel to π then " #
B 0
A=
0 −1
and det(A) = det(B). Thus, the restriction of φ to the plane π is a direct isometry. Therefore, by
Theorem 7.12, it is either the identity, or a translation or a rotation, which correspond to the three
cases stated in the theorem.

7.4 Moving points with isometries


7.4.1 Cycloids
Here we restrict to the Euclidean plane E2 . Rotation matrices give an effective way of describing
circular motions which can be used to construct curves as trajectories of a particle. Here we look

109
Geometry - first year [email protected]

at some cycloids. For this, let us first deduce the homogeneous matrix of a rotation around a point
C(c1 , c2 ) ∈ E2 .
       
1 0 c1  cos(θ) − sin(θ) 0 1 0 −c1  cos(θ) − sin(θ) −c1 cos(θ) + c2 sin(θ) + c1 
0 1 c2  ·  sin(θ) cos(θ) 0 · 0 1 −c2  =  sin(θ) cos(θ) −c1 sin(θ) − c2 cos(θ) + c2 
       
  
0 0 1 0 0 1 0 0 1 0 0 1
    

Now, if you choose the center C to be the point (0, 1), you have the following homogenous rotation
matrix  
cos(θ) − sin(θ) sin(θ) 
 sin(θ) cos(θ) − cos(θ) + 1
 
 
0 0 1
If you move the origin with this rotation you obtain
    
cos(θ) − sin(θ) sin(θ)  0  sin(θ) 
 sin(θ) cos(θ) − cos(θ) + 1 0 = − cos(θ) + 1
    
     
0 0 1 1 1

since the homogeneous coordinates of the origin are (0, 0, 1). If you ‘vary θ with time t’ you are
rotating the origin around C counterclockwise. If you want to have a clockwise motion you just
change the sign of the angle to obtain
   
0  − sin(t) 
0 7→ − cos(t) + 1 .
   
   
1 1

Now, if at the same time t you translate the point along the x-axis in the direction of i, you get
      
0 1 0 t   − sin(t)   − sin(t) + t 
0 7→ 0 1 0 − cos(t) + 1 = − cos(t) + 1 .
       
       
1 0 0 1 1 1

What you obtain is the trajectory of the point O as it rotates on the blue circle while the circle is
moving like a wheel on the x-axis. The corresponding curve is called a cycloid:

Let’s do something else. Instead of rotating the circle on the x-axis let us rotate it on a bigger
circle centered at the origin. The small circle we can think of as the trajectory of the point P (3, 0)
rotated around the center C(4, 0). The corresponding rotation matrix is
   
cos(θ) − sin(θ) −xC cos(θ) + yC sin(θ) + xC  cos(θ) − sin(θ) −4 cos(θ) + 4
 sin(θ) cos(θ) −xC sin(θ) − yC cos(θ) + yC  =  sin(θ) cos(θ) −4 sin(θ) 
   
  
0 0 1 0 0 1

110
Geometry - first year [email protected]

and thus, rotating P (2, 0) with time t we obtain

    
cos(t) − sin(t) −4 cos(t) + 4 3 3 cos(t) − 4 cos(t) + 4
 sin(t) cos(t) −4 sin(t)  0 =  3 sin(t) − 4 sin(t) 
    

0 0 1 1 1
   

If at the same time we rotate around the origin, P will move on the small circle which rotates around
a big circle of radius 3 centered at the origin:

cos(t ′ ) − sin(t ′ ) 0 3 cos(t) − 4 cos(t) + 4


  
 sin(t ′ ) cos(t ′ ) 0  3 sin(t) − 4 sin(t)  = . . .
   
0 0 1 1
  

If we do this simultaneously, i.e. if we choose t ′ = t then we obtain the following trajectory for P :

However, if we want the small circle to rotate like a wheel on the big circle, then, after an entire
revolution of the small circle we need to have traversed the length of this circle on the big circle, i.e.
2π. But the big circle is 3 times longer, so we need to choose t = 3t ′ , i.e. the rotation on the small
circle is 3-times faster:

This is an example of an epicycloid.

111
Geometry - first year [email protected]

7.4.2 Surfaces of revolution


Another way of looking at Euler angles Rotz,γ Rotx,β Rotz,α (see Section 7.3.2) is by consider the tra-
jectory that you obtain on a given point when you vary α, β and γ. For instance
   
1 cos γ cos α − sin γ cos β sin α 
Rotz,γ Rotx,β Rotz,α 0 = sin γ cos α + cos γ cos β sin α  .
   
0 sin β sin α
   

If in this expression you fix γ = 0 and let α ∈ [0, 2π[, β ∈ [0, π[ vary, you obtain
   
1  cos α 
0 7→ cos β sin α  ,
   
0 sin β sin α
   

i.e. the trajectory of the point (1, 0, 0) is a sphere and varying α and β corresponds to the map
 
 cos α 
[0, 2π[×[0, π[∋ (α, β) 7→ cos β sin α  , (7.4)
 
sin β sin α
 

which is a parametrization of the sphere. You can get other parametrizations of the sphere if you fix
α and let β and γ vary.
From a different perspective, notice that a plane in E3 can be described as the set of points which
you touch if you translate a line in a given direction. This can be seen with the parametric equations:
             
x xQ  vx  wx   xQ  vx   wx 
π : y  = yQ  + s vy  + t wy  = yQ  + s vy  +t wy  (7.5)
             
z zQ vz wz zQ vz wz
             
| {z } | {z }
line ℓ translation with tw

This is a general method of constructing surfaces starting from curves: you start with a curve in E3
and apply a motion to it. What you obtain, if non-degenerate, is a parametrization of a surface. This
can be exemplified with the parametrization of the sphere in (7.4) which you can rewrite as follows
     
1 0 0  cos α   cos α 
(α, β) 7→ 0 cos β − sin β   sin α  = cos β sin α  .
     
 
0 sin β cos β 0 sin β sin α
     
| {z } | {z }
rotation Rotx,β circle in Oxy-plane

This describes the unit sphere centered at the origin as the set of points which you touch with the
unit circle centered at the origin in the Oxy-plane if you rotate the circle around the x-axis.

112
Geometry - first year [email protected]

Definition 7.24. A surface of revolution in E3 is a surface obtained by rotating a curve around a line
ℓ. The line ℓ is called the axis of the surface.

Example 7.25. In (7.5), instead of translating the line ℓ we can rotate it around a line which is parallel
to ℓ. In this way we obtain a cylinder. For example, if
     
x 1 0
ℓ : y  = 0 + s 0
     
z 0 1
     

and if we rotate around the z-axis we obtain a parametrization of a cylinder of radius 1 and axis the
z-axis:        
cos θ − sin θ 0  1 0  cos θ 
(θ, s) 7→  sin θ cos θ 0 0 + s 0 =  sin θ 
       
0 0 1 0 1 s
       

Example 7.26. If instead we consider two skew lines and rotate one around the other, we obtain

113
Geometry - first year [email protected]

hyperboloids of revolution. For example, if


     
x 1 0
ℓ : y  = 0 + s 1
     
z 0 1
     

and if we rotate around the z-axis we obtain a parametrization of a hyperboloid


       
cos θ − sin θ 0  1 0  cos θ − s sin θ 
(θ, s) 7→  sin θ cos θ 0 0 + s 1 = sin θ + s cos θ  .
       
0 0 1 0 1 s
       

114
CHAPTER 8

Curves and surfaces

Contents
8.1 Smooth curves in En . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.1.1 Kinematic versus geometric properties . . . . . . . . . . . . . . . . . . . . . . . 119
8.1.2 Area enclosed by a simple planar loop . . . . . . . . . . . . . . . . . . . . . . . 121
8.2 Smooth hypersurfaces in En . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.2.1 Level sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.2.2 Local parametrization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.2.3 Volume enclosed by a simple surface of revolution . . . . . . . . . . . . . . . . 126

115
Geometry - first year [email protected]

8.1 Smooth curves in En


Definition 8.1. A parametrized curve of the Euclidean space En is a map
γ : I → En
where I is a segment, a ray or a line. Fixing a unit segment and an orientation, we may view I as an
interval of R. Then, with respect to an orthonormal coordinate system K of En we have

γ1 (t)
 

[γ]K : R ⊃ I → Rn [γ(t)]K =  ... 



 
 
γn (t)
where γ1 , . . . , γn : I → R. When there is no risk of confusion, we write γ(t) instead of [γ(t)]K . The
parametrized curve γ is called smooth if γ1 , . . . , γn are smooth, meaning they are infinitely differen-
tiable (i.e., have derivatives of all orders). The parametrized curve γ is called piecewise smooth if it is
smooth except in a finite number of points.
Geometrically, curves are certain sets of points in a Euclidean space En . The curve C correspond-
ing to the parametrization γ is the image of the map γ
n o
C = γ(t) : t ∈ I .
It may have many different parametrizations. We say that a curve C is smooth if it admits a smooth
parametrization. Similarly, a curve is called piecewise smooth if it admits a piecewise smooth
parametrization.

Example 8.2. The cardioid is a curve obtained as the trajectory of a point on a circle that is rolling
around a fixed circle of the same radius. Let a be the radius of the two circles. As in Section ?? we may
deduce a parametrization of such a curve using rotations. For this, let the fixed circle be centered at
the origin, let the rolling circle have initial center (2a, 0) and consider the trajectory of the contact
point (a, 0) of the two circles in the initial position. We obtain
" # " #
a a + 2a(1 − cos(t)) cos(t))
γ(t) = Rott ◦ T(2a,0) ◦ Rott ◦T−(2a,0) · =
|{z} | {z } 0 2a(1 − cos(t)) sin(t))
rotation around the fixed circle rotation of the moving circle

116
Geometry - first year [email protected]

Notice that if we translate the reference frame to the right by (a, 0) the parametrization changes to
" #
2a(1 − cos(t)) cos(t)
γ(t) = (8.1)
2a(1 − cos(t)) sin(t)

since the coordinates are modified by (−a, 0). We will take this to be the parametrization of our
cardioid.

Figure 8.1: Cardioid (Image source: Wikipedia)

Example 8.3. The Archimedean spiral is defined as the trajectory of a point obtained by a rotation
with time t around a point O, followed by a homothety with factor |t| centered at the same point O.
More precisely, if we choose the center O to be the origin then a parametrization of this curve is given
by " #
t cos(t)
γ(t) = (8.2)
t sin(t)
The associated conical Archimedean spiral is obtained by simultaneously translating with time t in a
direction orthogonal to the spiral.

Figure 8.2: Archimedean spiral and associated conical spiral (Image source: Wikipedia)

117
Geometry - first year [email protected]

Definition 8.4. Let C be a piecewise smooth curve and P a point on C. A line ℓ is called tangent at P
to C if the points on the curve are arbitrarily close to ℓ in a small enough neighbourhood of P . More
concretely, if for any ε > 0 there exists δ > 0 such that
for all Q ∈ C ∩ B(P , δ) we have d(Q, ℓ) < ε
where B(P , δ) is the open ball centered at P and of radius δ. If the line ℓ is tangent at the point P to
the curve C then it is called the tangent line to C at P and it is denoted by
TP C.

Proposition 8.5. Let C be a smooth curve with parametrization γ : I → En . Consider the point
P = γ(t0 ) for some t0 ∈ I. Then the velocity vector at γ(t0 ),
 dγ 
 dt1 (t0 )
γ̇(t0 ) =  ...  , is a direction vector for Tp C.
 
 
 
 dγn
(t )

dt 0
Example 8.6. For√the cardioid parametrized with Equation (8.1), one calculates that for t0 = 2π/3 we
have γ̇(t0 ) = (−2a 3, 0).

Definition 8.7. The tangent space to the smooth curve C is the 1-dimensional vector subspace gener-
ated by the velocity vector
TP C := ⟨γ̇(t0 )⟩
for any parametrization γ of C. It is the direction space of the tangent line at P , i.e.
TP C = D(TP C).

118
Geometry - first year [email protected]

8.1.1 Kinematic versus geometric properties


Let C be a curve with smooth parametrization γ : I → En . Quantities (or objects) defined starting
from the curve γ(I) = C, that depend only on C and not on the parametrization γ, are called geometric
quantities (or objects); whereas those that dependent on the parametrization γ are called kinematic
quantities (or objects). Note, in particular, that the parametrization γ depends on the chosen frame
K whereas geometric properties are independent of K. For example the velocity vector and the
acceleration vector
 dγ   d2γ 
 dt1 (t0 )  21 (t0 )
dγ d 2γ  dt 
(t0 ) =  ...  and γ̈(t0 ) = 2 (t0 ) =  ..
 
γ̇(t0 ) :=
   
dt   dt  2 . 

 dγn d γn 
dt (t0 )

(t )

dt 2 0

are kinematic objects. Therefore, the velocity Vγ (t) = |γ̇(t)| and the acceleration Aγ (t) = |γ̈(t)| are
kinematic quantities.
Definition 8.8. The arc length of a parametrized curve γ : I → Rn is the integral of its velocity
Z
ℓ(γ) = Vγ (t)dt.
I
Example 8.9. Consider the cardioid in Example 8.2. We have
" # " #
(1 − cos(t)) cos(t) sin(2t) − sin(t)
γ(t) = 2a and therefore γ̇(t) = 2a .
(1 − cos(t)) sin(t) cos(t) − cos(2t)
Thus
 
Vγ (t)2 = 4a2 (sin(2t) − sin(t))2 + (cos(t) − cos(2t))2
 
= 4a2 2 − 2 sin(2t) sin(t) − 2 cos(t) cos(2t)
 
= 4a2 2 − 4 cos(t) sin(t)2 − 2 cos(t)(cos(t)2 − sin(t)2 )
 
= 4a2 2 − 2 cos(t) sin(t)2 − 2 cos(t)3
 
= 4a2 2 − 2 cos(t)
t t
= 16a2 sin( )2 therefore Vγ (t) = 4a sin( )
2 2
t
for I = [0, 2π] since 2 ∈ [0, π]. Then

t 2π
Z
t
ℓ(γ) = 4a sin( )dt = −8a cos( ) = 16a.
0 2 2 0
Proposition 8.10. The arc length of a parametrized curve is a geometric quantity.
Definition 8.11. The vector field of normalized velocity vectors of a parametrized curve γ : I → Rn is
denoted by
γ̇(t)
T(γ, t) := .
Vγ (u)

119
Geometry - first year [email protected]

Example 8.12. Continuing with Example 8.9, for the cardioid γ : [0, 2π] → R2 we have
" #
1 sin(2t) − sin(t)
T(γ, t) = .
2 sin( 2t ) cos(t) − cos(2t)

Figure 8.3: Velocity vectors and normalized tangent vectors.

Proposition 8.13. Up to sign, the vector field T(γ, t) is a geometric object attached to the curve
C = γ(I).

Definition 8.14. We say that a parametrization γ : I → R2 is a natural parametrization if Vγ (t) = 1 for


all t ∈ I.

Proposition 8.15. Any smooth curve admits a natural parametrization, i.e., it can be reparametrized
by arc length.

Definition 8.16. The vector field of curvature vectors of a parametrized curve γ : I → Rn is by


definition
1 d
K(γ, t) := T(γ, t).
Vγ (t) dt
The curvature of the curve C = γ(I) at the point γ(t) is

κ(γ, t) = |K(γ, t)|.

Proposition 8.17. If γ : I → R2 is a natural parametrization. Then

K(γ, t) = γ̈(t) and κ(γ, t) = |γ̈(t)|.

120
Geometry - first year [email protected]

Figure 8.4: Curvature vectors and osculating circles.

Proposition 8.18. Up to sign, the vector field K(γ, t) is a geometric object attached to the curve
C = γ(I). In particular, the curvature κ(γ, t) is a geometric quantity.

Proposition 8.19. A parametrized curve γ : [a, b] → Rn has zero curvature if and only if it is a line
segment.

8.1.2 Area enclosed by a simple planar loop


Definition 8.20. A parametrized curve γ : [a, b] → E2 is called simple if it is injective on (a, b), i.e. if
it doesn’t have self-intersection points. It is a loop if γ(a) = γ(b). A Jordan curve is a simple loop.

A Jordan curve divides the plane into an interior and an exterior. In some cases, it is possible
to compute the area of the interior. One may try to approximate the interior with a polygon (the
area of which can be calculated by subdividing into triangles) and the remaining pieces (yellow

121
Geometry - first year [email protected]

in the picture) may be calculated by choosing the frame such that the corresponding portion of γ
is described by a function along the sides of the polygon which can be integrated using standard
calculus techniques. While this approach is theoretically straightforward, it is often impractical
in most cases. Instead, one may approximate the interior using polygons with a large number of
vertices.
Example 8.21. One may think of a parametrization γ : [a, b] → E2 as a curved segment. There are
many curved segments between two points. Choosing two such segments that intersect only at their
endpoints, we obtain a simple loop.

If the two curved segments can be described as graphs of functions – i.e., γ1 (t) = (t, f (t)) for some
function f : [a, b] → R and similarly for the second parametrized curved segment γ2 – then the area
enclosed between them can be computed by ordinary integration.
Example 8.22. Consider again the case of the cardioid with parametrization γ : I → R2 given in (8.1).
From the symmetry of the curve, we see that the area of the region enclosed by the cardioid is twice
the area above the x-axis. This corresponds to t ∈ [0, π].

From Example 8.6, we see that the y-values of γ are a function of x for t ∈ [0, 2π 2π
3 ] and [ 3 , π]

3 3 3
respectively. Moreover, γ( 2π 3 ) = a(− 2 , 2 ). We may therefore translate the current frame with the
vector (− 23 , 0) to arrive in the red frame (see above image). In this frame we have
2a(1 − cos(t)) cos(t) + 32
" #
γ(t) = .
2a(1 − cos(t)) sin(t)

122
Geometry - first year [email protected]

Thus

Z
3
Z π
Areaint (C) = 2 x(y)dy − 2 x(y)dy

0 3

and, ignoring the constant 4a2 , we calculate:


Z 2π Z 2π Z 2π √
dy 3 π 9 3
3 3 3
 
2 2
x(y)dy = x dt = (1 − cos(t)) cos(t) + (sin(t) − cos(t) + cos(t)) dt = · · · = +
0 0 dt 0 2 | {z } 2 3
| {z }
ẏ(t)/2a
x(t)/2a
R
where one opens the parentheses of the integrand and calculates sin(x)n cos(x)m dx based on the
different values of n and m. Similarly, on the other side of the red y-axis, we obtain
Z 2π √
3 π 9 3 2
x(y)dy = 4a ( − )
0 2 3

and therefore
Areaint (C) = 6πa2 .
The above calculation is tedious. In many cases where a curve is defined by rotations, it is much
easier to integrate using polar coordinates (Appendix D). Notice that Equation (8.1) easily translates
into polar coordinates:
(
x(t) = 2a(1 − cos(t)) cos(t)
⇔ r(t) = 2a(1 − cos(t)) t ∈ [0, 2π[
y(t) = 2a(1 − cos(t)) sin(t)
p
where r = x2 + y 2 . Then, with Theorem 8.23 below, we calculate
Z 2π
2
Areaint (C) = 2a (1 − cos(t))2 dt
0
Z 2π
= 2a2 (1 − 2 cos(t) + cos(t)2 )dt
0
Z 2π
3 1
= 2a2 ( − 2 cos(t) + cos(2t))dt
0 2 2

3 1
= 2a2 ( t − 2 sin(t) + sin(2t))
2 4 0
= 6πa2 .

Theorem 8.23. Let γ be a parametrized curve, given in polar coordinates (r, θ) by the equation r =
f (θ) then
1 θ2
Z
Area(γ, θ1 , θ2 ) = f (θ)2 dθ.
2 θ1

123
Geometry - first year [email protected]

8.2 Smooth hypersurfaces in En


The term hypersurfaces refers to objects that generalize curves in E2 and surfaces in E3 to the n-
dimensional setting – in the same way that hyperplanes generalize lines and planes. Loosely speak-
ing, hypersurfaces are bent hyperplanes or portions thereof. There are two ways of describing such
objects relative to a Cartesian frame: through equations (as zero-sets of certain functions) or through
parametrizations.

8.2.1 Level sets


Given a function f : En → R, the zero set of f is

Z(f ) = {P ∈ En : f (P ) = 0}

it is a particular case of the level set of f

L(f , c) = {P ∈ En : f (P ) = c}

for the value c ∈ R.

Example 8.24. If we are in dimension 3 and f (x, y, z) = x2 √


+ y 2 + z2 relative to some frame, then Z(f )
equals the origin, while for c > 0 we get spheres of radius c:

L(f , c) = Z(f − c) = {(x, y, z) : x2 + y 2 + z2 = c}

and for c < 0 we have the empty set.

Example 8.25. In En , the function f (x1 , . . . , xn ) = a1 x1 +· · ·+an xn (where not all coefficients ai are zero)
describes a family of parallel hyperplanes. Indeed

L(f , c) = Z(f − c) = {(x1 , . . . , xn ) : a1 x1 + · · · + an xn = c}

are hyperplanes with normal vector n(a1 , . . . , an ), thus all are parallel to Z(f ), which passes through
the origin.
This example generalizes to affine maps. If φ : An → Am is an affine map given by φ(x) = Ax + b
then Z(φ) is an affine subspace of An defined by the system Ax + b = 0 and similarly for any c ∈ Am ,
the level set L(φ, c) is the affine subspace of An defined by the system Ax = c − b (again a family of
parallel subspaces, since the corresponding homogeneous system is the same).

Definition 8.26. While in the linear case (Example 8.25) level sets always define hypersurfaces (hy-
perplanes), this is no longer true in general (as can be seen in Example 8.24). In Example 8.24 only
the level sets L(f , c) with c > 0 are considered surfaces. To avoid empty sets, e.g. L(f , −1), we may sim-
ply ask that the level set is non-empty. (However, notice that if we work over the complex numbers
C, then L(f , −1) is non-empty, it is a so-called complex sphere.) The second degenerate case, where
the level set is a point, i.e. L(f , 0), can be avoided by asking that the Jacobian J(f ) = (∂x1 f , . . . , ∂xn f ) is
non-zero.

124
Geometry - first year [email protected]

We say that a smooth hypersurface is the set of solutions to an equation of the form

L(f , c) : f (x) = c

where f : Rn → R is a smooth function such that there exists p ∈ Rn with f (p) = c and such that
J(f )(p) is non-zero.

Example 8.27. Consider again the case of the cardioid C given via the parametrization (8.1). One can
show that it satisfies the equation

C : f (x, y) = 0 where f (x, y) = (x2 + y 2 )2 + 4ax(x2 + y 2 ) − 4a2 y 2 .

Various level sets of the function f are visible in the following image.

Example 8.28. In 3-dimensional space, consider the function f (x, y, z) = x2 + y 2 − z2 . Then the level
set L(f , 0) is a cone, L(f , c) is a hyperboloid of two sheets if c < 0, and a hyperboloid of one sheet if
c > 0.

125
Geometry - first year [email protected]

8.2.2 Local parametrization


The second way of describing a hypersurface is via parameters. We have already seen how to con-
struct surfaces via motion (e.g. the cone and the hyperboloid discussed in Section ??), which pro-
duced parametrizations γ : V → R3 with V a product of intervals.
In view of the definition in terms of equations (Definition 8.26) it is natural to ask how the two
descriptions are related. In particular, if a hypersurface is given by an equation, can we describe it via
a parametrization? The following proposition gives an affirmative answer locally. It is a particular
case of the Implicit Function Theorem.

Proposition 8.29. Let f : Rn → R be a smooth function, consider the zero set

S : f (x) = 0 and a point p(p1 , . . . , pn ) ∈ S.

If the Jacobian J(f )(p) is non-zero, then there exists a neighbourhood U of p, an open subset V ⊆ Rn−1
and a smooth injective map γ : V → Rn such that

∀q ∈ U : f (q) = 0 ⇔ ∃(t1 , . . . , tn−1 ) ∈ V : q = γ(t1 , . . . , tn−1 )

i.e., locally, in the neighbourhood U of the point p, the map γ is a parametrization of S.

8.2.3 Volume enclosed by a simple surface of revolution


Example 8.30. Rotating the cardioid around the x-axis yields the following apple-shaped surface.
Its volume is . . .

Example 8.31. Consider a circle of radius r in the Oyz-plane centered at (0, R, 0). A torus T is
obtained by rotating (revolving) this circle around the z-axis. Then r is the minor radius and R the

126
Geometry - first year [email protected]

major radius of the torus. From this description it is not difficult to obtain a parametrization. Such a
torus has volume 2π2 Rr 2 . Moreover, one may show that

T : f (x, y, z) = 0 where f (x, y, z) = (x2 + y 2 + z +2 +R2 − r 2 )2 − 4R2 (x2 + y 2 ).

Figure 8.5: A torus1

1 Image source: Wikipedia

127
CHAPTER 9

Quadratic curves (conics)

Contents
9.1 Ellipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.1.1 Geometric description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.1.2 Canonical equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.1.3 Parametric equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
9.1.4 Relative position of a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
9.1.5 Tangent line at a given point - algebraic . . . . . . . . . . . . . . . . . . . . . . 132
9.1.6 Tangent line at a given point - via gradients . . . . . . . . . . . . . . . . . . . . 133
9.1.7 Reflective properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
9.2 Hyperbola . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.2.1 Geometric description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.2.2 Canonical equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.2.3 Parametric equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
9.2.4 Relative position of a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.2.5 Tangent line at a given point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
9.2.6 Reflective properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
9.3 Parabola . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
9.3.1 Geometric description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
9.3.2 Canonical equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
9.3.3 Parametric equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
9.3.4 Relative position of a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
9.3.5 Tangent line at a given point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.3.6 Reflective properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

128
Geometry - first year [email protected]

Definition 9.1. A quadratic curve (or conic) in E2 is a curve defined by a quadratic equation

ax2 + bxy + cy 2 + dx + ey + f = 0

where a, b, c, d, e, f ∈ R.

Figure 9.1: Conic sections1

9.1 Ellipse
9.1.1 Geometric description

P′ P

F2 F1

Definition 9.2. An ellipse is the locus of points in E2 for which the sum of the distances from two
fixed points, called focal points (or foci), is constant.

9.1.2 Canonical equation


In general, we can describe a conic with a canonical equation, obtained with respect to a suitably
chosen coordinate frame.
1 Image source: Wikipedia

129
Geometry - first year [email protected]

Proposition 9.3. Let F1 and F2 be two points in E2 and let a > 0 be a real number. Choose the
−−−−→
coordinate system Oxy = (O, i, j) such that F1 and F2 are on the x-axis, such that F2 F1 has the same
direction as i and such that O is the midpoint of [F1 F2 ]. With these choices, the ellipse with focal
points F1 and F2 for which the sum of distances from the focal points is 2a has the equation

x2 y 2
Ea,b : + =1 (9.1)
a2 b2
for some number b > 0. We denote this ellipse by Ea,b and call a the semi-major axis and b the semi-
minor axis.

P
j a
b
a
F2 O i F1 O c F1

• Equation (9.1) is called the canonical equation of the ellipse Ea,b . Clearly, with respect to some
other coordinate system, the same ellipse will have a different equation.

• If 2c denotes the distance between F1 and F2 , then b2 = a2 − c2 .

• The intersections of Ea,b with the coordinate axes are the points (±a, 0) and (0, ±b).

• The numerical quantity r


c b2
ε= = 1−∈ [0, 1)
a a2
is called the eccentricity of the ellipse Ea,b . It measures the flatness or roundness of the ellipse.

• The canonical equation shows that M(xM , yM ) ∈ Ea,b if and only if (±xM , ±yM ) ∈ Ea,b .

9.1.3 Parametric equations


Parametric equations are never unique. Depending on the intended application, one parametrization
may be preferred over another. Equation (9.1) allows us to express y in terms of x:

b√ 2
y(x) = ± a − x2 .
a
This gives a partial parametrization of Ea,b . For the ‘northern part’ we have the parametrization
   b√ 
φ : [−a, a] → E2 given by φ(x) = x, y(x) = x, a2 − x2 .
a

130
Geometry - first year [email protected]

This is the graph of the function


b√ 2 −bx ab
y(x) = a − x2 for which y ′ (x) = √ and y ′′ (x) = √ .
a a a2 − x2 (x − a)(x + a) a2 − x2

Thus, we can use the known methods to verify the monotonicity and the convexity of y(x), which
describes this part of the ellipse.
A second way of parametrizing the ellipse Ea,b is with

φ : R → E2 defined by φ(t) = (a cos(t), b sin(t)).

It is easy to check using equation (9.1) that this is a parametrization. Moreover, with this parametriza-
tion, we may view the ellipse as the orbit of a rotation followed by a dilation along the coordinate
axes: " #" #" #
a 0 cos(t) − sin(t) 1
t 7→ .
0 b sin(t) cos(t) 0
We call such a transformation elliptical motion.

9.1.4 Relative position of a line

Consider the canonical equation (9.1) of the ellipse Ea,b . Let ℓ be a line with equation y = kx + m.
The intersection of the two objects is the set of points whose coordinates are solutions to the system
( 2 y2 ( 2 (kx+m)2
x x
+ = 1 + b2 = 1
a 2 b 2
⇔ a2 .
y = kx + m y = kx + m

The solutions to this system are (x, y) = (x, kx + m) where x is a solution to the first equation. Let us
now discuss that equation after substituting y = kx + m.

(b2 + a2 k 2 )x2 + 2kma2 x + a2 (m2 − b2 ) = 0.

This is a quadratic equation in x since a, b, k, m are fixed. The discriminant of this equation is

∆ = 4k 2 m2 a4 − 4a2 (m2 − b2 )(b2 + a2 k 2 ) = 4a2 b2 (a2 k 2 + b2 − m2 ).

So, the number of real solutions is controlled by a2 k 2 + b2 − m2 :

131
Geometry - first year [email protected]

√ √
• − a2 k 2 + b2 < m < a2 k 2 + b2 , in which case ℓ intersects Ea,b in two distinct points.

• m = ± a2 k 2 + b2 , in which case ℓ intersects Ea,b in a unique point. Such a point is a double
intersection point because it is obtained as a double solution to the algebraic equation. For these
two values of m, the line ℓ : y = kx + m is tangent to the ellipse. Therefore, if a slope k is given,
there are two tangent lines to the ellipse with this slope:

y = kx ± a2 k 2 + b2 .
√ √
• m < − a2 k 2 + b2 or m > a2 k 2 + b2 , in which case there is no intersection point between ℓ and
Ea,b .

9.1.5 Tangent line at a given point - algebraic


Consider an ellipse and a line:
x2 y 2
(
x = x0 + tvx
Ea,b : 2 + 2 = 1 and ℓ:
a b y = y0 + tvy
which have the point (x0 , y0 ) in common. When is ℓ tangent to the ellipse? If the intersection Ea,b ∩ ℓ
has a unique point. In order to determine when this is the case, we check which points on ℓ satisfy
the equation of the ellipse:
2
(x0 + vx t)2 (y0 + vy t)
+ = 1.
a2 b2
The parameters t satisfying the above equations correspond to points on ℓ, which lie on Ea,b . The
equation is equivalent to
 vx2 vy2  2 x2 y 2
 
y v 
 +  t + 2 x0 vx + 0 y t + 0 + 0 − 1 = 0.

 a2 b2  a2 b2 a2 b2
| {z }
=0

In order for ℓ to be tangent to Ea,b , there needs to be a unique solution t to the above equation. Since
t = 0 is obviously a solution, this needs to be the only solution. In other words, t = 0 should be a
double solution. For this to happen we must have
x 0 v x y0 v y
+ 2 = 0 ⇔ ⟨n, v⟩ = 0
a2 b
y
where n = ( xa20 , b02 ). Thus, ℓ is tangent to the ellipse if and only if the vector n is orthogonal to ℓ, i.e. if
and only if n is a normal vector for ℓ. It follows that ℓ is tangent to Ea,b in the point (x0 , y0 ) ∈ Ea,b if
and only if it satisfies the Cartesian equation:
x y
ℓ : 20 (x − x0 ) + 02 (y − y0 ) = 0.
a b
We call this line the tangent line to Ea,b at the point (x0 , y0 ) ∈ Ea,b and denote it by T(x0 ,y0 ) Ea,b . Rearrang-
ing the above equation we see that:
x x y y
T(x0 ,y0 ) Ea,b : 02 + 02 = 1. (9.2)
a b

132
Geometry - first year [email protected]

9.1.6 Tangent line at a given point - via gradients


It is possible to describe the tangent line T(x0 ,y0 ) Ea,b to Ea,b at the point (x0 , y0 ) ∈ Ea,b using gradients.
For this consider the map

x2 y 2
ψ : E2 → R defined by ψ(x, y) = +
a2 b 2
and notice that Ea,b = ψ −1 (1). The gradient at a point (x0 , y0 ) ∈ Ea,b is
x y x 0 y0
   
∇(x0 ,y0 ) (ψ) = 2 2 , 2 2 = 2 2, 2 .
a b (x0 ,y0 ) a b

By using a parametrization φ : I → E2 of Ea,b , and applying the chain rule to ∂t ψ(φ(t)), one shows
that ∇(x0 ,y0 ) (ψ) is orthogonal to the tangent vectors at the point (x0 , y0 ). In other words, ∇(x0 ,y0 ) (ψ) is a
normal vector at the point (x0 , y0 ) ∈ Ea,b . This gives a different way of obtaining the equation (9.2).

9.1.7 Reflective properties


An ellipse has the following reflective properties: a ray starting in one focal point is reflected by the
ellipse to the other focal point, i.e. if P is a point on the ellipse then the tangent line in P is the
exterior angle bisector of the angle ∡F1 P F2 .

F2 F1

133
Geometry - first year [email protected]

9.2 Hyperbola
9.2.1 Geometric description

P′

F2 F1

Definition 9.4. A hyperbola is the locus of points in E2 for which the difference of the distances from
two given points, the focal points, is constant.

9.2.2 Canonical equation


Proposition 9.5. Let F1 and F2 be two points in E2 and let a > 0 be a real number. Choose the
−−−−→
coordinate system Oxy = (O, i, j) such that F1 and F2 are on the x-axis, such that F2 F1 has the same
direction as i and such that O is the midpoint of [F1 F2 ]. With these choices, the hyperbola with focal
points F1 and F2 for which the absolute value of the difference of distances from the focal points
equals 2a has the equation
x2 y 2
Ha,b : 2 − 2 = 1 (9.3)
a b
for some positive scalar b ∈ R. We denote this hyperbola by Ha,b and call a the semi-major axis and b
the semi-minor axis.

F2 O i F1

134
Geometry - first year [email protected]

• Equation (9.3) is called the canonical equation of the hyperbola Ha,b . Clearly, the same hyperbola
may be represented by a different equation in another frame.
• If 2c denotes the distance between F1 and F2 , then b2 = c2 − a2 .
• The intersections of Ha,b with the coordinate axes are the points (±a, 0).
• The numerical quantity r
c b2
ε = = 1 + 2 ∈ (1, ∞)
a a
is called the eccentricity of the hyperbola Ha,b . It measures how widely the branches of the
hyperbola open.
• The canonical equation shows that M(xM , yM ) ∈ Ha,b if and only if (±xM , ±yM ) ∈ Ha,b .

c
b

F2 c O a F1

9.2.3 Parametric equations


Parametric equations are never unique. Depending on the intended application, one parametrization
may be preferred over another. Equation (9.3) allows us to express y in terms of x:
b√
y(x) = ± x2 − a2 .
a
This gives a partial parametrization of Ha,b . For the ‘northern part’ we have the parametrization
   b√ 
φ : (−∞, −a] ∪ [a, ∞) → E2 given by φ(x) = x, y(x) = x, x2 − a2 .
a
This is the graph of the function
b√ 2 2 bx ab
y(x) = x − a for which y ′ (x) = √ and y ′′ (x) = √
a a x 2 − a2 (a − x)(x + a) x2 − a2
Thus, we can use the known methods to verify the monotonicity and the convexity of y(x), which
describes this part of the hyperbola.

135
Geometry - first year [email protected]

9.2.4 Relative position of a line

Consider the canonical equation (9.3) of the hyperbola Ha,b . Let ℓ be a line with equation y =
kx + m. The intersection of the two objects is the set of points whose coordinates are solutions to the
system
( 2 y2 ( 2 (kx+m)2
x x
− = 1 − b2 = 1
a 2 b 2
⇔ a2 .
y = kx + m y = kx + m
The solutions to this system are (x, y) = (x, kx + m) where x is a solution to the first equation. Let us
discuss that equation:
(b2 − a2 k 2 )x2 − 2kma2 x − a2 (m2 + b2 ) = 0 (9.4)
This is a quadratic equation in x since a, b, k, m are fixed. The discriminant of this equation is
∆ = 4k 2 m2 a4 + 4a2 (m2 + b2 )(b2 − a2 k 2 ) = 4a2 b2 (m2 + b2 − a2 k 2 ).
So, the number of real solutions is controlled by m2 +b2 −a2 k 2 . . . if the equation is quadratic. Suppose
equation (9.4) is quadratic, i.e. b2 − a2 k 2 , 0.
√ √
• m < − a2 k 2 − b2 or m > a2 k 2 − b2 , in which case ℓ intersects Ha,b in two distinct points.

• m = ± a2 k 2 − b2 , in which case ℓ intersects Ha,b in a unique point. Such a point is a double
intersection point because it is obtained as a double solution to the algebraic equation. For these
two values of m, the line ℓ : y = kx + m is tangent to the hyperbola. Therefore, if k satisfies
b2 − a2 k 2 , 0, i.e., if k , ± ba , then there are two tangent lines to the hyperbola with slope k:

y = kx ± a2 k 2 − b2 .
√ √
• − a2 k 2 − b2 < m < a2 k 2 − b2 , in which case there is no intersection point between ℓ and Ha,b .
Suppose equation (9.4) is not quadratic, i.e. b2 − a2 k 2 = 0 and the equation is
−2kma2 x − a2 (m2 + b2 ) = 0
Notice that k , 0 and a , 0 and that k = ± ba . We have two cases:

136
Geometry - first year [email protected]

2 2
• m , 0, hence the unique solution x = − m2a+b 2 , which corresponds to a unique intersection point.
In this case, it is a simple intersection point, corresponding to a simple solution of the algebraic
equation (not a double solution).

• m = 0, in which case there is no intersection point and ℓ is either y = ba x or y = − ba x. These


are the two asymptotes of the hyperbola Ha,b . One can check with the parametrization in the
previous section that these two lines really are asymptotes.

9.2.5 Tangent line at a given point


The tangent line to Ha,b at the point (x0 , y0 ) ∈ Ha,b has the equation

x0 x y0 y
T(x0 ,y0 ) Ha,b : − 2 = 1. (9.5)
a2 b
This can be deduced either with the algebraic method or via the gradient as in the case of the ellipse.

9.2.6 Reflective properties


A hyperbola has the following reflective properties: for a point P on the hyperbola, the tangent line
at P is the angle bisector of the angle ∡F1 P F2 .

F2 F1

137
Geometry - first year [email protected]

9.3 Parabola
9.3.1 Geometric description

P′


F

Definition 9.6. A parabola is the locus of points in E2 for which the distance from a given point, the
focal point, equals the distance to a given line, the directrix.

9.3.2 Canonical equation


Proposition 9.7. Let F be a point, d be a line in E2 and p > 0 be a real number. Choose the coordinate
system Oxy = (O, i, j) such that F lies on the x-axis, such that the x-axis is orthogonal to d, the origin
−−→
is equidistant from d and F, and the vector i has the same direction as OF . With these choices, the
parabola with focal point F and directrix d for which d(F, d) = p has the equation

Pp : y 2 = 2px (9.6)

We denote this parabola by Pp and call p the parameter of the parabola.

• Equation (9.6) is called the canonical equation of the parabola Pp . Clearly, with respect to some
other coordinate system, the same parabola will have a different equation.
p p
• The focal point is F( 2 , 0) and the directrix has equation d : x = − 2 .

• The intersection of Pp with the coordinate axes is the point (0, 0).

• The canonical equation shows that M(xM , yM ) ∈ Pp if and only if (xM , ±yM ) ∈ Pp .

138
Geometry - first year [email protected]

j
ℓ ℓ
O i F p p F
2 2

9.3.3 Parametric equations


Parametric equations are never unique. Depending on the intended application, one parametrization
may be preferred over another. Equation (9.6) allows us to express y in terms of x:
p
y(x) = ± 2px.

This gives a partial parametrization of Pp . For the ‘northern part’ we have the parametrization
   p 
φ : [0, ∞) → E2 given by φ(x) = x, y(x) = x, 2px .

This is the graph of the function


√ √
p ′ 2p ′′ 2p
y(x) = 2px for which y (x) = √ and y (x) = − 3/2
2 x 4x

Thus, we can use the known methods to verify the monotonicity and the convexity of y(x), which
describes this part of the parabola.
We can, in fact, parametrize the whole parabola if we express x in terms of y, which is another
way of reading equation (9.6). We then have the parametrization
   y2 
φ : R → E2 given by φ(x) = x(y), y = ,y .
2p

9.3.4 Relative position of a line

139
Geometry - first year [email protected]

Consider the canonical equation (9.6) of the parabola Pp . Let ℓ be a line with equation y = kx + m.
The intersection of the two objects is the set of points whose coordinates are solutions to the system
( 2
(kx + m)2 = 2px
(
y = 2px
⇔ .
y = kx + m y = kx + m
The solutions to this system are (x, y) = (x, kx + m) where x is a solution to the first equation. Let us
discuss that equation:
k 2 x2 + 2(km − p)x + m2 = 0 (9.7)
This is a quadratic equation in x since p, k, m are fixed. The discriminant of this equation is
∆ = 4p(p − 2km).
So, the number of real solutions is controlled by p − 2km:
• km < p/2, in which case ℓ intersects Pp in two distinct points.
• km = p/2, in which case ℓ intersects Pp in a unique point. Such a point is a double intersection
point because it is obtained as a double solution to the algebraic equation. For this value of m,
the line ℓ : y = kx + m is tangent to the parabola. Therefore, if a slope k is given, there is one
tangent line to the parabola having the given slope:
p
y = kx + .
2k
• km > p/2, in which case there is no intersection point between ℓ and Pp .

9.3.5 Tangent line at a given point


The tangent line to Pp at the point (x0 , y0 ) ∈ Pp has the equation
T(x0 ,y0 ) Pp : yy0 = p(x + x0 ) (9.8)
This can be deduced either with the algebraic method or via the gradient as in the case of the ellipse.

140
Geometry - first year [email protected]

9.3.6 Reflective properties


A parabola has the following reflective properties: rays starting in the focal point are reflected by the
parabola into rays parallel to the axis of the parabola. Equivalently, rays inside the parabola which
are parallel to the axis are reflected in the focal point.

These properties are used for lenses, parabolic reflectors, satellite dishes, etc.

141
CHAPTER 10

Hyperquadrics

Contents
10.1 Hyperquadrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10.2 Reducing to canonical form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10.3 Classification of conics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.3.1 Isometric classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.3.2 Algorithm 1: Isometric invariants . . . . . . . . . . . . . . . . . . . . . . . . . . 155
10.3.3 Affine classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
10.3.4 Algorithm 2: Lagrange’s method . . . . . . . . . . . . . . . . . . . . . . . . . . 157
10.4 Classification of quadrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

142
Geometry - first year [email protected]

10.1 Hyperquadrics
Definition 10.1. A hyperquadric Q in En is the set of points whose coordinates satisfy a quadratic
equation, i.e.,
Xn n
X
Q: aij xi xj + bi x i + c = 0 (10.1)
i,j=1 i=1

with respect to some Cartesian frame.

• Hyperquadrics in E2 are called conic sections (see the previous chapter).


• In Chapter ??, we consider hyperquadrics in E3 . In dimension 3 they are called quadrics.
• Notice that Equation (10.1) is equivalent to
n
X n
X
Q: qij xi xj + bi xi + c = 0 (10.2)
i,j=1 i=1

a +a
where qii = aii and qij = qji = ij 2 ji . The matrix Q = (qij ) is symmetric and we call it the
symmetric matrix associated to Equation (10.2) of the quadric Q.
• The matrix Q defines a homogeneous polynomial of degree 2 in the above equation.
• Notice that Equation (10.2) can be rearranged in matrix form as follows

Q : xT · Q · x + bT · x + c = 0 (10.3)

where x = (x1 , . . . , xn ) and b = (b1 , . . . , bn ).

10.2 Reducing to canonical form


Let Q be the hyperquadric described by Equation (10.2) with respect to the right oriented orthonor-
mal frame K = (O, B) where B = (e1 , . . . , en ). Let Q be the matrix associated to Equation (10.2) of Q.

[Step 1 - Rotation] By the Spectral Theorem (see Corollary ??), there is an orthonormal basis B ′ of
eigenvectors for Q which diagonalizes Q. Changing the frame from K = (O, B) to K′ = (O, B ′ ), the
equation of the hyperquadric becomes
 
λ1 0 . . . 0 
 0 λ2 . . . 0 
 
T T
Q : y · D · y + v · y + c = 0 where D =  .
.. . . .. 
 (10.4)
 .. . . . 
 
0 0 . . . λn

and where y = (y1 , . . . , yn ) and v = (v1 , . . . , vn ). This change of coordinates consists of replacing x by
MB,B ′ y since x = MB,B ′ y. Notice that, since the two bases B and B ′ are orthonormal, the matrix MB,B ′
is an orthogonal matrix, i.e., MB,B ′ ∈ O(n).

143
Geometry - first year [email protected]

For the proof of the Spectral theorem, one uses the fact that, since Q is symmetric, the eigenvalues
λ1 , . . . , λn are all real. Hence, eventually after permuting the basis vectors in B ′ we may assume
that λ1 , . . . , λp > 0, λp+1 , . . . , λr < 0 and λr+1 , . . . , λn = 0 where r is the rank of Q. This permutation
corresponds to changing B ′ = (e′1 , . . . , e′n ) to B ′′ = (eσ (1) , . . . , eσ (n) ) for some permutation σ of {1, . . . , n}.
The change of basis matrix MB ′ ,B ′′ is again orthogonal. Thus, so far, we have a change of coordinates
from K = (O, B) to K = (O, B ′′ ) given by the change of basis matrix MB,B ′′ = MB,B ′ MB ′ ,B ′′ ∈ O(n).
Recall that det(MB,B ′′ ) = 1 if MB,B ′′ ∈ SO(n) and det(MB,B ′′ ) = −1 if MB,B ′′ is not special orthogonal.
In the latter case, the matrix MB,B ′′ changes the orientation of the basis B ′′ , i.e., the change of coor-
dinates is an indirect isometry. So, if we replace one vector in B ′′ by minus that vector, for example
e′′1 ↔ −e′′1 , then B ′′ has the same orientation as B and MB,B ′ ∈ SO(n), i.e., the change of coordinates is
a displacement. In conclusion, we may assume that B ′′ is such that MB,B ′′ ∈ SO(n).
Writing out the equation of Q in K′′ we have:
Q : λ1 y12 + λ2 y22 + · · · + λr yr2 + v1 y1 + v2 y2 + · · · + vn yn + c = 0. (10.5)

[Step 2 - translation] Notice that r > 0, i.e., there is at least one eigenvalue distinct from 0 otherwise
Q = 0n (the n × n zero matrix) and (10.2) is not a quadratic polynomial. Now, if v1 , 0 then
v v 2 v12
λ1 y12 + v1 y1 = λ1 y12 + 2 1 y1 = λ1 y1 + 1
  
− 2.
2λ1 2λ1 4λ1
vi
Thus, if for i ∈ {1, . . . , r} we let zi = yi + 2λ and zi = yi for i > r then Equation (10.5) becomes
i

Q : λ1 z12 + λ2 z22 + · · · + λr zr2 + vr+1 zr+1 + · · · + vn zn =k (10.6)


for some k ∈ R. This change of variables corresponds to a change of coordinates from K′′ = (O, B ′′ ) to
K′′′ = (O′ , B ′′ ) given by the translation
     v1 
 z1   y1   2λ1 
 .   .   . 
 .   .   . 
 .   .   . 
 zr   yr   vr 
   
 2λ
z  y  +  0r  .
  = 
 r+1   r+1   
 .   .   . 
 ..   ..   . 
     . 
zn yn
     
0
Thus, up until now, we changed the coordinates from K to K′′ with a direct isometry that fixes
the origin (corresponding to a matrix in SO(n)) and from K′′ to K′′′ with a translation. These are
isometries, so the composition of these two transformations is an isometry.
[Step 3 - non-isometric deformation] It is possible to further simplify Equation (10.6) of Q by mak-
ing other changes of coordinates that scale the axes such that all the coefficients are ±1. However, in
general, these do not correspond to isometries (see the discussion in Section 10.3.3).
We reduced Equation (10.2) to Equation (10.6). Equation (10.6) is not yet the canonical form, but
it is an important step towards the canonical form. We may refer to Equation (10.6) as an interme-
diate canonical form. In what follows we explore what more can be done if we restrict to conics and
quadrics, i.e., if we look at hyperquadrics in E2 and E3 .

144
Geometry - first year [email protected]

10.3 Classification of conics


10.3.1 Isometric classification
A hyperquadric in E2 is a curve given by an equation of the form

C : q11 x2 + 2q12 xy + q22 y 2 + b1 x + b2 y + c = 0. (10.7)

From the discussion in Section 10.2, we may apply a rotation and a translation to change the frame
such that Equation (10.7) becomes

C : λ1 x2 + λ2 y 2 = k or C : λ1 x2 + v2 y = k. (10.8)

We may also assume that λ1 > 0 since if λ1 , λ2 < 0 we can multiply the entire equation by −1.
Case 1: If λ2 > 0 and k = 0, then the equation has only (0, 0) as solution, in this case the curve
degenerates to a point: the origin.
Case 2: If λ2 > 0 and k < 0, then there are no real solutions to the equation.
Case 3: If λ2 > 0 and k > 0, after dividing by k the equation becomes

x2 y2
k
+ k
=1
λ1 λ2

which is the equation of an ellipse. The ellipse is in canonical form if λk > λk . If this is not the case,
1 2

we need to do one additional change of coordinates by rotating the reference frame with 90 which
is a direct isometry.
Case 4: If λ2 < 0 and k = 0, then the equation becomes
p p p p
( λ1 x − −λ2 y)( λ1 x + −λ2 y) = 0.

This is the union of two lines.


Case 5: If λ2 < 0 and k < 0, we may multiply the whole equation by −1 and interchange the roles of
λ1 and λ2 . Doing so requires one additional coordinate change obtained by rotating the reference

frame with 90 which is a direct isometry. Then we end up in the following case.
Case 6: If λ2 < 0 and k > 0, after dividing by k the equation becomes

x2 y2
k
− k
=1
λ1 −λ2

which is the equation of a hyperbola.


Case 7: If λ2 = 0 we have the equation

C : λ1 x2 + v2 y = k.
√ √ √ √
If v2 = 0 and k ≥ 0 then we have two lines described by the equation ( λ1 x − k)( λ1 x + k) = 0. If
k = 0 this is a double-line, x2 = 0.

145
Geometry - first year [email protected]

Case 8: If v2 = 0 and k < 0 then we have no solutions.


Case 9: If v2 , 0 then, dividing by |v2 |, the equation becomes

λ1 2 k v
x = − 2y
|v2 | |v2 | |v2 |

and we may change the coordinates with the translation/glide reflection (x, |vk | − |vv2 | y) → (x, y) such
1 2
that the equation simplifies to
|v |
x2 = 2 y.
λ1
This change of coordinates corresponds to a reflection and a translation (again, an isometric change
|v1 |
of coordinates) and we recognize the equation of a parabola with parameter p = 2λ . However,
1
here again, we do not yet have the canonical form of a parabola. In order to obtain Pp we need to
interchange the x-axis with the y-axis.
[Conclusion] Starting with an equation of the form (10.7), we may use rotations, translations and
reflections to transform the equation into one of the following

• degenerate cases: two lines, double lines, points, or

• non-degenerate cases:

x2 y 2 x2 y 2
Ea,b : + =1 or Ha,b : − =1 or Pp : y 2 = 2px.
a2 b2 a2 b2

Moreover, we can either carefully select direct isometries at each step or ensure at the end that the
composition of all isometries used is direct. So, the curves described by an equation of the form
(10.7) are conic sections and can be reduced to such curves via direct isometries (displacements).

Example 10.2 (Ellipse). Consider the curve with equation

C : 73x2 + 72xy + 52y 2 − 10x + 55y + 25 = 0. (10.9)

146
Geometry - first year [email protected]

It is the ellipse in the image above. However, a priori, it is not at all clear that C is an ellipse. The
symmetric matrix associated to this equation is
" #
73 36
Q=
36 52

and, in matrix form, Equation (10.9) becomes


" #" # " #
h i 73 36 x h i x
C: x y + −10 55 + 25 = 0. (10.10)
36 52 y | {z } y
| {z }
b
Q

The eigenvalues of Q are λ1 = 100 and λ2 = 25. Corresponding eigenvectors are (4, 3) for λ1 and
(3, −4) for λ2 . These two vectors form an orthogonal basis. Thus, an orthonormal basis is B ′ =
(e′1 (4/5, 3/5), e′2 (3/5, −4/5)). The change of basis matrix from B ′ to B is
" #
1 4 3
MB,B ′ = .
5 3 −4

We know that this matrix is orthogonal, which in this example is easy to check directly. In particular
T
M−1
B,B ′ = MB,B ′ . Moreover, the determinant is −1, which means that MB,B ′ is not a direct isometry, i.e.,
MB,B ′ ∈ O(2)\SO(2). If we change the direction of the second eigenvector, we still have an eigenvector
for the eigenvalue λ2 , but now, with respect to the basis B ′ = (e′1 (4/5, 3/5), e′2 (−3/5, 4/5)), we have
" #
1 4 −3
MB,B ′ = ∈ SO(2).
5 3 4

Changing the frame from K to K′ = (O, B ′ ), Equation (10.9) becomes


" ′# " ′#
h
′ ′
i
T x x
C : x y MB,B ′ Q MB,B ′ ′ + b MB,B ′ ′ + 25 = 0
y y

147
Geometry - first year [email protected]

which one calculates to be


i 100 0 x′ h i x′
" #" # " #
h

C: x y ′ + −25 50 + 25 = 0
0 25 y ′ | {z } y ′
| {z } ′ b
Q′

and we have
C : 100x′2 + 25y ′2 + 25x′ + 50y ′ + 25 = 0
equivalently
C : 4x′2 + y ′2 + x′ + 2y ′ + 1 = 0
equivalently
 x′ 1  1  ′2 
C : 4 x′2 + + − + y + 2y ′ + 1 − 1 + 1 = 0
4 64 16
equivalently
 1 2  ′ 2 1
C : 4 x′ + + y +1 − = 0.
8 16

Now, let us change the frame again, using a translation by the vector (− 18 , −1). The new frame is
K′′ = (O′′ , B ′′ ), where O′′ = (− 81 , −1), and the basis B ′′ = B ′ does not change. In K′′ , the equation of C
becomes
1 x′′2 y ′′2
C : 4x′′2 + y ′′2 − =0 ⇔ 1
+ − 1 = 0.
16 64
1

Clearly, this is the equation of an ellipse. However, it is not yet in canonical form because the focal
points are on the y-axis. To obtain the canonical form, we perform a final coordinate change and
permute the coordinate axes, for instance with
" # " #
0 −1 0 1
∈ SO(2) or with ∈ O(2).
1 0 1 0

148
Geometry - first year [email protected]

With either of the two transformations we obtain


y ′′′2 x′′′2
C = E1, 1 : + 1 − 1 = 0.
8 1 64

To recap:
• We changed the coordinates from K to K′ by the rotation MB,B ′ of angle θ, where cos(θ) = 45 .

• We changed the coordinates from K′ to K′′ with a translation by the vector (− 18 , −1).
• We changed the coordinates from K′′ to K′′′ in order to interchange the variables.
• We obtained E1, 1 .
8

Example 10.3 (Hyperbola). Consider the curve with equation


C : −94x2 + 360xy + 263y 2 − 91x + 221y + 169 = 0. (10.11)
It is the hyperbola in the image above. However, a priori, it is not at all clear that C is a hyperbola.
The symmetric matrix associated to this equation is
" #
−94 180
Q=
180 263

149
Geometry - first year [email protected]

and, in matrix form Equation, (10.11) becomes


" #" # " #
h i −94 180 x h i x
C: x y + −91 221 + 25 = 0. (10.12)
180 263 y | {z } y
| {z }
b
Q

The eigenvalues of Q are λ1 = 338 and λ2 = −169. Corresponding eigenvectors are (5, 12) for λ1
and (12, −5) for λ2 . These two vectors form an orthogonal basis. Thus, an orthonormal basis is
B ′ = (e′1 (5/13, 12/13), e′2 (12/13, −5/13)). The change of basis matrix from B ′ to B is
" #
1 5 12
MB,B ′ = .
13 12 −5
We know that this matrix is orthogonal, which in this example is easy to check directly. In particular
T
M−1
B,B ′ = MB,B ′ . Moreover, the determinant is −1, which means that MB,B ′ is not a direct isometry, i.e.,
MB,B ′ ∈ O(2)\SO(2). If we change the direction of the second eigenvector, we still have an eigenvector
for the eigenvalue λ2 , but now, with respect to the basis B ′ = (e′1 (5/13, 12/13), e′2 (−12/13, 5/13)), we
have " #
1 5 −12
MB,B ′ = ∈ SO(2).
13 12 5
Changing the frame from K to K′ = (O, B ′ ), Equation (10.11) becomes
" ′# " ′#
h
′ ′
i
T x x
C : x y MB,B ′ Q MB,B ′ ′ + b MB,B ′ ′ + 169 = 0
y y

which one calculates to be


# " ′#
i x′
" " #
h i 338 0 x h

C: x y ′ + 169 169 + 169 = 0
0 −169 y ′ | {z } y ′
| {z } ′ b
Q′

and we have
C : 338x′2 − 169y ′2 + 169x′ + 169y ′ + 169 = 0

150
Geometry - first year [email protected]

equivalently
C : 2x′2 − y ′2 + x′ + y ′ + 1 = 0
equivalently
 x′ 1  1  ′2 1 1
C : 2 x′2 + + − − y − y′ + + + 1 = 0
2 16 8 4 4
equivalently
 1 2  ′ 1 2 9
C : 2 x′ + + y − + = 0.
4 2 8

Now, let us change the frame again, using a translation by the vector (− 14 , 12 ). The new frame is
K′′ = (O′′ , B ′′ ), where O′′ = (− 14 , 21 ), and the basis B ′′ = B ′ does not change. In K′′ , the equation of C
becomes
9 x′′2 y ′′2
C : 2x′′2 − y ′′2 + = 0 ⇔ − 9 + 9 − 1 = 0.
8 16 8
Clearly, this is the equation of a hyperbola. However, it is not yet in canonical form because the focal
points are on the y-axis. To obtain the canonical form, we perform a final coordinate change and
permute the coordinate axes, for instance with
" # " #
0 −1 0 1
∈ SO(2) or with ∈ O(2).
1 0 1 0

With either of the two transformations we obtain


x′′2 y ′′2
C = H 3 , √3 : 9
− 9
− 1 = 0.
4 2
16 8

151
Geometry - first year [email protected]

To recap:
5
• We changed the coordinates from K to K′ by the rotation MB,B ′ of angle θ, where cos(θ) = 13 .

• We changed the coordinates from K′ to K′′ with a translation by the vector (− 14 , 12 ).


• We changed the coordinates from K′′ to K′′′ in order to interchange the variables.
• We obtained H 3 , √3 .
4 2

Example 10.4 (Parabola). Consider the curve with equation


C : 128x2 + 480xy + 450y 2 + 391x + 119y + 289 = 0. (10.13)
It is the parabola in the image above. However, a priori, it is not at all clear that C is a parabola. The
symmetric matrix associated to this equation is
" #
128 240
Q=
240 225
and, in matrix form Equation, (10.13) becomes
" #" # " #
h i 128 240 x h i x
C: x y + 391 119 + 25 = 0. (10.14)
240 225 y | {z } y
| {z }
b
Q

152
Geometry - first year [email protected]

The eigenvalues of Q are λ1 = 578 and λ2 = 0. Corresponding eigenvectors are (8, 15) for λ1 and
(15, −8) for λ2 . These two vectors form an orthogonal basis. Thus, an orthonormal basis is B ′ =
(e′1 (8/17, 15/17), e′2 (15/17, −8/17)). The change of basis matrix from B ′ to B is
" #
1 8 15
MB,B ′ = .
17 15 −8
We know that this matrix is orthogonal, which in this example is easy to check directly. In particular
T
M−1
B,B ′ = MB,B ′ . Moreover, the determinant is −1, which means that MB,B ′ is not a direct isometry, i.e.,
MB,B ′ ∈ O(2)\SO(2). If we change the direction of the second eigenvector, we still have an eigenvector
for the eigenvalue λ2 , but now, with respect to the basis B ′ = (e′1 (8/17, 15/17), e′2 (−15/17, 8/17)), we
have " #
1 8 −15
MB,B ′ = ∈ SO(2).
17 15 8
Changing the frame from K to K′ = (O, B ′ ), Equation (10.13) becomes
" ′# " ′#
h
′ ′
i
T x x
C : x y MB,B ′ Q MB,B ′ ′ + b MB,B ′ ′ + 289 = 0
y y
which one calculates to be
0 x′ h i x′
" #" # " #
h i 578

C: x y ′ + 289 −289 + 289 = 0
0 0 y ′ | {z } y ′
| {z } ′ b
Q′

and we have
C : 578x′2 + 289x′ − 289y ′ + 289 = 0
equivalently
C : 2x′2 + x′ − y ′ + 1 = 0
equivalently
 x′ 1 1
C : 2 x′2 + + − − y′ + 1 = 0
2 16 8
equivalently
 1 2  ′ 7 
C : 2 x′ + − y − = 0.
4 8

153
Geometry - first year [email protected]

Now, let us change the frame again, using a translation by the vector (− 14 , 78 ). The new frame is
K′′ = (O′′ , B ′′ ), where O′′ = (− 14 , 78 ), and the basis B ′′ = B ′ does not change. In K′′ , the equation of C
becomes
1
C : 2x′′2 − y ′′ = 0 ⇔ x′′2 = y ′′ .
2
Clearly, this is the equation of a parabola. However, it is not yet in canonical form because the
focal point is on the y-axis. To obtain the canonical form, we perform a final coordinate change and
permute the coordinate axes, for instance with
" # " #
0 −1 0 1
∈ SO(2) or with ∈ O(2).
1 0 1 0

With either of the two transformations we obtain


1
C = P 1 : y ′′2 = 2 x.
4 4

To recap:
8
• We changed the coordinates from K to K′ by the rotation MB,B ′ of angle θ, where cos(θ) = 17 .

154
Geometry - first year [email protected]

• We changed the coordinates from K′ to K′′ with a translation by the vector (− 14 , 78 ).


• We changed the coordinates from K′′ to K′′′ in order to interchange the variables.
• We obtained P 1 .
4

10.3.2 Algorithm 1: Isometric invariants


The discussion in Subsections 10.3.1 is an algebraic case-by-case analysis which may appear lengthy.
It is the honest way of doing mathematics. However, it may be difficult to carry out such steps.
If we only want to know what type of conic we are dealing with, it is not difficult to extract a
recipe which allows us to determine what type of curve a given quadratic equation describes. For
this, notice that we may write the Equation (10.1) of Q in a slightly different matrix form. We only
do this in dimension 2, so consider the quadratic curve:
C : q11 x2 + 2q12 xy + q22 y 2 + 2b1 x + 2b2 y + c = 0. (10.15)
The equation can be rewritten as
  
h i q11 q12 b1  x
x y 1 q21 q22 b2  y  = 0
b1 b2 c 1
  
| {z }
=Q
b

The matrix Q b is symmetric and we call it the extended symmetric matrix associated to Equation (10.15)
of the conic C. Notice that changing coordinates via a rotation R [Step 1], followed by a translation
with a vector v [Step 2], amounts to a single matrix multiplication:
 ′
# x ′  x ′ 
     
x  " # x  " −1 −1
 ′  R v  
  R −R v  ′   ′ 
y  = y  ⇔ y  = y 
  0 1   0 1    
1 1 | {z } 1 1
=M

after which, the equation becomes


 ′
h i x 
T b 
x y 1 M QM y ′  = 0.
′ ′ 
1
 

b does not change when reducing to the canonical form with displacements. We
Observe that det(Q)
b is invariant. Indeed det(M) = det(R−1 ) = 1, so it does not affect the determinant of Q:
say that det(Q) b

det(M T QM)
b = det(M T ) det(Q)
b det(M) = det(Q).
b

In order to distinguish between the different conics, we use three invariants. As before, Q denotes
the symmetric matrix associated to Equation (10.15) of the conic C. The invariants are:
b = det(Q),
D b D = det(Q) and T = tr(Q).

155
Geometry - first year [email protected]

By checking Cases 1-9 in Section 10.3.1, and since D,


b D and T do not change under displacements,
we have the following result:
Proposition 10.5. The type of curve described by Equation (10.15) is determined by the following
table.

D
b D T curve C
D >0 - A point
b=0
D D =0 - Two lines or the empty set
D <0 - Two lines
D >0 b <0
DT An ellipse
b,0
D D >0 b >0
DT The empty set
D =0 - A parabola
D <0 - A hyperbola

Table 10.1: Classification in dimension 2 via invariants.

10.3.3 Affine classification


In the isometric classification, we allowed only coordinate changes that are isometries. In fact, we
noticed that it suffices to use direct isometries to reduce to the canonical form. The non-degenerate
curves we obtained are:
x2 y 2 x2 y 2
Ea,b : + = 1 or H a,b : − = 1 or Pp : y 2 = 2px.
a2 b 2 a2 b 2
It is possible further to change the coordinates and rescale the basis vectors of the coordinate axes.
Rescalings are clearly not isometries.
Example 10.6. Let us consider the case of Ea,b . If we replace x with ax and y with by, we are per-
forming the following coordinate change
" ′# " #" #
x a 0 x
=
y′ 0 b y
Then, the equation of Ea,b becomes:
i 12 0 x i a 0 12 a 0 x′
" #" # " #" #" #" #
h
a
h
′ ′ a
0
Ea,b : x y =1 ⇔ x y = 1.
0 b12 y 0 b 0 1
b2
0 b y′
Thus, after changing coordinates, the equation of Ea,b becomes
x′2 + y ′2 = 1.
Such affine transformations can, in general, be applied to Equation (10.6) for hypersurfaces.
However, if we restrict our attention to E2 , a case-by-case analysis shows that curves defined by a
quadratic equation correspond to the possibilities listed in the following table. In this table, the
second column indicates the signature of Q.

156
Geometry - first year [email protected]

r = rank Q (p, r − p) equation name


2 (0, 2) or (2, 0) x2 + y 2 + 1 = 0 imaginary ellipse
2 (1, 1) x2 − y 2 − 1 = 0 hyperbola
2 (0, 2) or (2, 0) x2 + y 2 − 1 = 0 ellipse
2 (0, 2) or (2, 0) x2 + y 2 = 0 two complex lines
2 (1, 1) x2 − y 2 = 0 two real lines
1 (0, 1) or (1, 0) x2 + 1 = 0 two complex lines
1 (0, 1) or (1, 0) x2 − 1 = 0 two real lines
1 (0, 1) or (1, 0) x2 = 0 a real double-line
1 (0, 1) or (1, 0) 2
x −y = 0 parabola

Table 10.2: Affine classification in dimension 2.

10.3.4 Algorithm 2: Lagrange’s method


One reason the discussion in Subsections 10.3.1 is lengthy is due to interpretation we added to each
step (such as rotation, translation, and the fact that eigenvectors determine the directions of the
new axes). We did this because we aimed to change the reference frame via displacements, without
metrically altering the objects.
If we are only interested in the shape of the object, we are allowed to use affine transformations,
which are changes of coordinates. This allows us to focus on the algebra behind the calculations
and the method of bringing to canonical form becomes particularly simple. This point of view is
sometimes attributed to Lagrange and works in any dimension.
Fix a quadratic curve, i.e., a hyperquadric in dimension 2

C : q11 x2 + 2q12 xy + q22 y 2 + b1 x + b2 y + c = 0. (10.16)

(Step 1) Eliminate the mixed terms by completing the squares.

(Step 2) Eliminate the linear terms by completing the squares.

(Step 3) Then, with respect to some frame, the curve C has the equation

ax2 + by 2 + c = 0 or ax2 + by + c = 0

where a, b, c are obtained in Steps 1 and 2. It is now easy to identify the type of curve (see Table
10.2).

This works because Step 1 and Step 2 correspond to affine changes of coordinates.

10.4 Classification of quadrics


A hyperquadric in E3 is a surface given by an equation of the form

C : q11 x2 + q22 y 2 + q33 z2 + 2q12 xy + 2q23 yz + 2q13 xz + b1 x + b2 y + b3 z + c = 0. (10.17)

157
Geometry - first year [email protected]

From the discussion in Section 10.2, we may apply an orthogonal change of coordinates and a trans-
lation to change the frame so that Equation (10.17) becomes

Q : λ1 x12 + λ2 x22 + · · · + λr xr2 + vr+1 xr+1 + · · · + vn xn = k. (10.18)

The isometric classification in this case is similar: we work out all possible cases to see what we
obtain. One important remark is that the change of basis matrix in Step 1 – the matrix MB,B ′′ used
to obtain Equation (10.18) – is an element of the group SO(3). Thus, by Euler’s theorem (Theorem
7.17), this transformation is a rotation around an axis.
However, as in Section 10.3.3 and Section 10.3.4, one can ‘stretch’ the coordinate axes with affine
transformations which are not isometries, to show that one may change the frame so that Equation
(10.17) is one of the possibilities listed in the following table.

r = rank Q (p, r − p) equation name


3 (3, 0) or (0, 3) x2 + y 2 + z2 − 1 = 0 ellipsoid
3 (2, 1) or (1, 2) x2 + y 2 − z2 − 1 = 0 hyperboloid of one sheet
3 (2, 1) or (1, 2) x2 − y 2 − z2 − 1 = 0 hyperboloid of two sheets
3 (3, 0) or (0, 3) x2 + y 2 + z2 + 1 = 0 imaginary ellipsoid
3 (3, 0) or (0, 3) x2 + y 2 + z2 = 0 imaginary cone
3 (2, 1) or (1, 2) x2 + y 2 − z2 = 0 (real, elliptic) cone
2 (2, 0) or (0, 2) x2 + y 2 + 1 = 0 cylinder on imaginary ellipse
2 (1, 1) x2 − y 2 − 1 = 0 cylinder on hyperbola
2 (2, 0) or (0, 2) x2 + y 2 − 1 = 0 cylinder on ellipse
2 (2, 0) or (0, 2) x2 + y 2 = 0 cylinder on two complex lines
2 (1, 1) x2 − y 2 = 0 cylinder on two real lines
1 (1, 0) or (0, 1) x2 + 1 = 0 two complex planes
1 (1, 0) or (0, 1) x2 − 1 = 0 two real planes
1 (1, 0) or (0, 1) x2 = 0 a double plane
1 (1, 0) or (0, 1) x2 + 1 = 0 two complex planes
1 (1, 0) or (0, 1) x2 − 1 = 0 two real planes
1 (1, 0) or (0, 1) x2 = 0 a double plane
2 (2, 0) or (0, 2) x + y2 − z = 0
2 elliptic paraboloid (EP)
2 (1, 1) x2 − y 2 − z = 0 hyperbolic paraboloid (HP)
1 (1, 0) or (0, 1) x2 + y = 0 cylinder on parabola

158
CHAPTER 11

Canonical equations of real quadrics

Contents
11.1 Ellipsoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
11.1.1 Canonical equation - global description . . . . . . . . . . . . . . . . . . . . . . 161
11.1.2 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
11.1.3 Parametrizations - local description . . . . . . . . . . . . . . . . . . . . . . . . . 163
11.2 Elliptic Cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
11.2.1 Canonical equation - global description . . . . . . . . . . . . . . . . . . . . . . 163
11.2.2 Conic sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
11.2.3 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
11.2.4 Ca,b,c as ruled surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
11.2.5 Parametrizations - local description . . . . . . . . . . . . . . . . . . . . . . . . . 165
11.3 Hyperboloid of one sheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
11.3.1 Canonical equation - global description . . . . . . . . . . . . . . . . . . . . . . 165
11.3.2 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
11.3.3 H1a,b,c as ruled surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
11.3.4 Parametrizations - local description . . . . . . . . . . . . . . . . . . . . . . . . . 169
11.4 Hyperboloid of two sheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
11.4.1 Canonical equation - global description . . . . . . . . . . . . . . . . . . . . . . 170
11.4.2 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
11.4.3 Parametrizations - local description . . . . . . . . . . . . . . . . . . . . . . . . . 172
11.5 Elliptic paraboloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
11.5.1 Canonical equation - global description . . . . . . . . . . . . . . . . . . . . . . 172
11.5.2 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
11.5.3 Parametrizations - local description . . . . . . . . . . . . . . . . . . . . . . . . . 174

159
Geometry - first year [email protected]

11.6 Hyperbolic paraboloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174


11.6.1 Canonical equation - global description . . . . . . . . . . . . . . . . . . . . . . 174
11.6.2 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
h
11.6.3 Pa,b as ruled surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
11.6.4 Parametrizations - local description . . . . . . . . . . . . . . . . . . . . . . . . . 178

160
Geometry - first year [email protected]

Here we look at the main examples of quadrics in E3 , sometimes called quadratic surfaces. These
are surfaces which satisfy a quadratic equation of the form

S : q11 x2 + q22 y 2 + q33 z2 + 2q12 xy + 2q13 xz + 2q23 yz + a1 x + a2 y + a3 z + c = 0. (11.1)

Notice that an equation as above may not define a surface. It could happen that there are no solutions
or that the coordinates of only one point satisfy a given quadratic equation.

11.1 Ellipsoid
11.1.1 Canonical equation - global description
An ellipsoid is a surface which (in some coordinate system) satisfies an equation of the form

x2 y 2 z2
Ea,b,c : + + =1 so Ea,b,c = ϕ −1 (1)
a2 b 2 c 2
| {z }
ϕ(x,y,z)

for some positive constants a, b, c ∈ R.

Figure 11.1: Ellipsoid1

If we fix one variable, we obtain intersections with planes parallel to the coordinate axes. For this
surface the intersections are either ellipses, points or the empty set. Check this for z = h and deduce
the axes of the ellipses that you obtain.

11.1.2 Tangent planes


Using the gradient or the algebraic method we obtain the tangent plane at a point p = (xp , yp , zp ) ∈
Ea,b,c to be
xp x yp y zp z
Tp Ea,b,c : 2 + 2 + 2 = 1.
a b c
1 Image source: Wikipedia

161
Geometry - first year [email protected]

We will check this with the algebraic method. Let us consider a line passing through p in parametric
form      
x xp  vx 
l : y  = yp  + t vy  ⇔ l = p + ⟨v⟩.
     
   
z zp vz
     
|{z}
=v

Recall that the tangent plane Tp Ea,b,c is the union of all lines intersecting the quadric Ea,b,c at p in a
special way, i.e. in a double point. This is why we investigate the intersection of our quadric with l.
How do we obtain the intersection Ea,b,c ∩ l? We look at those points of l which satisfy the equation
of the ellipsoid, i.e. we look for solutions t for the equation
   
xp  vx  (xp + tvx )2 (yp + tvy )2 (zp + tvz )2
ϕ(yp  + t vy ) = 1 ⇔ + + − 1 = 0.
   
a2 b2 c2
zp vz
   

If we rearrange the left-hand side as a polynomial in t we get

 v2
x
vy2 vz2  2  xp vx yp vy zp vz  xp2 yp2 zp2
+ + t + 2 + 2 + 2 t + 2 + 2 + 2 −1 = 0
a2 b2 c2 a2 b c a b c
| {z }
=1

 v2
x
vy2 vz2  2  xp vx yp vy zp vz 
⇔ + + t + 2 + 2 + 2 t = 0.
a2 b2 c2 a2 b c
| {z }
,0

This equation in t admits the solution t = 0. That is clear, the point on l corresponding to t = 0 is
the point p which, by assumption, lies on Ea,b,c . Furthermore, the second solution will correspond to
a second point of intersection. However, the line l is tangent to our quadric if p is a double point of
intersection. This happens if and only if the above equation has t = 0 as double solution, i.e. if and
only if
xp vx yp vy zp vz
+ 2 + 2 = 0.
a2 b c
x y z
How do we interpret this? In the Euclidean setting this can be interpreted as saying that ( a2p , bp2 , cp2 )
is perpendicular to the direction vector v of the line. All lines which are tangent to the surface and
contain p, need to satisfy this condition, so
 xp     
 a2  x xp  xp x yp y zp z
 y     
Tp Ea,b,c :  bp2  · (y  − yp ) = 0 ⇔ + + − 1 = 0.
 zp      a2 b2 c2
2
z zp
c

Deduce this equation also with the gradient.

162
Geometry - first year [email protected]

11.1.3 Parametrizations - local description


A parametrization of this surface is

 x(θ1 , θ2 ) = a cos(θ1 ) cos(θ2 )

 π π
y(θ1 , θ2 ) = b sin(θ1 ) cos(θ2 ) θ1 ∈ [0, 2π[ θ2 ∈ [− , [

2 2


 z(θ , θ ) = c sin(θ )

1 2 2
Why? Check this and deduce a parametrization of the tangent plane Tp Ea,b,c at the point
 
x(θ1,p , θ2,p )
p = y(θ1,p , θ2,p ) ∈ Ea,b,c .
 
z(θ1,p , θ2,p )
 

11.2 Elliptic Cone


11.2.1 Canonical equation - global description
An elliptic cone is a surface which (in some coordinate system) satisfies an equation of the form
x2 y 2 z2
Ca,b,c : + − =0 so Ea,b,c = ϕ −1 (0)
a2 b2 c2
| {z }
ϕ(x,y,z)

for some positive constants a, b, c ∈ R.

Figure 11.2: Elliptic cone2

If we fix one variable, we obtain intersections with planes parallel to the coordinate axes. For this
surface the intersections are either ellipses, hyperbolas or a point. Check this for z = h and deduce
the axes of the ellipses that you obtain.
2 Image source: Wikipedia

163
Geometry - first year [email protected]

11.2.2 Conic sections


Above we noticed that the intersection of an elliptic cone with planes parallel to the coordinate axes
are quadratic surfaces (possibly degenerate). In fact we have the following result.

Proposition 11.1. The intersection of an elliptic cone with an arbitrary plane is a (possibly degener-
ate) quadratic curve.

Figure 11.3: Conic sections3

11.2.3 Tangent planes


As in the case of the ellipsoid, using the gradient or the algebraic method we obtain the tangent plane
at a point p = (xp , yp , zp ) ∈ Ca,b,c to be

xp x yp y zp z
Tp Ca,b,c : + − = 0.
a2 b2 c2

11.2.4 Ca,b,c as ruled surface


A ruled surface is a surface S such that for any point p on the surface S there is a line l which contains
p and which is contained in S:

∀p ∈ S, ∃ a line l such that p ∈ l and l ⊆ S.

If we denote by L the family of all these lines, it is easy to see that the surface is the union of them:
[
S= l.
l∈L

3 Image source: Wikipedia

164
Geometry - first year [email protected]

The lines in L are called rectilinear generators of the surface S. We refer to them as generators since we
don’t consider here non-rectilinear generators.
So, how is the cone a ruled surface? Fix a point (x0 , y0 , z0 ) ∈ Ca,b,c and notice that for any t ∈ R we
have
2
(tx0 )2 (ty0 )2 (tz0 )2 y02 z02
!
2 x0
+ − 2 =t + − = 0.
a2 b2 c a2 b2 c2
Thus, the line {(tx0 , ty0 , tz0 ) : t ∈ R}, which passes through the given point and the origin is contained
in Ca,b,c . The set of all lines obtained in this way form the generators L of the cone.

11.2.5 Parametrizations - local description


Describing a parametrization of an elliptic cone can be generalized to any planar curve. Suppose you
have a parametrization of a curve in the plane Oxy. In our case, for the ellipse
(
x(θ) = a cos(θ)
θ ∈ [0, 2π[.
y(θ) = b sin(θ)

You want to rescale this curve with the height such that when the height z = 0 you have a point, and
for all other values of z you have a rescaled version of your curve:



 x(θ, h) = ha cos(θ)
y(θ, h) = hb sin(θ) θ ∈ [0, 2π[ h ∈ R.



 z(θ, h) = hc

11.3 Hyperboloid of one sheet


11.3.1 Canonical equation - global description
A hyperboloid of one sheet is a surface which (in some coordinate system) satisfies an equation of the
form
x2 y 2 z2
H1a,b,c : 2 + 2 − 2 = 1 so H1a,b,c = ϕ −1 (1) (11.2)
a b c
| {z }
ϕ(x,y,z)

for some positive constants a, b, c ∈ R.

165
Geometry - first year [email protected]

(a) Hyperboloid of one sheet4 (b) Kobe Port Tower

If we fix one variable, we obtain intersections with planes parallel to the coordinate axes. For this
surface the intersections are either ellipses, hyperbolas or two lines. Check this for y = h and deduce
the axes of the hyperbolas that you obtain.

11.3.2 Tangent planes


Using the gradient or the algebraic method we obtain the tangent plane at a point p = (xp , yp , zp ) ∈
H1a,b,c to be
xp x yp y zp z
Tp H1a,b,c : 2 + 2 − 2 = 1.
a b c
We will check this with the algebraic method. Let us consider a line passing through p in parametric
form
     
x xp  vx 
l : y  = yp  + t vy  ⇔ l = p + ⟨v⟩.
     
z zp vz
     
|{z}
=v

Recall that the tangent plane Tp H1a,b,c is the union of all lines intersecting the quadric H1a,b,c at p in a
special way, i.e. in a double point. This is why we investigate the intersection of our quadric with l.
How do we obtain the intersection H1a,b,c ∩ l? We look at those points of l which satisfy the equation
of our hyperboloid, i.e. we look for solutions t for the equation
   
xp  vx  (xp + tvx )2 (yp + tvy )2 (zp + tvz )2
ϕ(yp  + t vy ) = 1 ⇔ + − − 1 = 0. (11.3)
   
a2 b2 c2
zp vz
   

4 Image source: Wikipedia

166
Geometry - first year [email protected]

If we rearrange the left-hand side as a polynomial in t we get

 v2
x
vy2 vy2  2  xp vx yp vy zp vz  xp2 yp2 zp2
+ − t + 2 + 2 − 2 t + 2 + 2 − 2 −1 = 0
a2 b2 c2 a2 b c a b c
| {z }
=1

 v2
x
vy2
vy2  2  xp vx yp vy zp vz 
⇔ + 2 − 2 t + 2 2 + 2 − 2 t = 0. (11.4)
a2
b c a b c
| {z }
=0?
This equation in t admits the solution t = 0. That is clear, the point on l corresponding to t = 0 is
the point p which, by assumption, lies on H1a,b,c . Furthermore, a second solution will correspond to
a second point of intersection. However, the line l is tangent to our quadric if p is a double point of
intersection. This happens if and only if the above equation has t = 0 as double solution, i.e. if and
only if
2 2
vx2 vy vy xp vx yp vy zp vz
2
+ 2 − 2 , 0 and + 2 − 2 = 0.
a b c a2 b c
How do we interpret the second condition? In the Euclidean setting this can also be interpreted as
x y z
saying that ( a2p , bp2 , − cp2 ) is perpendicular to the direction vector v of the line. In both cases, all lines
which are tangent to the surface and contain p, need to satisfy this equations, so the tangent plane is
 xp     
 x xp 
2
xp x yp y zp z
ya

Tp H1a,b,c : 
 p 
    
2
 · (y  − yp ) = 0 ⇔ + − − 1 = 0.
bz       a2 b2 c2
z

− 2p
zp
c

Deduce this equation also with the gradient.


How do we interpret the first condition? If
2 2
vx2 vy vy
+ − =0 (11.5)
a2 b2 c2
then equation (11.4) is linear, and has one simple solution t = 0. This means that l intersects H1a,b,c
only once (it punctures the surface in one point). Such lines are not tangent to the surface. How can
we visualize this? Let us start with what we know: the vector v = (vx , vy , vz ) satisfies the equation
(11.5), so we can think of it as the position vector of some point on the cone Ca,b,c . How does this
cone relate to our hyperboloid? Our surface H1a,b,c is the union of hyperbolas (revolving on ellipses
around the z-axis) and if we take the union of all the asymptotes to these hyperbolas we get Ca,b,c (see
Figure 11.5).

167
Geometry - first year [email protected]

Figure 11.5: Hyperboloid and asymptotic cone5

This should help to see that, when the vector v = (vx , vy , vz ) satisfies equation (11.5), the line l is
parallel to a line contained in the cone Ca,b,c . It will therefore intersect H1a,b,c in at most one point.
Notice also that if l ⊆ Ca,b,c , it will not intersect H1a,b,c at all, but this cannot happen in our setting
because we chose the point p such that it lies both on l and on our quadric. In fact, the related ques-
tion ‘does a given line l intersect H1a,b,c ?’ can be answered by investigating equation (11.3) without
the assumption that p lies on the surface H1a,b,c . How would you do this?

11.3.3 H1a,b,c as ruled surface


Here is a fact: the hyperboloid with one sheet is a doubly ruled surface (this is visible in Figure 11.4a).
Hmm.. I know what a ruled surface is, because cones and cylinders are ruled surfaces, but, ‘doubly
ruled’? Doubly ruled just means that it is a ruled surface in two ways. So, there are two distinct
families of lines, L1 and L2 respectively, such that
[ [
H1a,b,c = l and H1a,b,c = l.
l∈L1 l∈L2

One way to see where the two families of lines come from is to rearrange Equation (11.2):
x2 y 2 z2 x2 z2 y2 x z x z y y
     
+ − = 1 ⇔ − = 1 − ⇔ − + = 1 − 1 + (11.6)
a2 b2 c2 a2 c 2 b2 a c a c b b
Now, assume that the factors in the last equation are not 0, then we can divide to obtain
x y
a − zc 1+ b µ
⇔ y = x z =
1− b a+c λ
for some parameters λ and µ. We introduced these parameters in order to separate the above equa-
tion:
y
    
x z
 λ  a − c  = µ 1 − b 


⇔ lλ,µ :  .
 µ xa + zc = λ 1 + yb

5 Prof. C. Pintea - lecture notes

168
Geometry - first year [email protected]

What we end up with is a system of two equations, which are linear in x, y, z and which depend on
the parameters λ and µ. For each fixed pair of parameters, λ and µ, we get a line which we denote
with lλ,µ . Reading the above deduction backwards it is easy to see that all points on such a line satisfy
the equation of H1a,b,c . So, we have a family of lines contained in your hyperboloid.
We assumed that the factors in (11.6) are not zero. In fact, you only divide by two of them, so .. if
one of those two is zero, you can flip the above fraction and divide by the other two. That will lead
to the same family of lines L1 = {lλ,µ : λ, µ ∈ R, λ2 + µ2 , 0}.
OK, what about L2 ? The second family of generators (these lines are called generators), is ob-
tained if you group the terms differently:
x y
a − zc 1− µ
y = x bz = .
1+ b a + c λ

Then, you obtain:


y
    
x z
 λ − = µ 1 +
l˜λ,µ :   a c b

.

 µ xa + zc = λ 1 − yb


As above, one can check that points on these lines satisfy the equation of our hyperboloid.
One important thing to notice is that, although we write down two parameters, λ and µ, we don’t
necessarily get distinct lines for distinct parameters: lλ,µ = ltλ,tµ for any nonzero scalar t. So, in fact,
L1 depends on one parameter. More concretely
y y
           
x z
 a− c = α 1 − b    0 = 1− b 

 
 [ 
 
L1 =  l : l :
   
α y ∞
 
 α xa + zc = 1 + b   xa + zc = 0 
 
  
 
 
   

and similarly for L2 .

11.3.4 Parametrizations - local description


Two parametrizations of this surface are
 q 
a 1 + θ 2 cos(θ1 )  
 q 2  a cosh(θ2 ) cos(θ1 )
σ1 (θ1 , θ2 ) =  b 1 + θ 2 sin(θ )  and σ2 (θ1 , θ2 ) =  b cosh(θ2 ) sin(θ1 ) 
   

2 1 
c sinh(θ2 )
   

cθ2

for θ1 ∈ [0, 2π[ and θ2 ∈ R. Why? The parameter θ1 is used to rotate on ellipses the curve obtained for
θ1 = 0. What is this curve that we ‘rotate’? Check this and deduce a parametrization of the tangent
plane Tp H1a,b,c at the point
 
x(θ1,p , θ2,p )
p = y(θ1,p , θ2,p ) ∈ H1a,b,c
 
z(θ1,p , θ2,p )
 

where θ1,p and θ2,p are the parameters of the point p, i.e. p = p(x(θ1,p , θ2,p ), y(θ1,p , θ2,p ), z(θ1,p , θ2,p )).

169
Geometry - first year [email protected]

11.4 Hyperboloid of two sheets


11.4.1 Canonical equation - global description
A hyperboloid of two sheets is a surface which (in some coordinate system) satisfies an equation of the
form
x2 y 2 z2
H2a,b,c : 2 + 2 − 2 = −1 so H2a,b,c = ϕ −1 (−1)
a b c
| {z }
ϕ(x,y,z)

for some positive constants a, b, c ∈ R.

Figure 11.6: Hyperboloid of two sheets6

If we fix one variable, we obtain intersections with planes parallel to the coordinate axes. For this
surface the intersections are either ellipses, hyperbolas or the empty set. Check this for y = h and
deduce the axes of the hyperbolas that you obtain.

11.4.2 Tangent planes


Using the gradient or the algebraic method we obtain the tangent plane at a point p = (xp , yp , zp ) ∈
H2a,b,c to be
xp x yp y zp z
Tp H2a,b,c : 2 + 2 − 2 = −1.
a b c
We will check this with the algebraic method. Let us consider a line passing through p in parametric
form      
x xp  vx 
l : y  = yp  + t vy  ⇔ l = p + ⟨v⟩.
     
z zp vz
     
|{z}
=v

Recall that the tangent plane Tp H2a,b,c is the union of all lines intersecting the quadric H2a,b,c at p in a
special way, i.e. in a double point. This is why we investigate the intersection of our quadric with l.
How do we obtain the intersection H2a,b,c ∩ l? We look at those points of l which satisfy the equation

6 Image source: Wikipedia

170
Geometry - first year [email protected]

of our hyperboloid, i.e. we look for solutions t for the equation


   
xp  vx  (xp + tvx )2 (yp + tvy )2 (zp + tvz )2
ϕ(yp  + t vy ) = −1 ⇔ + − + 1 = 0.
   
a2 b2 c2
zp vz
   

If we rearrange the left-hand side as a polynomial in t we get

 v2
x
vy2 vy2  2  xp vx yp vy zp vz  xp2 yp2 zp2
+ − t + 2 + 2 − 2 t + 2 + 2 − 2 +1 = 0
a2 b2 c2 a2 b c a b c
| {z }
=−1

 v2
x
vy2
vy2  2  xp vx yp vy zp vz 
⇔ + 2 − 2 t + 2 2 + 2 − 2 t = 0. (11.7)
a2b c a b c
| {z }
=0?

This equation in t admits the solution t = 0. That is clear, the point on l corresponding to t = 0 is
the point p which, by assumption, lies on H2a,b,c . Furthermore, a second solution will correspond to
a second point of intersection. However, the line l is tangent to our quadric if p is a double point of
intersection. This happens if and only if the above equation has t = 0 as double solution, i.e. if and
only if
2 2
vx2 vy vy xp vx yp vy zp vz
2
+ 2
− 2
, 0 and + 2 − 2 = 0.
a b c a2 b c
How do we interpret the second condition? Similar to the case of the hyperboloid of one sheet:
 vx     
 a2  x xp  xp x yp y zp z
2  v     
Tp Ha,b,c :  by2  · (y  − yp ) = 0 ⇔ + − + 1 = 0.
 vx      a2 b2 z2
− c2 z zp

Deduce this equation also with the gradient.


How do we interpret the first condition? If
2 2
vx2 vy vy
+ − =0 (11.8)
a2 b2 c2

then equation (11.7) is linear, and has one simple solution t = 0. This means that l intersects H2a,b,c
only once (it punctures the surface in one point). Such lines are not tangent to the surface. How can
we visualize this? Let us start with what we know: the vector v = (vx , vy , vz ) satisfies the equation
(11.8), so we can think of it as the position vector of some point on the cone Ca,b,c . How does this cone
relate to our hyperboloid? Our surface H2a,b,c is the union of hyperbolas and if we take the union of
all the asymptotes to these hyperbolas we get Ca,b,c (see Figure 11.5). So, when l is parallel to a line
contained in Ca,b,c , it will intersect H2a,b,c in at most one point. Notice also that if l ⊆ Ca,b,c , it will not
intersect H2a,b,c at all, but this cannot happen because we chose the point p such that it lies both on l
and on our quadric.

171
Geometry - first year [email protected]

11.4.3 Parametrizations - local description


Two parametrizations of this surface are
 q 
a θ 2 − 1 cos(θ1 )  
 q 2  a sinh(θ2 ) cos(θ1 )
σ1 (θ1 , θ2 ) =  b θ 2 − 1 sin(θ )  and σ2 (θ1 , θ2 ) =  b sinh(θ2 ) sin(θ1 ) 
   

2 1 
εc cosh(θ2 )
   

cθ2

for θ1 ∈ [0, 2π[, θ2 ∈ R and ε ∈ {±1}. Why? The parameter θ1 is used to ‘rotate’ on ellipses the curve
obtained for θ1 = 0. What is this curve? Check this and deduce a parametrization of the tangent
plane Tp H2a,b,c at the point
 
x(θ1,p , θ2,p )
p = y(θ1,p , θ2,p ) ∈ H2a,b,c
 
z(θ1,p , θ2,p )
 

where θ1,p and θ2,p are the parameters of the point p, i.e. p = p(x(θ1,p , θ2,p ), y(θ1,p , θ2,p ), z(θ1,p , θ2,p )).
Notice also that with σ2 we have a parametrization for each sheet of this hyperboloid, with ε = 1 we
get one sheet and with ε = −1 we get the other sheet. One should also be careful with where the
parameters live: for σ1 you want to choose θ2 in ] − ∞, −1] ∪ [1, ∞[ so that the square root is defined.

11.5 Elliptic paraboloid


11.5.1 Canonical equation - global description
An elliptic paraboloid is a surface which (in some coordinate system) satisfies an equation of the form

e x2 y 2 e
Pa,b : + − 2z = 0 so Pa,b = ϕ −1 (0)
a b
| {z }
ϕ(x,y,z)

for some positive constants a, b ∈ R.

Figure 11.7: Elliptic paraboloid7

7 Image source: Wikipedia

172
Geometry - first year [email protected]

If we fix one variable, we obtain intersections with planes parallel to the coordinate axes. For this
surface the intersections are either ellipses, parabolas or the empty set. Check this for y = h and see
what parabolas you obtain.

11.5.2 Tangent planes


e
Using the gradient or the algebraic method we obtain the tangent plane at a point p = (xp , yp , zp ) ∈ Pa,b
to be
e
x p x yp y
Tp Pa,b : + − zp − z = 0.
a b
We will check this with the algebraic method. Let us consider a line passing through p in parametric
form      
x xp  vx 
l : y  = yp  + t vy  ⇔ l = p + ⟨v⟩.
     
   
z zp vz
     
|{z}
=v
e e
Recall that the tangent plane is the union of all lines intersecting the quadric Pa,b
Tp Pa,b at p in a
special way, i.e. in a double point. This is why we investigate the intersection of our quadric with l.
e
How do we obtain the intersection Pa,b ∩ l? We look at those points of l which satisfy the equation of
our paraboloid, i.e. we look for solutions t for the equation
   
xp  vx  (xp + tvx )2 (yp + tvy )2
ϕ(yp  + t vy ) = 0 ⇔ + − 2(zp + tvz ) = 0.
   
a b
zp vz
   

If we rearrange the left-hand side as a polynomial in t we get

 v2
x
vy2  2  xp vx yp vy  xp2 yp2
+ t +2 + − vz t + + − 2zp = 0
a b a b a b
| {z }
=0

 v2 vy2  2
x
 xp vx yp vy 
⇔ + t +2 + − vz t = 0. (11.9)
a b a b
| {z }
,0

This equation in t admits the solution t = 0. That is clear, the point on l corresponding to t = 0 is
e
the point p which, by assumption, lies on Pa,b . Furthermore, a second solution will correspond to
a second point of intersection. However, the line l is tangent to our quadric if p is a double point of
intersection. This happens if and only if the above equation has t = 0 as double solution, i.e. if and
only if
xp vx yp vy
+ − vz = 0.
a b

173
Geometry - first year [email protected]

How do we interpret this condition? Similar to the quadrics treated in the previous sections, so
 xp     
 a  x xp  xp x yp y
e y     
Tp Pp,q :  bp  (y  − yp ) = 0 ⇔ + − zp − z = 0.
a b
−1 z zp
     

Deduce this equation also with the gradient.

11.5.3 Parametrizations - local description


A parametrization of this surface is
√ 
 √aθ2 cos(θ1 )
σ (θ1 , θ2 ) =  bθ2 sin(θ1 )  θ1 ∈ [0, 2π[ θ2 ∈ [0, ∞[
 
θ2 /2
 

e
Why? Check this and deduce a parametrization of the tangent plane Tp Pa,b at the point
 
x(θ1,p , θ2,p )
e
p = y(θ1,p , θ2,p ) ∈ Pa,b .
 
z(θ1,p , θ2,p )
 

where θ1,p and θ2,p are the parameters of the point p, i.e. p = p(x(θ1,p , θ2,p ), y(θ1,p , θ2,p ), z(θ1,p , θ2,p )).

11.6 Hyperbolic paraboloid


11.6.1 Canonical equation - global description
A hyperbolic paraboloid is a surface which (in some coordinate system) satisfies an equation of the
form
h x2 y 2 h
Pa,b : − − 2z = 0 so Pa,b = ϕ −1 (0) (11.10)
a b
| {z }
ϕ(x,y,z)

for some positive constants a, b ∈ R.

174
Geometry - first year [email protected]

(a) Hyperbolic paraboloid8 (b) Potato chips9

If we fix one variable, we obtain intersections with planes parallel to the coordinate axes. For
this surface the intersections are either parabolas, hyperbolas or two lines. Check this for y = h and
deduce the parabolas that you obtain.

11.6.2 Tangent planes


h
Using the gradient or the algebraic method we obtain the tangent plane at a point p = (xp , yp , zp ) ∈ Pa,b
to be
h
x p x yp y
Tp Pa,b : − − zp − z = 0.
a b
We will check this with the algebraic method. Let us consider a line passing through p in parametric
form      
x xp  vx 
l : y  = yp  + t vy  ⇔ l = p + ⟨v⟩.
     
   
z zp vz
     
|{z}
=v

h h
Recall that the tangent plane Tp Pa,b is the union of all lines intersecting the quadric Pa,b at p in a
special way, i.e. in a double point. This is why we investigate the intersection of our quadric with l.
h
How do we obtain the intersection Pa,b ∩ l? We look at those points of l which satisfy the equation of
our paraboloid, i.e. we look for solutions t for the equation
   
xp  vx  (xp + tvx )2 (yp + tvy )2
ϕ(yp  + t vy ) = 0 ⇔ − − 2(zp + tvz ) = 0.
   
a b
zp vz
   

6 Image source: Wikipedia


9 Image source: the internet

175
Geometry - first year [email protected]

If we rearrange the left-hand side as a polynomial in t we get


 v2
x
vy2  2  xp vx yp vy  xp2 yp2
− t +2 − − vz t + − − 2zp = 0
a b a b a b
| {z }
=0

 v2
x
vy2  2  xp vx yp vy 
⇔ − t +2 − − vz t = 0. (11.11)
a b a b
| {z }
=0?
This equation in t admits the solution t = 0. That is clear, the point on l corresponding to t = 0 is
h
the point p which, by assumption, lies on Pa,b . Furthermore, a second solution will correspond to
a second point of intersection. However, the line l is tangent to our quadric if p is a double point of
intersection. This happens if and only if the above equation has t = 0 as double solution, i.e. if and
only if
2
vx2 vy xp vx yp vy
− , 0 and − − vz = 0.
a b a b
How do we interpret the second condition? Similar to the quadrics treated in the previous sections,
so  xp     
 a  x xp  x p x yp y
h  y     
Tp Pp,q : − bp  · (y  − yp ) = 0 ⇔ − − zp − z = 0.
a b
−1 z zp
     

Deduce this equation also with the gradient.


How do we interpret the first condition? If
2
vx2 vy  vx vy  vx vy 
− =0 ⇔ √ −√ √ +√ =0 (11.12)
a b a b a b
h
then equation (11.11) is linear, and has one simple solution t = 0. This means that l intersects Pa,b
only once (it punctures the surface in one point). Such lines are not tangent to the surface. How can
we visualize this? Let us start with what we know: the vector v = (vx , vy , vz ) satisfies the equation
(11.12), so we can think of it as the position vector of some point on the cylinder Cyl(A, k) where
A is the union of two lines given by the equation (11.12). These two lines are the intersection of
h
our quadric Pa,b with the coordinate plane z = 0 (they are visible in Figure 11.8a). Three things can
happen here:
h
1. l is one of the lines in A, then all points of l lie in Pa,b , so we have infinitely many solutions t
for equation (11.11), or
2. l is one of the lines in A translated in the positive direction of the z-axis, in which case l will
h
not intersect the surface Pa,b (this cannot happen for our choice of l because we assume that
h
p ∈ l ∩ Pa,b ), or
3. l is parallel to one of the lines in A (excluding the previous two cases), in which case l will
h
puncture the surface Pa,b in a simple point (it will not be tangent to the surface).

176
Geometry - first year [email protected]

h
11.6.3 Pa,b as ruled surface
Here is another fact: the hyperbolic paraboloid is a doubly ruled surface (like the hyperboloid of one
sheet). In other words, there are two families of lines, L1 and L2 respectively, such that
[ [
h h
Pa,b = l and Pa,b = l.
l∈L1 l∈L2

Every point on this surface lies on one line in L1 and on one line in L2 . The generators containing
the saddle point (with our equation, this point is the origin of the coordinate system) are visible in
Figure 11.8a.
Again, one way to see where the two families of lines come from is to rearrange (11.10)

x2 y 2 x2 y 2
! !
x y x y
− − 2z = 0 ⇔ − = 2z ⇔ √ −√ √ + √ = 2z (11.13)
a b a b a b a b

Similar to the case of the hyperboloid of one sheet, we can introduce two parameters λ and µ, in
order to separate the above equation:
  y

x
 λ  a − b  = 2µz


 √ √
⇔ lλ,µ :  .
 µ √x + √y = λ


a b

What we end up with is a system of two equations, which are linear in x, y, z and which depend on
the parameters λ and µ. For each fixed pair of parameters, λ and µ, we get a line which we denote
h
with lλ,µ . It is easy to check that all points on such a line satisfy the equation of Pa,b . This is the first
family of lines L1 = {lλ,µ : λ, µ not both zero}.
The second family of generators, L2 , is obtained if you group the terms differently:
  y


 λ √x + √ = 2µz
  a
l˜λ,µ :  b

y
.

 x
 µ √ −√ =λ
a b

As above, one can check that points on these lines satisfy the equation of our paraboloid.
Again, one important thing to notice is that, although we write down two parameters, λ and µ,
we don’t necessarily get distinct lines for distinct parameters: lλ,µ = ltλ,tµ for any nonzero scalar t. So,
in fact, L1 depends on one parameter. More concretely
   y
 
x   
 a − b = 2αz  0
 = 2z 

 
 √ √ 
   
  [
  
L1 := lα :  l :
  
∞  y
 α √x + √y = 1  √xa + √ = 0
  
 
  

    

a
 b
b

and similarly for L2 . You might have noticed that l∞ is one of the lines visible in Figure 11.8a, since
it lies in the plane z = 0. The other one, l˜∞ , belongs to the family L2 .

177
Geometry - first year [email protected]

11.6.4 Parametrizations - local description


A parametrization of this surface is
 √ 
 √aθ1 
σ2 (θ1 , θ2 ) =  θ1 , θ2 ∈ R
 

 1 2 2 2 

2 (θ1 − θ2 )

h
Check this and deduce a parametrization of the tangent plane Tp Pa,b at the point
 
x(θ1,p , θ2,p )
h
p = y(θ1,p , θ2,p ) ∈ Pa,b .
 
z(θ1,p , θ2,p )
 

where θ1,p and θ2,p are the parameters of the point p, i.e. p = p(x(θ1,p , θ2,p ), y(θ1,p , θ2,p ), z(θ1,p , θ2,p )).

178
APPENDIX A

Axioms

‘That it was the Greeks who added the element of logical structure to geometry is virtually univer-
sally admitted today’ [6, p.47]. According to Aristotle1 , for Thales2 ‘the primary question was not
what do we know, but how we know it’ [6, Chapter 4]. The ancient Greeks revolutionized mathematics
through their systematic use of ordered sequences of logical deductions. In Euclid’s3 Elements these
deductions are grounded in what are considered self-evident truths – definitions and postulates.
A major revision of the fundamental assumptions underlying Euclidean geometry was carried
out by Hilbert4 in his Foundations of Geometry [14]. The axioms Hilbert proposed as the basis for
Euclidean geometry are presented below. Hilbert showed that these axioms are independent. They
are necessary and sufficient assumptions for describing the interactions between primitives (points,
lines, and planes). The purpose of any axiomatic treatment is to put all statements on a solid ground,
the axioms from which they can be deduced. As Blumenthal recounts, Hilbert remarked that ‘one
must be able to say “tables, chairs, beer-mugs” each time in place of “points, lines, planes” ’ [12,
p.208].

Remark. We use the notation [AB] for ‘the segment AB’. Moreover, Axiom III.5 does not appear in
[14] but it is needed for the transitivity of the congruence relation on angles.

Axiom Group I: Axioms of Incidence

I.1 For every two points A, B there exists a line a that contains each of the points A, B.

I.2 For every two points A, B there exists no more than one line that contains each of the points A,
B.
1 384–322 BC
2 c.626/623 – c.548/545 BC
3 ∼ 300 BC
4 1862 – 1943

179
Geometry - first year [email protected]

I.3 There exist at least two points on a line. There exist at least three points that do not lie on a
line.

I.4 For any three points A, B, C that do not lie on the same line there exists a plane α that contains
each of the points A, B, C. For every plane there exists a point which it contains.

I.5 For any three points A, B, C that do not lie on one and the same line there exists no more than
one plane that contains each of the three points A, B, C.

I.6 If two points A, B of a line a lie in a plane α then every point of a lies in the plane α.

I.7 If two planes α, β have a point A in common then they have at least one more point B in
common.

I.8 There exist at least four points which do not lie in a plane.

Axiom Group II: Axioms of Order

II.1 If a point B lies between a point A and a point C then the points A, B, C are three distinct points
of a line, and B then also lies between C and A.

II.2 For two points A and C, there always exists at least one point B on the line AC such that C lies
between A and B.

II.3 Of any three points on a line there exists no more than one that lies between the other two.

II.4 (Pasch’s Axiom) Let A, B, C be three points that do not lie on a line and let a be a line in the
plane ABC which does not meet any of the points A, B, C. If the line a passes through a point of
the segment [AB], it also passes through a point of the segment [AC], or through a point of the
segment [BC].

Axiom Group III: Axioms of Congruence

III.1 If A, B are two points on a line a, and A′ is a point on the same or on another line a′ then it is
always possible to find a point B′ on a given side of the line a′ through A′ such that the segment
[AB] is congruent or equal to the segment [A′ B′ ]. In symbols [AB] ≡ [A′ B′ ].

III.2 If a segment [A′ B′ ] and a segment [A′′ B′′ ] are congruent to the same segment [AB], then the seg-
ment [A′ B′ ] is also congruent to the segment [A′′ B′′ ], or briefly, if two segments are congruent
to a third one they are congruent to each other.

III.3 On the line a let [AB] and [BC] be two segments which except for B have no point in common.
Furthermore, on the same or on another line a′ let [A′ B′ ] and [B′ C ′ ] be two segments which
except for B′ also have no point in common. In that case, if [AB] ≡ [A′ B′ ] and [BC] ≡ [B′ C ′ ] then
[AC] ≡ [A′ C ′ ].

180
Geometry - first year [email protected]

III.4 Let ∡(h, k) be an angle in a plane α and a′ a line in a plane α ′ and let a definite side of a′ of α ′
be given. Let h′ be a ray on the line a′ that emanates from the point O′ . Then there exists in the
plane α ′ one and only one ray k ′ such that the angle ∡(h, k) is congruent or equal to the angle
∡(h′ , k ′ ) and at the same time all interior points of the angle ∡(h′ , k ′ ) lie on the given side of a′ .
Symbolically ∡(h, k) ≡ ∡(h′ , k ′ ). Every angle is congruent to itself, i.e., ∡(h, k) ≡ ∡(h, k) is always
true.

III.5 If an angle ∡(h′ , k ′ ) and an angle ∡(h′′ , k ′′ ) are congruent to the same angle ∡(h, k), then the
angle ∡(h′ , k ′ ) is also congruent to the angle ∡(h′′ , k ′′ ), or briefly, if two angles are congruent to
a third one they are congruent to each other.

III.6 If for two triangles ABC and A′ B′ C ′ the congruences AB ≡ A′ B′ , AC ≡ A′ C ′ , ∡BAC ≡ ∡B′ A′ C ′
hold, then the congruence ∡ABC ≡ ∡A′ B′ C ′ is also satisfied.

Axiom Group IV: Axiom of Parallels

IV (Euclid’s Axiom). Let a be any line and A a point not on it. Then there is at most one line in the
plane, determined by a and A, that passes through A and does not intersect a.

Axiom Group V: Axioms of Continuity

V.1 (Axiom of measure or Archimedes’ Axiom). If [AB] and [CD] are any segments then there exists
a number n such that n segments [CD] constructed contiguously from A, along the ray from A
through B, will pass beyond the point B.

V.2 (Axiom of line completeness). An extension of a set of points on a line with its order and
congruence relations that would preserve the relations existing among the original elements
as well as the fundamental properties of line order and congruence that follows from Axioms
I-III, and from V.1 is impossible.

181
APPENDIX B

Lines and the real numbers

We are used to thinking about lengths of segments as positive real numbers R≥0 . We are also used to
drawing the field of real numbers R as a line with 0 represented by a point which separates positive
and negative numbers. In this section we describe a path which connects Hilbert’s Axioms with real
numbers using vectors. The zero vector corresponds to 0 ∈ R, but R also has an identity element 1
which needs a correspondent. For this, we need to make a choice.

Definition B.1. A unit segment is a non-trivial segment in E which we choose. Once a unit segment
[AB] is chosen the length |AB| is called the unit length and, equivalently the distance from A to B is
called unit distance (see Definition 1.3). Then, vectors having unit length are called unit vectors or
versors.

Definition B.2. For two points A and B, we denote by FAB the set of vectors represented with points
on the line AB. Observe that we have a partition
n −−−→ − o n −−−→
o n→ o
FAB = XY : X, Y ∈ AB and |XY ⟩ = |AB⟩ ∪ 0 ∪ XY : X, Y ∈ AB and |XY ⟩ = −|AB⟩ .
| {z } | {z }
F+AB F−AB

Proposition B.3. Let O be a point on a line AB. The following maps are bijections:
−−→
1. The map φO : AB → FAB defined by φO (P ) = OP ,
→− −−−→ −−−→
2. The map ψ+ : F+AB ∪ { 0 } → L defined by ψ+ ( XY ) = | XY |,
→− −−−→ −−−→
3. The map ψ− : F−AB ∪ { 0 } → L defined by ψ− ( XY ) = | XY |.
−−→ −−−→
Sketch of proof. For 1., if OP = OQ then P = Q by Lemma 1.4. For 2. and 3. let X, Y be two distinct
points on the line AB. By Lemma 1.4, there are exactly two points Z, Z ′ with O between them such

182
Geometry - first year [email protected]

−−−→ −−−→
that the segment [XY ] is congruent to [OZ] and to [Z ′ O]. Then OZ = − OZ ′ and exactly one of them
will have direction |AB⟩.

Definition B.4. Under the bijections in Proposition B.3 we identify F+AB ∪{0} with L and we call these
elements positive. The elements in F+AB are called strictly positive. We call the elements in F−AB ∪ {0}
negative and the ones in F−AB are called strictly negative. Notice that the involution in Definition 1.13
restricts to an involution −□ : FAB → FAB which interchanges positive and negative elements.
−−→ −−−→ −−→ −−−→
Definition B.5. We define an ordering on FAB . If AC , AD are positive, we let AC ≤ AD if and only
−−→ −−−→ −−→ −−−→
if [AC] ⊆ [AD]. If AC , AD are negative, we let AC ≤ AD if and only if [AD] ⊆ [AC].

Proposition B.6. The ordering in Definition B.5 is a total order. With this ordering and with addition
of vectors, FAB is an abelian totally ordered group and L is an ordered submonoid of FAB .

Sketch of proof. Since FAB is stable under addition, it follows from Proposition 1.18 that it is an
abelian group. The other claims follow by inspecting the definitions.

Definition B.7 (Multiplication of segments). Assume that a unit segment 1 was chosen. Let a and
b be two segments which represent the lengths x and y respectively. If x = 0 or y = 0 by definition
we have x · y = 0. If they are not both zero, let us cite the construction from [14, p.52]: “Lay off the
segments 1 and b from the vertex O on a side of a right angle. Then lay off the segment a on the other
side. Join the end points of the segment 1 and a with a line and through the end point of the segment
b draw a parallel to this line. It will delineate a segment c on the other side. This segment is then
called the product of the segments a by the segment b” and we write c = ab.

ab

O 1 b

This defines a multiplication on L which we identified with the positive elements F+AB ∪ {0}. The
product can be extended to the whole of FAB by requiring that x · y = (−x) · (−y) whenever x, y are
negative and that (−x) · y = −(x · y) = x · (−y) whenever x is negative and y is positive.

Proposition B.8. Assume that a unit segment was chosen. With addition of vector and the above
multiplication and ordering, FAB is a totally ordered field and L is an ordered submonoid of FAB
with respect to both addition and multiplication.

Sketch of proof. The sum and product defined above are shown to be associative, commutative and
distributive [14, §5]. Moreover it is easy to see that 1 is the neutral element for the product and it is
easy to construct inverses. The other claims follow by inspecting the definitions.

183
Geometry - first year [email protected]

The upshot of this section is the next theorem stating that the Axioms imply that FAB is isomor-
phic to R. Let us point out first that there are several ways of describing R. It can be described as
the ‘smallest complete ordered field’. Concretely, this description requires that R satisfies a certain
set of axioms (see for example [18, Chapter 1]). In order to show that it exists one can rely on several
constructions. The construction closest to our setting is the construction of R via Dedekind cuts (see
for example [18, Appendix to Chapter 1]).

Theorem B.9. For any unit segment [AB] there is a unique isomorphism φAB : FAB → R of ordered
→− −−→
fields mapping 0 to 0 and AB to 1. This isomorphism maps L to R≥0 .

Sketch of proof. By Proposition B.8, we have that FAB is an ordered field. Then, by Archimedes’ Ax-
iom (Axiom V.1) the field FAB contains Z as a subring. Thus, it contains Q as an ordered subfield. By
the Axiom of line completeness (Axiom V.2), the field FAB is a complete field. However, any com-
plete ordered field which containing Q as an ordered subfield is isomorphic to R through an order
preserving isomorphism [17, p.17].

Definition B.10. Assume that a unit segment [AB] was chosen. By Theorem B.9, we may identify
FAB with R which gives a multiplication of real numbers with vectors. More precisely we have a map
−1
□ · □ : R × FAB → FAB given by (r, a) 7→ r · a := φAB (r)a.

Proposition B.11. Assume that a unit segment and a point O have been chosen. For any point X and
−−−→ −−−→ −−−→ −−−→
any r ∈ R there is a unique point Y such that OY = r · OX . Moreover we have | OY | = |r| · | OX |.

Proposition B.12. In the setting of Definition B.10, for any vector a and any scalars x, y ∈ R, we have
(x + y) · a = x · a + y · a.

184
APPENDIX C

Changing the basis in a vector space

Let φ : V → W be a linear map between the vector spaces V and W . Let E = (e1 , . . . , en ) be a basis for
V and let F be a basis for W . In your Algebra course, you used the notation [φ]E,F for the matrix of
the linear map φ with respect to the bases E and F [9, Definition 3.4.1]. We will use the notation

MF ,E (φ) = [φ]E,F .

Notice that the indices E, F are reversed. Recall that this is the matrix whose columns are the compo-
nents of the φ(ei )’s with respect to the basis F :
 
 ↑ ... ↑ 
MF ,E (φ) = [φ(e1 )]F . . . [φ(en )]F  .
 

↓ ... ↓
 

You have also learned [9, Theorem 3.4.8] that if ψ : W → U is another linear map, to some vector
space U with basis G, then
MG,E (ψ ◦ φ) = MG,F (ψ) · MF ,E (φ).

In particular, if V = W = U , G = F and φ = ψ = IdV then

In = MF ,F (IdV ) = MF ,F (IdV ◦ IdV ) = MF ,E (IdV ) · ME,F (IdV )

hence MF ,E (IdV ) = ME,F (IdV )−1 and thus

MF ,F (φ) = MF ,E (IdV ) · ME,E (φ) · ME,F (IdV ) = ME,F (IdV )−1 · ME,E (φ) · ME,F (IdV ).

So, the matrix of φ with respect to the basis F is obtained from the matrix of φ with respect to the
basis E by conjugating with the matrix ME,F (IdV ).

185
Geometry - first year [email protected]

Definition C.1. The matrix ME,F := ME,F (IdV ) is called the change of basis matrix from the basis F to
the basis E. It is the matrix whose columns are the components of the vectors in F with respect to E.
If F = (f1 , . . . , fn ) then  
 ↑ ... ↑ 
ME,F = [f1 ]E . . . [fn ]E  .

↓ ... ↓
 

186
APPENDIX D

Coordinate systems

D.1 Polar coordinates


Polar coordinates use oriented angles with a line. They establish a bijection between points distinct
from the origin and pairs of numbers (r, ϕ) in R>0 × [0, 2π[.

Figure D.1: Polar coordinates1 .

The correspondence between Cartesian coordinates (x, y) and polar coordinates (r, ϕ) is as follows:
( ( p
x = r cos ϕ r = r(x, y) = x2 + y 2
,
y = r sin ϕ ϕ = f (x, y)

where, with r , 0,
arccos( xr )
(
if y ≥ 0
f (x, y) = x .
− arccos( r ) if y < 0
1 Image source: Wikipedia

187
Geometry - first year [email protected]

D.2 Cylindrical coordinates


Cylindrical coordinates extend polar coordinates to dimension 3. They establish a bijection between
points distinct from the origin and triples of numbers (r, ϕ, z) in R>0 × [0, 2π[×R.

Figure D.2: Cylindrical coordinates2 .

The correspondence between Cartesian coordinates (x, y) and polar coordinates (r, ϕ) is as follows:
  p


 x = r cos ϕ 

 r = r(x, y) = x2 + y 2
y = r sin ϕ , ϕ = f (x, y)
 

 

 z=z
  z=z

where f is as in the previous Section D.1.

D.3 Spherical coordinates


Spherical coordinates also extend polar coordinates to dimension 3 but use angles for the third coor-
dinate. They establish a bijection between points distinct from origin and triples of numbers (r, ϕ, θ)
in R>0 × [0, 2π[×[0, π[.

Figure D.3: Spherical coordinates3 .


2 Image source: Wikipedia

188
Geometry - first year [email protected]

The correspondence between Cartesian coordinates (x, y, z) and polar coordinates (r, ϕ, θ) is as fol-
lows:   p


 x = r cos ϕ sin θ 

 r = r(x, y, z) = x2 + y 2 + z2
y = r sin ϕ sin θ ,  ϕ = f (x, y)
 

 
 z = r cos θ
  θ = arccos( z )

r
where f is as in the previous Section D.1.

D.4 Barycentric coordinates


−−−−→ −−−−→
In the n-dimensional affine space An , let P0 , P1 , . . . , Pn be points such that B = ( P0 P1 , . . . , P0 P1 ) is a
basis of the direction space D(An ). This is the case if and only if the given points are the vertices of
an n-simplex. Then K = (P0 , B) is a Cartesian frame. Every point A has a unique expression of the
form
−−−−→ −−−−→
A = P0 + a1 P0 P1 + · · · + an P0 Pn
= (1 − a1 − · · · − an )P0 + a1 P1 + · · · + an Pn
= λ0 P0 + λ1 P1 + · · · + λn Pn .
The n + 1 values (λ0 , λ1 , . . . , λn ) are the barycentric coordinates of the point A. Notice that the sum of
these coordinates equals 1. The above equation also shows how to translate from Cartesian coordi-
nates to barycentric coordinates. This is the algebraic point of view.
The geometric interpretation in dimension 2 is the following. Let ABC be a triangle of area 1. A
point P in the plane of the triangle has barycentric coordinates

λ1 = Areaor (△P AB), λ2 = Areaor (△P BC), λ3 = Areaor (△P CA).

Where Areaor stands for the oriented area of a triangle. Similar, in higher dimensions, the barycentric
coordinates are give by oriented volume relative to an n-simplex.

3 Image source: Wikipedia

189
APPENDIX E

Bundles of lines and planes

E.1 Bundle of lines in A2


Definition E.1. Fix a point Q ∈ A2 . The set LQ of all lines in A2 passing through Q is called a bundle
of lines and Q is called the center of the bundle LQ .

Proposition E.2. If ℓ1 : a1 x + b1 y + c1 = 0 and ℓ2 : a2 x + b2 y + c2 = 0 are two distinct lines in the bundle


LQ , then LQ consists of lines having equations of the form

ℓλ,µ : λ(a1 x + b1 y + c1 ) + µ(a2 x + b2 y + c2 ) = 0.

where λ, µ ∈ R are not both zero. In particular, if Q = Q(x0 , y0 ), ℓ1 : x = x0 and ℓ2 : y = y0 then


n o
LQ = ℓλ,µ : λ(x − x0 ) + µ(y − y0 ) = 0 : λ, µ ∈ R not both zero .

Bundles of lines are useful when a point Q is given as the intersection of two lines, but its coor-
dinates are not known explicitly, and one wants to find the equation of a line passing through Q and
satisfying some other conditions. For example, the condition that it contains some point P distinct
from Q, or that it is parallel to a given line.

190
Geometry - first year [email protected]

Notice that there is redundancy in the two parameters λ, µ, meaning that we don’t have two
independent parameters here. If λ , 0 then one can divide the equation of ℓλ,µ by λ to obtain

ℓ1,t : (a1 x + b1 y + c1 ) + t(a2 x + b2 y + c2 ) = 0.


µ
where t = λ ∈ R. So ℓ1, µ and ℓλ,µ are in fact the same lines.
λ

Definition E.3. A reduced bundle is the set of all lines LQ passing through a common point Q from
which we remove one line, i.e. it is LQ \{ℓ2 } for some ℓ2 ∈ LQ . With the above notation and discussion,
it is the set n o
ℓ1,t : (a1 x + b1 y + c1 ) + t(a2 x + b2 y + c2 ) = 0 : t ∈ R .
The fact that we use one parameter instead of two, to describe almost all lines passing through
Q, simplifies calculations.
Definition E.4. Let v ∈ V2 . The set Lv of all lines in A2 with direction vector v is called an improper
bundle of lines, and v is called a direction vector of the bundle Lv .

The connection between bundles of lines and improper bundles of lines is best understood through
projective geometry, where the improper bundle of lines is the set of all lines intersecting in the same
point at infinity.

E.2 Bundle of planes in A3


Definition E.5. Let ℓ ⊆ A3 be a line. The set Πℓ of all planes in A3 containing ℓ is called a bundle of
planes and ℓ is called the axis (or carrier line) of the bundle Πℓ .

191
Geometry - first year [email protected]

Proposition E.6. If π1 : a1 x + b1 y + c1 z + d1 = 0 and π2 : a2 x + b2 y + c2 z + d2 = 0 are two distinct planes


in the bundle Πℓ , then Πℓ consists of planes having equations of the form

πλ,µ : λ(a1 x + b1 y + c1 z + d1 ) + µ(a2 x + b2 y + c2 z + d2 ) = 0.

where λ, µ ∈ R are not both zero.

Bundles of planes are useful when a line ℓ is given as the intersection of two planes (see Sub-
section 3.3.2) and one wants to find the equation of a plane containing ℓ and satisfying some other
conditions. For example, the condition that it contains some point P which does not belong to ℓ, or
that it is parallel to a given line.
As in the case of line bundles, there is redundancy in the two parameters λ, µ. If λ , 0 then one
can divide the equation of πλ,µ by λ to obtain

π1,t : (a1 x + b1 y + c1 z + d1 ) + t(a2 x + b2 y + c2 z + d2 ) = 0.


µ
where t = λ ∈ R. So π1, µ and πλ,µ are in fact the same planes.
λ

Definition E.7. A reduced bundle is the set of all planes Πℓ with axis ℓ from which we remove one
plane, i.e. it is Πℓ \ {π2 } for some π2 ∈ Πℓ . With the above notation and discussion, it is the set
n o
π1,t : (a1 x + b1 y + c1 z + d1 ) + t(a2 x + b2 y + c2 z + d2 ) = 0 : t ∈ R .

The fact that we use one parameter instead of two, to describe almost all planes containing ℓ,
simplifies calculations.

Definition E.8. Let W ⊆ V3 be a vector subspace of dimension 2. The set ΠW of all planes in A3
which admit W as direction space is called an improper bundle of planes, and W is called the vector
subspace associated to the bundle ΠW .

The connection between bundles of planes and improper bundles of planes is best understood
through projective geometry, where we can think of the improper bundle of planes as the set of all
planes intersecting in a line at infinity.

192
APPENDIX F

Some classical theorems in affine geometry

In this section we collect classical theorems in affine geometry. The proofs of the first three theorems
are vectorially (the theorem of Thales F.1, the affine version of Pappus’ theorem F.2 and the theorem
of Desargues F.3). We then introduce Lemma F.4 and Lemma F.4 which allow us to prove the next
four theorem by means of frames (Pappus’ hexagon theorem F.6, Newton-Gauss theorem F.7, and the
theorems of Menelaus F.8 and Ceva F.9).
It is important to point out that in the process of deducing the affine space structure from
Hilbert’s Axioms we made use of some of these results. Indeed, in [14], Theorem 40 is Pappus’
affine theorem F.2 and is a particular case of Pascal’s theorem F.11. This result was implicitly used
to define and derive properties of the multiplication of vectors with scalars. Therefore, proving the
following theorem after using results from [14] may introduce circular reasoning.
The value of the following proofs lies in the fact that we prove them starting from the notion
of an affine space only. Moreover, the proofs are such that they work not only for real affine spaces
but for affine spaces over other commutative fields. These latter affine spaces are outside the scope
of these notes. However, the proofs which allow this type of generalization are certainly of interest
since they capture the essence of the statements.

The following theorem asserts that the ratio in which parallel lines cut a transversal line ℓ does
not depend on ℓ but only on the parallel lines.

Theorem F.1 (Thales’1 intercept theorem). Let H, H ′ and H ′′ be three distinct parallel lines in the
affine plane A2 . Let ℓ1 and ℓ2 be two lines not parallel to H, H ′ and H ′′ . For i = 1, 2 let

Pi = ℓi ∩ H
Pi′ = ℓi ∩ H ′
Pi′′ = ℓi ∩ H ′′
1 c.626/623 – c.548/545 BC

193
Geometry - first year [email protected]

and let k1 , k2 be scalars such that


−−−−→ −−−→
Pi Pi′′ = ki Pi Pi′ .
Then k1 = k2 .

Proof. We follow the proof in [19]. If ℓ1 = ℓ2 the theorem is trivial. Suppose that ℓ1 , ℓ2 . Then, if
P1 = P2 the points P1′ and P2′ must be distinct. Thus, interchanging H with H ′ we may assume that
−−−−→
P1 , P2 . Now let v = P1 P2 and consider the vectors
−−−−→′ −−−−→′ −−−′ −→′ −−−−→
P2 P2 − P1 P1 = P1 P2 − P1 P2 = αv
−−−−→ −−−−→ −−−−−→ −−−−→
P2 P2′′ − P1 P1′′ = P1′′ P2′′ − P1 P2 = βv
where α and β are scalars.
−−−−→ −−−−→
If α = 0, from the first equation we have P2 P2′ = P1 P1′ . Since these are direction vectors of ℓ1
−−−−→ −−−−→
and ℓ2 it follows that the two lines are parallel. Since P2 P2′′ and P1 P1′′ are also direction vectors for
the two lines, we see from the second equation that βv is a direction vector for ℓ1 and ℓ2 . But v is a
direction vector for H which is not parallel to ℓ1 . Thus, β = 0. Moreover, in this case we have
−−−−→ −−−−→ −−−−→
P2 P2′′ = k2 P2 P2′ = k2 P1 P1′
−−−−→ −−−−→
P1 P1′′ = k1 P1 P1′
−−−−→ −−−−→
and since P2 P2′′ = P1 P1′′ it follows that k1 = k2 .
If α , 0, then
−−−−→ −−−−→ −−−−→ −−−−→
P2 P2′′ − P1 P1′′ = βv = α −1 β(αv) = α −1 β P2 P2′ − α −1 β P1 P1′ .
On the other hand,
−−−−→ −−−−→ −−−−→ −−−−→
P2 P2′′ − P1 P1′′ = k2 P2 P2′ − k1 P1 P1′ .
−−−−→ −−−−→
Now, since α , 0, the vectors P1 P1′ and P2 P2′ are not parallel, i.e. they are linearly independent.
Therefore, the last two equations imply k2 = α −1 β = k1 .

194
Geometry - first year [email protected]

Remark. Theorem F.1 can be generalize to the case where the three parallel lines are replaced by
parallel hyperpanes. The proof does not change if H, H ′ and H ′′ are replaced by hyperplanes.
Theorem F.2 (Pappus’2 affine theorem). Let H, H ′ be two distinct lines in the affine plane A2 . Let
P , Q, R ∈ H and P ′ , Q′ , R′ ∈ H ′ be distinct points, none of which lies at the intersection H ∩ H ′ . If
P Q′ ∥ P ′ Q and QR′ ∥ Q′ R then P R′ ∥ P ′ R.

Proof. We follow the proof in [19]. Suppose H and H ′ are not parallel, and let H ∩ H ′ = {O}. By
Theorem F.1, for some scalars h, k we have
−−−→′ −−−−→ −−−→ −−→
OP = k OQ′ , OQ = k OP , since P Q′ ∥ P ′ Q,
−−−−→′ −−−→ −−−→ −−−→
OQ = h OR′ , OR = h OQ , since QR′ ∥ Q′ R.
But then,
−−−→′ −−−→′ −−→ −−−−→ −−−→
P R = OR − OP = h−1 OQ′ − k −1 OQ
−−−→′ −−−→′ −−−→ −−−−→ −−−→
RP = OP − OR = k OQ′ − h OQ ,
−−−→ −−−→
and so RP ′ = hk P R′ , that is, RP ′ ∥ P R′ . If H ∥ H ′ , then
−−→ −−−′−→′
PQ = Q P , since P Q ∥ Q′ P ′ ,
−−−→ −−−−→
QR = R′ Q′ , since QR ∥ R′ Q′ ,
and so
−−→ −−→ −−−→ −−−′−→′ −−− −→ −−−−→
P R = P Q + QR = Q P + R′ Q′ = R′ P ′ .
Thus P R′ ∥ P ′ R.
2 c.290 – c.350

195
Geometry - first year [email protected]

Theorem F.3 (Desargues’3 theorem). Let A, B, C, A′ , B′ , C ′ ∈ A2 be points such that no three are collinear,
and such that AB ∥ A′ B′ , BC ∥ B′ C ′ and AC ∥ A′ C ′ . Then the three lines AA′ , BB′ and CC ′ are either
parallel or have a point in common.

Proof. We follow the proof in [19]. Suppose that AA′ , BB′ and CC ′ are not parallel. Then two of them
meet, and we may assume that AA′ ∩ BB′ = {O}. By Theorem F.1 applied to AB and A′ B′ we have
−−−→′ −−−→ −−−→′ −−→
OA = k OA , and OB = k OB

for some scalar k. Let {C ′′ } = OC ∩ A′ C ′ . By Theorem F.1, now applied to AC and A′ C ′ we have
−−−−→ −−−→
OC ′′ = k OC
−−−→ −−−→
since OA′ = k OA . On the other hand, putting {C ′′′ } = OC ∩ B′ C ′ , Theorem F.1 applied to the lines
BC and B′ C ′ implies
−−−−−→ −−−→
OC ′′′ = k OC
−−−→ −−→
since OB′ = k OB . Then, the last two equations imply that C ′′ = C ′′′ = C ′ , and so O, C and C ′ are
collinear.

Lemma F.4. Let K be a frame of A2 . Consider the lines


x y x y
ℓ1 : + =1 and ℓ2 : + =1
a b′ b a′
for some non-zero scalars a, a′ , b, b′ . Then the lines are parallel if and only if aa′ − bb′ = 0. If they are
not parallel they meet in the point with coordinates

ab(a′ − b′ ) a′ b′ (a − b)
!
, .
aa′ − bb′ aa′ − bb′

3 1591 – 1661

196
Geometry - first year [email protected]

Proof. A direction vector for ℓ1 is (−a, b′ ) and a direction vector for ℓ2 is (−b, a′ ). The two vectors are
linearly dependent if and only if
a b′
= ⇔ aa′ = bb′ ⇔ aa′ − bb′ = 0.
b a′
If ℓ1 and ℓ2 are not parallel, the intersection point is obtained by solving the system given by the
equations of the two lines.

Lemma F.5. Let K be a frame of A2 . Consider the points A(a, 0), B(b, 0), A′ (a′ , h′ ) and B′ (b′ , h′ ) with
a , b, a′ , b′ and h′ , 0. If the lines AB′ and A′ B are not parallel, they meet in the point with
coordinates
bb′ − aa′ h′ (b − a)
!
, .
b + b′ − a − a′ b + b′ − a − a′

Proof. A direction vector for AB′ is (b′ −a, h′ ) and a direction vector for A′ B is (a′ −b, h′ ). The equations
of the two lines are
x−a y x−b y
AB′ : ′ = and A′ B : ′ =
b − a h′ a − b h′
If ℓ1 and ℓ2 are not parallel, the intersection point is obtained by solving the system given by the
equations of the two lines.

197
Geometry - first year [email protected]

Theorem F.6 (Pappus’4 hexagon theorem). Let H, H ′ be two distinct lines in the affine plane A2 . Let
P , Q, R ∈ H and P ′ , Q′ , R′ ∈ H ′ be distinct points, none of which lies at the intersection H ∩H ′ . Assume
that P Q′ ∩ P ′ Q = {X}, P R′ ∩ P ′ R = {Y } and QR′ ∩ Q′ R = {Z}. Then the points X, Y , Z are collinear.

Proof. We first consider the case where H and H ′ are not parallel. Let H ∩ H ′ = {O} and choose a
frame K with the origin in O, with the first coordinate axis H and the second coordinate axis H ′ .
With respect to K the coordinates of the given points are as follows

P (p, 0), Q(q, 0), R(r, 0), P ′ (0, p′ ), Q′ (0, q′ ), R(0, r ′ ).

Applying Lemma F.4 three times with the frame K, we obtain the coordinates of the intersection
points:

pq(p′ − q′ ) p′ q′ (p − q) pr(p′ − r ′ ) p′ r ′ (p − r) qr(q′ − r ′ ) q′ r ′ (q − r)


! ! !
X , , Y , , Z , .
pp′ − qq′ pp′ − qq′ pp′ − rr ′ pp′ − rr ′ qq′ − rr ′ qq′ − rr ′

We may check the collinearity of these points by calculating a determinant. This amounts to an
algebraic calculation. We may simplify this calculation by noticing that the frame K can be choosen
such that p = p′ = 1. A calculation shows that
q(1−q′ ) q′ (1−q)
1−qq′ 1−qq′ 1 q(1 − q′ ) q′ (1 − q) 1 − qq′
r(1−r ′ ) r ′ (1−r) 1 ′ ′
1−rr ′ 1−rr ′ 1 = r(1 − r ) r (1 − r) 1 − rr ′ = 0.
(1 − qq′ )(1 − rr ′ )(qq′ − rr ′ )
qr(q′ −r ′ ) q′ r ′ (q−r)
1 qr(q − r ) q r (q − r) qq′ − rr ′
′ ′ ′ ′
qq′ −rr ′ qq′ −rr ′

Now consider the case where H and H ′ are parallel. Choose a frame K with origin P and with the
first coordinate axis H and the second coordinate axis the line P P ′ . With respect to K the coordinates
of the points are as follows

P (0, 0), Q(q, 0), R(r, 0), P ′ (0, h′ ), Q′ (q′ , h′ ), R(r ′ , h′ ).

4 c.290 – c.350

198
Geometry - first year [email protected]

Applying Lemma F.5 three times with the frame K, we obtain the coordinates of the intersection
points:
qq′ h′ q rr ′ h′ r qq′ − rr ′ h′ (q − r)
! ! !
X , Y , Z ,
q + q′ q + q′ r + r′ r + r′ q + q′ − r − r ′ q + q′ − r − r ′
Similar to the previous case, the collinearity of these points can be checked by calculating a determi-
nant. It is easy to see that this determinant is zero:

qq′ q q + q′
h′
rr ′ r r + r′ = 0.
(q + q′ )(r + r ′ )(q + q′ − r − r ′ )
qq − rr q − r q + q′ − r − r ′
′ ′

Remark. The two theorems of Pappus can be unified in the context of projective geometry. If in
the previous theorem (Theorem F.6) we let the points X, Y , Z go to infinity, we obtain Theorem F.2.
Moreover, the union of the two lines H and H ′ in these theorems is a degenerate conic. Both these
theorems are particular cases of Pascal’s Theorem F.11.

199
Geometry - first year [email protected]

Theorem F.7 (Newton-Gauss line). A complete quadrilateral is the configuration of four lines, no
three of which pass through the same point. Let O, A, B, C, A′ , B′ be the intersection points of such
lines, i.e. the vertices of the complete quadrilateral, such that A lies between O and B and such that
A′ lies between O and B′ . The diagonals of this quadrilateral are the segments [OC], [AA′ ] and [BB′ ].
The midpoints of a complete quadrilateral are collinear.
−−−→ −−−→
Proof. Choose the coordinate frame with origin O and basis ( OA , OA′ ). Then the coordinates of the
points are O(0, 0), A(1, 0), A′ (0, 1), B(b, 0), B′ (0, b′ ) and by Lemma F.4

b(1 − b′ ) b′ (1 − b)
!
C , .
1 − bb′ 1 − bb′

Thus, if X, Y , Z are the midpoints of [OC], [AA′ ], [BB′ ] respectively, then

b(1 − b′ ) b′ (1 − b) b b′
! !
1 1
 
X , , Y , , Z , .
2(1 − bb′ ) 2(1 − bb′ ) 2 2 2 2

The collinearity of the three points is equivalent to the vanishing of the following determinant:

b(1 − b′ ) b′ (1 − b) 1 − bb′
1
1 1 1 = 0.
4(1 − bb′ )
b b′ 1

The following two theorems give necessary and sufficient conditions for three points to be collinear
and, respectively, for three lines to be concurrent in terms of oriented ratios of the sides of a triangle.
In the context of projective geometry, Menelaus’ theorem and Ceva’s theorem may be seen as projec-
tive duals (see for example [2]). In order to state them, we need to introduce oriented ratios. Let P be
a point on the line AB. The oriented ratio (or signed ratio) AP : P A, in which P divides the segment
[AB], is the ordinary ratio |AP | : |P B| if P is between A and B and it is −|AP | : |P B| otherwise. Notice
that the signed ratio is defined by the equation
−−→   −−→
AP = AP : P B P B (F.1)

−−→ −−→
which is why it is sometimes written as AP / P B .

Theorem F.8 (Menelaus’5 Theorem). Let ABC be a triangle and ℓ a line which does not pass through
the vertices of the triangle. Let X, Y , Z be points on the lines BC, CA and AB respectively and consider
the signed ratios α = BX : XC, β = CY : Y A and γ = AZ : ZB. Then X, Y , Z are collinear if and only if

α · β · γ = −1.

5 c.70–c.130

200
Geometry - first year [email protected]

Proof. By Proposition 6.4, affine transformations preserve oriented ratios, i.e. oriented ratios are in-
−−→ −−→
dependent of the choice of the Cartesian frame. We choose a frame with origin A and basis ( AC , AZ ).
Consider the coordinates of the points in this frame: A(0, 0), B(0, b), C(1, 0), Y (y, 0), Z(0, 1). By Lemma
F.4, the point X ∈ BC lies on the line Y Z if and only if it has the following coordinates
!
y(1 − b) b(1 − y)
X , . (F.2)
1 − by 1 − by
Considering the components of the relevant vectors
−−→ y(1 − b) −−−→ y − 1
BX = (1, −b) and XC = (1, −b)
1 − by 1 − by
we see that (F.2) holds if and only if
y(1 − b)
β = BX : XC = . (F.3)
y −1
Considering the other vectors we have
−−→ −−→ 1
AZ = (0, 1), ZB = (b − 1)(0, 1) ⇒ γ = AZ : ZB = and
b−1
−−−→ −−→ y −1
CY = (y − 1)(1, 0), Y A = −y(1, 0) ⇒ α = CY : Y A = .
−y
It follows that (F.3) is equivalent to
1
β=− ⇔ α · β · γ = −1.
α·β

Theorem F.9 (Ceva’s6 Theorem). Let ABC be a triangle. Let E ∈ BC, F ∈ CA and G ∈ AB and consider
the signed ratios α = BE : EC, β = CF : FA and γ = AG : GB. The three lines AE, BF and CG are
concurrent if and only if
α · β · γ = 1.
6 1647-1734

201
Geometry - first year [email protected]

Proof. By Proposition 6.4, affine transformations preserve oriented ratios, i.e. oriented ratios are in-
−−−→ −−→
dependent of the choice of the Cartesian frame. We choose a frame with origin A and basis ( AG , AF ).
Consider the coordinates of the points in this frame: A(0, 0), B(0, b), C(c, 0), F(0, 1), G(1, 0). By Lemma
F.4, the intersection point P of the lines BF and CG has coordinates
!
b(1 − c) c(1 − b)
P , .
1 − bc 1 − bc

Then, the lines AE, BF, CG are concurrent if and only if P lies on AE, i.e. if and only if E is the
intersection of AP with BC. These two lines are described by the equations

b(1−c)
x y  x = 1−bc t


BC : + = 1 and AP :  .
b c  y = c(1−b) t

1−bc

Their intersection point is !


b(1 − c) c(1 − b)
, (F.4)
2−b−c 2−b−c
Considering the components of the relevant vectors

−−→ b−1 −−→ c−1


BE = (b, −c) and EC = (b, −c)
2−b−c 2−b−c
we see that E has coordinates (F.4) if and only if

b−1
α= . (F.5)
c−1
Considering the other vectors we have

−−−→ −−→ 1
AG = (1, 0), GB = (b − 1)(1, 0) ⇒ γ = AG : GB =
b−1

202
Geometry - first year [email protected]

−−→ −−→ 1−c


and CF = (1 − c)(0, 1), FA = −(1, 0) ⇒ β = CF : FA =
−1
and we see that (F.5) is equivalent to
1
α= ⇔ α · β · γ = 1.
β ·γ

Remark. The lines AE, BF and CG are known as cevians. As is often the case with mathematical
discoveries, attributing a theorem to a single mathematician can be challenging. Today, we know
that a proof of Ceva’s theorem was discovered much earlier by Al-Mu’taman7 .
The intersection point of two cevians can be described vectorially as follows.
Proposition F.10. Let ABC be a triangle. Let G ∈ AB and F ∈ AC be two points distinct from the
vertices of the triangle and let P = BF ∩ CG. Consider the signed ratios λ = AG : GB and µ = AF : FC.
For any point O we have
−−−→ −−→ −−−→
−−→ OA + λ OB + µ OC
OP = . (F.6)
1+λ+µ
−−−→ −−→ −−→ −−→
Proof. By (F.1) we have AG = λ GB and AF = µ FC . Therefore
−−→
−−→ −−−→ −−→ −−→ GB −−−→ 1 + λ −−−→
AB = AG + GB = (1 + λ) GB = (1 + λ) −−−→ AG = AG .
AG λ

and
−−→
−−→ −−→ −−→ −−→ FC −−→ 1 + µ −−→
AC = AF + FC = (1 + µ) FC = (1 + µ) −−→ AF = AF .
AF µ
−−→ −−→
In other words, with respect to the frame with origin A and basis ( AB , AC ), we have the following
coordinates B(1, 0), C(0, 1), G(λ/(1 + λ), 0) and F(0, µ/(1 + µ)). Thus, by Lemma F.4, the intersection
point P has coordinates !
λ µ
P , .
1+λ+µ 1+λ+µ
Hence, for any point O we have
−−→ −−→ −−−→ −−→ −−−→ −−−→ −−→ −−−→
−−→ −−−→ −−→ −−−→ λ AB + µ AC −−−→ (λ + µ) AO + λ OB + µ OC OA + λ OB + µ OC
OP = OA + AP = OA + = OA + =
1+λ+µ 1+λ+µ 1+λ+µ

Remark. Equation (F.6) in the above proposition does not depend on the point O. The coefficients
of the three vectors are the barycentric coordinates of the point P relative to the triangle ABC (See
Section D.4). The above proposition can be applied to derive the barycentric coordinates of the
intersection point of different cevians, such as the medians, the altitudes, etc.
7 11th century

203
Geometry - first year [email protected]

Theorem F.11 (Pascal’s hexagrammum mysticum theorem). If all six vertices of a hexagon lie on a
conic section and the three pairs of opposite sides intersect, then the three points of intersection are
collinear.

Proof. In the case where the conic section is a circle, a proof of this theorem based on Menelaus’
theorem can be found in [8, §.3.8].

204
APPENDIX G

Eigenvalues and Eigenvectors

Definition G.1. Let V be a R-vector space and let φ : V → V be a linear map. A non-zero vector v ∈ V
is called an eigenvector of φ if there is a scalar λ ∈ R such that φ(v) = λv. The scalar λ is then called
the eigenvalue associated to the eigenvector v. The set of eigenvalues of φ is called the spectrum of φ.
For A ∈ Matn×n (R) an eigenvector of A is an eigenvector v ∈ Rn for the map φA : Rn → Rn defined
by A through φ(x) = Ax, and an eigenvalue of A is an eigenvalue for φA .

• If φ = IdV , then every non-zero vector v is an eigenvector of φ with eigenvalue λ = 1.

• Every non-zero vector in ker(φ) is an eigenvector for φ with eigenvalue λ = 0.

Proposition G.2. The eigenvalue associated to an eigenvector is unique.

Proposition G.3. If v1 , v2 ∈ V are eigenvectors with the same eigenvalue λ, then for every c1 , c2 ∈ R
the vector c1 v1 + c2 v2 , if it is non-zero, is also an eigenvector with eigenvalue λ.

Definition G.4. From the above proposition it follows that for each λ ∈ R the set

Vλ (φ) = {v ∈ V : v is an eigenvector of φ with eigenvalue λ} ∪ {0}

is a vector subspace of V, called the eigenspace for the eigenvalue λ. For a matrix A ∈ Matn×n (R) the
eigenspace for the eigenvalue λ is defined to be the subspace Vλ (A) := Vλ (φA ) in Rn .

Proposition G.5. If v1 , . . . , vk ∈ V are eigenvectors with eigenvalues λ1 , . . . , λk respectively, and these


λi are pairwise distinct, then v1 , . . . , vk are linearly independent.

Proposition G.6. If every v ∈ V \ {0} is an eigenvector of φ then there exists λ ∈ R such that φ = λ IdV .

205
Geometry - first year [email protected]

G.1 Characteristic polynomial


In order to find the eigenvalues of a linear map V → V one uses the characteristic polynomial.
Proposition G.7. Let V be a finite dimensional vector space and let φ : V → V be a linear map. A
scalar λ ∈ R is an eigenvalue of φ if and only if the map
φ − λ IdV : V → V defined by (φ − λ IdV )(v) = φ(v) − λv
is not bijective, that is, if and only if det(φ − λ IdV ) = 0.

• Let B = (e1 , . . . , en ) be a basis of V. The matrix associated to the map λ IdV is


 
λ 0 . . . 0 
 0 λ . . . 0 
 
[λ IdV ]B =  . .

.. 

 .. .. . 
 
0 0 ... λ
and if A = (aij ) = [φ]B then
 
a11 − λ a12 ... a1n 
 a21 a22 − λ . . . a2n 
 
[φ − λ IdV ]B =  .
.. ..  .

 .. . . 

an1 an2 . . . ann − λ

Definition G.8. Let A ∈ Matn×n (R). The determinant


a11 − T a12 ... a1n
a21 a22 − T ... a2n
PA (T ) : det(A − T Idn ) = .. .. ..
. . .
an1 an2 . . . ann − T
is a polynomial of degree n in T , called the characteristic polynomial of A. If φ : V → V is linear and
B = (e1 , . . . , en ) is a basis of V then the characteristic polynomial of φ is Pφ := P[φ]B .
Proposition G.9. The definition of Pφ is independent of the basis.
Corollary G.10. Let V be a vector space of dimension n, and let φ : V → V be linear. Then λ ∈ R
is an eigenvalue of φ if and only if λ is a root of the polynomial Pφ . In particular, φ has at most n
eigenvalues.
Proposition G.11. Let V be a finite dimensional vector space. A linear map φ : V → V is diagonaliz-
able if and only if there is a basis of V consisting entirely of eigenvectors of φ.
Theorem G.12. Let V be a real vector space of dimension n, and let φ : V → V be linear. If
{λ1 , . . . , λk } ⊆ R is the spectrum of φ, then
dim(Vλ1 (φ)) + · · · + dim(Vλk (φ)) ≤ n
with equality if and only if φ is diagonalizable.

206
Geometry - first year [email protected]

Corollary G.13. If dim(V) = n and φ has n distinct eigenvalues then it is diagonalizable.

• We have a practical method for finding eigenvalues and eigenvectors.

• Suppose we are given A ∈ Matn×n (R).

1. Calculate the characteristic polynomial PA .


2. Find the eigenvalues of A by calculating the roots of PA which lie in R.
3. For each eigenvalue λ ∈ R, the homogenous system of n equations in n unknowns:

x1  0
   

(A − λ Idn )  ...  =  ... 


   
   
xn 0

has rank r < n and therefore has nontrivial solutions. The space of solutions is the eigenspace
Vλ (A).

• If the sum of the dimensions of the eigenspaces found by letting λ vary over the roots of PA is
equal to n then A is diagonalizable (by Theorem G.12).

• A linear map V → V need not have any eigenvalues nor eigenvectors.

• If dim(V) is odd, then the characteristic polynomial has odd degree and so has at least one real
root. Thus, every linear map V → V on an odd dimensional real vector space has at least one
eigenvalue, and so at least one eigenvector.

• If we replace R by C, by the fundamental theorem of algebra, PA has roots in C. Thus, every


linear map V → V on a finite dimensional complex vector space has at least one eigenvalue, and
so at least one eigenvector. Of course, this does not mean that the linear map is diagonalizable.

Definition G.14. Let V be a finite dimensional R-vector space and let φ : V → V be linear admitting
λ as eigenvalue. The number dim(Vλ (φ)) is called the geometric multiplicity of λ for φ. The algebraic
multiplicity of λ for φ is instead the multiplicity of λ as a root of the characteristic polynomial Pφ ; this
is denoted by hφ (λ).

Proposition G.15. For any linear map φ : V → V and λ ∈ R one has

dim(Vλ (φ)) ≤ hφ (λ),

that is, the geometric multiplicity is not larger than the algebraic multiplicity.

207
APPENDIX H

Bilinear forms and symmetric matrices

The purpose of this appendix is threefold:

1. We prove Sylvester’s theorem (Theorem H.7) which shows that the properties in Proposition
4.15 define scalar products (Corollary H.10).

2. From the proof of Sylvester’s theorem one obtains a simple algorithm for the affine classification
of quadrics (see Chapter 10).

3. We prove the Spectral theorem (Theorem H.14) which is used in the isometric classification of
quadratic surfaces (see Chapter 10).

Bilinear forms are the natural context for the proofs of both theorems.

H.1 Affine diagonalization


Throughout we let V denote a finite dimensional real vector space. The proofs for the statements in
this appendix can be found in [19, Chapter 15 and 16].

Definition H.1. A map φ : V × V → R is called a bilinear form if it is linear in both arguments, i.e.

φ(au + bv, w) = aφ(u, w) + bφ(v, w) and φ(u, av + bw) = aφ(u, v) + bφ(u, w). (H.1)

for all u, v, w ∈ V and all a, b ∈ R.

With respect to a basis B = (e1 , . . . , en ) of V, the values of a bilinear form φ are determined by its
values on the basis vectors. These values are the entries in the Gram matrix GB (φ) of φ relative to B,
which is defined by GB (φ) = φ(ei , ej ). This matrix determines φ since

φ(v, w) = [v]TB · GB (φ) · [w]B .

208
Geometry - first year [email protected]

Moreover if B ′ is another basis, then

GB ′ (φ) = MTB,B ′ · GB (φ) · MB,B ′ . (H.2)

In particular, the rank of GB (φ) does not depend on B. It depends only on φ. We denote it by rank(φ)
and call it the rank of φ.
Definition H.2. A bilinear form φ : Vn × Vn → R is called symmetric if the Gram matrix GB (φ) is a
symmetric matrix with respect to a basis B. It follows from (H.2) that this definition doesn’t depend
on the basis B.
Definition H.3. Let φ be a bilinear form on the vector space V. The quadratic form associated to φ is
the map
qφ : V → R defined by qφ (v) = φ(v, v).
Proposition H.4. Let φ be a symmetric bilinear form on the vector space V. The quadratic form qφ
associated to φ satisfies
qφ (λv) = λ2 qφ (v)
2φ(v, w) = q(v + w) − q(v) − q(w)
for every λ ∈ R and every v, w ∈ V.
Proof. The first property is an immediate consequence of (H.1). Further,
qφ (v + w) − qφ (v) − qφ (w) = φ(v + w, v + w) − φ(v, v) − φ(w, w)
= φ(v, w) + φ(w, v)
= 2φ(v, w).

Remark. In particular, Proposition H.4 implies that the quadratic form qφ determines uniquely the
symmetric bilinear form φ, i.e. the correspondence φ ↔ qφ is a bijection between symmetric bilinear
forms and quadratic forms.
Definition H.5. Let φ be a symmetric bilinear form on the vector space V. A diagonalizing basis for
φ is a basis B of V such that GB (φ) is a diagonal matrix.
Theorem H.6. Let φ be a symmetric bilinear form on the vector space V. Then there exists a diago-
nalizing basis for φ.
Proof. We take the proof from [19, p.232]. The proof is ‘by completing the squares’ and induction on
n. If n = 1 there is nothing to prove. Suppose therefore that n ≥ 2, and that every symmetric bilinear
form on a space of dimension less than n has a diagonalizing basis. Choose a basis B = (b1 , . . . , bn )
of V. If φ is the zero form, then B is diagonalizing and there is nothing to prove. Otherwise we can
obtain from B a second basis B ′ = (c1 , . . . , cn ) for which φ(c1 , c1 ) , 0. Indeed, if there is some i for
which φ(bi , bi ) , 0 then it suffices to exchange b1 and bi . If, on the other hand, φ(bi , bi ) = 0 for all i,
then there are i , j for which φ(bi , bj ) , 0 (otherwise φ is the zero form), and again we can exchange
these with b1 and b2 so that φ(b1 , b2 ) , 0. The new basis

B ′ = (b1 + b2 , b2 , . . . , bn )

209
Geometry - first year [email protected]

has the required property. In the basis B ′ , the quadratic form qφ associated to φ has the form
n
X n
X
q(v(y1 , . . . , yn )) = h11 y12 + 2 h1i y1 yi + hij yi yj ,
i=2 i,j=2

where hij = φ(ci , cj ). Since h11 = φ(c1 , c1 ) , 0 we can rewrite the above equation as
 n
2
 X 
q(v(y1 , . . . , yn )) = h11 y1 + h−1
11 h1i yi 
 + (terms not involving y ).
 1
i=2

We now change coordinates as follows


n
X
z1 = y1 + h−1
11 h1i yi , z2 = y2 , . . . , zn = yn ,
i=2

which corresponds to a change of basis from B ′ to B ′′ = (d1 , . . . , dn ) given by d1 = c1 , and for i > 1,
di = ci − h−1
11 h1i c1 . In these coordinates, qφ has the form

q(v(y1 , . . . , yn )) = h11 z12 + q′ (z2 , . . . , zn ),

where q′ is a homogeneous polynomial of degree 2 in z1 , . . . , zn , and so defines a quadratic form on the


space ⟨d2 , . . . , dn ⟩. By the inductive hypothesis, ⟨d2 , . . . , dn ⟩ has a basis (e2 , . . . , en ) which diagonalizes
q′ . Thus, the basis (d1 , e2 , . . . , en ) is diagonalizing for q.

Theorem H.7. (Sylvester’s1 Theorem) Let φ be a symmetric bilinear form of rank r on the vector
space V. Then there is an integer p depending only on φ, and a basis B of V such that the Gram
matrix GB (φ) has the form
 
Ip 0 0
 0 −Ir−p 0 (H.3)
 
 
0 0 0
where 0 stands for zero matrices of appropriate sizes.

Proof. We take the proof from [19, p.232]. By Theorem H.6 there is a basis B = (b1 , . . . , bn ) for which

qφ (v) = b11 y12 + · · · + bnn yn2 .

for each v = y1 b1 + · · · + yn bn ∈ V. The number of non-zero coefficients bii is equal to the rank of the
quadratic from qφ , and so, depends only on φ. After possibly reordering the basis, we can suppose
that the first p coefficients are positive, the next r − p are negative, and the remaining n − r are zero.
Then one has

b11 = β12 , . . . , b1p = βp2 , 2


bp+1,p+1 = −βp+1 , . . . , br,r = −βr2 , br+1,r+1 = · · · = an,n = 0.
1 1814 – 1897

210
Geometry - first year [email protected]

for appropriate β1 , . . . , βr ∈ R, which can be taken to be positive. Then, with respect to the basis
c1 = b1 /β1 , . . . , cr = br /βr , cr+1 = br+1 , . . . , cn = bn ,
the matrix of φ is that given in (H.3), an so the quadratic form qφ is

qφ (v(x1 , . . . , xn )) = x12 + · · · + xp2 − xp+1


2
− · · · − xr2 .
for each v = x1 c1 + · · · + xn cn ∈ V.
There remains only to prove that p depends only on φ and not on the particular basis B. Suppose
that with respect to another basis B ′ = (c1 , . . . , cn ), qφ is

qφ (v) = z12 + · · · + zt2 − zt+1


2
− · · · − zr2 ,
for each v = z1 c1 +· · ·+zn cn , for some integer t ≤ r. We need to show that t = p. If p , t we can suppose
that t < p. Consider the subspaces of V:
S = ⟨b1 , . . . , bp ⟩ and T = ⟨ct+1 , . . . , cn ⟩.
Since dim(S) + dim(T ) > n it follows from Grassman’s formula that S ∩ T , {0}, and so there is a
v ∈ S ∩ T with v , 0. Then,
v = x1 b1 + · · · + xp bp = zt+1 ct+1 + · · · + zn cn .
Since v , 0 it follows, using the expression of qφ relative to B, that

qφ (v) = x12 + · · · + xp2 > 0


and, using the expression of qφ relative to B ′ , we have
2
qφ (v) = −zt+1 − · · · − zr2 < 0.
This is clearly a contradiction, so t = p.

Theorem H.8. (Sylvester’s Theorem - matrix formulation) Let A be a symmetric matrix of rank r.
Then there is an integer p depending only on A, and an invertible matrix P such that
 
Ip 0 0
P T AP =  0 −Ir−p 0
 
0 0 0
 

where 0 stands for zero matrices of appropriate sizes.


Proof. Let φ be the symmetric bilinear form associated to the matrix A with respect to some basis B ′
of V. Then GB ′ (φ) = A. By Theorem H.7 there is a basis B and an integer p depending only on φ such
that  
Ip 0 0
P T AP = MTBB′ GB ′ (φ) MB′ B = GB (φ) =  0 −Ir−p 0 .
 
0 0 0
 

where P = MB′ B . Since p depends only on φ, it does not depend on B and B ′ , i.e. it depends only on
A and the claim follows.

211
Geometry - first year [email protected]

Corollary H.9. Let φ be a positive definite symmetric bilinear form on an n-dimensional vector space
V. There is a basis B of V such that GB (φ) = In .

Proof. By Theorem H.7 there is a basis B = (b1 , . . . , bn ) such that the Gram matrix of φ has the form
(H.3). Since φ is positive definite, i.e. φ(v, v) > 0 for all non-zero vectors v, its Gram matrix must
equal the identity matrix In (otherwise ⟨bn , bn ⟩ equals 0 or −1, a contradiction).

Corollary H.10. Assume that the dimension of the space V of geometric vectors is n and assume that
a unit segment has been chosen. Then, the properties in Proposition 4.15 define the scalar product.
More concretely, there is a unique positive definite symmetric bilinear form φ which recognizes right
angles and unit lengths.

Proof. The first three properties in Proposition 4.15, assert that the scalar product is a positive defi-
nite symmetric bilinear form on the space Vn of geometric vectors. By Corollary H.9 there is a basis
such that the Gram matrix of the scalar product equals In . Therefore, by property (SP4), B is an
orthonormal basis.
Let φ be another positive definite symmetric bilinear form. Since φ is symmetric, the Gram
matrix GB (φ) is symmetric. If φ satisfies property (SP4), i.e. if it recognizes right angles and unit
lengths, then we must have GB (φ) = In since B is orthonormal. Hence φ is the scalar product.

H.2 Isometric diagonalization


In essence, the spectral theorem says that a symmetric linear map or a symmetric bilinear form can
be diagonalized with orthogonal transformations. In order to make this statement precise, we first
put side by side some facts about linear maps and bilinear forms, and make precise what we mean
by a ‘symmetric’ linear map. Symmetric bilinear forms were discussed in the previous section H.2.
Given a basis B of Vn , an n × n-matrix M with real entries gives rise to the linear map

φ : Vn → Vn , defined by φ(v) = M · [v]B

and to the bilinear form

ψ : Vn × Vn → R, defined by ψ(v, w) = [v]TB · M · [w]B .

We say that φ is the linear map associated to M in the basis B and ψ is the bilinear form associated to
M in the basis B. The other way around, given a linear map φ : Vn → Vn , it has an associated matrix
MB,B (φ) with respect to the bases B. Similarly, given a bilinear form ψ : Vn × Vn → R we associate to
it the Gram matrix GB (ψ) (see Section H.1). Then

φ(v) = MB,B (φ) · [v]B and ψ(v, w) = [v]TB · GB (ψ) · [w]B .

If we change the basis from B to B ′ , it is an exercise in linear algebra to show that

MB ′ ,B ′ (φ) = M−1
B,B ′ · MB,B (φ) · MB,B ′ and GB ′ (ψ) = MTB,B ′ · GB (ψ) · MB,B ′ (H.4)

where MB,B ′ is the base change matrix from B ′ to B.

212
Geometry - first year [email protected]

In this setting we add the following assumptions: we consider the scalar product ⟨ , ⟩ on Vn , we
let B be an orthonormal basis and we assume that M is a symmetric matrix, i.e. M = M T . Then, we
let φ be the linear map associated to M in the basis B and we let ψ be the bilinear form associated to
M in the basis B. Then, since the gram matrix of the scalar product is In we have
ψ(v, w) = [v]TB · M · [w]B = ⟨v, M · [w]B ⟩ = ⟨v, φ(w)⟩.
Since M is symmetric, we also have
[v]TB · M · [w]B = (M T · [v]B )T · [w]B = (M · [v]B )T · [w]B
and therefore
⟨φ(v), w⟩ = ψ(v, w) = ⟨v, φ(w)⟩. (H.5)
Definition H.11. A linear map φ : Vn → Vn is called symmetric (relative to the scalar product) if (H.5)
holds for all v, w ∈ Vn .
The proof of the Spectral Theorem uses the concept of orthogonal complement to a vector.
Definition H.12. Let v ∈ Vn . The orthogonal complement, denoted by v⊥ , is the set of all vectors in Vn
which are orthogonal to v. So n o
v⊥ = w ∈ Vn : ⟨v, w⟩ = 0 .
Since the scalar product is bilinear, the map fv : Vn → R defined by f (w) = ⟨v, w⟩ is linear and we
notice that v⊥ = ker(fv ). Thus, if v is non-zero, v⊥ is an (n − 1)-dimensional vector subspace of Vn .
The proof of the Spectral Theorem also uses the following linear algebra fact.
Lemma H.13. The characteristic polynomial of a symmetric matrix M ∈ Matn×n (R) has only real
roots. Equivalently, the eigenvalues of a symmetric linear map are all real.
Proof. The equivalence of the two statements is obvious, by considering the linear map associated to
M. We take the proof of the first claim from [19, p.232].
We can consider M as a matrix over C (that is, with complex entries), and so we may view φ as a
linear map Cn → Cn . Let λ ∈ C be a root of the characteristic polynomial of M, and let x(x1 , . . . , xn ) ∈
Cn be a corresponding eigenvector. Then
Mx = λx.
Taking the complex conjugates of both sides gives
Mx = λx.
Consider the scalar xT Mx. Writing it in two different ways using the above equations gives
xT Mx = xT (Mx) = xT (λx) = λxT x (H.6)
xT Mx = (xT M)x = (Mx)T x = (λx)T x = λxT x (H.7)
Note that
xT x = x1 x1 + . . . xn xn
is a strictly positive real number, since x , 0. We can therefore deduce from (H.6) and (H.7) that
λ = λ, that is that λ is real.

213
Geometry - first year [email protected]

Theorem H.14 (Spectral Theorem). Let φ : Vn → Vn be a symmetric linear map. Then, there exists
an orthonormal basis B such that MB,B (φ) is a diagonal matrix.
Equivalently, let ψ : Vn ×Vn → R be a symmetric bilinear form. Then, there exists an orthonormal
basis B such that GB (ψ) is a diagonal matrix.
Proof. We take the proof of the first claim from [19, p.232]. The proof is by induction on n = dim(Vn ).
If n = 1 there is nothing to prove. Suppose therefore that n ≥ 2, and that the theorem holds for spaces
of dimension n−1. Since φ is symmetric, it has only real eigenvalues, by Lemma H.13. Thus φ has an
eigenvalue λ; let e1 be a corresponding eigenvector, which we can take to be of length 1. Let U = e⊥ 1,
the orthogonal complement to e1 . For each u ∈ U,

⟨φ(u), e1 ⟩ = ⟨u, φ(e1 )⟩ = ⟨u, λe1 ⟩ = λ⟨u, e1 ⟩ = λ · 0 = 0,

and so φ(u) ∈ U, that is φ induces a map φU : U → U. Since φU (u) = φ(u) for every u ∈ U, the map φ
is a symmetric linear map. By the inductive hypothesis, U has an orthonormal basis (e2 , . . . , en ) which
diagonalizes φU . Thus B = (e1 , e2 , . . . , en ) is an orthonormal basis of V which diagonalizes φ.
For the second claim we use (H.5). Let φ be the symmetric linear map associated to the Gram
matrix GB ′ (ψ) of ψ for some basis B ′ . Let B = (e1 , e2 , . . . , en ) be the diagonalizing orthonormal basis
for φ obtained in the first part of the proof. By construction, it consists of eigenvectors of φ. Then,
by (H.5), we have
(
λi if i = j
ψ(ei , ej ) = ⟨φ(ei ), ej ⟩ = ⟨λi ei , ej ⟩ = λi ⟨ei , ej ⟩ =
0 if i , j.

hence B is a diagonalizing basis for ψ.

Theorem H.15 (Spectral Theorem - matrix formulation). Let M ∈ Matn×n (R) be a symmetric matrix.
There exists an orthogonal matrix P such that
 
λ1 0 . . . 0 
 0 λ2 . . . 0 
 
−1 T
P MP = P MP =  . .. .. 

 .. . . 

0 0 ... λn

where λ1 , . . . , λn are the eigenvalues of M.


Proof. Let B ′ be an orthonormal basis of Vn . Let φ be the linear map associated to M in the basis B ′ .
By the proof of Theorem H.14, there is an orthonormal basis B such that
 
λ1 0 . . . 0 
 0 λ2 . . . 0 
 
MB,B (φ) =  .
.. .. 

 .. . . 
 
0 0 . . . λn

where λ1 , . . . , λn are the eigenvalues of φ, and therefore of M. Then we may choose P = MB ′ ,B since

P −1 MP = M−1
B ′ ,B MB ′ ,B ′ (φ) MB ′ ,B = MB,B (φ)

214
Geometry - first year [email protected]

by (H.4). Moreover, since B and B ′ are both orthogonal, it follows that P is an orthogonal matrix, i.e.
P −1 = P T and the proof is finished. Indeed, since B ′ is orthonormal and since P = MB ′ ,B is the matrix
with columns consisting of the components of the basis vectors in B (which is an orthonormal basis),
we have P T · P = 1, hence P −1 = P T .

215
APPENDIX I

Trigonometric functions

Let a and b be two vectors in V2 . In Section 4.1 we noticed that there is a bijection between the
1
set W of unoriented angles ∡(a, b) and the semicircle S 2 as well as a bijection between the set Wor
of oriented angles ∡or (a, b) and the unit circle S1 . It is clearly uncomfortable to do calculations in
this setting. For a more comfortable manipulation, one attaches numbers to angles (we parametrize
angles). The standard way to do this is to define a measure of an angle, a numerical value which we
may use instead of describing the angle as two rays emanating from a point. The standard measure
is that of radians, which can be introduced as follows
−−−→ −−→
Definition I.1. Let a = OA and b = OB be two non-zero vectors in V2 . The interior of the angle
−−→
∡AOB is the set of points P with the property that there are scalars α, β ≥ 0 such that OP = αa + βb.
The sector S(AOB) is the set of points P in the interior of the angle and in the interior of the circle of
radius 1 centered in O, i.e. d(O, P ) ≤ 1.
The measure of the angle ∡(a, b), denoted by m(∡(a, b)), is



 2 · Area(S(AOB)) if a and b are linearly independent
m(∡(a, b)) =  0 if a and b have the same direction


 π if a and b have opposite direction

where π is by definition the area of a unit circle. The definition does not depend on the representa-
tives.
Theorem I.2. The properties of the area function translate into properties of angles. For two non-
zero vectors a and b we have
1. 0 ≤ m(∡(a, b)) ≤ π

2. If v is between a and b, then m(∡(a, v)) + m(∡(v, b)) = m(∡(a, b)).

3. For any θ ∈ [0, π] there is a vector c ∈ V such that m(∡(a, c)) = θ

216
Geometry - first year [email protected]

⟨a,b⟩ ⟨c,d⟩
4. m(∡(a, b)) = m(∡(c, d)) if and only if |a|·|b|
= |c|·|d|
.

5. m(∡(a, b)) = m(∡(b, a)).

6. m(∡(a, b)) + m(∡(b, −a)) = π.

7. m(∡(a, b)) = 0 if and only if b and a have the same direction.

8. If a ⊥ b, then m(∡(a, b)) = π2 .


π
9. If ⟨a, b⟩ > 0, then 0 ≤ m(∡(a, b)) < 2 and we say that the angle is acute.
π
10. If ⟨a, b⟩ < 0, then 2 < m(∡(a, b)) ≤ π and we say that the angle is obtuse.

11. In particular, a ⊥ b if and only if m(∡(a, b)) = π2 .

Definition I.3. The measure of the angle ∡or (a, b), denoted by m(∡or (a, b)), is
(
m(∡(a, b)) if [a, b] ≥ 0
m(∡or (a, b)) =
−m(∡(a, b)) if [a, b] < 0

Proposition I.4. The measure of an oriented angle satisfies the following properties

1. −π < m(∡or (a, b)) ≤ π

2. 0 < m(∡or (a, b)) < π if and only if (a, b) is right oriented.
π
3. m(∡or (a, J(a))) = 2

4. m(∡or (a, b)) = −m(∡or (b, a))

These definitions of measures of angles give the standard bijections

W ↔ [0, π] and Wor ↔ (−π, π]. (I.1)

This allows us to view sin, cos, tan as functions (−π, π] → R. Notice also that if θ is an oriented angle
then the sign is no longer positive and the sine function changes the sign while the cosine function
doesn’t, i.e. we have the sign rules that we are familiar with:

sin(−θ) = − sin(θ) and cos(−θ) = cos(θ).

In what follows we work out main properties of the trigonometric functions. The bijections (I.1)
allow us to simplify notation: we write ∡or (a, b) instead of m(∡or (a, b)).

Proposition I.5. If a and b are two non-zero vectors in the oriented Euclidean plane E2 , then

⟨a, b⟩ [a, b]
cos(θ) = and sin(θ) =
|a| · |b| |a| · |b|

where θ = ∡or (a, b).

217
Geometry - first year [email protected]

Proposition I.6. If a is a non-zero vector in the oriented Euclidean plane E2 , and if c2 + s2 = 1 then
there is a vector b such that

cos(∡or (a, b)) = c and sin(∡or (a, b)) = s

and all such vectors are proportional.

Instead of viewing ∡or (a, b) as a value in (−π, π] it is more convenient to view it as an equivalence
class of R modulo 2π, i.e. θ ∈ R represents the angle ∡or (a, b) if θ = ∡or (a, b) + 2kπ for some integer
k. With this convention we have.

Proposition I.7. If a, b, c are three non-zero vectors in the oriented Euclidean plane E2 , then

∡or (a, b) = ∡or (a, c) + ∡or (c, b) mod 2π.

Theorem I.8. For all α, β ∈ R, we have

cos(α + β) = cos(α) cos(β) − sin(α) sin(β)


sin(α + β) = sin(α) cos(β) + cos(α) sin(β).

Proposition I.9. The following half angle formulas hold true

sin(α)
 
α
tan 2 = 1+cos(α)
 
α
2 cos2 2 = 1 + cos α
 
2 α
2 sin 2 = 1 − cos α

Proposition I.10. For all θ ≥ 0, we have

sin(θ) ≤ θ ≤ tan(θ).

Corollary I.11. For all θ, we have


0 ≤ 1 − cos(θ) ≤ θ 2 .

Proposition I.12. We have the following limits


! !
1 − cos(θ) sin(θ)
lim =0 and lim = 1.
θ→0 θ θ→0 θ

Corollary I.13. The sine and cosine functions are differentiable and the derivatives are

cos′ (θ) = − sin(θ) and sin′ (θ) = cos(θ).

218
APPENDIX J

Some classical theorems in Euclidean geometry

In this section we collect classical theorems in Euclidean geometry. The theorems in Appendix F are
purely affine but they also belong to this list.
−−→ −−→
Theorem J.1 (Euclid’s first theorem). Let ABC be a rectangular triangle with BA ⊥ BC , and let
−−−→ −−→
H ∈ [AC] be such that BH ⊥ AC . Then

d(A, B)2 = d(A, C) · d(A, H).

Theorem J.2 (Pythagoras’ theorem). Let the triangle ABC be rectangular in B. Then

d(A, C)2 = d(A, B)2 + d(B, C)2 .

Theorem J.3 (Theorem of heights). Let the triangle ABC be rectangular in B and let H be the foot of
the altitude on the leg [AC]. Then

d(B, H)2 = d(A, H) · d(H, C).

Theorem J.4 (Thales’ circle theorem). If A, B, C are points on a circle, then [AC] is a diameter if and
only if ∡ABC is a right angle.

Theorem J.5 (Central angle theorem). Let ABC be a triangle and consider the circumcircle with
center O. We say that the angle ϕ = ∡(ACB) is subtended by the chord [AB] and that the angle
ψ = ∡AOB is the corresponding central angle. We have ψ = 2ϕ.

Theorem J.6 (Ptolemy’s theorem). A quadrilateral is called inscribed if its vertices lie on a circle. A
quadrilateral is inscribed if and only if the sum of the products of the lengths of its two pairs of
opposite sides is equal to the product of the lengths of its diagonals.

219
Geometry - first year [email protected]

Theorem J.7 (Euler’s line). The orthocenter H, the centroid G and the circumcenter U of a triangle
are collinear. Moreover show that
−−−→ −−−→
HG = 2 GU .

Theorem J.8 (Feuerbach Circle). In every triangle the three midpoints of the sides, the three base
points of the altitudes, and the midpoints of the three altitude sections touching the vertexes lie on
a circle. Moreover, the center of this circle lies on Euler’s line half way between the orthocenter and
the circumcenter.

220
APPENDIX K

Quaternions and rotations

K.1 Algebraic considerations


Some of the aspects considered here are also covered in [?, Section 4.4].

Definition K.1. Denote the standard basis of R4 by 1, i, j, k and consider the bilinear map

· : R4 × R4 → R4

given on the basis vectors by

1 i j k
1 1 i j k
i i −1 k −j .
j j −k −1 i
k k j −i −1

We denote R4 with the above multiplication by H. The elements of H are called quaternions. The
product is the Hamilton product.

Remark. From the definition we observe that

1. The multiplication map on arbitrary quaternions p = a1 +b1 i+c1 j+d1 k and q = a2 +b2 i+c2 j+d2 k
is
pq = (a1 a2 − b1 b2 − c1 c2 − d1 d2 ) + (a1 b2 + a2 b1 + c1 d2 − c2 d1 )i
(K.1)
+ (a1 c2 + a2 c1 − b1 d2 + b2 d1 )j + (a1 d2 + a2 d1 + b1 c2 − b2 c1 )k

2. Direct calculations show that H is an algebra, usually called quaternion algebra.

221
Geometry - first year [email protected]

3. H is not commutative, i · j = k = −j · i.

4. R · 1 is a subfield of H so we just write R for it.

5. C = R · 1 + R · i is a subfield of H.

Definition K.2. For a quaternion q = a + bi + cj + dk ∈ H, a is the real part ℜ(q) of q and bi + cj + dk


the imaginary part ℑ(q) of q. We say that q is real if it equals its real part. We say that q is purely
imaginary if it equals its imaginary part.

Proposition K.3. A quaternion is real if and only if it commutes with all quaternions, i.e. the center
of H is R.

Proposition K.4. A quaternion is purely imaginary if and only if its square is real and non-positive.

Definition K.5. For a quaternion q = a + bi + cj + dk ∈ H, the conjugate of q is

q = a − bi − cj − dk = ℜ(q) − ℑ(q) ∈ H.

Proposition K.6. For p, q ∈ H and a ∈ R we have

1. p + q = p + q

2. ap = ap

3. p = p

4. p · q = q · p

5. p ∈ R ⇔ p = p

6. p is purely imaginary ⇔ p = −p

7. ℜ(p) = 12 (p + p)

8. ℑ(p) = 12 (p − p)

K.2 Geometric considerations


By construction H is R4 as real vector space, so we may view it as a 4-dimensional real affine space.
If in addition we consider the 4-dimensional Euclidean structure we may identify H with E4 . In
particular, we may consider the standard scalar product ⟨ , ⟩ : H × H → H  R4 .

Proposition K.7 (Compare this with the similar statements for C  E2 ). For p, q ∈ H we have

1. ⟨p, q⟩ = 21 (pq + qp)

2. ⟨p, p⟩ = pp
p
3. |p| = pp

222
Geometry - first year [email protected]

If in addition p and q are purely imaginary, we have

4. ⟨p, q⟩ = − 21 (pq + qp) = −ℜ(pq)

5. ⟨p, p⟩ = −p2
p
6. |p| = −p2

7. ⟨p, q⟩ = 0 ⇔ pq = −qp.

Definition K.8. With our identification, quaternions are vectors in V4  D(H)  H and the norm |q|
1
of a quaternion q equals (q q) 2 . If |q| = 1 we say that q is a unit quaternion.

Proposition K.9. For any p, q ∈ H we have

|pq| = |p| · |q|.

In particular, left and right multiplication by unit quaternions are isometries.

Proposition K.10. H is a skew field. The inverse of q ∈ H \ {0} is

q
q−1 = .
|q|2

We identified H with E4 . Next, we view E3 as a subspace of H identifying it with purely imaginary


quaternions ℑ(H) = Ri + Rj + Rk.

Proposition K.11. Let q1 , q2 be two quaternions with ai = ℜqi , vi = ℑqi . Making use of the scalar
product and the cross product in E3 , we have

q1 q2 = (a1 + v1 )(a2 + v2 ) = a1 a2 − ⟨v1 , v2 ⟩ + a2 v1 + a1 v2 + v1 × v2 . (K.2)

Proposition K.12. Let v = vi i + vj j + vk k ∈ D(E3 )  ℑ(H) be a unit quaternion and p ∈ E3  ℑ(H) a


point. The rotation of p around the axis Rv by an angle θ is given by

p′ = qpq−1

where
θ θ
   
q = cos + sin v.
2 2

223
Bibliography

[1] C. Alsina, B. Nelsen, A mathematical space odyssey. Solid geometry in the 21st century. The
Dolciani Mathematical Expositions, 50. Mathematical Association of America, Washington, DC,
2015.
[2] J. Benı́tez, A unified proof of Ceva and Menelaus’ theorems using projective geometry. J. Geom.
Graph. 11 (2007), no. 1, 39–44.
[3] M. de Berg, O. Cheong, M. van Kreveld, M. Overmars, Computational geometry. Algorithms
and applications. Third edition. Springer-Verlag, Berlin, 2008.
[4] M. Berger, Geometry I. Translated from the 1977 French original by M. Cole and S. Levy. Fourth
printing of the 1987 English translation Universitext. Springer-Verlag, Berlin, 2009.
[5] P.A. Blaga, Geometrie liniară. Cu un ochi către grafica pe calculator, Vol. I, Presa Universitară
Clujeană, 2022.
[6] C.B. Boyer, A history of mathematics. Second edition. Edited and with a preface by Uta C.
Merzbach. John Wiley & Sons, Inc., New York, 1989.
[7] H.S.M. Coxeter, Regular polytopes. Methuen, London, 1948.
[8] H.S.M. Coxeter, S.L. Greitzer, Geometry revisited. New Mathematical Library, 19. Random
House, Inc., New York, 1967.
[9] S. Crivei, Basic Linear Algebra. Presa Universitară Clujeană, 2022.
[10] Euclid. The thirteen books of Euclid’s Elements translated from the text of Heiberg. Vol. I: In-
troduction and Books I, II. Vol. II: Books III–IX. Vol. III: Books X–XIII and Appendix. Translated
with introduction and commentary by Thomas L. Heath. 2nd ed. Dover Publications, Inc., New
York, 1956.
[11] A.E. Fekete, Real linear algebra. Monographs and Textbooks in Pure and Applied Mathematics,
91. Marcel Dekker, Inc., New York, 1985.

224
Geometry - first year [email protected]

[12] I. Grattan-Guinness, The search for mathematical roots, 1870–1940. Logics, set theories and
the foundations of mathematics from Cantor through Russell to Gödel. Princeton Paperbacks.
Princeton University Press, Princeton, NJ, 2000.

[13] J. Hefferon, Linear Algebra, 2020.

[14] D. Hilbert, Foundations of Geometry. Second English edition. Translated by Unger L. from 10th
ed. Revised and Enlarged by Bernays P. Open Court, Illinois, 1971.

[15] L. Hodgkin, A history of mathematics. From Mesopotamia to modernity. Oxford University


Press, Oxford, 2005.

[16] M. Nechita, Lecture notes for mathematical analysis, 2025.

[17] C.C. Pugh, Real mathematical analysis. Second edition. Undergraduate Texts in Mathematics.
Springer, Cham, 2015.

[18] W. Rudin, Principles of mathematical analysis. Third edition. International Series in Pure and
Applied Mathematics. McGraw-Hill Book Co., New York-Auckland-Düsseldorf, 1976.

[19] E. Sernesi, Linear Algebra. A geometric approach. Translated by Montaldi J., Chapman &
Hall/CRC, 1993.

[20] Handbook of discrete and computational geometry. Second edition. Edited by Jacob E. Good-
man and Joseph O’Rourke. Discrete Mathematics and its Applications (Boca Raton). Chapman
& Hall/CRC, Boca Raton, FL, 2004.

[21] Lean 4, Programming Language and Theorem Prover, https://lean-lang.org/

[22] The mathlib Community: The Lean mathematical library. In: Proceedings of the 9th ACM SIG-
PLAN International Conference on Certified Programs and Proofs. p. 367–381. CPP 2020, New
York, NY, USA (2020).

[23] A mathlib overview, https://leanprover-community.github.io/mathlib-overview.html

225

You might also like