Geometry Draft 2025 2
Geometry Draft 2025 2
Draft 2025
Contents
1 Affine space 5
1.1 Geometric Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Vector space structure of geometric vectors . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3 Affine space structure of the Euclidean space . . . . . . . . . . . . . . . . . . . . . . . . 17
2 Cartesian coordinates 19
2.1 Frames in dimension 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Frames in dimension 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Frames in dimension n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3 Affine subspaces 31
3.1 Lines in A2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Planes in A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3 Lines in A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4 Affine subspaces of An . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4 Euclidean space 49
4.1 Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3 Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6 Affine maps 87
6.1 Properties of affine maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
2
Geometry - first year [email protected]
7 Isometries 99
7.1 Affine form of isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.2 Isometries in dimension 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.3 Isometries in dimension 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.4 Moving points with isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
10 Hyperquadrics 142
10.1 Hyperquadrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10.2 Reducing to canonical form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10.3 Classification of conics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.4 Classification of quadrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Appendices 178
A Axioms 179
3
Geometry - first year [email protected]
Bibliography 224
4
CHAPTER 1
Affine space
Contents
1.1 Geometric Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Vector space structure of geometric vectors . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3 Affine space structure of the Euclidean space . . . . . . . . . . . . . . . . . . . . . . . 17
5
Geometry - first year [email protected]
Definition 1.1. We say that two ordered pairs of points (A, B) and (C, D) are equidistant, and we write
(A, B) ≡ (C, D) if and only if the segments [AB] and [CD] are congruent1 .
Proof. The equidistance relation is equivalent to the congruence relation on segments. We show that
the latter relation is an equivalence relation. We need to show reflexivity, symmetry and transitivity.
In order to show that the relation is reflexive, fix [AB] and construct congruent a segment [A′ B′ ] using
Axiom III.1. Then, applying Axiom III.2 to the congruences [AB] ≡ [A′ B′ ] and [AB] ≡ [A′ B′ ] we obtain
[AB] ≡ [AB]. For symmetry, assume that [AB] ≡ [CD]. Applying Axiom III.2 to the congruences
[CD] ≡ [CD] and [AB] ≡ [CD] we obtain [CD] ≡ [AB] as desired. For transitivity, assume that [AB] ≡
[CD] and that [CD] ≡ [EF]. By symmetry we have [EF] ≡ [CD]. Then, applying Axiom III.2 to the
congruences [AB] ≡ [CD] and [EF] ≡ [CD] we obtain [AB] ≡ [EF] as desired.
Definition 1.3. The equivalence classes of the equidistance relation are called distances or lengths.
The distance defined by the pair (A, B) is denoted by |AB|. It is also called the length of the segment
[AB]. Explicitly, we have
n o
|AB| = ordered pairs (X, Y ) such that (X, Y ) ≡ (A, B) .
1 The axioms sometimes refer to segments as being ‘congruent or equal’ which may suggest that congruence is reserved
for unequal segments. For us, if two segments are equal they are congruent.
6
Geometry - first year [email protected]
We say that (A, B) represents the distance |AB| and that [AB] represents the length |AB|. If |AB| = |CD| we
say that [AB] and [CD] have the same length. The set of distances/lengths defined by segments in E is
the set of equivalence classes:
n o .
L = |AB| : (A, B) ∈ E × E = E × E ≡ .
Next, we introduce the concept of direction. The Axioms of Order also allow us to define the
notion of sides of the line AB with respect to the point A. The points B, C ∈ AB are on the same side of
AB relative to A if A is not between B and C (see [14, p.8]). The set of points on the same side of a line
relative to A is also called a ray emanating from A. If it contains the point B, we denote it by (AB.
Lemma 1.4. For any segment [AB], any line ℓ and any point A′ ∈ ℓ there is a unique segment [A′ B′ ],
congruent to [AB], lying on ℓ on a given side relative to A′ .
Proof. The existence of the point B′ , and therefore of the segment [A′ B′ ], is guaranteed by Axiom
III.1. In order to prove uniqueness, suppose for a contradiction that there is another point B′′ on
the ray (A′ B′ such that [A′ B′ ] ≡ [A′ B′′ ]. Choose a point C ′ not on the line A′ B′ . The existence of
such a point is guaranteed by Axioms I.2 and I.8. We have [A′ C ′ ] ≡ [A′ C ′ ], [A′ B′ ] ≡ [A′ B′′ ] and
∡C ′ A′ B′ ≡ ∡C ′ A′ B′′ . Applying Axiom III.6 we obtain ∡A′ C ′ B′ ≡ ∡A′ C ′ B′′ which contradicts Axiom
III.4.
C′
A′ B′′ B′
When doing geometry it is customary to avoid degenerate cases and overlaps. However, for our
purpose of building on the axioms, we take degenerate cases into account when defining directions.
This will open the way to linear algebra tools. Before giving the definition, let us recall that the
Axioms of Order allow us to define the notion of sides of a plane α with respect to a line contained
in α (see [14, p.8]).
Definition 1.5. We say that two ordered pairs of points (A, B) and (C, D) are equidirectional, and we
∧
write (A, B) = (C, D), in the following three cases:
1. if A = B and C = D;
7
Geometry - first year [email protected]
C D A D B D D
A=C D B D
A C D B D B
A B=C D C
A
A B C D
3. if A , B, the points A, B, C are not collinear, AB and CD are parallel and B and D lie on the same
side of AC in the plane that the four points determine.
Proposition 1.6. Consider two pairs of distinct points (A, B) and (C, D).
∧
1. If AB = CD then (A, B) = (C, D) if and only if (AB ∩ (CD = (AB or (AB ∩ (CD = (CD.
∧
2. If AB, CD are distinct but parallel then (A, B) = (C, D) if and only if [AD] intersects [BC].
∧
Proof. 1. If A = C then, by definition, we have (A, B) = (C, D) if and only if B, D are on the same side
of A. Thus, by definition, we have (AB = (CD, hence (AB∩(CD = (AB = (CD. For the remaining cases
we make use of the following fact which intuitively is clear: (∗) four points on a line can always be
relabeled P1 , P2 , P3 , P4 such that P2 and P3 lie between P1 and P4 , and furthermore, that P2 lies between
P1 and P3 and P3 lies between P2 and P4 . This is Theorem 5 in [14].
Consider the case where A , C. For the implication from left to right, assume first that B, C are
on the same side of A. Since C is between A and D, by (∗), for any point X on (CD we have C between
A and X, thus X ∈ (AC. Since (AC = (AB we showed that (AB ∩ (CD = (CD. Next assume that B, C
are on opposite sides of A. By (∗), for any X ∈ (AB we have A between C and X, thus X ∈ (CA. Since
A, D are on the same side of C we also have (CA = CD, hence (AB ∩ (CD = (AB. The implication from
right to left is easier.
2. Now assume that the lines AB and CD are distinct and parallel. We show the implication from
left to right since the other direction is easier. Assume for a contradiction that B and C lie on the same
∧
side of the line AD. Then (DC lies on the same side of AD as B. By assumption, since (A, B) = (C, D),
we also have (CD on the same side of AC as B. By Axiom II.2, we may choose a point P between C
and D. Since P lies on both (CD and (DC it lies on the same side of AC and AD as B. Thus B lies in
the interior of the angle described by (AC and (AD. Hence AB intersects [CD], contradicting AB∥CD.
Thus B and C lie on opposite sides of AD, i.e. [BC] intersects AD. Interchanging the role of B, C with
A, D we find that [AD] intersects BC. Therefore [BC] intersects [AD].
8
Geometry - first year [email protected]
Next, we prove transitivity along lines by treating the case when the four points are collinear.
By Proposition 1.6, we have (AB ∩ (CD = (AB or (CD and similarly, (EF ∩ (CD = (EF or (CD. In all
four cases, by Axiom V.1 we find a point P such that (AB = (AP , (CD = (CP and (EF = (EP . Then, if
(AP ∩ (CP = (AP and (EP ∩ (CP = (EP , we have A, E between C and P . It follows from Theorem 5 in
[14] that E lies between C and A or that A lies between C and E. In the first case (AP ∩ (EP = (EP
and in the second case (AP ∩ (EP = (AP . The other three cases are treated similarly. Thus, for the rest
of the proof we may assume that we consider pairs of distinct points and that the six points are not
collinear.
Consider the cases where two of the lines overlap. The cases where AB or EF equal CD follow
directly from transitivity along lines proved in the previous paragraph. It remains to consider the
case where AC = EF = ℓ. Since the six points are not collinear, the line CD is distinct from ℓ and
∧
parallel to ℓ. We need to show that (A, B) = (E, F). By transitivity along lines, we may replace (A, B)
and (E, F) by other equidirectional pairs of points on ℓ. Thus, we may assume that A = E. Then B and
∧ ∧
F are on the same side of the line AC = EC since (A, B) = (C, D) and since (E, F) = (C, D). Hence B, F
are on the same side of A.
Finally, consider the case where the three lines AB, CD and EF are distinct. Again, by transitivity
along lines, we may shift the pairs of points along these three lines. Thus we may assume that C is the
∧
intersection of AE with CD. Then B, D are on the same side of the line AC = AE since (A, B) = (C, D)
∧
and D, F are on the same side of the line CE = AE since (A, B) = (C, D). Thus B, F are on the same side
relative to the line AE.
Definition 1.8. The equivalence classes of the equidirectional relation are called directions. The
direction containing the ordered pair (A, B) is denoted by |AB⟩. Explicitly, we have
n ∧
o
|AB⟩ = ordered pairs (X, Y ) such that (X, Y ) = (A, B) .
We say that (A, B) is a representative of the direction |AB⟩. If |AB⟩ = |CD⟩ we say that (A, B) and (C, D)
define the same direction. The set of directions defined with points in E is the set of equivalence
classes: n o .∧
D = |AB⟩ : (A, B) ∈ E × E = E × E = .
The direction |BA⟩ is called the opposite direction of |AB⟩ and is denoted by −|AB⟩. This defines an
involution −□ : D → D.
At the intersection of the equidistance relation and the equidirectional relation lies the equipol-
lence relation which is used to define vectors. Before defining this relation, we define degenerate
parallelograms first.
Definition 1.9. A parallelogram ABCD is an ordered quadruple of points (A, B, C, D) with the usual
property of having parallel opposite sides if the four points are not collinear. In this case, the labeling
is such that C, D are on the same side of AB and the points D, A are on the same side of BC.
If the four points are collinear we require that the segments [AC] and [BD] have the same mid-
point. In this case, we say that the parallelogram is degenerate.
Definition 1.10. Two ordered pairs of points (A, B) and (C, D) are called equipollent, and we write
(A, B) ∼ (C, D), if the segments [AD] and [BC] have the same midpoints.
9
Geometry - first year [email protected]
D B
C A
Proposition 1.11. For two ordered pairs of points (A, B) and (C, D) the following statements are
equivalent:
Proof. First notice that the equivalence (1. ⇔ 2.) follows directly from Definition 1.10 and that (4. ⇔
5.) is by Definition 1.1. Thus, it suffices to show (2. ⇒ 3.), (3. ⇒ 4.) and (4. ⇒ 2.).
(2. ⇒ 3.) Let (A, B) ∼ (C, D). Denote by M the common midpoint of the segments [AD] and [BC].
If A = B then [AD] and [AC] have M as common midpoint. Then, if A = C or A = D it follows that all
four points coincide, hence ABCD is a parallelogram by definition. If on the other hand A , C and
A , D then M lies between A and C and between A and D. From Lemma 1.4 it follows that C = D.
Thus, for A = B we showed that ABDC is a degenerate parallelogram. A similar argument shows
that ABDC is a parallelogram if C = D. For the rest of the proof we may assume that A , B and that
C , D.
Assume further that A, B, C are collinear. Since M is the common midpoint of [AD] and [BC] we
have DM = AM = BC. Thus, the points are collinear and ABDC is a degenerate parallelogram by
definition.
Assume next that A , B, C , D and that A, B, C are not collinear. In this case the lines AB and CD
are distinct. Therefore, the common midpoint M of [AD] and [BC] cannot lie on any of these two
lines. Thus, the angles ∡CMD and ∡AMB are defined (see [14, p.11]). They are so-called vertical
angles (see [14, §6]) and it follows from [14, Theorem 14] that vertical angles are congruent. Thus,
by the first congruence theorem for triangles [14, Theorem 12], the triangles MCD and MBA are
congruent. In particular ∡MDC is congruent to ∡MAB and, again by [14, Theorem 14], they have
congruent supplementary angles. It then follows from [14, Theorem 30] that the lines AB and CD
are parallel. Similarly, one shows that AC and BD are parallel. Thus ABDC is a parallelogram.
(3. ⇒ 4.) Let ABDC be a parallelogram. Assume first that it is degenerate, i.e. the four points are
collinear. Let M be the common midpoint of [AD] and [CB]. First consider the case where A = B. If
C = D we are done since [AB] ≡ [CD] and |AB⟩ = |CD⟩. Assume for a contradiction that C , D. Then
10
Geometry - first year [email protected]
D B
-
M
-
C A
D, M are on the same side of A and C, M are on the same side of B = A. From Lemma 1.4 it follows
that C = D. A similar argument shows that if C = D then A = B and our claim follows. Thus, we may
assume that A , B and C , D. Assume now that A = C. Then D, M are on the same side of A and
B, M are on the same side of C = A. From Lemma 1.4 it follows that B = D and our claim follows.
Next, assume that A = D. Then M = A = D. Since M is the midpoint of [CB], we have that A = D
lies between C and B, hence |CD⟩ = |AB⟩. Moreover, since A = D is the midpoint of [CB] we have
[CD] ≡ [AB] by Lemma 1.4. We may assume for the rest of the proof that the four points are distinct.
Since M is the common midpoint of the segments [AD] and [BC], from Axiom III.3 we deduce
that [AB] ≡ [CD]. It remains to show that |AB⟩ = |CD⟩. Since M is the midpoint of [AD], the points
A, D lie on distinct sides of AD relative to M. Similarly for B and C. Thus, we have two possibilities:
either A and B are on the same side relative to M or A and C are on the same side relative to M. In
the first case, there are two possibilities: either B is between A and M or A is between B and M. If B
is between A and M, since M is the common midpoint of [AD] and [BC], it follows from Axiom III.3
that C lies between M and D. Therefore A and D are on opposite sides relative to C, i.e. (A, B) and
(C, D) define the same direction. The remaining three cases are treated similarly.
Now assume that ABDC is not degenerate. By definition, the opposite sides of ABDC are parallel.
By [14, Theorem 30], we have the following congruences ∡CDA ≡ ∡DAB and ∡DAC ≡ ∡ADB. By
the second congruence theorem for triangles [14, Theorem 13], the triangles ACD and DBA are
congruent, in particular [AB] ≡ [CD]. Moreover, by definition, the labeling of the vertices are such
that B and D lie on the same side of the line determined by A and C. Thus, since the sides are parallel,
the pairs (A, B) and (C, D) define the same direction.
D B
C A
(4. ⇒ 2.) Assume that [A, B] ≡ [C, D] and |AB⟩ = |CD⟩. Then AB is parallel to CD. If the four
11
Geometry - first year [email protected]
points are on the same line ℓ, the configuration is degenerate. By definition, since (A, B) and (C, D)
define the same direction, we have two cases. In the first case B, C are on the same side of ℓ relative
to A and A, D are on distinct sides of ℓ relative to C. Assume that C is between B and D and let M be
the midpoint of [AD]. Then, by Axiom III.3 applied to M, A, B and M, D, C we have [MB] ≡ [MC], i.e.
M is also the midpoint of [BC]. The other cases are treated similarly. The only thing to pay attention
to is that Axiom III.3 requires segments which do not overlap.
Finally, assume that the four points are not collinear. Denote by M the intersection of the diago-
nals. It exists by Point 2. of Proposition 1.6. By [14, Theorem 30] and the second congruence theorem
for triangles [14, Theorem 13], the triangles MAB and MDC are congruent. Thus [AM] ≡ [MD] and
[BM] ≡ [MC], i.e. M is the common midpoint of [AD] and [BC].
Vectors are defined as equivalence classes of the equipollence relation (see Definition 1.13 below).
To this end, the following proposition shows that this relation is indeed an equivalence relation.
Proof. By Proposition 1.11 we have that (A, B) ∼ (C, D) if and only if |AB⟩ = |CD⟩ and [AB] ≡ [CD].
By Proposition 1.2 and Proposition 1.7 both the equidistance relation on the equidirectional relation
are equivalence relations. Thus, reflexivity, symmetry and transitivity for the equipollence relation
follow from the corresponding properties of these two relations.
Definition 1.13. The equivalence classes of the equipollence relation are called vectors. The vector
−−→
containing the ordered pair (A, B) is denoted by AB . Explicitly, we have
−−→ n o
AB = ordered pairs (X, Y ) such that (X, Y ) ∼ (A, B) .
−−→ →−
The vector AA is called the zero vector and we denote it by 0 or simply by 0 when there is no risk of
−−→
confusion. Since all representatives of a vector AB define the same length |AB| this will also be the
−−→ −−→ −−→
length of the vector AB and we denote it by | AB |. The vector BA is called the opposite of the vector
−−→ −−→
AB and is denoted by − AB . This defines an involution −□ : V → V.
The first observation about vectors (which we prove below) is that for any fixed but arbitrary
−−−→
point O there is a 1-to-1 correspondence between points A and vectors OA . In this correspondence
−−−→
OA is called the position vector of A relative to O or, if it is clear from the context what O is, we simply
say the position vector of A.
Proposition 1.14. For any ordered pair of points (A, B) and any point O, there exists a unique point
X such that (A, B) ∼ (O, X).
12
Geometry - first year [email protected]
Proof. Assume first that A, B and O are not collinear. By Axiom III.4 there is a line ℓ passing through
O and having the same angles with AO as AB. By [14, Theorem 22] the lines ℓ and AB cannot have a
point in common, i.e. they are parallel. By Axiom IV, the line ℓ is the unique line passing through O
which is parallel to AB. Similarly, there is a unique line ℓ ′ passing through B and which is parallel to
OA. Let X be intersection point of ℓ and ℓ ′ . Then ABXO is a parallelogram. Hence, by Proposition
1.11, we have (A, B) ∼ (O, X).
Now assume that A, B and O lie on a line ℓ. If O = A then [AB] = [OB], thus (A, B) ∼ (O, X) if
and only if X = B. If O , A, consider the two sides in which A divides ℓ. If O and B are on the same
side, there exists a unique segment [OX] on ℓ congruent to [AB] and such that A and X are not on the
same side of ℓ relative to O (by Lemma 1.4). Thus (A, B) and (O, X) define the same direction, hence
(A, B) ∼ (O, X). If O and B are on different sides of ℓ relative to A, there exists a unique segment [OX]
on ℓ which is congruent to [AB] and such that A and X lie on the same side of ℓ relative to O (by
Lemma 1.4). As in the previous case, X is the unique point such that (A, B) ∼ (O, X).
−−−→
Corollary 1.15. For any point O, the map φO : E → V defined by φO (A) = OA is a bijection.
−−−→ −−−→
Proof. Fix a vector CD . We show that there is a point A such that φO (A) = CD . By Proposition
1.14, there is a unique point A such that (C, D) ∼ (O, A), i.e. there exists a unique point A such that
−−−→ −−−→
CD = OA = φO . The existence of A implies the surjectivity of φO and the uniqueness implies that
φO is injective. Thus, φO is bijective.
Remark. It is clear that if in the set of ordered pairs of points E × E we fix the first entry then we
obtain a bijection with E. What Corollary 1.15 is saying is that, starting from the Axioms, vectors do
not carry more information than two points do. So, why not simply work with pairs of points instead?
We are simply working with pairs of points to which we formally attached the concept of length and
direction.
X Y
b
A A
b b
a+ a+
a
B
b
O O
13
Geometry - first year [email protected]
Definition 1.16. Consider two vectors a and b. If we fix a point O then, by Proposition 1.14, there
−−−→
is a unique point A such that a = OA and for the point A there exists a unique point X such that
−−−→ −−−→
b = AX . The sum of a and b is by definition the vector OX and we denote the sum by a + b.
−−−→ −−→
Equivalently, for a fixed point O there are unique points A and B such that a = OA and b = OB
and for the points O, A and B there is a unique point Y such that OAY B is a parallelogram. It follows
−−−→ −−−→
that X = Y and therefore a + b = OY = OX .
Proof. Let a, b ∈ V and O, A, X be as in Definition 1.16. The sum maps (a, b) to a+b as in the definition.
It is a map □ + □ : V × V → V on equivalence classes which is defined using representatives. The
proposition claims that the definition does not depend on the choice of representatives, i.e. if we replace
O with a point distinct from O, then the above construction yields the same equivalence class.
−−−−→
Fix a point O′ , O. Let A′ be the unique point such that a = O′ A′ and let X ′ be the unique point
−−−−→ −−−→ −−−−→ −−−−→ −−−→
such that b = A′ X ′ . We need to show that AX = A′ X ′ . Since O′ A′ = a = OA , we have that OAA′ O′
is a parallelogram. Similarly AXX ′ A′ is a parallelogram and it suffices to show that OXX ′ O′ is a
parallelogram.
If O = A, then O′ = A′ and OXX ′ O′ is a parallelogram since AXX ′ A′ is a parallelogram. The
cases where A = X or X = O are similar. Thus, we may assume that the points O, A, X are pairwise
distinct. Now, if O, A, X are collinear then [OX] is congruent to [O′ X ′ ] by Axiom III.3. Moreover, by
−−−→ −−−−→
the construction in Definition 1.16, we have |OX⟩ = |OX ′ ⟩ and therefore OX = O′ X ′ .
Finally, assume that A < OX. By the third congruence theorem for triangles [14, Theorem 18], we
have that the triangles OAX and OA′ X ′ are congruent. In particular ∡AOX ≡ ∡A′ O′ X ′ . Moreover,
since the lines OA and O′ A′ are parallel, they meet OO′ in congruent angles (by [14, Theorem 30]).
From this we deduce that OX and O′ X ′ meet OO′ in congruent angles. Thus OX and O′ X ′ are
parallel. Since OO′ ∥AA′ and AA′ ∥XX ′ we also have OO′ ∥XX ′ . Thus OXX ′ O′ is a parallelogram.
Proof. Proposition 1.17 show that addition is indeed an operation on vectors. Thus we may choose
convenient representatives if needed. Let a, b, c be three vectors. Addition is associative if (a+b)+c =
a + (b + c). Fix a point A. Then, by Proposition 1.14, there exist unique points B, C and D such
−−→ −−→ −−−→ −−→ −−→ −−−→ −−→ −−−→ −−−→ −−→ −−→
that AB = a, BC = b, CD = c. But then ( AB + BC ) + CD = AC + CD = AD = AB + BD =
−−→ −−→ −−−→ −−→ −−→ −−→
AB + ( BC + CD ) which proves associativity. The neutral element is 0 = AA since AA + AB =
−−→ −−→ −−→ −−→ −−→ −−→ −−→ −−→
AB = AB + BB = AB + AA . The inverse element of AB is the opposite vector − AB = BA since
−−→ −−→ −−→ −−→ −−→
AB + (− AB ) = AB + BA = AA = 0. Finally, to show commutativity we notice the following. Since
a + b is constructed on the diagonal of a parallelogram with a and b represented on the sides, the
construction is symmetric and does not depend on the order of a and b in the sum.
Up to this point we have used the concept of length solely as equivalence classes given by the
congruence relation on segments (Definition 1.8). While this was enough for our discussion so far, in
what follows we need more structure on the set L of lengths. It is natural to ask about all the conse-
quences that the Axioms have on L. If we choose a segment [AB], a unit segment, one can deduce from
14
Geometry - first year [email protected]
the axioms that the line AB can be identified with the set of real numbers R and that L can be iden-
tified with the set of non-negative real numbers R≥0 such that |AB| = 1. These identifications require
care. They are available in Appendix B for the interested reader (see in particular Theorem B.9).
The identification is necessary for example in the existence and uniqueness claim of the following
definition which uses Proposition B.11.
−−−→
Definition 1.19. Assume that a unit segment was chosen. Consider a non-zero vector a = OA and
a scalar x ∈ R. If x > 0, there is a unique point X on the ray (OA such that |OX| = x · |OA|. The
multiplication of the vector a with the scalar x, denoted x · a (or simply xa), is given by
−−−→ →−
OX for a , 0 , x > 0 and X as above,
→−
x·a =
−(|x|a) for a , 0 , x < 0,
→
− →
−
0 for a = 0 or x = 0.
X
x>0 xa
a
−a
O O
xa x<0
Proposition 1.20. Assume that a unit segment was chosen. The multiplication of vectors with scalars
is well defined.
−−−→ −−−−→
Proof. Let OA = O′ A′ . For r > 0, construct X from O and A as in Definition 1.19 and similarly, con-
−−−→ −−−−→ −−−→ −−−−→
struct X ′ from O′ and A′ . We need to show that OX = O′ X ′ . By construction OX and O′ X ′ define
−−−→ −−−−→
the same direction, namely both define the same direction as OA = O′ A′ . Thus, by Proposition 1.11,
it suffices to show that |OX| = |O′ X ′ |. Since r > 0, this is clear since |OX| = r · |OA| = r · |O′ A′ | = |O′ X ′ |.
If r < 0 the claim also follows since |XO| = |OX|. The case where r = 0 is trivially true.
Proposition 1.21. Assume that a unit segment was chosen. For a, b ∈ V and x, y ∈ R we have
→−
1. 0 · a = 0 .
2. 1 · a = a
3. −1 · a = −a
4. (x + y) · a = x · a + y · a
5. x · (y · a) = (xy) · a
6. x · (a + b) = x · a + x · b
15
Geometry - first year [email protected]
Proof. The first five claims follow from Proposition B.12 and Definition 1.19. We prove the last claim.
If x = 0 the statement follows from the first assertion. Moreover, we may assume that x > 0. Indeed,
if x < 0 we have (−x) · (−(a + b)) = (−x) · (−a) + (−x) · (−b) which follows from the case x > 0. Now, by
Proposition 1.17 and Proposition 1.20 the operations do not depend on the choice of representatives.
−−−→ −−→
Let OAQB be a parallelogram such that a = OA and b = OB . By Proposition B.11, there is a unique
−−−→ −−−→
point A′ ∈ (OA such that xa = OA′ and there is a unique point B′ ∈ O such that xb = OB′ . Moreover,
−−−→ −−−→
we have x|a| = | OA′ | and x|b| = | OB′ |. Let Q be the unique point such that OA′ QB′ is a parallelogram.
−−−−→ −−−→ −−−−→
We have OQ′ = xa + xb and we need to show that x · OQ = OQ′ . Then the points O, Q, Q′ are
collinear and the angles in the triangles OAQ and OA′ Q′ are pairwise congruent by [14, Theorem
30]. It then follows from the proportionality of the sides of similar triangles (see Theorem 41 in [14])
that |OQ′ | : |OQ| = |OA′ | : |OA| = r.
Theorem 1.22. The set of vectors V with vector addition and scalar multiplication is a vector space.
Proof. For the axioms of a vector space see for example [13, Definition 1.1]. They follow directly
from Proposition 1.18 and Proposition 1.21.
Lemma 1.23. Let A, B, C be three non-collinear points and let π be the unique plane containing them.
Then, a point Q lies in π if and only if there exists a parallelogram AXY Q with X ∈ AB and Y ∈ AC.
Moreover, if such a parallelogram exists, it is unique.
Theorem 1.24. Assume that a unit segment was chosen. Let S be a subset of E and let O be a point
in S.
1. The set S is a line if and only if φO (S) is a 1-dimensional vector subspace.
−−−→ −−→
2. The vectors OA , OB are linearly dependent if and only if the points O, A, B are collinear.
5. If S is a line or a plane then the vector subspace φO (S) is independent of the choice of O in S.
Proof. Assume that S is a line. By definition of the vector space operations (Definitions 1.16 and
1.19), φO (S) is stable under multiplication with scalars and vector addition. Moreover, by Proposi-
tion B.11, any two vectors represented on the line S are linearly dependent. Thus φO (S) is a vector
subspace of dimension 1.
For the other implication, assume that φO (S) is a 1-dimensional vector subspace and let e be a
basis vector. Let A be the unique point such that φO (A) = e. By the argument in the first paragraph,
φO (OA) is a 1-dimensional vector subspace of V. Since e is a basis vector for both φO (OA) and φO (S)
we have φO (OA) = φO (S) and since φO is bijective we have OA = S. For 2., notice that O, A, B are
−−→
collinear if an only if OB belongs to the 1-dimensional vector subspace φO (OA) which in turn is
equivalent to the two vectors being linearly dependent.
Assume now that S is a plane. Then there exist other two points A and B such that O, A, B are
non-collinear. By the above paragraph φO (OA) and φO (OB) are two distinct 1-dimensional vector
16
Geometry - first year [email protected]
subspaces contained in φO (S). Now, for any point Q in the plane S there is a unique parallelogram
−−−→ −−−→ −−−→
OXQY with X ∈ OA and Y ∈ OB (by Lemma 1.23). Thus OQ = OX + OY . In other words, any vector
in φO (S) is a linear combination of vectors in φO (OA) and φO (OB), i.e. φO (S) is the vector space
generated by these two one dimensional subspaces. For the other implication assume that φO (S) is a
2-dimensional vector subspace. Let e1 , e2 be a basis of φO (S) and let A, B be the unique points such
that φO (A) = e1 and φO (B) = e2 . Then O, A, B are non-collinear. Let π be the unique plane containing
these three points. By the first part of the argument φO (π) is a 2-dimensional vector subspace. Since
it is included in φO (S) we must have φO (S) = φO (π). Since φO is bijective, we have S = π.
The proof for 4. and 5. is similar.
Definition 1.25. Because of the above theorem, there is an overlap in terminology which we accept.
We say that two vectors are collinear if they are linearly dependent. Moreover, since two vectors are
linearly dependent if and only if they have the same or opposite directions, we say that such vectors
are parallel. Similarly, we say that three vectors are coplanar if they are linearly dependent.
Proof. Assume for a contradiction that dim V ≤ 2. Then dim V cannot be 0 since by Axiom I.8 there
are at least 3 vectors in V. If dim V = 1 then by Theorem 1.24 there are no planes - this contradicts
Axiom I.3 which guarantees that there are at least 3 non-collinear points, i.e. there is at least on plane
by Axiom I.4. If dim V = 2, by Theorem 1.24, all points lie in one plane which is a contradiction with
Axiom I.8.
Moreover, with Corollary 1.15 we can define an ‘addition’ of vectors with points. For a vector a and
−−−→
a point O there is a unique point X such that a = OX , i.e. we have a so-called translation map
□ + □ : V × E → E given by a + O = X. (1.1)
We say that the vectors in V act on the set of points E by translations. This is one key observation which
allows us to use real vector spaces as an underlying model for the Euclidean space E. This is made
precise by the following definition.
Definition 1.27. A real affine space A is a triple (P, V, t) where P is a non-empty set whose elements are
called points, where V is a real vector space called direction space of A, and where t is a map V × P → P
called translation map which satisfies the following two axioms:
B = t(a, A).
17
Geometry - first year [email protected]
The dimension of the affine space A is by definition the dimension of V and is denoted by dim A. The
set of points is rarely mentioned separately. When we refer to ‘points in A’ we mean elements of P.
When dealing with an affine space it is common for the vector space V not to have a label. In that
case, we may invoke it as D(A).
Remark. Notice that if we fix a point O ∈ A, by Axiom (AS1), for each point P ∈ A there is a unique
vector v such that P = t(v, O). This vector is called the position vector of P relative to O and is denoted
−−→ −−→
by OP . This gives a bijection φO : A → V defined by φO (P ) = OP .
Theorem 1.28. The Euclidean space E has the structure of a real affine space of dimension dim E ≥ 3.
Proof. Considering the set P of points in E and the set of geometric vectors V, we observed in (1.1)
the existence of a translation map V × E → E using Corollary 1.15. This gives E the structure of a real
affine space by definition. The claim on the dimension follows from Proposition 1.26.
Example 1.29. The main, and in fact the only examples of finite dimensional real affine spaces are
the following. Every real vector space V is a real affine space over itself. Indeed, we may take the set
of points A to be V and the map
t : V × A → V, defined by t(v, P ) = v + P .
We know that up to isomorphism there is a unique real vector space of dimension n. We write Vn
for such a vector space and we know that Vn Rn . However, since Rn has a standard basis, we use
the notation Vn in order to ignore the standard basis. Consequently, if dim V = n, we denote the
corresponding affine space by An .
Remark. The concept of a real affine space unlocks all linear algebra tools for Euclidean geometry.
In this set-up we identify the set of points with V and view the elements of V in two distinct ways:
as points and as vectors which act on points. A vector space has an origin, the zero vector, but on a
line or in a plane all points are equal. One way of phrasing this is by saying that ‘an affine space is
nothing more than a vector space whose origin we try to forget about’ [4, Chapter 2].
Remark. The mathematical library mathlib [22] written in Lean [21] also uses the concept of affine
space as the underlying model for Euclidean geometry (see [23]).
18
CHAPTER 2
Cartesian coordinates
Contents
2.1 Frames in dimension 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.1 Coordinates as projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.2 Changing frames (an example) . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1.3 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Frames in dimension 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.1 Coordinates as projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.2 Changing frames (an example) . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.3 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Frames in dimension n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.1 Algorithm for changing frames . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.2 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Coordinates are tuples of numbers associated to points in a given object or space. For an object
or space S, we seek a subset C ⊆ Rn and a bijective map C → S that establishes a correspondence
(x1 , . . . , xn ) ↔ P between coordinates and points. Typically, there are many choices of such maps, and
the specific choice determines the level of control you have over the object or space S. This choice
often depends on the intended purpose or application involving S. In this chapter, we focus exclu-
sively on Cartesian frames, which are named after René Descartes1 . For other types of coordinate
systems we refer to Appendix D.
Here is a passage from [15, Section 6.8]: “in the eighteenth century, historians of mathemat-
ics (French ones, in particular) considered Descartes the revolutionary who had freed them from
bondage to the tedious methods of the ancient Greeks, by reducing hard geometric problems to
1 1596–1650
19
Geometry - first year [email protected]
simple algebraic ones. This is a view which is often now regarded with some suspicion, although
Descartes himself promoted it [. . . ] In any case, his project was [. . . ] specific: the relation of ge-
ometry and algebra. A standard modern textbook criticizes Descartes for not being more practical
[. . . ] This criticism is interesting, but, I think, misplaced. Coordinate geometry even today is not
‘intrinsically’ practical – even the statistician who studies whether points in a scatter graph lie near
a straight line y = ax + b, let alone the geometer who wishes to picture the curve y 2 = x3 + x2 (Fig. 2)
are not thinking as surveyors or geographers. On the other hand, for some practical tasks, the new
ideas were very well adapted, as Newton and Leibniz were to understand.”
It should be clear that Descartes did not consider the concept of affine space in the form that we
introduced it in the previous chapter. However, the main idea is the same. Examples of classical
theorems which can be proved using Cartesian frames and algebraic manipulations are given in
Appendix F.
y
P (4.2, 3.8)
−−→
OP 3.8j
j
x
O 4.2i
This establishes a bijection E2 ↔ R2 between points in the plane and ordered pairs of real num-
bers. This bijection arises as the composition of two bijections:
−−→
1. The bijection φO : E2 → V2 , which assigns to each point P its position vector OP .
20
Geometry - first year [email protected]
Definition 2.1. A frame in E2 is a pair K = (O, B), where O is a point in E2 and B = (i, j) is a basis of V2 .
A frame is also referred to as a Cartesian coordinate system, or a Cartesian frame. We use the shorter
term for convenience. Given a frame K, the unique pair (xP , yP ) in (2.1) is called coordinates of P with
respect to the frame K. In other words, the coordinates of P are the components of the position vector
of P relative to O in the basis B. We write PK (xP , yP ) when we want to indicate the coordinates. If it is
clear from the context what K is, we omit the subscript K and simply write P (xP , yP ). By convention,
the first coordinate is typically denoted x, and the second y. The line ℓ1 is called the x-axis, denoted
Ox, while the line ℓ2 is the y-axis, denoted Oy. The point O is referred to as the origin of the frame K.
For computational purposes, points PK (xP , yP ) and vectors a = ax i+ay j are identified with column
matrices, using their coordinates and components, respectively:
" # " #
x a
[P ]K = P ∈ R2 and [a]B = x ∈ R2 .
yP ay
Here, the subscript K indicates the frame relative to which P has the indicated coordinate, while the
subscript B indicates the basis in which the components of the vector a are expressed.
y
P (4.2, 3.8)
j Prx (P ) = (4.2, 0)
x
O i
The map Prx is called the projection on Ox along Oy, and Pry is called the projection on Oy along Ox.
For vectors a = ax i + ay j, we have similar maps defined as follows
21
Geometry - first year [email protected]
The map prx is called the projection on the first component, and pry is called the projection on the second
component. From the definitions we immediately deduce the following identities
−−−−−−−−→ −−→ −−−−−−−−→ −−→ −−→ −−→ −−→
OPrx (P ) = prx ( OP )i, OPry (P ) = pry ( OP )j, OP = prx ( OP )i + pry ( OP )j and
pr ( −−→
" #
xP OP ) −−→
[P ]K = = x −−→ = [ OP ]B .
yP pry ( OP )
Remark. In the process of showing that E has the structure of an affine space, we made a choice, we
fixed a unit segment. If i and j are unit vectors, then for a point P as above, we have |OX| = xP , |OY | =
yP . In other words, the coordinates of P are the lengths of the sides of the parallelogram OXP Y .
j C j′
O i
i′
O′
We can achieve this in two steps: (a) first, we change the origin, i.e., we go from (O, B) to (O′ , B), and
(b) we change the direction of the coordinate axes, i.e. we go from (O′ , B) to (O′ , B ′ ). The first step is
simply a translation, while the second corresponds to the usual base change from linear algebra.
22
Geometry - first year [email protected]
B B
A A
j C C j′
O i j j
i′
O′ i O i
(a) Change the origin. (b) Change the direction of the axes.
In the first step, when going from K = (O, B) to K̃ = (O′ , B), we are looking for the components of the
position vector of A relative to O′ with respect to B. We find these components by noticing that
−−−′→ −−−→ −−−−→′ −−−→ −−−→ −−−−→
O A = OA − OO and therefore [ O′ A ]B = [ OA ]B − [ OO′ ]B .
In the second step, when going from K̃ = (O′ , B) to K′ = (O′ , B ′ ), we are looking for the components
of the position vector of A relative to O′ with respect to B ′ . From linear algebra, we know that this
is done with the base change matrix (see Appendix C). Let MB ′ ,B be the base change matrix from the
basis B to the basis B ′ . Then,
−−−→ −−−→
[ OA ]B ′ = MB ′ ,B [ OA ]B .
Thus, composing the two operations, (a) and (b), we obtain
−−−→ −−−→
−−−→ −−−−→
[A]K′ = [ O′ A ]B ′ = MB ′ ,B ·[ O′ A ]B = MB ′ ,B · [ OA ]B − [ OO′ ]B (2.3)
−−−−→
Now, suppose that A has coordinates (1, 2) with respect to the frame K. Since [ OO′ ]B = [O′ ]K , by
(2.2), we already know the terms in the parentheses on the right-hand side of (2.3). Furthermore,
from our assumptions (2.2), it is easy to write down the matrix MB,B ′ and then MB ′ ,B = M−1
B,B ′ . Thus
" #−1 " # " #! " # " # " #! " #
−2 1 1 7 1 2 −1 1 7 3
[A]K′ = · − = · − = .
1 2 2 −1 −5 −1 −2 2 −1 0
2.1.3 Orientation
Consider the bases E = (e1 , e2 ) and F = (f1 , f2 ) of V2 . Represent all four vectors in a common point O
and rotate the basis F such that the vector f1 points in the same direction as the vector e1 , i.e. such
that f1 = λe1 for some positive scalar λ.
23
Geometry - first year [email protected]
e2
e1
e2
O f2 O
f2 f1
The line passing through O with direction vector e1 (and also f1 ) separates the plane E2 in two
half-planes. Considering the positioning of the second vectors, two things can happen: e2 and f2
point towards the same half-plane or they point towards different half-planes.
Proof. In the above process, we changed the basis F = (f1 , f2 ) to the basis F ′ = (f′1 , f′2 ) with a rotation.
So, the base change matrix MF ′ ,F is a 2 × 2-rotation matrix which has determinant equal to 1 (For
more on rotations see Chapter 7). Now, considering the coordinates of the vectors in F ′ with respect
to E, we have f′1 = (λ, 0) and f′2 = (a, b). Thus, we notice that the vectors e1 and f1 point in the same
half-plane if b > 0. If we now calculate the above determinant, we obtain
λ a
det(ME,F ) = det(ME,F ′ · MF ′ ,F ) = det(ME,F ′ ) · det(MF ′ ,F ) = = λ·b
| {z } 0 b
=1
Definition 2.4. In the first case of Proposition (2.3) we say that E and F have the same orientation
and in the second case we say that E and F have opposite orientation.
Why is this relevant? Next to the fact that it gives a geometric interpretation of the sign of the
determinant det(ME,F ), it also allows us to understand some signs which appear in calculations of
areas (see Chapter 5). Moreover, the trigonometry that we know to hold true in E2 implicitly builds
on the notion of oriented Euclidean plane E2 . Mathematically, the distinction in orientation is only
a matter of keeping track of the signs of some determinants. However, in relation to the physical
world this distinction is more concrete.
24
Geometry - first year [email protected]
−−−→
Definition 2.5. Let (i, j) be a basis of V2 represented in a common point O ∈ E2 such that i = OX and
−−−→
j = OY . Rotate the plane such that i points downwards. If Y is in the right half-plane determined
by the line OX, then we say that the basis (i, j) is right oriented. If Y lies in the left half-plane, we say
that the basis (i, j) is left oriented. A coordinate system (O, i, j) is left or right oriented if the basis (i, j) is
left respectively right oriented.
Fixing an orientation in the Euclidean plane E2 is equivalent to choosing a coordinate system
K = (O, B) and calling it right oriented. Then, all other bases of V2 either have the same orientation
as B, in which case they are also called right oriented, or they have opposite orientation, in which
case they are called left oriented. When it comes to a concrete configuration of points, on a sheet of
paper for instance, such a choice can be made with the right-hand rule. Once we have such a choice,
E2 is called oriented. In other words, the oriented plane E2 is the usual Euclidean plane together with
a choice of which of the two opposite classes of bases contains the ‘prefered’ bases.
25
Geometry - first year [email protected]
−−→
−−→ −−→ −−→ −−→ xP prx ( OP ) −−→
−−→
OP = prx ( OP )i + pry ( OP )j + prz ( OP )k and [P ]K = yP = pry ( OP ) = [ OP ]B .
zP
−−→
prz ( OP )
26
Geometry - first year [email protected]
The process of translating from one Cartesian frame to another is made precise in Section 2.3. Here,
we illustrate this process with an example. Let K = (O, B) and K′ = (O′ , B ′ ) be two frames in E3 with
B = (i, j, k) and B ′ = (i′ , j′ , k′ ). Assume that O′ , i′ , j′ and k′ are known relative to K:
4 −1 −2 0
[O′ ]K = 5 , i′ = −i − 2j = −2 , j′ = −2i + j = 1 , k′ = j + 2k = 1 .
−1 0 0 2
In any dimension, the coordinates with respect to one frame can be obtained from the coordinates
with respect to another frame in two steps.
(a) Change the origin. (b) Change the direction of the axes.
Let B be the point with coordinates (1, 5, 1) relative to K. The argument used for dimension 2 (Section
2.2.2) literally translates to our 3-dimensional setting and we have
−1
−1 −2 0 1 4 −2 −4 2 1 4 1
1
[B]K′ = M−1 ′
K,K′ ·([B]K − [O ]K ) =
−2 1 1 5 − 5 = −4 2 −1 5 − 5 = 1 .
10
0 0 2 1 −1 0 0 5 1 −1 1
27
Geometry - first year [email protected]
2.2.3 Orientation
Consider the bases E = (e1 , e2 , e3 ) and F = (f1 , f2 , f3 ) of V3 . Represent all six vectors in a common
point O and rotate the basis F such that the plane passing through O in the direction of f1 , f2 coin-
cides with the plane π passing through O in the direction of e1 , e2 . If in the plane π the bases (f1 , f2 )
and (e1 , e2 ) have opposite orientation, flip the vectors (f1 , f2 ) with a rotation such that they end up
having the same orientation with (e1 , e2 ). Any rotation with 180◦ around a line in the plane π which
passes through the origin will work and such a rotation has determinant equal to 1 (see Chapter 7).
Then, the plane π separates the space E3 in two half-spaces. Considering the positioning of the
third vectors, two things can happen: e3 and f3 point towards the same half-space or they point to-
wards different half-spaces. How can we tell the two cases apart? A similar argument as in dimension
2 shows that the sign of det(ME,F ) gives the answer. This fact is true in any dimension (see Section
2.3.2). In dimension 3, the orientation of a basis explains some signs which appear in calculations of
volumes (see Chapter 5). In relation to the physical world this distinction is more concrete.
−−−→
Definition 2.7. Let (i, j, k) be a basis of V3 represented in a common point O ∈ E3 such that k = OZ .
We say that the basis is right oriented if (i, j) is a right oriented basis of the plane Oxy when observed
from the point Z. We say that the basis is left oriented if (i, j) is a left oriented basis when observed
from the point Z. A coordinate system (O, i, j, k) is left or right oriented if the basis (i, j, k) is left
respectively right oriented.
There are many equivalent ways of deciding if a basis of V3 is left or right oriented. The Swiss
liked the three-finger rule so much, they put it on their 200-franc banknotes:
28
Geometry - first year [email protected]
is. The point O is the origin of the frame K. For computational purposes, points P (x1 , . . . , xn ) and vec-
tors a = a1 i1 + · · · + an in are identified with column matrices, using their coordinates and components,
respectively:
x1 a1
Definition 2.9. For vectors a = a1 i1 +· · ·+an in in D(An ) we have projection maps pr1 , . . . , prn : D(An ) → R
defined by pr1 (a) = a1 , . . . , prn (a) = an . From the definition we see that for a point P (x1 , . . . , xn ) we have
−−→
x1 pr1 ( OP )
[P ]K = ... =
.. = [ −
−→
. OP ]B .
xn −−→
pr ( OP )
n
A frame K allows us to identify the set of points in An with Rn . However, an n-tuple of numbers
has no geometric meaning in the absence of a frame. Moreover, with respect to different frames,
points have different coordinates. The process of translating from one Cartesian frame to another is
made precise with the following theorem.
Theorem 2.10. Let K = (O, B) and K′ = (O′ , B ′ ) be two frames in An . For any point P ∈ An we have
Proof. Since the coordinates of a point are the components of its position vector, Equation (2.5) is
equivalent to:
−−−→ −−→ −−−−→ −−→ −−−−→′ −−→ −−−′−→
[ O′ P ]B ′ = MB ′ ,B ·([ OP ]B − [ OO′ ]B ) = M−1
B,B ′ ·([ OP ]B − [ OO ]B ) = MB ′ ,B ·[ OP ]B + [ O O ]B ′ . (2.6)
−−−→ −−→ −−−−→
Since [ O′ P ]B = [ OP ]B − [ OO′ ]B and since MB ′ ,B is the base change matrix from B to B ′ , the first
equality follows. The second equality follows from the property that MB ′ ,B = M−1 B,B ′ (see Appendix
C). The last equality is obtained by opening the parentheses and noticing that
−−−−→′ −−−−→′ −−−′−→
M−1
B,B ′ [ OO ]B = [ OO ]B ′ = −[ O O ]B ′ .
29
Geometry - first year [email protected]
1. Construct the base change matrix MB,B ′ by placing [i′1 ]B , . . . , [i′n ]B in the columns of the matrix.
2.3.2 Orientation
Definition 2.11. Two bases E and F of the space Vn of geometric vectors are said to have the same
orientation if det(ME,F ) > 0. They have opposite orientation if det(ME,F ) < 0. Two frames K and K′
have the same orientation if their bases have the same orientation and they have opposite orientation
otherwise.
We say that the Euclidean space En is oriented if there is a choice of a frame K = (O, B) which is
called right oriented. Then, all other frames of En with the same orientation as K are also right oriented
and all other bases with opposite orientation are called left oriented.
Remark. Unless otherwise stated, whenever we consider a frame K = (O, B) of En , we will assume
that it is right oriented and that En is therefore an oriented Euclidean space.
30
CHAPTER 3
Affine subspaces
Contents
3.1 Lines in A2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.1 Parametric equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.2 Cartesian equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.1.3 Relative positions of two lines in A2 . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Planes in A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.1 Parametric equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.2 Cartesian equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.3 Relative positions of two planes in A3 . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Lines in A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3.1 Parametric equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.2 Cartesian equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.3 Relative positions of two lines in A3 . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.4 Relative positions of a line and a plane in A3 . . . . . . . . . . . . . . . . . . . 41
3.4 Affine subspaces of An . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.1 Hyperplanes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.2 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4.3 Relative positions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4.4 Changing the reference frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
31
Geometry - first year [email protected]
3.1 Lines in A2
Let E2 denote an arbitrary plane. It is a 2-dimensional real affine space. In order to emphasize the
fact that we treat E2 as an affine space only we denote it by A2 . In this setting, a line in A2 is a set of
points S such that the set of vectors which can be represented by points in S form a 1-dimensional
2
vector subspace of V2 (see Theorem 1.24). In terms of the map φQ : A2 → V2 which identifies points
with vectors when a point Q ∈ A2 is fixed, the subset S ⊆ A2 is a line if and only if for a point Q ∈ S
n −−→ o
2
φQ (S) = QP : P ∈ S is a 1-dimensional vector subspace of V2 .
It is not dificult to see that if the above description holds for one points Q ∈ A2 , it holds for any point
2
Q ∈ A2 . If S is a line, we call the vector subspace φQ (S) of V2 the direction space of the line S and
denote it D(S).
In this description, the point Q ∈ S is arbitrary but fixed. If we want to emphasize that the description
depends on fixing Q, we refer to this point as the base point of the line. Moreover, for any point O ∈ A2
−−→ −−→
we may split the vector QP in the equation QP = tv to obtain
−−→ −−−→
OP = OQ + tv. (3.1)
tv P
Q v
−−→
OP
−−−→
OQ
So, we can describe the line S as the set of points P in A2 which satisfy Equation (3.1) for some
t ∈ R. This equation is called the vector equation of the line S relative to O, having base point Q and
−−−→
direction vector v, or simply a vector equation of the line S. If v = QA , this equation also describes the
32
Geometry - first year [email protected]
segment [QA] if we restrict the parameter to t ∈ [0, 1]. Moreover, for t > 0 and t < 0 we obtain two
rays amanating from Q.
Notice that, a vector equation depends on the choice of the base point Q and on the choice of the
direction vector v. In particular, a line does not have a unique vector equation. Notice also that, the
vector equation does not depend on the coordinate system. In the above description O can be any
point in A2 .
Now fix a frame K = (O, B). If we write Equation (3.1) in coordinates relative to K, we obtain
( " # " # " #
x = xQ + tvx x x v
S: or, in matrix form S : = Q +t x (3.2)
y = yQ + tvy y yQ vy
where Q = Q(xQ , yQ ), v = v(vx , vy ) and where t is the parameter - for different values t we obtain
different points (x, y) on the line. The two equations in the System (3.2) are called parametric equations
of the line S. Traditionally, they are written in the form of a system of equations as indicated on
the left. Writing them as one equation, as indicated on the right, is closer to the computational
perspective where we identify points with column matrices. Clearly, the two ways of writing such
parametric equations are equivalent.
ax + by + c = 0 (3.4)
relative to a fixed coordinate system and any linear equation in two variables relative to a fixed
coordinate system describes a line if the constants a, b are not both zero. Moreover, if Equation (3.4)
describes the line ℓ relative to a coordinate system K = (O, B), then the direction space D(ℓ) of the
line is the 1-dimensional subspace of V2 which, relative to the basis B, satisfies the equation
D(ℓ) : ax + by = 0.
33
Geometry - first year [email protected]
Equation (3.4) is called a Cartesian equation of the line which it describes. Notice that there are
infinitely many Cartesian equations describing the same line, since you can multiply one equation
by a non-zero constant. It is sometimes useful to rearrange the linear equation (3.4) in order to
emphasize some geometric properties. For example, you can rearrange it in the form
x y c c
+ =1 where α=− and β=− .
α β a b
In this form we have the equation of the line where we can read off the intersection points with the
coordinate axes since this line intersects Ox in (α,y0) and it intersects Oy in (0, β).
(0, β)
O
x
(α, 0)
x − xQ y − yQ
=0
vx vy
−−→
which says that a point P belongs to the line if QP is linearly dependent on v. In this form we may
describe the two half-planes which are separated by the line with the inequalities
x − xQ y − yQ x − xQ y − yQ
<0 and >0
vx vy vx vy
−−→ −−→
Indeed, any point P in the plane either lies on the line or ( QP , v) is a left oriented basis or ( QP , v) is
a right oriented basis.
Discussing this system is basic linear algebra (see for example [9, Section 3.6]). In the plane the
situation is very simple:
34
Geometry - first year [email protected]
• two lines intersect in a unique point, the coordinates of which are the solution to (3.5); or
• they don’t intersect and (3.5) doesn’t have solutions, in which case the lines are parallel; or
incident parallel
3.2 Planes in A3
The usual Euclidean space E3 is a 3-dimensional real affine space In order to emphasize the fact that
we treat E3 as an affine space only we denote it by A3 . A plane in A3 is a set of points S such that
the set of vectors which can be represented by points in S form a 2-dimensional vector subspace of
3
V3 (see Theorem 1.24). Considering the bijection φQ : A3 → V3 for a point Q ∈ A3 , the subset S is a
plane if and only if for any Q ∈ S
3 −−→
φQ (S) = { QP : P ∈ S} is a 2-dimensional vector subspace of V3 .
It is not dificult to see that the above description does not depend one the point Q ∈ S. If S is a plane,
3
we call the vector subspace φQ (S) of V3 the direction space of the plane S and denote it D(S).
3
Now, if we fix Q and let P vary in the plane S then s and t vary in R. Since φQ : A3 → V3 is a bijection,
the plane S can be described as
−−→
S = P ∈ A3 : QP = sv + tw for some s, t ∈ R .
35
Geometry - first year [email protected]
In this description, the point Q ∈ S is arbitrary but fixed. If we want to emphasize that the description
depends on fixing Q, we refer to this point as the base point. Moreover, for any point O ∈ A3 we may
−−→ −−→
split the vector QP in the equation QP = sv + tw to obtain
−−→ −−−→
OP = OQ + sv + tw. (3.6)
So, we can describe the plane S as the set of points P in A3 which satisfy Equation (3.6) for
some s, t ∈ R. This equation is called the vector equation of the plane S relative to O, having base point
−−−→ −−−→
Q and direction vectors v and w, or simply a vector equation of the plane S. If v = QA , w = QC
−−→
and v + w = QB , this equation also described the interior of the parallelogram QABC if we restrict
the parameters s, t to (0, 1). The interior of the triangle QAC is obtained if we further impose the
condition that s + t < 1. Moreover, for t > 0 and t < 0 we obtain two half-planes separated by the line
QA.
Notice that a vector equation depends on the choice of the base point Q and on the choice of the
vectors v and w. In analogy with the case of the line in A2 we may call such vectors direction vectors
for the plane S. In particular, a plane does not have a unique vector equation. Notice also that the
vector equation does not depend on the coordinate system. In the above description O can be any
point in A3 .
Now fix a frame K = (O, B). If we write Equation (3.6) in coordinates relative to K then we obtain
x = xQ + svx + twx x xQ vx wx
S : y = y + sv + tw or, in matrix form S : y = y + s v + t w (3.7)
Q y y Q y y
z = z + sv + tw z zQ vz wz
Q z z
where Q = QK (xQ , yQ , zQ ), v = vK (vx , vy , vz ), w = vK (wx , wy , wz ). The values s and t are called parame-
ters and for different parameters we obtain different points (x, y, z) in the plane S. The three equations
in the System (3.7) are called parametric equations for the plane S.
vx vy x − xQ z − zQ
! ! ! !
vx vz x − xQ y − yQ
− − = − − . (3.8)
wx wz wx wy wx wy wx wz
36
Geometry - first year [email protected]
We will not give this equation a name, because it is a bit much to keep in mind, and one has to make
sense of what happens when the denominators are zero. We simply notice that it is a linear equation
in x, y and z and that it can be obtained by eliminating the parameters in (3.7).
There is an easier way of describing S with a linear equation. For this, you can interpret (3.7) as
−−→
saying that the vector QP is linearly dependent on the vectors v and w. With this in mind, the point
P (x, y, z) lies in the plane S if and only if
x − xQ y − yQ z − zQ
vx vy vz = 0. (3.9)
wx wy wz
−−−→ −−→ −−−→
In particular, considering QE , v = QF and w = QG for some points Q, E, F and G, then the four
points are coplanar if and only if
xE − xQ yE − yQ zE − zQ
xF − xQ yF − yQ zF − zQ = 0. (3.10)
xG − xQ yG − yQ zG − zQ
Notice that Equation (3.10) is just a restatement of the fact that four points Q, E, F and G are coplanar
−
−−→ −−→ −−−→
if and only if the vectors QE , QF and QG are linearly dependent (see Theorem 1.24). Notice also
that if we replace the equals sign in (3.9) with inequalities, we describe the two half spaces separated
−−→
by this plane since, for a point P not in the given plane, ( QP , v, w) is a basis which is either left or
right oriented.
We have just described a plane with a linear equation (Equation (3.9)) relative to the coordinate
system K. The converse is also true.
Proposition 3.3. Every plane in A3 can be described with a linear equation in three variables
ax + by + cz + d = 0 (3.11)
relative to a fixed coordinate system and any linear equation relative to a fixed coordinate system in
three variables describes a plane if the constants a, b, c are not all zero. Moreover, if Equation (3.11)
describes the plane π relative to a coordinate system K = (O, B), then the direction space D(π) of the
plane is the 2-dimensional subspace of V3 which, relative to the basis B, satisfies the equation
D(π) : ax + by + cz = 0.
Proof. This is a particular case of Theorem 3.7.
Equation (3.11) is called a Cartesian equation of the plane it describes. Notice that there are
infinitely many Cartesian equations describing the same plane, since you can multiply one equation
by a non-zero constant. Here again it may be useful to rearrange the linear equation (3.11) in order
to emphasize some geometric properties. For example, you can rearrange it in the form
x y z d d d
+ + =1 where α=− , β=− and γ =− .
α β γ a b c
In this form we have the equation of the plane where we can read off the intersection points with the
coordinate axes since the plane intersects Ox in (α, 0, 0), it intersects Oy in (0, β, 0) and it intersects Oz
in (0, 0, γ).
37
Geometry - first year [email protected]
Discussing this system is basic linear algebra (see for example [9, Section 3.6]). Here again, the
situation is very simple. Let M be the matrix of the system and M
e the extended matrix of the system.
Then we have:
• two planes either intersect in a line, the coordinates of the points on the line will be solutions
to (3.12), this happens if the rank of M and the rank of Me equal 2; or
• they don’t intersect and (3.12) doesn’t have solutions, in which case the planes are parallel, this
happens if the rank of M is strictly less than the rank of M;
e or
• the solutions to System (3.12) depend on two parameters in which case π1 = π2 , this happens
if the rank of M and the rank of M
e are equal to 1.
3.3 Lines in A3
Here again we treat the usual Euclidean space E3 as a 3-dimensional real affine space and denote it
by A3 . As in the case of A2 , by Theorem 1.24, a line in A3 is a set of points S such that the set of
vectors which can be represented by points in S form a 1-dimensional vector subspace of V3 . Hence,
the subset S is a line if for any Q ∈ S we have that
n −−→ o
3
φQ (S) = QP : P ∈ S is a 1-dimensional vector subspace of V3 .
3
If S is a line, we denote by D(S) the vector subspace φQ (S) of V3 .
38
Geometry - first year [email protected]
In this description, the point Q is arbitrary but fixed. If we want to emphasize that this description
depends on fixing Q, we refer to this point as the base point. Moreover, for any point O ∈ A3 we may
−−→ −−→
split the vector QP in the equation QP = tv to obtain
−−→ −−−→
OP = OQ + tv. (3.13)
The image that goes with this description is the same as the one in dimension 2. The only difference
is that we interpret it in the 3-dimensional space A3 . So, again, we can describe the line S as the
set of points P in A3 which satisfy Equation (3.13) for some t ∈ R. This equation is called the vector
equation of the line S relative to O, having base point Q and direction vector v, or simply a vector equation
of the line S.
So far, the description of a line in A3 is ad litteram the one used for A2 . Now fix a frame K = (O, B).
If we write Equation (3.13) in coordinates relative to K then we obtain
x = xQ + tvx x xQ vx
y = y + tv or, in matrix form y = y + t v (3.14)
Q y Q y
z = z + tv z zQ vz
Q z
where Q = Q(xQ , yQ , zQ ), v = v(vx , vy , vz ) relative to K and where t is the parameter yielding different
points (x, y, z) on the line. The three equations in the system (3.14) are called parametric equations for
the line S.
We refer to the Equations (3.15) as symmetric equations of the line S. It could happen that vx , vy or vz
are zero. In that case, translate back to the parametric equations to understand what happens.
We have just described a line with two linear equations (Equations (3.15)) relative to the coordi-
nate system K. The converse is also true.
39
Geometry - first year [email protected]
Proposition 3.4. Every line in A3 can be described with two linear equations in three variables
(
a1 x + b1 y + c1 z + d1 = 0
(3.16)
a2 x + b2 y + c2 z + d2 = 0
relative to a fixed coordinate system and any compatible system of two linear equations of rank 2
in three variables relative to a fixed coordinate system describes a line. Moreover, if the Equations
(3.17) describe the line ℓ relative to a coordinate system K = (O, B), then the direction space D(ℓ) of
the line is the 1-dimensional subspace of V3 which, relative to the basis B, satisfies the equations
(
a1 x + b1 y + c1 z = 0
D(ℓ) : . (3.17)
a2 x + b2 y + c2 z = 0
The Equations (3.17) are called Cartesian equations of the line which they describe. Notice that
they describe a line as an intersection of two planes.
40
Geometry - first year [email protected]
Discussing this system is basic linear algebra (see for example [9, Section 3.6]). It is somewhat easier
to discuss the relative positions of lines in A3 via their parametric equations:
x = x1 + tvx
x = x2 + tux
ℓ1 : y = y + tv and ℓ2 y = y2 + tuy .
:
1 y
z = z + tv
z = z + tu
1 z 2 z
• if they are parallel and have a point in common then the two lines are equal;
• if they are not parallel then they are coplanar (they lie in the same plane) if
x1 − x2 y1 − y2 z1 − z2
vx vy vz = 0.
ux uy uz
• if they are not parallel and they don’t intersect, then we say that the two lines ℓ1 and ℓ2 are skew
relative to each other.
In order to see if they intersect, we check to see which points in ℓ satisfy the equation of π:
a(x0 + tvx ) + b(y0 + tvy ) + c(z0 + tvz ) + d = 0 ⇔ (avx + bvy + cvz )t + ax0 + by0 + cz0 + d = 0. (3.19)
41
Geometry - first year [email protected]
• avx + bvy + cvz = 0 and ax0 + by0 + cz0 + d , 0 in which case Equation (3.19) has no solution, i.e.
the plane and the line don’t intersect, they are parallel; or
• avx + bvy + cvz = 0 and ax0 + by0 + cz0 + d = 0 in which case any t ∈ R is a solution to Equation
(3.19), i.e. the line is contained in the plane, in particular they are parallel; or
• avx + bvy + cvz , 0 in which case Equation (3.19) has the unique solution
Proposition 3.6. An affine subspace of An is an affine space with the affine structure inherited from
An .
Proof. The proof is a simple matter of unpacking the definition of affine spaces (Definition 1.27).
The space An is a triple (P, V, t) where P is the set of points, V is an n-dimensional vector space and
42
Geometry - first year [email protected]
Fixing a point O ∈ An , a point Q ∈ S and a basis (v1 , v2 , . . . , vd ) of D(S), it follows from the defini-
tion that S is a d-dimensional affine subspace if and only if
n −−→ −−−→ o
S = P ∈ An : OP = OQ + t1 v1 + t2 v2 + · · · + td vd for some t1 , . . . , td ∈ R . (3.20)
The equation in (3.20) is called the vector equation of the affine subspace S relative to O, having base
point Q and direction vectors v1 , v2 , . . . , vd , or simply a vector equation of S.
Fixing a coordinate system with origin O and translating the equation in (3.20) in coordinates,
one obtains parametric equations of the affine space S.
x1 q1 v1,1 vd,1
x2 q2 v1,2 vd,2
S : . = . + t1 . + · · · + td .
t1 , . . . , td ∈ R. (3.21)
.. .. .. ..
xn qn v1,n vd,n
Another way of representing an affine subspace is by Cartesian equations (Equations (3.22)) as follows.
Theorem 3.7. Fix a coordinate system K = (O, B) in the affine space An . Let
a11 x1 + · · · + a1n xn = b1
..
(3.22)
.
a x + ··· + a x = b
t1 1 tn n t
be a system of linear equations in the unknowns x1 , . . . , xn . The set S of points of An whose coordinates
are solutions to (3.22), if there are any, is an affine space of dimension d = n − r where r is the rank of
the matrix of coefficients of the system. The direction space D(S) is the vector subspace of Vn whose
equations relative to B are given by the associated homogeneous system
a11 x1 + · · · + a1n xn = 0
..
D(S) : . (3.23)
.
a x + ··· + a x = 0
t1 1 tn n
Conversely, for every affine subspace S of An of dimension d there is a system of n−d linear equations
in n unknowns whose solutions correspond precisely to the coordinates of the points in S.
Proof. We follow the proof in [19, Section 8]. Denote by W the vector subspace defined by the ho-
mogeneous system (3.23). First we show that the set of solutions to (3.22) is an affine subspace of An
43
Geometry - first year [email protected]
with the indicated properties. By assumption we only consider the cases where S , ∅. Thus, we may
fix a point Q(q1 , . . . , qn ) ∈ S. Then, for any point P (p1 , . . . , pn ) belonging to S we have
−−→
for each j = 1, . . . , t. Thus, QP ∈ W. This shows that S is contained in the affine subspace T passing
−−−→
through Q and parallel to W. Conversely, if R(r1 , . . . , rn ) ∈ T , then QR ∈ W an so, the components
−−−→
(r1 − q1 , . . . , rn − qn ) of QR are solutions to (3.23).Thus,
for each j = 1, . . . , t. That is, R ∈ S. Thus S = T , hence S is an affine subspace. Moreover, dim(S) =
dim(W) = n− the rank fo the matrix of coefficients of (3.23).
Next we show that an affine subspace S ⊆ An has a description by a linear system of the form
(3.22). Let S be any affine subspace of An with direction space W of dimension s. Being an s-
dimensional subspace of V, W can be described by a homogeneous system with n − s equations
a11 x1 + · · · + a1n xn = 0
..
W: .
.
n−s,1 x1 + · · · + an−s,n xn = 0
a
−−→
Fix a point Q ∈ S. The points P (p1 , . . . , pn ) of S are characterized by the condition that QP ∈ W, i.e.
aj1 p1 + · · · + ajn pn = bj
where we have put bj = aj1 q1 + · · · + ajn qn . Thus, the points P (p1 , . . . , pn ) ∈ S are precisely those points
in An whose coordinates satisfy the equations
a11 x1 + · · · + a1n xn = b1
..
.
.
n−s,1 x1 + · · · + an−s,n xn = bn−s
a
3.4.1 Hyperplanes
Definition 3.8. Affine subspaces in An which have dimension n − 1 are called hyperplanes.
44
Geometry - first year [email protected]
Let H be a hyperplane and let (v1 , . . . , vn−1 ) be a basis of D(H). With respect to a frame K =
(O, e1 , . . . , en ) of An , parametric equations of H are of the from
x1 = q1 + t1 v1,1 + · · · + tn−1 vn−1,1 x1 q1 v1,1 vn−1,1
x2 = q2 + t1 v1,2 + · · · + tn−1 vn−1,2 x2 q2 v1,2 vn−1,2
H : .. or H : . = . + t1 . + · · · + tn−1 .
(3.24)
. .. .. .. ..
x = q + t v + ··· + t v
n n 1 1,n n−1 n−1,n xn qn v1,n vn−1,n
where vi = vi (vi,1 , . . . , vi,n ), where Q = Q(q1 , . . . , qn ) is a point in H and where ti ∈ R for each i ∈
{1, . . . , n − 1}. These parametric equations express the fact that a point P belongs to H if and only if
−−→
the vector QP is a linear combination of the basis vectors v1 , . . . , vn−1 , i.e. if and only if the vectors
−−→
QP , v1 , . . . , vn−1 are linearly dependent. We can reformulate this as follows. A point P (x1 , . . . , xn )
belongs to the hyperplane H if and only if
x1 − q1 x2 − q2 . . . xn − qn
v1,1 v1,2 ... v1,n
.. .. .. = 0. (3.25)
. . .
vn−1,1 vn−1,2 . . . vn−1,n
3.4.2 Lines
A line in An is a 1-dimensional affine subspace. If ℓ is such a line, then, by definition, the vectors
which can be represented by points in ℓ are linearly dependent. Any such non-zero vector v is called
a direction vector of ℓ. Thus, ℓ can be described as
−−→ −−−→
ℓ = P ∈ An : OP = OQ + tv for some t ∈ R .
for any point O ∈ An and any point Q ∈ ℓ. The image that goes with this description is the same as the
one in dimension 2, but here we interpret it in the n-dimensional space An . In coordinates, relative
to a given frame K of An , we obtain parametric equations for ℓ. They are of the form:
x1 = q1 + tv1 x1 q1 v1
x2 = q2 + tv2 x2 q2 v2
ℓ: .. or, in matrix notation, ℓ : . = . + t .
.. .. ..
.
x = q + tv
xn qn vn
n n n
where Q = Q(q1 , . . . , qn ) and v = v(v1 , . . . , vn ) relative to K. Here again it is possible to eliminate the
parameter t in order to obtain symmetric equations of the line ℓ:
x 1 − q1 x 2 − q2 x −q
ℓ: = = ··· = n n .
v1 v2 vn
45
Geometry - first year [email protected]
These are in fact a system of (n − 1)-linear equations which you can rearrange to look like this:
a1,1 x1 + · · · + a1,n xn = b1
..
ℓ: .
.
n−1,1 x1 + · · · + an−1,n xn = bn−1
a
Notice that the rank of this system is n−1 since dim(ℓ) = 1. Moreover, notice that each linear equation
in the above system describes a hyperplane. So, this is saying that a line in An can be described by
the intersection of n − 1 hyperplanes.
We obtain 2.) by noticing that dim(S) = dim(T ) implies equality in the above equation. Indeed, by
definition dim(S) = dim(T ) means that dim(D(S)) = dim(D(T )), but then D(S) is a vector subspace of
D(T ) of maximal dimension, hence D(S) = D(T ).
As a consequence of Proposition 3.9 we obtain the following corollary which implies the ‘parallel
postulate’ of Euclidean geometry (Axiom IV in Appendix A). The axioms of affine spaces therefore
imply the validity of this postulate.
Corollary 3.10. If S is an affine subspace of An and Q a point in An , there is a unique affine subspace
T of An which contains Q, is parallel to S and has the same dimension as S.
−−→
Proof. Fix a point Q ∈ An and a point P ∈ S. To see that T exists, translate all points of S with QP ,
i.e. consider
−−→ −−→ −−−→ −−−→
T = QP + S = QP + {P ′ ∈ An : P P ′ ∈ D(S)} = {P ′ ∈ An : QP ′ ∈ D(S)}
−−→ −−−→ −−−→
where for the last equality we use QP + P P ′ = QP ′ . Then T is an affine subspace with D(T ) = D(S),
in particular it is parallel to S and dim(T ) = dim(S). To see that T is unique, assume that T ′ is another
affine subspace passing through Q which is parallel to S and of the same dimension. By point 2.) of
Proposition 3.9 we see that T ′ has to equal T .
46
Geometry - first year [email protected]
Definition 3.11. If two affine subspace S and T of An are not parallel, they are said to be either skew
if they do not meet, or incident if they have a point in common.
In order to determine the intersection S ∩ T , suppose that the two subspace are given by the
Cartesian equations
X n
S: aij xj = bi for i = 1, . . . , n − s (3.26)
j=1
n
X
T : ckj xj = dk for k = 1, . . . , n − t. (3.27)
j=1
The intersection S ∩ T is the locus of points in An whose coordinates are simultaneously solutions to
both (3.26) and (3.27), i.e. they are solutions to the system
( Pn
aij xj = bi for i = 1, . . . , n − s,
S ∩ T : Pnj=1 (3.28)
c x
j=1 kj j = d k for k = 1, . . . , n − t.
By Theorem 3.7, if the System (3.28) has a solution, then it describes an affine subspace. Thus, if
S ∩ T is non-empty it is an affine subspace of An .
Proposition 3.12. If the intersection S ∩ T of two affine subspaces of An is non-empty it is an affine
subspace satisfying
n o
dim(S) + dim(T ) − dim(An ) ≤ dim(S ∩ T ) ≤ min dim(S), dim(T ) . (3.29)
S : nj=1 aij xj = 0
( P
for i = 1, . . . , n − s
Pn
T : j=1 bij xj = 0 for i = 1, . . . , n − t
Since S ∩ T is non-empty, the above system is compatible and the dimension of S ∩ T is n − r where r
is the rank of the matrix of coefficients of this system. Notice that
r ≤ n − s + n − t = 2n − (s + t)
Thus,
dim(S ∩ T ) = n − r ≥ n − [2n − (s + t)] = s + t − n = dim(S) + dim(T ) − dim(An ).
The last inequality is clear since S ∩T ⊆ S, T implies D(S ∩T ) ⊆ D(S), D(T ) and therefore dim(S ∩T ) ≤
dim(S), dim(T ).
47
Geometry - first year [email protected]
Proof. We use Grassmann’s identity which states that for any vector subspaces W and U we have
dim(W) + dim(U) = dim(W ∩ U) + dim(W + U).
Let s = dim(S) and let t = dim(T ). By Theorem 3.7 we have
S : nj=1 aij xj = 0
( P
for i = 1, . . . , n − s
T : nj=1 bij xj = 0
P
for i = 1, . . . , n − t
By convention dim(∅) = −∞, thus (3.30) holds only if S ∩ T , ∅. Assume that S ∩ T is non-empty.
Then, the above system is compatible and the dimension of S ∩ T is n − r where r denotes the rank of
the matrix of coefficients of this system. Moreover we notice that (3.30) is equivalent to
n−r = s+t−n ⇔ dim(D(S ∩ T )) = dim(D(S)) + dim(D(T )) − dim(Vn )
and since S ∩ T , ∅ this is in turn equivalent to
dim(D(S) ∩ D(T )) = dim(D(S)) + dim(D(T )) − dim(Vn )
rearranging the equation and using Grassmann’s identity the equation is equivalent to
dim(Vn ) = dim(D(S)) + dim(D(T )) − dim(D(S) ∩ D(T )) = dim(D(S) + D(T )).
At this point we use the fact that the vector subspace D(S) + D(T ) of Vn has maximal dimension if
and only if it equals the ambient space, i.e. if and only if D(S) + D(T ) = Vn .
x1
48
CHAPTER 4
Euclidean space
Contents
4.1 Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.1.1 Orthonormal frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.2 Oriented angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.2.1 The Euclidean space Rn (first revision) . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.2 Gram-Schmidt algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2.3 Normal vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2.4 Angles between lines and hyperplanes . . . . . . . . . . . . . . . . . . . . . . . 60
4.3 Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3.1 Distance from a point to a hyperplane . . . . . . . . . . . . . . . . . . . . . . . 62
4.3.2 Loci of points equidistant from affine subspaces . . . . . . . . . . . . . . . . . 63
4.3.3 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
49
Geometry - first year [email protected]
4.1 Angles
In Chapter 1 we extracted the information that the Axioms encode in two points and arrived at the
notion of affine space. One natural way to proceed is to consider the information that the Axiom
encodes in three points. Three non-collinear points define a plane, 12 angles, a triangle and its area,
etc. Here, we focus on angles which is a second key concept in the formulation of the Axioms.
There was much debate among philosophers as to the particular category (according to the Aris-
totelian scheme) in which an angle should be placed; is it, namely, a quantity, a quality or a relation
(see [10, p.177]). Such questions are for philosophers. Based on Hilbert’s Axioms we will take an
angle ∡(h, k) to be the information carried by two rays h and k, emanating from a common point O.
One can deduce from the Axioms that an angle defines a unique plane which the two rays separate
in a convex and a concave subset. Visually, it is common to identify an angle with the convex subset
they define. As a matter of notation, for points A ∈ h and B ∈ k we may write ∡AOB for the angle
∡(h, k). Notice that if three points O, A, B are given, the symbols ∡AOB require that A , O and B , O.
In this section we derive properties of angles needed to introduce the scalar product. Standard
results concerning the trigonometry of the oriented Euclidean plane are deduced in Appendix I.
Throughout, we are interested in properties of angles up to congruence, i.e. in those properties
which all congruent angles share. From Axioms III.4 and III.5 one can deduce that congruence of
angles is indeed an equivalence relation and one may consider equivalence classes of angles under
this relation. This treatment is implicit in the notion of angles between two vectors. Instead of adding
more notation, we will simply say angle up to congruence to mean that the angle may be replaced with
a congruent angle.
Two lines which intersect in exactly one point form four angles. The opposite angles are con-
gruent (see [14]) and the adjacent angles are called supplementary. A right angle is an angle which is
congruent to its supplementary angle. If one of the angles of two intersecting lines is a right angle
then all of them are right angles and we say that the lines are orthogonal, or that the lines are perpen-
dicular to eachother. An acute angle is an angle less than a right angle and an obtuse angle is an angle
greater than a right angle.
Definition 4.1. Let ∡(h, k) be an angle with the two rays emanating from the point O. Let ℓ be the
line containing k and choose A ∈ h and B ∈ ℓ such that AB is orthogonal to ℓ. The sine and cosine of
the angle ∡(h, k) are defined by
0 if the angle is a right angle;
|AB|
|OB|
sin ∡(h, k) = and cos ∡(h, k) =
|OA| if the angle is acute;
|OA|
− |OB| if the angle is obtuse.
|OA|
h h
A′ A′
A A
ℓ B′ B O k ℓ O B B′ k
50
Geometry - first year [email protected]
Proposition 4.2. The sine and cosine of an angle are well defined. Moreover, the following hold:
1. For an angle θ we have sin(θ) ∈ [0, 1], cos(θ) ∈ [−1, 1] and cos(θ)2 + sin(θ)2 = 1.
2. Two angles are congruent if and only if their cosines are equal.
3. Two angles have the same sine if and only if they are congruent or supplementary up to con-
gruence.
Proof. The sine function and the cosine function each attribute a real value to an angle by means of
certain choices. We need to show that these values are independent of the choices made. We show
this for acute angles, the other cases are similar. Let ∡(h, k) be an acute angle and let O, A, B be as
in Definition 4.1. Consider other two points A′ ∈ h and B′ ∈ k such that A′ B′ is orthogonal to k.
By Thales’ Intercept Theorem (see Theorem F.1) the ratio |A′ B′ |/|OA′ | equals |AB|/|OA| and the ratio
|OB′ |/|OA′ | equals |OB|/|OA|.
It remains to show that the definition does not depend on the order of the two rays. By Lemma
1.4 there is a unique point A′ ∈ k such that [OA] is congruent to [OA′ ]. Let B′ ∈ h be such that A′ B′ is
orthogonal to h. By the Second Congruence Theorem [14, Theorem 13], the triangles OAB and OB′ A′
are congruent, hence the ratio |OB′ |/|OA′ | equals |A′ B|/|B′ A| and |OB′ |/|OA′ | = |OB|/|OA|.
B′A
h
k
O BA′
It remains to consider the last three claims. Claim 1. follows from the fact that the length of
the catheti in a right angle triangle are always less than the length of the hypotenuse and, cos(θ)2 +
sin(θ)2 = 1 follows from Pythagoras’ Theorem. Since congruence of triangles is an equivalence rela-
tion, Claim 2. and 3. can be deduced with Axiom III.4.
−−−→ −−→
Definition 4.3. For two non-zero vectors a = OA and b = OB , the unoriented angle between a and
b, denoted ∡(a, b), is the angle ∡AOB up to congruence. By Proposition 4.2, the values of sine and
cosine do not change under congruence, thus, the sine sin ∡(a, b) and cosine cos ∡(a, b) of the unoriented
angle ∡(a, b) are well defined. If cos ∡(a, b) = 0 we say that a and b are orthogonal and we write a ⊥ b.
We denote the set of all unoriented angles by W (from the German word ‘Winkel’).
A′
a
a a A B′
a
a b
b O′ B
O b
b b
O
51
Geometry - first year [email protected]
Proposition 4.4. The unoriented angle of two (non-zero) vectors is well defined. Moreover, for any
two non-zero vectors a, b and a real number x > 0, the following hold:
Proof. Let a, b ∈ V2 be two non-zero vectors. Given a point O ∈ E2 , there are unique points A, B ∈ E2
−−−→ −−→
such that a = OA and b = OB . Thus we obtain an angle ∡AOB. For a different point O′ ∈ E2 ,
−−−−→ −−−−→
again there are unique points A′ , B′ ∈ E2 such that a = O′ A′ and b = O′ B′ giving us a second angle
∡A′ O′ B′ . It is not difficult to see that the two angles are congruent.
Remark. Let ℓ be a line in anplane π and let O be a point on ℓ. Choose a side of π orelative to ℓ and
1
consider the semicircle S 2 = A : with A on the given side of ℓ or on ℓ and |OA| = 1 . It follows from
1
Proposition 4.4 that the set of unoriented angles W is in bijection with S 2 .
b
O a ℓ
a a
Pr⊥ ⊥ a
a (b) = pra (b) |a|
B′
B
b b
O
Pr⊥ ⊥ b
b (a) = prb (a) |b|
52
Geometry - first year [email protected]
Proposition 4.6. The orthogonal projection map on vectors is a well defined linear map. Moreover,
for two vectors a and b we have
pr⊥
a (b) pr⊥ (a)
cos ∡(a, b) = = b .
|b| |a|
Proof. Showing that the map is well defined is left as an exercise. It is enough to show that pr⊥
a is
linear, since then
a a a
Pr⊥ ⊥
a (xb + yc) = pra (xb + yc) = x pr⊥
a (b) + y pr⊥
a (c) = x Pr⊥ ⊥
a (b) + y Pra (c).
|a| |a| |a|
a
Let i be the unit vector |a| and let j be a vector orthogonal to i. The linearity of pr⊥
a follows from the
fact that it is a projection on the coordinate x-axis of the frame having O as origin and (i, j) as basis.
The last claim follows from the definition of cosine.
53
Geometry - first year [email protected]
−−−→ −−→
For two vectors a = OA and b = OB the oriented angle ∡or (a, b) is the oriented angle defined by the
rays (OA and (OB. The orientation of ∡or (h, k) is the orientation of the basis (a, b). We denote the set
of all oriented angles up to congruence by Wor .
A
O a
Proposition 4.9. There is a bijection between the set Wor of oriented angles and the unit circle S1 .
Lemma 4.10. Fix an orientation in E2 , an oriented angle ∡or (a, b) and a length x. For any non-zero
vector c there is a unique vector d of length x such that ∡or (a, b) = ∡or (c, d).
Definition 4.11 (Counterclockwise sum of angles). We define the sum of two oriented angles ∡or (a, b)
and ∡or (c, d) as follows. By Lemma 4.10, there is a unique unit vector d′ such that ∡or (c, d) = ∡or (b, d′ )
and we define
∡or (a, b) + ∡or (c, d) = ∡or (a, d′ )
d′
Proposition 4.12. The set Wor of oriented angles with addition is an abelian group.
Definition 4.13. For a vector v ∈ V2 let J(v) to be the unique vector in V2 satisfying the following
properties
(a) J(v) ⊥ v,
54
Geometry - first year [email protected]
pr⊥
J(a) (b) pr⊥
J(b) (a)
sin ∡or (a, b) = = . (4.1)
|b| |a|
J(a)
55
Geometry - first year [email protected]
⟨a, b⟩ ⟨b, a⟩
Pr⊥
a (b) = a and Pr⊥
b (a) = b. (4.2)
⟨a, a⟩ ⟨b, b⟩
a a
⟨a,b⟩
⟨a,a⟩
a
b b
⟨b,a⟩
⟨b,b⟩
b
⟨av + bw, u⟩ = a⟨v, u⟩ + b⟨w, u⟩ and ⟨v, aw + bu⟩ = a⟨v, w⟩ + b⟨v, u⟩.
(SP4) It recognizes right angles and unit lengths, i.e. for all non-zero vectors v, w ∈ V2
56
Geometry - first year [email protected]
Proof. Since cos ∡(a, b) = cos ∡(b, a) scalar product is symmetric. Since cos ∡(a, a) = 1 the scalar prod-
uct is positive definite. In order to show bilinearity, notice that by symmetry it is enough to show
that the scalar product is linear in the first argument. For any non-zero vectors a, b, by Proposition
4.6, we have
pr⊥ (a)
⟨a, b⟩ = |a| · |b| · cos ∡(a, b) = |a| · |b| · b = |b| · pr⊥
b (a)
|a|
which is linear in a since b is fixed and since pr⊥
b is a linear map. The last property follows directly
from the definition.
Proposition 4.16. Let K = (O, B) be an orthonormal frame. For two vectors a(a1 , . . . , an ) and b(b1 , . . . , bn )
we have
⟨a, b⟩ = a1 b1 + · · · + an bn (4.3)
and therefore q
|a| = a21 + a22 + · · · + a2n ,
a1 b1 + a2 b2 + · · · + an bn
cos ∡(a, b) = q q ,
2 2 2 2 2 2
a1 + a2 + · · · + an b1 + b2 + · · · + bn
a ⊥ b ⇔ a1 b1 + a2 b2 + · · · + an bn = 0.
57
Geometry - first year [email protected]
By Corollary H.10, the scalar product is the unique positive definite symmetric bilinear form
which recognizes right angles and unit lengths. By Corollary H.9 for any positive definite bilinear
form (i.e. satisfying (SP1), (SP2) and (SP3) in Proposition 4.15) there is a basis in which it looks like
the scalar product (4.3). Thus, computationally such bilinear forms are indistinguishable.
Definition 4.17. The n-dimensional Euclidean space En is the pair (An , ⟨ , ⟩) where An is the n-dimensional
real affine space and where ⟨ , ⟩ is the unique positive definite symmetric bilinear form on D(An )
which recognizes right angles and unit lengths, i.e. with respect to an orthonormal basis B it has the
expression (4.3). Then, the distance between two points P , Q ∈ En is
q
−−→ −−→ −−→
q
d(P , Q) = | QP | = ⟨ QP , QP ⟩ = (p1 − q1 )2 + (p2 − q2 )2 + · · · + (pn − qn )2 . (4.4)
where the coordinates of P (p1 , . . . , pn ) and Q(q1 , . . . , qn ) are relative to an orthonormal frame K.
Remark (The Euclidean space Rn ). Choosing an orthonormal frame we may identify both An and Vn
with Rn . Since computationally there is no difference, it is more economical to say that the Euclidean
space is Rn with the standard basis an orthonormal basis. This is the starting point of any Analysis
course, and it is the advantage that Newton and Leibniz in particular saw in Descartes work.
Remark. The advantages of Definition 4.17 are conceptual. For example, by Proposition 3.6, a d-
dimensional affine subspace S of An is itself an affine space, which can be identified with Ad ; we
write S Ad . It is easy to see that a scalar product on An defined with the properties in Proposition
4.15 restricts to a scalar product on S Ad ⊆ An . Thus, an inclusion S Ad ⊆ An automatically
translates to an inclusion S Ed ⊆ En . Formally this can be stated as follows: An affine subspace of
En is a Euclidean space with the scalar product inherited from En . In particular, all the results which
we know to hold true for E2 or E3 will hold true when we consider 2-dimensional or 3-dimensional
subspaces of the Euclidean space En .
e′1 = e1
⟨e′ ,e ⟩
e′2 = e2 − ⟨e1′ ,e2′ ⟩ e′1
1 1
⟨e′ ,e ⟩ ⟨e′ ,e ⟩
e′3 = e3 − ⟨e1′ ,e3′ ⟩ e′1 − ⟨e2′ ,e3′ ⟩ e′2
1 1 2 2
..
.
58
Geometry - first year [email protected]
This process of obtaining an orthonormal basis from a given basis is called the Gram-Schmidt process.
It can be used in an infinite dimensional vector space, hence the name process. If the vector space is
finite, the process terminates and we may call it algorithm.
Proposition 4.18. The basis B ′ obtained from the basis B with the Gram-Schmidt algorithm is an
orthonormal basis.
Proof. A more general statement and proof is given in [19, Theorem 17.4]. The vectors e′1 , e′2 , . . . , e′k , . . .
are constructed by induction on k. For each k we consider the vector subspace Vk generated by
Bk = (e1 , e2 , . . . , ek ) and show that Bk′ = (e′1 , e′2 , . . . , e′k ) is an orthogonal basis for Vk .
If k = 1 then e′1 = e1 and the claim is obvious. Assume that the claim holds for k. By definition we
have
k
X ⟨e′i , ek+1 ⟩ ′
e′k+1 = ek+1 − e .
⟨e′i , e′i ⟩ i
i=1
| {z }
v
Since the vectors e′i form a basis, ek+1 , 0 ⇔ v , ek+1 else ek+1 would be a linear combination Bk′ and
′
therefore of Bk (since both are bases of Vk ). It follows that Bk+1 is a basis of Vk+1 . Moreover, for each
i = 1, . . . , k we have
P ⟨e′ ,ek+1 ⟩
⟨e′k+1 , e′i ⟩ = ⟨ek+1 − kj=1 ⟨ej ′ ,e′ ⟩ e′j , e′i ⟩
j j
⟨e′i ,ek+1 ⟩ ′ ′
= ⟨ek+1 , ei ⟩ − ⟨e′i ,e′i ⟩
⟨ei , ei ⟩
= ⟨ek+1 , ei ⟩ − ⟨ek+1 , e′i ⟩ = 0
since, by the inductive hypothesis, ⟨e′i , e′j ⟩ = 0 for all 1 ≤ i, j ≤ k. Thus, Bk+1
′
is an orthogonal basis
and renormalizing it we obtain an orthonormal bases for Vk+1 .
59
Geometry - first year [email protected]
ℓ : x + 3y − 3 = 0,
O i
for some c > 0. Moreover, the distance d(O, H) from the origin to the hyperplane equals c.
Proof. Let H be a hyperplane with equation (4.5). A normal vector of H is n(a1 , . . . , an ). By the
discussion in Section 4.1.1, the normalized vector ñ = n/|n| has components (cos(θ1 ), . . . , cos(θn )) for
some angles θ1 , . . . , θn ∈ [0, π). Thus, with c = b/|n| the equation of H is has the form (4.6). If c < 0
replace each θi by π − θi to change the sign of c (this is equivalent to changing the sign of ñ). The last
claim follows from the distance formula (4.8) deduced below in Proposition 4.25.
60
Geometry - first year [email protected]
are supplementary angles so if you know one of them you know the other one. We may calculate this
with the scalar product since
⟨v , v ⟩
cos ∡(v1 , v2 ) = 1 2 .
|v1 | · |v2 |
n2 n1
v2
v1
Notice also that the two angles can be described with normal vectors: if n1 and n2 are normal
vectors for ℓ1 and ℓ2 respectively, then the two angles between ℓ1 and ℓ2 are ∡(n1 , n2 ) and ∡(−n1 , n2 ).
So, if these vectors are known we may calculate
⟨n1 , n2 ⟩
cos ∡(n1 , n2 ) = .
|n1 | · |n2 |
On the other hand, if we know a direction vector v1 for the first line and a normal vector n2 for the
second line then the acute angle between ℓ1 and ℓ2 is
π ⟨v , n ⟩ π
− arccos( 1 2 ) ∈ [0, ).
2 |v1 | · |n2 | 2
This generalizes in three ways. In En consider two line ℓ1 and ℓ2 with direction vectors v1 and v2
respectively as well as two hyperplanes H1 and H2 with normal vectors n1 and n2 respectively.
1. ℓ1 and ℓ2 define two supplementary angles: ∡(v1 , v2 ) and ∡(−v1 , v2 ) which can be calculated
with
⟨v , v ⟩
cos ∡(v1 , v2 ) = 1 2 .
|v1 | · |v2 |
2. H1 and H2 define two supplementary angles: ∡(n1 , n2 ) and ∡(−n1 , n2 ) which can be calculated
with
⟨n , n ⟩
cos ∡(n1 , n2 ) = 1 2 .
|n1 | · |n2 |
3. ℓ1 and H1 define two supplementary angles: if cos ∡(v1 , n1 ) ≥ 0 then ∡(v1 , n1 ) is acute and the
acute angle between ℓ1 and H1 is
π ⟨v , n ⟩
− arccos( 1 1 ).
2 |v1 | · |n1 |
Else, if cos ∡(v1 , n1 ) < 0 replace n1 with the normal vector −n1 of H.
61
Geometry - first year [email protected]
The angles between (hyper)planes in E3 are referred to as dihedral angles. Two planes, π1 and π2 ,
define four dihedral angles which are the four regions in which the two planes divide E3 . More
precisely, let ℓ be the line π1 ∩ π2 , choose a plane π orthogonal to ℓ and consider the lines ℓ1 = π ∩ π1
and ℓ2 = π ∩ π2 . The angles between ℓ1 and ℓ2 do not depend on the choice of π, i.e if we choose a
different plane orthogonal to ℓ we obtain congruent angles. So, up to congruence we have two angles
and they can be calculated using normal vectors as indicated above.
4.3 Distance
Definition 4.23. The distance between two points P , Q in En , denoted d(P , Q), is the length of the
segment [P Q] and we calculate it with (4.4). The distance between two sets of points S1 and S2 is
n o
d(S1 , S2 ) := inf d(P , Q) : P ∈ S1 and Q ∈ S2 . (4.7)
d(P , H) = |P P ′ |.
Proof. For any other point Q in H, distinct from P ′ , we have a right-angled triangle P P ′ Q. Since the
hypotenuse is larger than the catheti we have
d(P , H) ≤ |P P ′ | < |P Q|
Proposition 4.25. Let K be a frame of En . Consider the point P (p1 , . . . , pn ) and a hyperplane H :
a1 x1 + a2 x2 + · · · + an xn = b. Then
|a1 p1 + a2 p2 + · · · + an pn − b|
d(P , H) = q . (4.8)
2 2 2
a1 + a2 + · · · + an
62
Geometry - first year [email protected]
n
P
ℓ
j
Q
O i
63
Geometry - first year [email protected]
Proof. We notice first that if c = 0 then L(S, c) is the set S itself by definition, (4.7). Moreover, if
c < 0 then L(S, c) is empty since distances are positive. The non-trivial cases appear when c > 0.
If S consists of a single point, in dimension n = 1, then L(S, c) consists of two points, namely the
endpoints of a segment with midpoint S. We can discuss the diagonal entries in Table 4.1 together by
considering hyperplanes since, in dimension 1, these are points, in dimension 2, these are lines and,
in dimension 3, these are planes. Let the hyperplane S be given by the equation a1 x1 +a2 x2 +· · ·+an xn =
b. By (4.8), a point Q(q1 , . . . , qn ) belongs to L(S, c) if and only if for all P ∈ S we have
d(P , Q) = c ⇔ |a1 q1 + a2 q2 + · · · + an qn | = c′
q
where c′ = c · a21 + a22 + · · · + a2n . Thus, L(S, c) is the union of two hyperplanes with equations
a1 q1 + a2 q2 + · · · + an qn = c′ and a1 q1 + a2 q2 + · · · + an qn = −c′ .
Consider the case where S is a line in E3 . We notice that in dimension 3 we don’t yet have a
formula for the distance between a point and a line. Such a formula will be deduced in Section 5.2.2
and can be used to deduce equations for L(S, c) if S is an arbitrary line. However, we have other
options as well, for instance we can ‘scan’ the three dimensional space with planes passing through S
and notice that in each such plane we obtain two lines at equal distance from S. Yet another option,
since we are only interested in the possible shapes of L(S, c), is to make a good choice of a frame.
Choose the frame K such that S is the z-axis. An arbitrary point in S is P (0, 0, t) with t ∈ R. Then, for
a point Q(xQ , yQ , zQ ) we have
2 2
d(P , Q) = c ⇔ xQ + yQ = c2
Thus, this locus of points has equation L(S, c) : x2 + y 2 = c2 which describes a cylinder with axis S.
Lastly, in any dimension the set of points at distance c from a point Q(xQ , yQ , zQ ) is the (hy-
per)sphere of radius c centered in Q, since
Definition 4.29. If S ′ is another set, the locus of points equidistant from S and S ′ is
n o
L(S, S ′ ) = P ∈ En : d(P , S) = d(P , S ′ ) .
Proposition 4.30. Let S and S ′ be two affine subspace of En . Table 4.2 classifies the possible shapes
of L(S, S ′ ).
64
Geometry - first year [email protected]
Table 4.2: Loci of points equidistant from two distinct affine subspaces.
Proof. Consider the first column in Table 4.2. Let A(a1 , . . . , an ) and B(b1 , . . . , bn ) be two fixed points.
A point P (p1 , . . . , pn ) is equidistant from A and B if and only if d(A, P ) = d(P , B). In coordinates this
gives the equation
(a1 − p1 )2 + · · · + (an − pn )2 = (b1 − p1 )2 + · · · + (bn − pn )2 .
which is equivalent to P satisfying the equation
1
(a1 − b1 )x1 + · · · + (an − bn )xn + (a21 − b12 + · · · + a2n − bn2 ) = 0
2
−−→
which is the equation of a hyperplane with normal vector AB . Moreover, it is easy to check that
the hyperplane contains the midpoint of the segment [AB]. It is called the perpendicular bisecting
hyperplane of the segment [AB]. In dimension 2 it is the perpendicular bisector of the segment [AB].
For the first diagonal in Table 4.2, consider a hyperplane H and a point Q outside H. We choose
the frame K such that Q is on the positive part of the last coordinate axes, such that the last coordinate
axes is orthogonal to H and such that the origin O is at the same distance from H and Q. Moreover,
eventually changing the unit segment, we may assume that Q has coordinates (0, . . . , 0, 1) and H has
equation xn + 1 = 0. Then, a point P is equidistant from H and Q if and only if
q
d(P , H) = d(P , Q) ⇔ |pn + 1| = p12 + · · · + pn−1 2
+ (pn − 1)2 = 0
x12 + · · · + xn−1
2
= 4xn
which is the equation of a hyperparaboloid. In dimension 1 this is a point (the origin, 0). In dimen-
sion 2 it is a parabola and in dimension 3 a paraboloid (see Chapter 11).
Next, consider the case of two hyperplanes
We may assume that the normal vectors which can be read of from the equations are unit vectors. A
point P (p1 , . . . , pn ) is at the same distance from H and H′ if
(a1 − b1 )x1 + · · · + (an − bn )xn + (an+1 − bn+1 ) = 0 and (a1 + b1 )x1 + · · · + (an + bn )xn + (an+1 + bn+1 ) = 0
65
Geometry - first year [email protected]
and we notice that if the hyperplanes are parallel then only one of the equation has solutions (the
hyperplane lying at half-distance between H and H′ ). In dimension 1 this is the midpoint of the
segment [HH′ ]. Assume that H and H′ are not parallel. In dimension 2 these are the angle bisectors
of the angles described by the two lines H and H′ . In dimension 3 these are bisecting planes of the
dihedral angles between H and H′ and in general these are angle bisecting hyperplanes.
Next consider the case of a point S = {Q} and a line S ′ in E3 . We choose a frame K such that the
point Q has coordinates (1, 0, 0) and such that the line S ′ contains the point (−1, 0, 0) and has direction
vector j(0, 1, 0). We anticipate and use the distance formula from a point to a line in dimension 3
(Section 5.2.2). Notice however that because of the choice of the frame, the distance formula can be
easily deduced in this case. For a point P , we have
−−→
d(P , S) = d(P , S ′ ) = | QP × j| ⇔ (xP − 1)2 + yP2 + zP2 = zP2 + (xP + 1)2 ⇔ yP2 = 4xP
These are two planes orthogonal to the plane containing the two lines and which bisect the angles
formed by the two lines.
Assume now that the lines are skew. We choose a frame such that S is the x-axis and S ′ contains
the point Q(0, 1, 0) and has v(λ, 0, 1) as direction vector. Then, for a point P , we have
−−→ −−→
d(P , S) = d(P , S ′ ) ⇔ | OP × i| = | QP × v| ⇔ yP2 + zP2 = (xP − λzP )2 + (λ2 + 1)(yP − 1)2
which corresponds to a one-sheeted hyperboloid (see Chapter ??). In the last subcase, where the two
lines S and S ′ are parallel on checks that L(S, S ′ ) is a plane.
Finally, the last case that we need to consider is that of a line S and a plane S ′ in E3 . With a
similar argument we see that in this case we obtain an elliptic cone if the line punctures the plane
and, if the line and the plane are parallel we obtain a parabolic cylinder.
4.3.3 Convergence
Once we have a clear notion of distance, the notion of proximity to a point also becomes clear. Being
close to a point P means lying in a ball of radius ε centered at the point P and you may adjust ε at
will. The set of open balls centered at all points defines a topology on En , called the standard topology.
Recall Chapter 7 of your Analysis course [16]. All the results from your analysis course hold true for
En by fixing an orthonormal frame which identifies En with Rn .
66
CHAPTER 5
Contents
5.1 Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.1.1 Area of polygons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.1.2 Oriented area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.1 Algebraic identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.2.2 Distance between a point and a line . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2.3 Common perpendicular line of two skew lines . . . . . . . . . . . . . . . . . . 77
5.2.4 Distance between two skew lines . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3 Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.1 Volume of polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.2 Oriented volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.3.3 Hypervolume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
67
Geometry - first year [email protected]
5.1 Area
5.1.1 Area of polygons
Definition 5.1. A polygonal segment A1 A2 . . . An is a union of segments [A0 A1 ], . . . , [An−1 An ], such that
consecutive segments [Ai Ai+1 ] and [Ai+1 Ai+2 ] intersect only in their shared vertex Ai+1 . If all the
vertices lie in the same plane, the polygonal segment is said to be planar. The polygonal segment
A1 A2 . . . An is said to intersect itself in a point P if there exist indices i and j with i + 1 < j, such that
[Ai Ai+1 ] intersects [Aj Aj+1 ] at P .
A6
A2
A7
A3
A4
A1
A5
A4
A5
A2
A3
A
A8
B
A1 A6
A7
We will only consider planar polygons. A planar polygon separates the points of the plane not ly-
ing on its sides into two regions (see [14, Theorem 9]): the interior and the exterior. These regions have
the following property: if A is a point of the interior (an inner point) and B is a point of the exterior
(an exterior point), then every polygonal segment lying in the plane of the polygon and connecting A
with B must intersect the sides of the polygon at least once.
The interior S of a (planar) polygon is bounded in the sense that, for any point P in the plane,
there is an r > 0 such that all points in S are at distance at most r from P . Indeed, take r to be the
maximum of the distances from P to all vertices. Our goal is to measure such bounded regions of
the plane - namely, interiors of polygons. We do this by means of triangles, which are the simplest
68
Geometry - first year [email protected]
possible polygons. Any polygon can be subdivided into triangles (see for example [3, Theorem 3.1]),
for instance with the so-called ear clipping method. Such a subdivision partitions the interior of a
polygon into interiors of triangles, if we are willing to disregard segments. Given all this, we may
measure the interior of polygons by comparing them with the the interior of a unit square.
A4
A5
A2
A3
A8
A1 A6
A7
Definition 5.2. For brevity, we use the term area of a polygon to mean the area of the interior of a
polygon. To define area, we consider two polygons to be disjoint if their interiors are disjoint. Denote
by P the set of all finite unions of polygons. We define an area function Area : P → R≥0 through the
following properties:
(A2) If S1 and S2 are similar polygons with ratio x then Area(S1 ) = x2 Area(S2 ).
Remark. The above definition captures basic principles of the intuitive notion of area. We note that
in property (A2), the term ‘similar polygons’ can be replaced with ‘congruent polygons’, resulting in
a weaker assumption. Furthermore, in property (A3) the finite union can be extended to a countable
union, in which case the sum is replaced by a countable sum (see Section 5.3.3). Our definition of
area allows us to deduce the following well-known facts.
2. The area of a triangle ABC with side length a and corresponding height h is ah/2.
3. The area of a parallelogram ABCD is twice the area of the triangle ABC.
Proof. Denote by Sa a square of side length a and by Ra,b a rectangle of sides length a and b. First,
notice that since the ratio Sa : S1 equals a, we have Area(Sa ) = a2 Area(S1 ) = a2 , by (A2) and (A1).
69
Geometry - first year [email protected]
1. Place Sa and Sb in opposite corners of Sa+b . The complement of the two small squares in the
big square are two disjoint rectangles which are congruent to Ra,b . Thus, by (A3) we have
hence
(a + b)2 = a2 + b2 + 2 Area(Ra,b )
and therefore Area(Ra,b ) = ab.
D C
Sb
Sa
A H B M
For claim 3, it suffices to notice that a diagonal divides a parallelogram in two congruent trian-
gles. For claim 2, we use claim 3 and check the known area formula for a parallelogram ABCD, i.e.
Area(ABCD) = ah where h is a height and a the length of the corresponding side. Let H be a point on
AB such that h = |DH|. The triangle ADH can be moved on the opposite side of the parallelogram to
form a rectangle with side lengths a and h. Thus, by claim 1, Area(ABCD) = Area(Ra,h ) = ah.
Orthonormal frames offer an efficient method for calculating the area of a parallelogram, and
therefore, the areas of triangles and polygons.
J(v)
D C
w
prJ(v) (w)
A H v B
Proposition 5.4. Suppose the vertices A(xA , yA ), B(xB , yB ) and D(xD , yD ) of the parallelogram ABCD
are given with respect to an orthonormal frame. Then,
xA yA 1
v vy
Area(ABCD) = | xB yB 1 | = | x |
wx wy
x D yD 1
−−→ −−−→
where v(vx , vy ) = AB and w(wx , wy ) = AD .
70
Geometry - first year [email protected]
Proof. Drop a perpendicular line from D on AB and denote by H its intersection with AB. By Propo-
sition 5.3, we have
−−→ −−−→ ⟨J(v), w⟩ v vy
Area(ABCD) = | AB | · | HD | = |v| · | prJ(v) (w)| = |v| · | | = |⟨J(v), w⟩| = | x |.
|J(v)| wx wy
xA yA 1 xA yA 1
vx vy xB − xA yB − yA
= = xB − xA yB − yA 0 = xB yB 1 .
wx wy xD − xA yD − yA
xD − xA yD − yA 0 xD yD 1
Similarly, the oriented area of the triangle ABC is Areaor (ABC) = Areaor (ABDC)/2.
−−→ −−→
Proposition 5.6. Let v = AB and w = AC be two vectors with ABC a triangle.
[v, w] 2 · Areaor (ABC)
sin ∡or (v, w) = = .
|v| · |w| |AB| · |AC|
where B ′ = (v, w).
Proof. This is a direct consequence of the definition of the sine function for oriented angles and of
the definition of oriented area of a triangle.
71
Geometry - first year [email protected]
Definition 5.7. Let a, b ∈ V3 be two vectors. The cross product (or vector product) of a and b, denoted
a × b, is the vector defined by the following properties:
1. if a and b are parallel then a × b = 0.
72
Geometry - first year [email protected]
Proposition 5.8. With the above notation, for any two non-zero vectors a, b ∈ V3 we have
Proof. By construction, we see that J⊥a (Pr⊥a (b)) is orthogonal to both a and b. Furthermore, by the
definition of the operator J⊥a , the vector J⊥a (Pr⊥a (b)) is a positive scalar multiple of a × b. Therefore,
all we need to show is that the vector on the right-hand side has length |a × b|. We have
|a × b| = |a| · |b| · sin ∡(a, b) = |a| · | Pr⊥a (b)| = |a| · |J⊥a (Pr⊥a (b))|
where the last equality follows from the fact that J⊥a is a rotation with a right angle - in particular it
doesn’t change the length of vectors.
(av + bw) × u = a(v × u) + b(w × u) and v × (aw + bu) = a(v × w) + b(v × u).
v × w = −w × v.
Proof. Skew-symmetry follows from the orientation requirement in the definition of the cross prod-
uct. Indeed, if (a, b, v) is right-oriented then (b, a, v) is left-oriented, hence (b, a, −v) is right-oriented.
To check bilinearity, we need to verify that the cross product is linear in each argument. If the first
argument is fixed, the map a × □ : V3 → V3 is a composition of linear maps (by Proposition 5.8), so it
is linear. For the second argument we may use the skew-symmetry of the cross product.
Since the cross product is bilinear, its values are determined by the values on a basis. If B = (i, j, k)
is a right-oriented orthonormal basis, one can check with the definition of the cross product that the
values on the basis vectors are:
× i j k
i 0 k −j
.
j −k 0 i
k j −i 0
This table allows us to calculate the cross product for arbitrary vectors a(a1 , a2 , a3 ) and b(b1 , b2 , b3 ),
with components relative to B:
i j k
(a1 i + a2 j + a3 k) × (b1 i + b2 j + b3 k) = (a2 b3 − a3 b2 )i + (a3 b1 − a1 b3 )j + (a1 b2 − a2 b1 )k = a1 a2 a3 .
b1 b2 b3
We can therefore derive, for example, a formula for the area of a parallelogram P spanned by a and
b: s
2 2 2
a2 a3 a a3 a a2
Area(P ) = |a × b| = + 1 + 1 .
b2 b3 b1 b3 b1 b2
73
Geometry - first year [email protected]
(i × j) × j = k × j = −i whereas i × (j × j) = 0.
However, there is a rule which explains how iterated cross products behave when evaluated in dif-
ferent ways. This rule is the Jacobi identity:
(a × b) × c + (b × c) × a + (c × a) × b = 0. ∀a, b, c ∈ V3 . (5.1)
One way to prove this identity is to write out the above expression in coordinates and check that the
left-hand side simplifies to the zero vector. The geometric interpretation of the identity is as follows:
given a tetrahedron OABC, the three planes passing through the edges adjacent to O and orthogonal
to the respective opposite faces intersect in one line.
A short way to prove (5.1) is by using the double cross formula. This formula is of independent
interest, as it provides an efficient way of calculating iterated cross products. Specifically, it replaces
the calculation of a determinant with the calculation of two scalar products. Other beautiful identi-
ties, which can be derived with linear algebra only, can be found in the exercises and in [11, Chapter
4].
Theorem 5.10 (Double Cross Formula). For any vectors a, b and c we have
Proof. Notice that (5.2) can be checked directly in coordinates. A different proof is provided in [11,
p.74] or in [5, p.70]. Our proof has three steps. First, note that we may assume all three vectors to be
non-zero; otherwise, it is straightforward to check that both sides of the equality are zero.
74
Geometry - first year [email protected]
(Step 1) It suffices to prove (5.2) in the special cases where all three vectors are unit vectors.
Indeed, under this assumption, the linearity of the cross product (Proposition 5.9) and the linearity
of the scalar product we obtain:
" # " #
a b c a c b b c a
(a × b) × c = |a| · |b| · |c| · ( × ) × = |a| · |b| · |c| · ⟨ , ⟩ − ⟨ , ⟩ = ⟨a, c⟩ · b − ⟨b, c⟩ · a.
|a| |b| |c| |a| |c| |b| |b| |c| |a|
(Step 2) It suffices to prove (5.2) in the special cases where c = a and c = b. Indeed, since a
and b are non-parallel unit vectors (a, b, a × b) is a basis and c = αa + βb + γa × b for some scalars
α, β, γ ∈ R. Then, using the linearity of the cross product (Proposition 5.9) and the properties of the
scalar product, we have:
(a × b) × c = (a × b) × (αa + βb + γa × b)
= α(a
× b) × a + β(a ×
b) × b
= α ⟨a, a⟩b − ⟨b, a⟩a + β ⟨a, b⟩b − ⟨b, b⟩a
= − α⟨b, a⟩ − β⟨b, b⟩ a + α⟨a, a⟩ + β⟨a, b⟩ b
= ⟨−αa − βb, b⟩a + ⟨αa + βb, a⟩b
= ⟨−αa − βb − γa × b, b⟩a + ⟨αa + βb + γa × b, a⟩b
= ⟨a, c⟩b − ⟨b, c⟩a.
(Step 3) It remains to show that (5.2) holds in the special cases where c = a and c = b, assuming
a and b are unit vectors. Using the skew-symmetry of the cross product, it is easy to see that the
two cases are equivalent. By Step 1, we may also assume that all three vectors are unit vectors. In
particular ⟨a, a⟩ = 1 and ⟨b, a⟩ = cos θ, where θ = ∡(a, b). Thus, it suffices to prove the identity:
(a × b) × a = b − cos θ · a.
Let v denote the left-hand side and w the right-hand side. Notice that Bv = (a × b, a, v) is a right-
oriented orthogonal basis. Since there is a unique vector of length |v| with this property, it suffices to
show that Bw = (a × b, a, w) is a right-oriented orthogonal basis and that |v| = |w|. The first property
can be verified by calculating the determinant of the base-change matrix to the basis B = (a, b, a × b),
which is also right-oriented:
0 1 − cos θ
det(MB,Bw ) = 0 0 1 =1>0
1 0 0
|v|2 = |a × b|2 = (sin θ)2 = 1 − 2(cos θ)2 + (cos θ)2 = ⟨b − ⟨b, a⟩ · a, b − ⟨b, a⟩ · a⟩ = |w|2 .
Thus, Bw equals Bv , hence v = w. This completes the proof of Step 3 and of the theorem.
75
Geometry - first year [email protected]
−−→
Area (ABCP ) | AP × v|
d(ℓ, P ) = = .
|AB| |v|
P C
−−→ d(ℓ, P )
AP
ℓ
A v B
From this we immediately deduce a formula for the distance between two parallel lines in E3 . Let
ℓ ′ be a line parallel to ℓ which passes through a point B(xB , yB , zB ). We may apply Proposition 4.26 to
conclude that d(ℓ, ℓ ′ ) = d(ℓ, Q) for any point Q in ℓ ′ . Thus,
−−→
′ | AB × v|
d(ℓ, ℓ ) = d(ℓ, B) = .
|v|
ℓ′ B
v
−−→ d(ℓ, ℓ ′ )
AB
ℓ
A v
76
Geometry - first year [email protected]
The common perpendicular line of ℓ and ℓ ′ is is the unique line d which intersects the two lines and
is orthogonal to both of them, i.e. the line satisfying the following properties
• d ⊥ ℓ and d ⊥ ℓ ′ , and
• d ∩ ℓ , ∅ and d ∩ ℓ ′ , ∅.
Since the two lines are skew relative to each other, the vectors v and w are linearly independent,
hence the vector v × w is non-zero and perpendicular to both v and w. Let π be the plane passing
through A and parallel to v and v × w. Notice that n = (v × w) × v is a normal vector for π and that ℓ is
included in π. Similarly, there is a plane π′ containing ℓ ′ and having normal vector n′ = (v × w) × w.
The intersection of these two planes is a line d and we claim that it is the common perpendicular of
ℓ and ℓ ′ . Since d = π ∩ π′ , a direction vector for d is
n × n′ = (v × w) × v × (v × w) × w .
where ⟨a, a × w⟩ = 0 since the two vectors are orthogonal. It follows that a is a direction vector for d.
But a = v × w is orthogonal to both v and w, hence it is orthogonal to both ℓ and ℓ ′ . Moreover, since d
77
Geometry - first year [email protected]
and ℓ lie in the plane π and have non-parallel direction vectors they necessarily intersect. Similarly,
we see that d ∩ ℓ ′ , ∅. This shows that d is indeed the common perpendicular of ℓ and ℓ ′ .
Writing the equations of π and π′ in coordinates we obtain
x − xA y − yA z − zA
π : vx vy vz = 0
a x a y az
d = π ∩ π′ :
x − xB y − yB z − zB
π ′: wx wy wz = 0
ax ay az
where we used the coordinates of A and B with respect to an orthonormal frame K = (O, B) and the
components of v, w and a with respect to B.
Since the lines are skew, there is a unique plane π containing ℓ ′ which is parallel to ℓ. Let P be
a point in π. Considering the cuboid with vertices M, N and P we see that d(ℓ, π) = |MN |. Hence
78
Geometry - first year [email protected]
d(ℓ, ℓ ′ ) = d(ℓ, π). Now, dropping a perpendicular from A on π which intersects π in H we see that
d(ℓ, π) = |AH|. Therefore
−−→
′ −−→ ⟨v × w, AB ⟩
d(ℓ, ℓ ) = d(ℓ, π) = |AH| = | Pr⊥
v×w ( AB )| = | |
|v × w|
−−→
⟨v × w, AB ⟩ ax (xB − xA ) + ay (yB − yA ) + az (zB − zA )
d(ℓ1 , ℓ2 ) = | |= q .
|v × w|
a2x + a2y + a2z
A further interpretation of this formula is the following. Let P be a parallelepiped spanend by the
−−→
three vectors v, w, AB and denote by F a face spanned by v and w. The distance between the two
lines is the height of P corresponding to the face F. Moreover, anticipating the discussion in Section
5.3, we have
−−→
Vol(P ) [v, w, AB ]
d(ℓ1 , ℓ2 ) = =| |.
Area(F) |v × w|
5.3 Volume
5.3.1 Volume of polyhedra
Definition 5.11. ‘A polyhedron may be defined as a finite, connected set of plane polygons, such
that every side of each polygon belongs also to just one other polygon, with the proviso that the
polygons surrounding each vertex form a single circuit’ [7]. Well known examples are prisms having
two congruent parallel faces, in particular triangular prisms and parallelepipeds, or pyramids having
just one point, the apex, outside the plane of the polygonal base. The height of a prism is the distance
between the two main faces, and the height of a pyramid is the distance from the apex to the base.
79
Geometry - first year [email protected]
For our purposes, the following non-standard definition suffices. The interior of a tetrahedron
is a polyhedral set. If ABCD and ABCD ′ are two tetrahedra with disjoint interior then the union of
the interior of the two tetrahedra together with the interior of the triangle ABC is a polyhedral set
obtained by gluing the two tetrahedra. Any polyhedral set is obtained after a finite number of gluings
of tetrahedra.
Definition 5.12. Let P denote the set of all polyhedral sets in E3 . We define a volume function
Vol : P → R≥0 through the following properties:
(V2) If S1 and S2 are similar polyhedral sets with ratio x = S1 : S2 then Vol(S1 ) = x3 · Vol(S2 ).
(V3) If S1 and S2 are disjoint polyhedral sets then Vol(S1 ∪ S2 ) = Vol(S1 ) + Vol(S2 ).
1. The volume of a prism of height h and base area a is ah. In particular, the volume of a parallel-
ogram with a face of area a and corresponding height h is ah.
2. The volume of a pyramid of height h and base area a is ah/3. In particular, the volume of a
tetrahedron ABCD with Area(BCD) = a and d(A, BCD) = h is ah/3.
Proof. It is clear that 3. follows from 1. However, the proof needs to start from the definitions. We
do this by first proving 3. Denote by Ra,b,c a rectangular parallelepiped with side lengths a, b and c.
By (V1) and (V2), we have
1
Vol(Ra,a,1 ) = a3 · Vol(R1,1, 1 ) = a3 · = a2
a a
for any a > 0. Now, as in the proof of Proposition 5.3, consider Ra,a,1 and Rb,b,1 in opposite corners
of Ra+b,a+b,1 . By (V3), we have Vol(Ra+b,a+b,1 ) = Vol(Ra,a,1 ) + Vol(Rb,b,1 ) + 2 · Vol(Ra,b,1 ) hence (a + b)2 =
a2 + b2 + 2 · Vol(Ra,b,1 ) and therefore Vol(Ra,b,1 ) = ab for any a, b > 0. Then, for an arbitrary rectangular
parallelepiped we have
ab
Vol(Ra,b,c ) = c3 · Vol(R a , b ,1 ) = c3 · 2 = abc.
c c c
Next, we use 3. to deduce the volume of an arbitrary parallelepiped. For this we section and rear-
range the parallelepiped as in the figure (this is Figure 4.1.2 from [1]) below.
80
Geometry - first year [email protected]
In (a), (b) and (c) the parallelepiped is transformed into a prism with base a parallelogram by
cutting and pasting congruent triangular prisms. In particular, the height h and the base area a are
unchanged. In the last step (d), the base is rearranged into a rectangle without changing its area - as
in the proof of Proposition 5.3. The final result is a rectangular parallelepiped Rx,y,h with volume ah
equal to the volume of the initial parallelepiped.
Now, slicing a parallelepiped along the diagonals of two opposite faces we obtain a triangular
prism with volume ha where a is half the area of the face which was cut. Thus, a triangular prism
has the claimed area. For an arbitrary prism, use a triangulation of the base polygon to deduce the
claim.
For 2. we first consider tetrahedra and follow the argument in [1, §4.1]. Slice a rectangular prism as
in the following figure1 . The proof of Proposition 5 of Book XII of Euclid’s Elements (see for example
[10]), is correct for our settings and assumptions. It implies that two triangular pyramids (tetrahedra)
with the same height and with congruent bases have the same volume [10, Volume 3, p.390].
81
Geometry - first year [email protected]
In the above slicing the two tetrahedra with white base have the same heights and the bottom and
top tetrahedra are clearly congruent. Thus, if T is any of the three tetrahedra of the triangular prism
P , then
1 ah
Vol(T ) = Vol(P ) =
3 3
where a is the area of a base triangle of P and h is the height of P . Then, for an arbitrary pyramid,
the claim follows by considering a tetrahedralization.
Remark. Similar to our definition of area, the above definition of volume tries to capture basic prin-
ciples of the intuitive notion of volume. It is an over-simplified version of a so-called Lebesgue
measure (see Section 5.3.3). In property (V1) we may replace the rectangular parallelepiped with
a cube of side-length 1. However, after weakening the assumption, it is not possible to cover an
arbitrary rectangle allowing only the uniform scaling given in (V2). To see this difficulty, keep the
notation in the proof of Proposition 5.13 and place three cubes Ca , Cb and Cc on the diagonal of a
cube Ca+b+c .
The planes of the small cubes divide Ca,b,c in cuboids the sides of which can be checked to correspond
to the algebraic formula for the cube of a + b + c. We have
Vol(Ca,b,c ) = (a + b + c)3 = a3 + b3 + c3 + 3a2 b + 3a2 c + 3ab2 + 3b2 c + 3ac2 + 3bc2 + 6abc. (5.3)
Assuming that the volume Vol(Cx,y,y ) of a cuboid with two equal adjacent sides is xy 2 , it follows
from (V2), (V3) and (5.3) that Vol(Ra,b,c ) = abc. Thus, it is necessary and sufficient to show that
Vol(Cx,y,y ) = xy 2 . For this, place two cubes Ca and Cb on the diagonal of Ca+b and notice that the
subdivision of the big cube corresponds to
From this equation and (V2), (V3) it follows that Vol(Ra,b,(a+b) ) = ab(a + b) for any a, b > 0. However,
this does not yield a decomposition of a cube into smaller cubes and rectangles congruent to Rx,y,y
for arbitrary x and y. Compare this to the two dimensional case.
Orthonormal frames offer an efficient way of calculating the volume of a parallelogram, and
therefore volumes of tetrahedra and polyhedra provided that the coordinates of vertices are known.
82
Geometry - first year [email protected]
a1 a2 a3
Vol(P ) = |⟨a × b, c⟩| = | b1 b2 b3 |
c1 c2 c3
where the components of a(a1 , a2 , a3 ), b(b1 , b2 , b3 ) and c(c1 , c2 , c3 ) are with respect to an orthonormal
basis.
Proof. The area of the face spanned by a and b is |a × b| and the height corresponding to this face is
| pr⊥
a×b (c)|. Thus
⟨a × b, c⟩
Vol(P ) = |a × b| · | pr⊥
a×b (c)| = |a × b| · | a × b| = |⟨a × b, c⟩|.
⟨a × b, a × b⟩
i j k c1 c2 c3 a1 a2 a3
⟨a × b, c⟩ = ⟨ a1 a2 a3 , c1 i + c2 j + c3 k⟩ = a1 a2 a3 = b1 b2 b3 .
b1 b2 b3 b1 b2 b3 c1 c2 c3
83
Geometry - first year [email protected]
[a, b, c] = 0.
The sign of the box product determines the orientation of the basis (a, b, c)
> 0 then (a, b, c) is right oriented
[a, b, c] = 0 then (a, b, c) is not a basis .
< 0 then (a, b, c) is left oriented
Proof. In dimension 3, linearity of the box product follows from the fact that it is the composition of
linear maps, cross product and scalar product. You can also deduce the linearity of this map from the
fact that with respect to an orthonormal basis it is the determinant of a base change matrix. The other
properties follow from this latter description and from known properties of the determinant.
5.3.3 Hypervolume
The higher-dimensional analogues of polygons and polyhedra are polytopes. The simplest polytope in
dimension n is an n-simplex. If n = 2 these are triangles and if n = 3 these are tetrahedra. In general,
in dimension n, consider n + 1 points P0 , P1 , . . . , Pn which do not lie in a hyperplane. This is equivalent
−−−−→ −−−−→
to asking for the vectors P0 P1 , . . . , P0 Pn to be linearly independent, i.e. to form a basis of Vn . The
convex hull of P0 , P1 , . . . , Pn is an n-simplex. The (n − 1)-faces of an n-simplex are n − 1-simplices,
in particular, the faces of a 4-simplex are tetrahedra. We may define polytopes as sets obtained by
gluing n-simplices along their faces.
A hyperparallelepiped spanned by v1 , . . . , vn is the set of all points obtained from a given point
P with linear combinations of the given vectors considering only coefficients in the interval [0, 1],
concretely
H = P + {α1 v1 + · · · + αn vn : α1 , . . . , αn ∈ [0, 1]} .
A hypercube is a hyperparallelepiped with all 1-faces of equal length and all angles right angles.
Definition 5.18. Let P denote the set of all polytopes in E3 . We define a rational hypervolume function
Vol : P → R≥0 through the following properties:
84
Geometry - first year [email protected]
This volume function is called rational because it does not allow us to measure all polytopes. For
instance it is not possible to use it to deduce the volume of a cube with sides an arbitrary irrational
numbers.
Definition 5.19. We say that a set S is measurable, if it contains a sequence of strictly growing internal
polytopes Ii and if it is contained in a sequence of strictly shrinking exterior polytopes Ei such that
the volumes of these sets converge to a common value. Formaly, the following properties need to
hold
Ii ⊆ Ij ⊆ S ⊆ Ej ⊆ Ei ∀i < j and lim Area(Ii ) = lim Area(Ei )
i→∞ i→∞
If the limit exists, it is unique and we denote it by Vol(S). Denoting by M the set of all measurable sets
of En , we extended the (hyper)volume function to Vol : M → R≥0 . This is a version of the Lebesgue
measure (see for example [18, Chapter 11]).
Q
From this definition one may deduce that a cuboid with side lengths a1 , . . . , an has volume ai . Or
that if P is a hyperparallelepiped in dimension n, and if S is any n-simplex constructed from vertices
of P then
1
Vol(S) = Vol(P )
n!
which is the higher-dimensional analogue of the volume formula for a tetrahedron.
As in the case of dimension 2 and 3, the efficient way of calculating hypervolume is given by the
box product which allows us to calculate the hypervolume of hyperparallelepipeds and therefore of
n-simplices and of polytopes in general.
Definition 5.20. Let v1 , . . . , vn be vectors in Vn with components vi (vi,1 , . . . , vi,n ) relative to a right
oriented orthonormal basis B of Vn . The (n-fold) box product of these vectors is
Proposition 5.21. The definition of the box product does not depend on the choice of the right
oriented orthonormal basis B.
Proof. Let V = (v1 , . . . , vn ) be an n-tuple of n-dimensional vectors and let MB,V be the matrix whose
column are the components of the v′i s. Notice that
[v1 , . . . , vn ] = det(MB,V )
If the vectors are linearly dependent, then det(MB,V ) = 0 for any basis B. If the vectors are linearly
independent, then V is a basis and if B ′ is another left oriented orthonormal bases then
85
Geometry - first year [email protected]
and it suffices to show that det(MB ′ ,B ) = 1. Let ei denote the vectors in B. Then, since B ′ is orthonor-
mal, we have
⟨e1 , e1 ⟩ ⟨e1 , e2 ⟩ . . . ⟨e1 , en ⟩
⟨e2 , e1 ⟩ ⟨e2 , e2 ⟩ . . . ⟨e2 , en ⟩
T
MB ′ ,B MB ′ ,B = . .. .. = In
.. . ... .
⟨en , e1 ⟩ ⟨en , e2 ⟩ . . . ⟨en , en ⟩
hence
1 = det(In ) = det(MTB ′ ,B MB ′ ,B ) = det(MTB ′ ,B ) det(MB ′ ,B ) = det(MB ′ ,B )2 .
It follows that det(MB ′ ,B ) = ±1 and since the two bases are both right-oriented, we deduce that
det(MB ′ ,B ) = 1 and the claim follows.
Definition 5.22. Let B be a basis of Vn . The oriented volume of the basis B, denoted by Volor (B), is the
value of the box product of the vectors in B. The volume of the basis B is the absolute value | Volor (B)|
and we denote it by Vol(B). If we are in dimension 2, we refer to these values as the area and oriented
area of B, and denote them by Area(B) and Areaor (B) respectively.
One can also generalize the J-operator and the cross product to higher dimensions. For n − 1
vectors v1 , . . . , vn−1 , the wedge product of these vectors, denoted by v1 ∧ · · · ∧ vn−1 is the unique vector
w such that |w| is the hypervolume of the hyperparallelepiped spanned by v1 , . . . , vn−1 , the vector w
is orthogonal to each vi and (v1 , . . . , vn−1 , w) is right oriented.
86
CHAPTER 6
Affine maps
Contents
6.1 Properties of affine maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.1.1 Homogeneous coordinates and homogeneous matrix . . . . . . . . . . . . . . . 90
6.2 Projections and reflections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2.1 Tensor product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2.2 Parallel projection on a hyperplane . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2.3 Parallel reflection in a hyperplane . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.2.4 Parallel projection on a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.2.5 Parallel reflection in a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
87
Geometry - first year [email protected]
since
−−→ −−−−−−−−−−→ −−→
[lin(φ)( P Q )]B ′ = [ φ(P )φ(Q) ]B ′ = (A · [Q]K + b) − (A · [P ]K + b) = A · ([Q]K − [P ]K ) = A · [ P Q ]B .
We notice that both A and b depend on the choice of the frames K and K′ and moreover A is the
matrix of lin(φ) relative to the bases in K and K′ .
If n = m, then φ : An → An is called an affine endomorphism. The set of all such endomorphisms
is denoted by Endaff (An ). It is easy to see that, the affine map φ is invertible if and only if the
map lin(φ) is invertible, equivalently, φ is invertible if and only if the matrix A of the linear map
lin(φ) is invertible (see proof of Proposition 6.5). An invertible affine endomorphism φ is called an
affine automorphism or affine transformation. The set of all affine transformations of An is denoted by
AGL(An )
Moreover, if in addition to n = m, O = O′ and b = 0 in (6.1) then φ can be viewed as a linear map
from Vn to Vn since it is given by multiplication with the matrix A. The set of invertible linear maps
Vn → Vn is denoted by GL(Vn ) and we have the following inclusion
GL(Vn ) ⊆ AGL(An ).
Example 6.2. A homothety φC,λ of En with center C is the map which rescales the space with a factor
λ along lines passing through the point C. With respect to a coordinate system K = (C, B) with origin
in C it has the form
[φC,λ (P )]K = λ · [P ]K .
Notice that if λ = 1 this is the identity map, if λ < 1 this is a contraction, if λ > 1 it is an expansion.
Example 6.3. The various parallel projections and reflections described in the next sections are ex-
amples of affine maps as well as isometries discussed in Chapter 7.
Proposition 6.4. Let φ : En → Em be an affine map. If a line ℓ is mapped onto a line ℓ ′ under φ, then
φ preserves the oriented ratio on ℓ, i.e. if A, B, C, D are points on ℓ, then
−−→ −−−−−−−−−−→
AC φ(A)φ(C)
−−→ = −−−−−−−−−→ . (6.3)
AB φ(A)φ(B)
88
Geometry - first year [email protected]
1. φ is injective;
2. φ preserves lines;
Proof. Throughout we let φ be an affine map as in (6.1). First we notice that φ is an affine transfor-
mation, i.e. invertible, if the matrix A is invertible. Indeed, for two points P , Q we have
A · [P + tv]K + b = (A · [P ] + b) + t · A · [v]B
which shows that φ(ℓ) is a line passing through (A · [P ] + b) ∈ An and having direction vector A · [v]B ∈
D(An ) (which is a non-zero vector since A is invertible). The third claim follows from Proposition
6.3.
For the converse, assume that φ : An → An is a map with the three indicated properties. Since φ
preserves oriented ratios, it preserves midpoints of segments, i.e. if M is the midpoint of the segment
[AB] then φ(M) is the midpoint of the segment [φ(A)φ(B)]. Therefore, φ preserves parallelograms.
−−−→ −−−→ −−→ −−−−−−−−−−→ −−−−−−−−−−→
Hence φ preserves sums of any two vectors, i.e. if OA + OC = OB then φ(O)φ(A) + φ(O)φ(C) =
−−−−−−−−−−→ −−−→ −−→ −−−−−−−−−−→ −−−−−−−−−−→
φ(O)φ(B) . Moreover, since φ preserves oriented ratios, if OA = λ OB then φ(O)φ(A) = λ φ(O)φ(B) .
These two observations show that φ induces a linear map on vectors, i.e. that the map
−−−→ −−−−−−−−−−→
lin(φ) : OA 7→ φ(O)φ(A)
is linear. Now, Since φ is injective, it is easy to see that lin(φ) : D(An ) → D(An ) is injective. More-
over, it is a linear algebra fact that an injective linear map from a vectorspace to itself is bijective.
Therefore, lin(φ) is bijective.
89
Geometry - first year [email protected]
−−−−→ −−−−→
Now let K = (O, B) be a frame in An where B = ( OX1 , . . . , OXn ). Let O′ = φ(O), Yi = φ(Xi ) and
−−−−→ −−−−→
B ′ = ( O′ Y1 , . . . , O′ Yn ). Since lin(φ) is bijective, B ′ is a basis of D(An ) and for any point P we have
This shows that φ has the expression (6.1), i.e. it is an affine map, and since A is invertible, φ is a
transformation by the first paragraph of the proof.
Definition 6.6. Let K be a reference frame of An . The homogeneous coordinates of the point P (p1 , . . . , pn )
are (p1 , . . . , pn , 1). Yes, they are the ordinary coordinates with an extra 1 at the end.
Definition 6.7. The homogeneous matrix of an affine map φ : An → Am defined with respect to some
reference frames K and K′ by φ(x) = Ax + b is
"#
A b
M̂K,K (φ) = .
0 1
The utility of introducing these notions is the following: composition of affine maps corresponds
to multiplication of homogeneous matrices. Indeed let φ(x) = A′ x + b′ be another affine map defined
on Am . Then
and the homogeneous coordinates of the values of the map φ can be obtained through a matrix
multiplication as well
" # " # " #
φ(x) A b x
= · .
1 0 1 1
Thus # " ′
A b′ A b x
" # " # " #
ψ ◦ φ(x)
= · · .
1 0 1 0 1 1
90
Geometry - first year [email protected]
v1 h v1 w1 . . . v1 wn
91
Geometry - first year [email protected]
ℓ
ℓP
P
P′
Figure 6.1: Projection of the point P on the line ℓ in the direction of the vector v.
Definition 6.11. Let H be a hyperplane and let v be a vector in Vn which is not parallel to H. For
any point P ∈ En there is a unique line ℓP passing through P and having v as direction vector. The
line ℓP is not parallel to H, hence, it intersects H in a unique point P ′ . We denote P ′ by PrH,v (P ) and
call it the projection of the point P on the hyperplane H parallel to v. This gives a map
PrH,v : An → An
H : a1 x1 + · · · + an xn + an+1 = 0 (6.4)
92
Geometry - first year [email protected]
and a line ℓP passing through the point P (p1 , . . . , pn ) and having v(v1 , . . . , vn ) as direction vector:
ℓP = {P + tv : t ∈ R}. (6.5)
a1 p1 + · · · + an pn + an+1 aT · P + an+1
P′ =P − v=P − v. (6.6)
a1 v1 + · · · + an vn aT · v
where a = a(a1 , . . . , an ) and where in the second equality we use the convention that points and vectors
are identified with column matrices of their coordinates and components respectively. Hence, if we
denote by p1′ , . . . , pn′ the coordinates of the projected point PrH,v (P ) then
′
p1 = p1 + µv1
.. a1 p1 + · · · + an pn + an+1
where µ = − .
. a1 v1 + · · · + an vn
p′ = p + µv
n n n
v · aT
!
a
PrH,v (P ) = In − T · P − Tn+1 v
v ·a v ·a
where In is the n × n-identity matrix. In particular, if B is orthonormal, the linear part of this map is
!
v⊗a
MB (lin(PrH,v )) = In − .
⟨v, a⟩
Parallel projections on hyperplanes are affine maps. Obviously, they are not bijective, so
Pr⊥
H = PrH,v
93
Geometry - first year [email protected]
The reflection of P in ℓ in the direction of v is the point P ′ such that Prℓ,v (P ) = ℓ ∩ ℓP is the midpoint
of the segment [P P ′ ]. Thus, identifying points with column matrices of their coordinates relative to
K we have
P +P′
= Prℓ,v (P ) ⇒ P ′ = 2 Prℓ,v (P ) − P .
2
ℓP
P
P′
Figure 6.2: Reflection of the point P in the line ℓ parallel to the vector v.
Definition 6.14. Let H be a hyperplane and let v be a vector in Vn which is not parallel to H. For
any point P ∈ An there is a unique point P ′ such that PrH,v (P ) is the midpoint of the segment [P P ′ ].
We denote P ′ by RefH,v (P ) and call it the reflection of the point P in the hyperplane H parallel to v. This
gives a map
RefH,v : An → An
called, the reflection in the hyperplane H parallel to v.
94
Geometry - first year [email protected]
We keep the notation in the previous section. In particular the hyperplane H is given by the
equation (6.4). The idea here is the same as in Example 6.13: PrH,v (P ) is the midpoint of the segment
[P P ′ ]. Thus, identifying points with column matrices of their coordinates relative to K we have
P +P′ aT · P + an+1
PrH,v (P ) = ⇒ RefH,v (P ) = P − 2 v.
2 vT · a
v · aT
!
a
RefH,v (P ) = In − 2 T · P − 2 Tn+1 v.
v ·a v ·a
Parallel reflections in hyperplanes are affine maps. Obviously, they are bijective, so
Ref⊥
H = RefH,v
95
Geometry - first year [email protected]
For an arbitrary point P (xP , yP ) the line ℓP passing through P has direction space given by the equa-
tion
D(ℓP ) : x − 2y = 0
with respect to the basis B of the current coordinate system K. Thus, ℓP is described by
ℓP : (x − xP ) − 2(y − yP ) = 0.
The projection of P on ℓ in the direction of v is the point P ′ = ℓ ∩ ℓP and the corresponding Figure is
6.1. The only difference is that we describe ℓ and ℓP with different types of equations. To determine
P ′ we find the intersection by plugging in the points of ℓ in the equation of ℓP
1
(1 − t − xP ) − 2(t − yP ) = 0 ⇒ t = − (xP − 2yP − 1).
3
As expected, a short calculation shows that P ′ has the same expression as in Example 6.10.
Definition 6.17. Let ℓ be a line and let W be an (n − 1)-dimensional vector subspace in Vn which is
not parallel to ℓ. For any point P ∈ An there is a unique hyperplane HP passing through P and having
W as associated vector subspace. The hyperplane HP is not parallel to ℓ, hence, it intersects ℓ in a
unique point P ′ . We denote P ′ by Prℓ,W (P ) and call it the projection of the point P on the line ℓ parallel
to W. This gives a map
Prℓ,W : An → An
96
Geometry - first year [email protected]
With respect to the reference frame K, the vector subspace W is given by a homogeneous equation
x1
x2
T
W : a1 x1 + a2 x2 + · · · + an xn = 0 ⇔ a · . = 0 (6.8)
..
xn
and Prℓ,W (P ) = PrHP ,v (Q) for a fixed but arbitrary point Q(q1 , . . . , qn ) ∈ ℓ. Hence, if we denote by
p1′ , . . . , pn′ the coordinates of the projected point Prℓ,W (P ) then, by (6.6),
′
p1 = q1 + v 1 µ
aT · Q − aT · P
..
where µ = −
. aT · v
p′ = q + v µ
n n n
v · aT v · aT
!
Prℓ,W (P ) = T P + In − T Q.
v ·a v ·a
Parallel projections on lines are affine maps. Obviously, they are not bijective, so
Pr⊥
ℓ = Prℓ,v⊥
97
Geometry - first year [email protected]
As in Section 6.2.4, the vector subspace W is given by the homogeneous equation 6.8. The idea is
similar to the one used in Section 6.2.3: since PrH,v (P ) is the midpoint of the segment [P P ′ ], we have
Refℓ,W (P ) = 2 Prℓ,W (P ) − P
Here again we use the convention that point and vectors are identified with column matrices of their
coordinates and components respectively. Rearranging this in matrix form we obtain
v · aT v · aT
! !
Refℓ,W (P ) = 2 T − In P + 2 In − T Q.
v ·a v ·a
where Q(q1 , . . . , qn ) is a point on ℓ and v(v1 , . . . , vn ) is a direction vector for ℓ. In particular, if B is
orthonormal, the linear part of this map is
v⊗a
MB (lin(Refℓ,W )) = 2 −I .
⟨v, a⟩ n
Parallel reflections in lines are affine maps. Obviously, they are bijective, so
Refℓ,W ∈ AGL(An ) ⊆ Endaff (An ).
Definition 6.20. The orthogonal reflection Ref⊥ n
ℓ in the line ℓ ⊆ E is the reflection in ℓ parallel to
vectors which are orthogonal to the line ℓ, i.e.
Ref⊥
ℓ = Refℓ,v⊥
where v is a direction vector of ℓ. With the above notation we see that for any point Q ∈ ℓ
! !
⊥ a⊗a a⊗a
Refℓ (P ) = 2 2 − In P + 2 In − Q
|a| |a|2
since we may choose v = a.
98
CHAPTER 7
Isometries
Contents
7.1 Affine form of isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.1.1 The Euclidean space En (second revision) . . . . . . . . . . . . . . . . . . . . . 102
7.2 Isometries in dimension 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.2.1 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.2.2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.3 Isometries in dimension 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.3.1 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.3.2 Euler angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.3.3 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.4 Moving points with isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.4.1 Cycloids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.4.2 Surfaces of revolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
99
Geometry - first year [email protected]
d(A′ , C ′ ) d(A, C)
=
d(A′ , B′ ) d(A, B)
hence
−−−′ −→′ −−→
AC d(A′ , C ′ ) d(A, C) AC
−−−′−→′ = ± ′ , B′ )
= ± = ± −−→ .
AB d(A d(A, B) AB
But, the two ratios have the same sign since, as in the previous paragraph, C is between A and B if
and only if C ′ is between A′ and B′ .
Remark (Notation). At this point some simplifications in notation are useful. When it is clear from
the context that we are working in a frame K = (O, B), we will simply write v instead of [v]B . We
identify points with column matrices. For example, we write x to mean the column matrix with
entries (x1 , . . . , xn ). This will represent both the coordinates of a point and the components of the
position vector.
Moreover, when there is no risk of confusion, we denote the linear map induced by an afine
transformation with the same letter, i.e. we write ψ instead of lin(ψ). The arguments are either
points or vectors, and the context will make it clear which one it is.
Example 7.3. Translation maps are isometries. Let v be a vector then, the translation with the vector
v is the map Tv : En → En given by Tv (x) = x + v. It is just the translation map of the affine structure
of En where we fixed the vector argument to be v.
100
Geometry - first year [email protected]
Proposition 7.4. Let φ ∈ AGL(En ) be an affine transformation given by φ(x) = Ax + b with respect to
some orthonormal frame. The following are equivalent:
1. φ is an isometry
2. φ preserves lengths of vectors
3. φ preserves the scalar product, i.e. for any vectors v, w we have
⟨φ(v), φ(w)⟩ = ⟨v, w⟩.
−−→
Proof. The length of a vector AB is by definition the length of the segment [AB] which is the distance
between A and B, so 1. is equivalent to 2. If φ is an isometry, it is an affine map, by Proposition 7.2.
In particular it preserves oriented ratios by Proposition 6.5. From this it follows that it preserves the
cosine of angles. Then
⟨φ(v), φ(w)⟩ = |φ(v)| · |φ(w)| · cos ∡(φ(v), φ(w)) = |v| · |w| · cos ∡(v, w) = ⟨v, w⟩.
−−→
For the converse, let v = AB and notice that if φ preserves the scalar product then
q q
−−→ −−→ −−→ −−→ −−→ −−→
d(φ(A), φ(B)) = |φ( AB )| = ⟨φ( AB ), φ( AB )⟩ = ⟨ AB , AB ⟩ = | AB | = d(A, B).
This shows that 3. implies 1. and 2.
Proposition 7.5. Let φ ∈ AGL(En ) be an affine transformation given by φ(x) = Ax + b with respect to
some orthonormal frame. The following are equivalent:
1. φ is an isometry
2. A−1 = AT .
Proof. Consider ψ(x) = φ(x) − b = Ax. Since translations are isometries, φ is an isometry if and only if
ψ is an isometry. Let K = (O, B) with B = (e1 , . . . , en ) be the orthonormal frame with respect to which
φ is given in the form φ(x) = Ax + b. Let f1 = ψ(e1 ) = Ae1 , f2 = ψ(e2 ) = Ae2 , . . . , fn = ψ(en ) = Aen .
To see that 1. implies 2., notice that if ψ is an isometry, by Proposition 7.4 we have
(
1 if i = j
⟨fi , fj ⟩ = ⟨ψ(ei ), ψ(ej )⟩ = ⟨ei , ej ⟩ = .
0 if i , j
Since the components of fi = Aei are the entries in the i-th column of the matrix A, it follows that
AT A is the identity matrix In .
To see that 2. implies 1. consider two arbitrary points x and y in En . We have |φ(x) − φ(y)| =
|(Ax + b) − (Ay + b)| = |A(x − y)| and therefore
|φ(x) − φ(y)|2 = |A(x − y)|2
= ⟨A(x − y), A(x − y)⟩
= (A(x − y))T · A(x − y)
= (x − y)T AT A(x − y)
= (x − y)T (x − y)
= |x − y|2
which is equivalent to d(φ(x), φ(y)) = d(x, y).
101
Geometry - first year [email protected]
Definition 7.6. A matrix A ∈ Matn×n (R) such that AT A = In is called orthogonal and the set of all such
matrices is denoted by O(n).
Proposition 7.7. If A is an orthogonal matrix then det(A) = ±1.
Proof. Since AT A = In we have 1 = det(In ) = det(AT A) = det(AT ) det(A) = det(A)2 .
Definition 7.8. The set of matrices in O(n) with determinant 1 is denoted by SO(n). Such matrices
are called special orthogonal. The set O(n) is a subgroup of AGL(An ) and SO(n) is a normal subgroup
of O(n). With group theory notation we write this as follows:
SO(n) ⊴ O(n) ≤ AGL(Rn ).
Definition 7.9. Let φ be an isometry of En given by φ(x) = Ax + b with respect to some right-oriented
orthonormal frame. Then φ is called a displacement, or a direct isometry, if A ∈ SO(n). Else, if det(A) =
−1, the map φ is called an indirect isometry.
102
Geometry - first year [email protected]
a2 + c 2 = 1
b2 + d 2 = 1
.
ab + cd = 0
ad − bc = 1
From the first two equations it follows that there exist θ, θ̃ such that a = cos(θ), c = sin(θ), d = cos(θ̃)
and b = sin(θ̃). Then, the last two equations become
(
0 = ab + cd = cos(θ) sin(θ̃) + sin(θ) cos(θ̃) = sin(θ + θ̃)
1 = ad − bc = cos(θ) cos(θ̃) − sin(θ) sin(θ̃) = cos(θ + θ̃)
Corollary 7.11. A direct isometry φ of E2 that fixes a point is either the identity or a rotation. More-
over, the angle θ of the rotation is such that
tr(lin(φ))
cos(θ) = .
2
Proof. Let φ(x) = Ax + b with respect to a right oriented orthonormal frame. By Proposition 7.10, it
is a direct isometry if A ∈ SO(2) which is equivalent to A = Rθ where Rθ is the rotation matrix (7.1).
If θ = 0 then φ is a (possibly trivial) translation by b thus, it has a fixed point only if b = 0 in which
case φ equals the identity map, i.e. φ = Id. For the rest of the proof assume that θ , 0.
Let p be a fixed point for φ. Then
So, if θ , 0 then equation (7.2) has a unique solution p, i.e. the fixed point of φ is unique. To finish
the proof we show that φ is a rotation around the point p. We notice that
which means that φ rotates − −→ with θ around p. Finally, the trace of the liner map lin(φ) is the trace
px
of its matrix with respect to any orthonormal basis. Thus, we may use A = Rθ to conclude that
103
Geometry - first year [email protected]
Remark. Let us notice the effect of rotations on coordinates and on basis vectors in dimension 2. Let
K = (O, B) be a right-oriented orthonormal frame of E2 with B = (i, j). A rotation around the origin
with angle θ is given by the map
" # " # " # " ′#
x cos θ − sin θ x x
7→ . = ′ .
y sin θ cos θ y y
| {z }
=Rotθ
Thus, Rotθ is a base change matrix MB ′ ,B where B ′ = (i′ , j′ ) is another orthonormal basis. The compo-
T
nents of i′ and j′ with respect to B are the columns of the matrix M−1 B ′ ,B = MB ′ ,B :
" # " #
′ cos θ ′ sin θ
i = , j = .
− sin θ cos θ
We notice that the vectors in B ′ are obtained by rotating the vectors in B by −θ. Indeed rotating
points counterclockwise with respect to K is equivalent to rotating K clockwise.
7.2.2 Classification
Theorem 7.12 (Chasles). A direct isometry of the plane E2 is either
a) the identity, or
b) a translation, or
c) a rotation.
Proof. Consider a direct isometry φ(x) = Ax + b given with respect to a right-oriented orthonormal
frame. By Proposition 7.10 A = Rθ where Rθ is the rotation matrix (7.1) for some θ. If θ = 0 then φ
the identity map if b = 0 and it is a translation if b , 0. Suppose therefore that θ , 0. Then, by the
proof of Corollary 7.11 the map φ has a (unique) fixed point p. Thus, φ is a direct isometry which
fixes a point and the proof is finished by applying Corollary 7.11.
Definition 7.13. A glide reflection in the line ℓ is the composition of an orthogonal reflection in ℓ and
a translation in the direction of ℓ.
Example 7.14. A reflection in the x-axis followed by a translation with λi is a glide reflection. In
matrix form the map is given by
" #" # " #
1 0 x1 λ
x 7→ Ref⊥
Ox (x) + λi = + .
0 −1 x2 0
Lemma 7.15. A rotation with angle θ around the origin after a reflection in the x-axis is a reflection
in the line y = tan(θ/2)x.
104
Geometry - first year [email protected]
Proof. The mentioned map, the reflection in the x-axis followed by a rotatation in the origin, has the
form φ : x 7→ Ax where
" #" # " #
cos(θ) − sin(θ) 1 0 cos(θ) sin(θ)
A= = .
sin(θ) cos(θ) 0 −1 sin(θ) − cos(θ)
Notice that det(A − I2 ) = 0 thus φ fixes at least a line. Consider the vector v(cos( θ2 ), sin( θ2 )). A calcu-
lation shows that (A − I2 )v = 0 which means that v is fixed by A. Hence the line passing through the
origin in the direction of v is fixed by φ. To see that it is a reflection, check that J(v) is mapped to
−J(v).
Theorem 7.16. An indirect isometry of the plane E2 fixes a line ℓ and is either
a) a reflection in ℓ, or
b) a glide-reflection in ℓ.
Proof. Let φ(x) = Ax + b be an indirect isometry given with respect to a right-oriented orthonormal
basis. Consider the reflection ψ(x) = Ref⊥ Ox (x) = Cx, which is also an indirect isometry. Since (φ ◦
ψ)(x) = ACx+b and since det(A) = det(C) = −1 we see that AC is an orthogonal matrix of determinant
1, i.e. φ ◦ ψ is a direct isometry. Then by Theorem 7.12, we have 3 cases.
If φ ◦ ψ is the idenity or a translation, then AC = I2 which implies A = C −1 . Since C is the matrix
of a reflection we have C 2 = In , hence A = C. Therefore φ is the composition of Ref⊥ Ox followed by a
translation with b(b1 , b2 ). Then
" #
1 0
φ(x) = x + b2 j +b1 i
0 −1
| {z }
ψ ′ (x)
and one checks (for example with (6.7)) that ψ ′ is a reflection in the line y = b2 /2. Hence φ is the
composition of a reflection and a (possibly trivial) translation along the reflection axis. Thus, if b1 = 0
it is a reflection and if b1 , 0 it is a glide reflection.
If φ◦ψ is a rotation then AC = Rθ for some θ. In this case we have A = Rθ C. By Lemma 7.15, φ is a
reflection in a line ℓ followed by a translation and the claim follows as in the previous paragraph.
tr(lin(φ)) − 1
cos(θ) = .
2
105
Geometry - first year [email protected]
Proof. By choosing the fixed point of φ to be the origin, we may assume that φ has the form φ(x) = Ax
with respect to a right-oriented orthonormal frame. Since φ is a direct isometry, we have A ∈ SO(3).
A rotation around an axis fixes the rotation axis. To see that this is the case for φ, it suffices to show
that A has an eigenvector v for the eigenvalue 1 since then
which means that φ fixes the line passing through the origin in the direction of v. Notice that
In fact (u1 , u2 ) is a basis for v⊥ , the orthogonal complement to v. Since φ is an isometry it maps v⊥
to itself. In particular, restricting φ to v⊥ we have an isometry in dimension 2. Thus, with respect to
the basis B = (u1 , u2 , v) we have " #
B 0
A=
0 1
where B is a 2 × 2 matrix. Since B is right-oriented and φ is a direct isometry, we have det(A) = 1.
Therefore det(B) = 1. Hence φ restricts to a direct isometry on v⊥ . Since it fixes the origin, it cannot
be a translation. Therefore, by Theorem 7.12, it must be a (possibly trivial) rotation. Then, the matrix
of φ with respect to B is
cos(θ) − sin(θ) 0
[lin(φ)]B = sin(θ) cos(θ) 0
0 0 1
for some θ ∈ R. This is a rotation around the axis Rv. Finally, the trace of the linear map lin(φ) is the
trace of the matrix [lin(φ)]B . Thus
106
Geometry - first year [email protected]
Proposition 7.18 (Euler-Rodrigues). Let v be a unit vector and θ ∈ R. The rotation of angle θ and
axis Rv is given by
Rotv,θ (x) = cos(θ)x + sin(θ)(v × x) + (1 − cos(θ))⟨v, x⟩v. (7.3)
Consider the composition of the following three rotations Rotz,γ Rotx,β Rotz,α =
cos γ cos α − sin γ cos β sin α − cos γ sin α − sin γ cos β cos α sin γ sin β
sin γ cos α + cos γ cos β sin α − sin γ sin α + sin γ cos β cos α − cos γ sin β .
sin β sin α sin β cos α cos β
You may think of the composition of these three rotations as follows: Each of the three rotations is
a base change matrix. The first rotation, Rotz,α = MK′ ,K , rotates the versor of the x-axis and that of the
y-axis by −α (and therefore rotates points with respect to K by α). The next rotation Rotx,β = MK′′ ,K′ ,
rotates the versors of the y ′ -axis and that of the z′ -axis by −β (and therefore rotates points with
respect to K′ by β). Similarly with the last rotation. The observation here is that Rotx,β is a rotation
around the current x-axis, i.e. a rotation around the first axis of the coordinate system that you
are in. If we want to point out that at each step the coordinate system is changing we may write
Rotz′′ ,γ Rotx′ ,β Rotz,α for the overall rotation.
107
Geometry - first year [email protected]
Proposition 7.19. All rotations in dimension 3 are of this form, i.e. any matrix in SO(3) can be
written in this form for some α, β, γ ∈ R.
Definition 7.20. You may restrict the range of the values α, β, γ by α ∈ [0, 2π[, β ∈ [0, π[ and γ ∈
[0, 2π[. Then, each triple (α, β, γ) corresponds to a unique rotation Rotz,γ Rotx,β Rotz,α ∈ SO(3). The
angles α, β and γ are called Euler angles. Another way of describing rotations in E3 is via quaternions
(see Appendix K).
Remark. Euler angles give coordinates on SO(3). Not to be confused with spherical coordinates (See
Appendix D).
7.3.3 Classification
Definition 7.21. A glide-rotation (or helical displacement) is the composition of a rotation in E3 with
a translation parallel to the rotation axis.
a) the identity, or
b) a translation, or
c) a rotation, or
d) a glide-rotation.
Proof. Let φ be the direct isometry given by φ(x) = Ax + b with respect to a right-oriented frame. We
know that A ∈ SO(3). If A = I3 , then φ is a translation or the identity. Suppose that A , I3 . Applying
Theorem 7.17 to x 7→ Ax we see that A is a rotation matrix. Let v be a direction vector of length 1 for
the rotation axis of A and decompose the vector b into its components parallel to v and orthogonal
to v:
b = b1 + b2 where b1 ∥v and b2 ⊥ v.
1 Image source: Wikipedia
108
Geometry - first year [email protected]
Let π be the plane passing through the origin and orthogonal to the rotation axis. Then φ2 is an
isomtry which leaves π invariant (φ2 (π) = π). By Chasles’ Theorem in dimension 2 (Theorem 7.12),
the restriction of φ2 to π is a rotation around a fixed point p ∈ π. Thus, φ2 is a rotation around the
axis p + Rv. On the other hand, φ1 is a translation by b1 parallel to the rotation axis. This finishes
the proof since φ = φ1 ◦ φ2 .
Theorem 7.23. An indirect isometry of the Euclidean space E2 fixes a plane π and is either
a) a reflection in π, or
Proof. Let φ be the direct isometry given by φ(x) = Ax+b with respect to a right-oriented frame. One
can show that a matrix A ∈ O(3) of determinant −1 admits −1 as an eigenvalue. Let v be an eigen-
vector for the eigenvalue −1, i.e. Av = −v. Notice that we also have AT v = A−1 v = −v. Calculating we
obtain
⟨v, φ(x)⟩ = ⟨v, Ax + b⟩
= ⟨v, Ax⟩ + ⟨v, b⟩
= ⟨AT v, x⟩ + ⟨v, b⟩
= ⟨−v, x⟩ + ⟨v, b⟩
Thus, the plane
1
π : ⟨v, x⟩ = ⟨v, b⟩
2
is invariant under the isometry φ. Moreover, if we choose the frame with the first basis vectors
parallel to π then " #
B 0
A=
0 −1
and det(A) = det(B). Thus, the restriction of φ to the plane π is a direct isometry. Therefore, by
Theorem 7.12, it is either the identity, or a translation or a rotation, which correspond to the three
cases stated in the theorem.
109
Geometry - first year [email protected]
at some cycloids. For this, let us first deduce the homogeneous matrix of a rotation around a point
C(c1 , c2 ) ∈ E2 .
1 0 c1 cos(θ) − sin(θ) 0 1 0 −c1 cos(θ) − sin(θ) −c1 cos(θ) + c2 sin(θ) + c1
0 1 c2 · sin(θ) cos(θ) 0 · 0 1 −c2 = sin(θ) cos(θ) −c1 sin(θ) − c2 cos(θ) + c2
0 0 1 0 0 1 0 0 1 0 0 1
Now, if you choose the center C to be the point (0, 1), you have the following homogenous rotation
matrix
cos(θ) − sin(θ) sin(θ)
sin(θ) cos(θ) − cos(θ) + 1
0 0 1
If you move the origin with this rotation you obtain
cos(θ) − sin(θ) sin(θ) 0 sin(θ)
sin(θ) cos(θ) − cos(θ) + 1 0 = − cos(θ) + 1
0 0 1 1 1
since the homogeneous coordinates of the origin are (0, 0, 1). If you ‘vary θ with time t’ you are
rotating the origin around C counterclockwise. If you want to have a clockwise motion you just
change the sign of the angle to obtain
0 − sin(t)
0 7→ − cos(t) + 1 .
1 1
Now, if at the same time t you translate the point along the x-axis in the direction of i, you get
0 1 0 t − sin(t) − sin(t) + t
0 7→ 0 1 0 − cos(t) + 1 = − cos(t) + 1 .
1 0 0 1 1 1
What you obtain is the trajectory of the point O as it rotates on the blue circle while the circle is
moving like a wheel on the x-axis. The corresponding curve is called a cycloid:
Let’s do something else. Instead of rotating the circle on the x-axis let us rotate it on a bigger
circle centered at the origin. The small circle we can think of as the trajectory of the point P (3, 0)
rotated around the center C(4, 0). The corresponding rotation matrix is
cos(θ) − sin(θ) −xC cos(θ) + yC sin(θ) + xC cos(θ) − sin(θ) −4 cos(θ) + 4
sin(θ) cos(θ) −xC sin(θ) − yC cos(θ) + yC = sin(θ) cos(θ) −4 sin(θ)
0 0 1 0 0 1
110
Geometry - first year [email protected]
cos(t) − sin(t) −4 cos(t) + 4 3 3 cos(t) − 4 cos(t) + 4
sin(t) cos(t) −4 sin(t) 0 = 3 sin(t) − 4 sin(t)
0 0 1 1 1
If at the same time we rotate around the origin, P will move on the small circle which rotates around
a big circle of radius 3 centered at the origin:
If we do this simultaneously, i.e. if we choose t ′ = t then we obtain the following trajectory for P :
However, if we want the small circle to rotate like a wheel on the big circle, then, after an entire
revolution of the small circle we need to have traversed the length of this circle on the big circle, i.e.
2π. But the big circle is 3 times longer, so we need to choose t = 3t ′ , i.e. the rotation on the small
circle is 3-times faster:
111
Geometry - first year [email protected]
If in this expression you fix γ = 0 and let α ∈ [0, 2π[, β ∈ [0, π[ vary, you obtain
1 cos α
0 7→ cos β sin α ,
0 sin β sin α
i.e. the trajectory of the point (1, 0, 0) is a sphere and varying α and β corresponds to the map
cos α
[0, 2π[×[0, π[∋ (α, β) 7→ cos β sin α , (7.4)
sin β sin α
which is a parametrization of the sphere. You can get other parametrizations of the sphere if you fix
α and let β and γ vary.
From a different perspective, notice that a plane in E3 can be described as the set of points which
you touch if you translate a line in a given direction. This can be seen with the parametric equations:
x xQ vx wx xQ vx wx
π : y = yQ + s vy + t wy = yQ + s vy +t wy (7.5)
z zQ vz wz zQ vz wz
| {z } | {z }
line ℓ translation with tw
This is a general method of constructing surfaces starting from curves: you start with a curve in E3
and apply a motion to it. What you obtain, if non-degenerate, is a parametrization of a surface. This
can be exemplified with the parametrization of the sphere in (7.4) which you can rewrite as follows
1 0 0 cos α cos α
(α, β) 7→ 0 cos β − sin β sin α = cos β sin α .
0 sin β cos β 0 sin β sin α
| {z } | {z }
rotation Rotx,β circle in Oxy-plane
This describes the unit sphere centered at the origin as the set of points which you touch with the
unit circle centered at the origin in the Oxy-plane if you rotate the circle around the x-axis.
112
Geometry - first year [email protected]
Definition 7.24. A surface of revolution in E3 is a surface obtained by rotating a curve around a line
ℓ. The line ℓ is called the axis of the surface.
Example 7.25. In (7.5), instead of translating the line ℓ we can rotate it around a line which is parallel
to ℓ. In this way we obtain a cylinder. For example, if
x 1 0
ℓ : y = 0 + s 0
z 0 1
and if we rotate around the z-axis we obtain a parametrization of a cylinder of radius 1 and axis the
z-axis:
cos θ − sin θ 0 1 0 cos θ
(θ, s) 7→ sin θ cos θ 0 0 + s 0 = sin θ
0 0 1 0 1 s
Example 7.26. If instead we consider two skew lines and rotate one around the other, we obtain
113
Geometry - first year [email protected]
114
CHAPTER 8
Contents
8.1 Smooth curves in En . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.1.1 Kinematic versus geometric properties . . . . . . . . . . . . . . . . . . . . . . . 119
8.1.2 Area enclosed by a simple planar loop . . . . . . . . . . . . . . . . . . . . . . . 121
8.2 Smooth hypersurfaces in En . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.2.1 Level sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.2.2 Local parametrization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.2.3 Volume enclosed by a simple surface of revolution . . . . . . . . . . . . . . . . 126
115
Geometry - first year [email protected]
γ1 (t)
Example 8.2. The cardioid is a curve obtained as the trajectory of a point on a circle that is rolling
around a fixed circle of the same radius. Let a be the radius of the two circles. As in Section ?? we may
deduce a parametrization of such a curve using rotations. For this, let the fixed circle be centered at
the origin, let the rolling circle have initial center (2a, 0) and consider the trajectory of the contact
point (a, 0) of the two circles in the initial position. We obtain
" # " #
a a + 2a(1 − cos(t)) cos(t))
γ(t) = Rott ◦ T(2a,0) ◦ Rott ◦T−(2a,0) · =
|{z} | {z } 0 2a(1 − cos(t)) sin(t))
rotation around the fixed circle rotation of the moving circle
116
Geometry - first year [email protected]
Notice that if we translate the reference frame to the right by (a, 0) the parametrization changes to
" #
2a(1 − cos(t)) cos(t)
γ(t) = (8.1)
2a(1 − cos(t)) sin(t)
since the coordinates are modified by (−a, 0). We will take this to be the parametrization of our
cardioid.
Example 8.3. The Archimedean spiral is defined as the trajectory of a point obtained by a rotation
with time t around a point O, followed by a homothety with factor |t| centered at the same point O.
More precisely, if we choose the center O to be the origin then a parametrization of this curve is given
by " #
t cos(t)
γ(t) = (8.2)
t sin(t)
The associated conical Archimedean spiral is obtained by simultaneously translating with time t in a
direction orthogonal to the spiral.
Figure 8.2: Archimedean spiral and associated conical spiral (Image source: Wikipedia)
117
Geometry - first year [email protected]
Definition 8.4. Let C be a piecewise smooth curve and P a point on C. A line ℓ is called tangent at P
to C if the points on the curve are arbitrarily close to ℓ in a small enough neighbourhood of P . More
concretely, if for any ε > 0 there exists δ > 0 such that
for all Q ∈ C ∩ B(P , δ) we have d(Q, ℓ) < ε
where B(P , δ) is the open ball centered at P and of radius δ. If the line ℓ is tangent at the point P to
the curve C then it is called the tangent line to C at P and it is denoted by
TP C.
Proposition 8.5. Let C be a smooth curve with parametrization γ : I → En . Consider the point
P = γ(t0 ) for some t0 ∈ I. Then the velocity vector at γ(t0 ),
dγ
dt1 (t0 )
γ̇(t0 ) = ... , is a direction vector for Tp C.
dγn
(t )
dt 0
Example 8.6. For√the cardioid parametrized with Equation (8.1), one calculates that for t0 = 2π/3 we
have γ̇(t0 ) = (−2a 3, 0).
Definition 8.7. The tangent space to the smooth curve C is the 1-dimensional vector subspace gener-
ated by the velocity vector
TP C := ⟨γ̇(t0 )⟩
for any parametrization γ of C. It is the direction space of the tangent line at P , i.e.
TP C = D(TP C).
118
Geometry - first year [email protected]
are kinematic objects. Therefore, the velocity Vγ (t) = |γ̇(t)| and the acceleration Aγ (t) = |γ̈(t)| are
kinematic quantities.
Definition 8.8. The arc length of a parametrized curve γ : I → Rn is the integral of its velocity
Z
ℓ(γ) = Vγ (t)dt.
I
Example 8.9. Consider the cardioid in Example 8.2. We have
" # " #
(1 − cos(t)) cos(t) sin(2t) − sin(t)
γ(t) = 2a and therefore γ̇(t) = 2a .
(1 − cos(t)) sin(t) cos(t) − cos(2t)
Thus
Vγ (t)2 = 4a2 (sin(2t) − sin(t))2 + (cos(t) − cos(2t))2
= 4a2 2 − 2 sin(2t) sin(t) − 2 cos(t) cos(2t)
= 4a2 2 − 4 cos(t) sin(t)2 − 2 cos(t)(cos(t)2 − sin(t)2 )
= 4a2 2 − 2 cos(t) sin(t)2 − 2 cos(t)3
= 4a2 2 − 2 cos(t)
t t
= 16a2 sin( )2 therefore Vγ (t) = 4a sin( )
2 2
t
for I = [0, 2π] since 2 ∈ [0, π]. Then
2π
t 2π
Z
t
ℓ(γ) = 4a sin( )dt = −8a cos( ) = 16a.
0 2 2 0
Proposition 8.10. The arc length of a parametrized curve is a geometric quantity.
Definition 8.11. The vector field of normalized velocity vectors of a parametrized curve γ : I → Rn is
denoted by
γ̇(t)
T(γ, t) := .
Vγ (u)
119
Geometry - first year [email protected]
Example 8.12. Continuing with Example 8.9, for the cardioid γ : [0, 2π] → R2 we have
" #
1 sin(2t) − sin(t)
T(γ, t) = .
2 sin( 2t ) cos(t) − cos(2t)
Proposition 8.13. Up to sign, the vector field T(γ, t) is a geometric object attached to the curve
C = γ(I).
Proposition 8.15. Any smooth curve admits a natural parametrization, i.e., it can be reparametrized
by arc length.
120
Geometry - first year [email protected]
Proposition 8.18. Up to sign, the vector field K(γ, t) is a geometric object attached to the curve
C = γ(I). In particular, the curvature κ(γ, t) is a geometric quantity.
Proposition 8.19. A parametrized curve γ : [a, b] → Rn has zero curvature if and only if it is a line
segment.
A Jordan curve divides the plane into an interior and an exterior. In some cases, it is possible
to compute the area of the interior. One may try to approximate the interior with a polygon (the
area of which can be calculated by subdividing into triangles) and the remaining pieces (yellow
121
Geometry - first year [email protected]
in the picture) may be calculated by choosing the frame such that the corresponding portion of γ
is described by a function along the sides of the polygon which can be integrated using standard
calculus techniques. While this approach is theoretically straightforward, it is often impractical
in most cases. Instead, one may approximate the interior using polygons with a large number of
vertices.
Example 8.21. One may think of a parametrization γ : [a, b] → E2 as a curved segment. There are
many curved segments between two points. Choosing two such segments that intersect only at their
endpoints, we obtain a simple loop.
If the two curved segments can be described as graphs of functions – i.e., γ1 (t) = (t, f (t)) for some
function f : [a, b] → R and similarly for the second parametrized curved segment γ2 – then the area
enclosed between them can be computed by ordinary integration.
Example 8.22. Consider again the case of the cardioid with parametrization γ : I → R2 given in (8.1).
From the symmetry of the curve, we see that the area of the region enclosed by the cardioid is twice
the area above the x-axis. This corresponds to t ∈ [0, π].
From Example 8.6, we see that the y-values of γ are a function of x for t ∈ [0, 2π 2π
3 ] and [ 3 , π]
√
3 3 3
respectively. Moreover, γ( 2π 3 ) = a(− 2 , 2 ). We may therefore translate the current frame with the
vector (− 23 , 0) to arrive in the red frame (see above image). In this frame we have
2a(1 − cos(t)) cos(t) + 32
" #
γ(t) = .
2a(1 − cos(t)) sin(t)
122
Geometry - first year [email protected]
Thus
2π
Z
3
Z π
Areaint (C) = 2 x(y)dy − 2 x(y)dy
2π
0 3
and therefore
Areaint (C) = 6πa2 .
The above calculation is tedious. In many cases where a curve is defined by rotations, it is much
easier to integrate using polar coordinates (Appendix D). Notice that Equation (8.1) easily translates
into polar coordinates:
(
x(t) = 2a(1 − cos(t)) cos(t)
⇔ r(t) = 2a(1 − cos(t)) t ∈ [0, 2π[
y(t) = 2a(1 − cos(t)) sin(t)
p
where r = x2 + y 2 . Then, with Theorem 8.23 below, we calculate
Z 2π
2
Areaint (C) = 2a (1 − cos(t))2 dt
0
Z 2π
= 2a2 (1 − 2 cos(t) + cos(t)2 )dt
0
Z 2π
3 1
= 2a2 ( − 2 cos(t) + cos(2t))dt
0 2 2
2π
3 1
= 2a2 ( t − 2 sin(t) + sin(2t))
2 4 0
= 6πa2 .
Theorem 8.23. Let γ be a parametrized curve, given in polar coordinates (r, θ) by the equation r =
f (θ) then
1 θ2
Z
Area(γ, θ1 , θ2 ) = f (θ)2 dθ.
2 θ1
123
Geometry - first year [email protected]
Z(f ) = {P ∈ En : f (P ) = 0}
L(f , c) = {P ∈ En : f (P ) = c}
Example 8.25. In En , the function f (x1 , . . . , xn ) = a1 x1 +· · ·+an xn (where not all coefficients ai are zero)
describes a family of parallel hyperplanes. Indeed
are hyperplanes with normal vector n(a1 , . . . , an ), thus all are parallel to Z(f ), which passes through
the origin.
This example generalizes to affine maps. If φ : An → Am is an affine map given by φ(x) = Ax + b
then Z(φ) is an affine subspace of An defined by the system Ax + b = 0 and similarly for any c ∈ Am ,
the level set L(φ, c) is the affine subspace of An defined by the system Ax = c − b (again a family of
parallel subspaces, since the corresponding homogeneous system is the same).
Definition 8.26. While in the linear case (Example 8.25) level sets always define hypersurfaces (hy-
perplanes), this is no longer true in general (as can be seen in Example 8.24). In Example 8.24 only
the level sets L(f , c) with c > 0 are considered surfaces. To avoid empty sets, e.g. L(f , −1), we may sim-
ply ask that the level set is non-empty. (However, notice that if we work over the complex numbers
C, then L(f , −1) is non-empty, it is a so-called complex sphere.) The second degenerate case, where
the level set is a point, i.e. L(f , 0), can be avoided by asking that the Jacobian J(f ) = (∂x1 f , . . . , ∂xn f ) is
non-zero.
124
Geometry - first year [email protected]
We say that a smooth hypersurface is the set of solutions to an equation of the form
L(f , c) : f (x) = c
where f : Rn → R is a smooth function such that there exists p ∈ Rn with f (p) = c and such that
J(f )(p) is non-zero.
Example 8.27. Consider again the case of the cardioid C given via the parametrization (8.1). One can
show that it satisfies the equation
Various level sets of the function f are visible in the following image.
Example 8.28. In 3-dimensional space, consider the function f (x, y, z) = x2 + y 2 − z2 . Then the level
set L(f , 0) is a cone, L(f , c) is a hyperboloid of two sheets if c < 0, and a hyperboloid of one sheet if
c > 0.
125
Geometry - first year [email protected]
If the Jacobian J(f )(p) is non-zero, then there exists a neighbourhood U of p, an open subset V ⊆ Rn−1
and a smooth injective map γ : V → Rn such that
Example 8.31. Consider a circle of radius r in the Oyz-plane centered at (0, R, 0). A torus T is
obtained by rotating (revolving) this circle around the z-axis. Then r is the minor radius and R the
126
Geometry - first year [email protected]
major radius of the torus. From this description it is not difficult to obtain a parametrization. Such a
torus has volume 2π2 Rr 2 . Moreover, one may show that
127
CHAPTER 9
Contents
9.1 Ellipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.1.1 Geometric description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.1.2 Canonical equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.1.3 Parametric equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
9.1.4 Relative position of a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
9.1.5 Tangent line at a given point - algebraic . . . . . . . . . . . . . . . . . . . . . . 132
9.1.6 Tangent line at a given point - via gradients . . . . . . . . . . . . . . . . . . . . 133
9.1.7 Reflective properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
9.2 Hyperbola . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.2.1 Geometric description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.2.2 Canonical equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.2.3 Parametric equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
9.2.4 Relative position of a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.2.5 Tangent line at a given point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
9.2.6 Reflective properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
9.3 Parabola . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
9.3.1 Geometric description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
9.3.2 Canonical equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
9.3.3 Parametric equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
9.3.4 Relative position of a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
9.3.5 Tangent line at a given point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.3.6 Reflective properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
128
Geometry - first year [email protected]
Definition 9.1. A quadratic curve (or conic) in E2 is a curve defined by a quadratic equation
ax2 + bxy + cy 2 + dx + ey + f = 0
where a, b, c, d, e, f ∈ R.
9.1 Ellipse
9.1.1 Geometric description
P′ P
F2 F1
Definition 9.2. An ellipse is the locus of points in E2 for which the sum of the distances from two
fixed points, called focal points (or foci), is constant.
129
Geometry - first year [email protected]
Proposition 9.3. Let F1 and F2 be two points in E2 and let a > 0 be a real number. Choose the
−−−−→
coordinate system Oxy = (O, i, j) such that F1 and F2 are on the x-axis, such that F2 F1 has the same
direction as i and such that O is the midpoint of [F1 F2 ]. With these choices, the ellipse with focal
points F1 and F2 for which the sum of distances from the focal points is 2a has the equation
x2 y 2
Ea,b : + =1 (9.1)
a2 b2
for some number b > 0. We denote this ellipse by Ea,b and call a the semi-major axis and b the semi-
minor axis.
P
j a
b
a
F2 O i F1 O c F1
• Equation (9.1) is called the canonical equation of the ellipse Ea,b . Clearly, with respect to some
other coordinate system, the same ellipse will have a different equation.
• The intersections of Ea,b with the coordinate axes are the points (±a, 0) and (0, ±b).
• The canonical equation shows that M(xM , yM ) ∈ Ea,b if and only if (±xM , ±yM ) ∈ Ea,b .
b√ 2
y(x) = ± a − x2 .
a
This gives a partial parametrization of Ea,b . For the ‘northern part’ we have the parametrization
b√
φ : [−a, a] → E2 given by φ(x) = x, y(x) = x, a2 − x2 .
a
130
Geometry - first year [email protected]
Thus, we can use the known methods to verify the monotonicity and the convexity of y(x), which
describes this part of the ellipse.
A second way of parametrizing the ellipse Ea,b is with
It is easy to check using equation (9.1) that this is a parametrization. Moreover, with this parametriza-
tion, we may view the ellipse as the orbit of a rotation followed by a dilation along the coordinate
axes: " #" #" #
a 0 cos(t) − sin(t) 1
t 7→ .
0 b sin(t) cos(t) 0
We call such a transformation elliptical motion.
Consider the canonical equation (9.1) of the ellipse Ea,b . Let ℓ be a line with equation y = kx + m.
The intersection of the two objects is the set of points whose coordinates are solutions to the system
( 2 y2 ( 2 (kx+m)2
x x
+ = 1 + b2 = 1
a 2 b 2
⇔ a2 .
y = kx + m y = kx + m
The solutions to this system are (x, y) = (x, kx + m) where x is a solution to the first equation. Let us
now discuss that equation after substituting y = kx + m.
This is a quadratic equation in x since a, b, k, m are fixed. The discriminant of this equation is
131
Geometry - first year [email protected]
√ √
• − a2 k 2 + b2 < m < a2 k 2 + b2 , in which case ℓ intersects Ea,b in two distinct points.
√
• m = ± a2 k 2 + b2 , in which case ℓ intersects Ea,b in a unique point. Such a point is a double
intersection point because it is obtained as a double solution to the algebraic equation. For these
two values of m, the line ℓ : y = kx + m is tangent to the ellipse. Therefore, if a slope k is given,
there are two tangent lines to the ellipse with this slope:
√
y = kx ± a2 k 2 + b2 .
√ √
• m < − a2 k 2 + b2 or m > a2 k 2 + b2 , in which case there is no intersection point between ℓ and
Ea,b .
In order for ℓ to be tangent to Ea,b , there needs to be a unique solution t to the above equation. Since
t = 0 is obviously a solution, this needs to be the only solution. In other words, t = 0 should be a
double solution. For this to happen we must have
x 0 v x y0 v y
+ 2 = 0 ⇔ ⟨n, v⟩ = 0
a2 b
y
where n = ( xa20 , b02 ). Thus, ℓ is tangent to the ellipse if and only if the vector n is orthogonal to ℓ, i.e. if
and only if n is a normal vector for ℓ. It follows that ℓ is tangent to Ea,b in the point (x0 , y0 ) ∈ Ea,b if
and only if it satisfies the Cartesian equation:
x y
ℓ : 20 (x − x0 ) + 02 (y − y0 ) = 0.
a b
We call this line the tangent line to Ea,b at the point (x0 , y0 ) ∈ Ea,b and denote it by T(x0 ,y0 ) Ea,b . Rearrang-
ing the above equation we see that:
x x y y
T(x0 ,y0 ) Ea,b : 02 + 02 = 1. (9.2)
a b
132
Geometry - first year [email protected]
x2 y 2
ψ : E2 → R defined by ψ(x, y) = +
a2 b 2
and notice that Ea,b = ψ −1 (1). The gradient at a point (x0 , y0 ) ∈ Ea,b is
x y x 0 y0
∇(x0 ,y0 ) (ψ) = 2 2 , 2 2 = 2 2, 2 .
a b (x0 ,y0 ) a b
By using a parametrization φ : I → E2 of Ea,b , and applying the chain rule to ∂t ψ(φ(t)), one shows
that ∇(x0 ,y0 ) (ψ) is orthogonal to the tangent vectors at the point (x0 , y0 ). In other words, ∇(x0 ,y0 ) (ψ) is a
normal vector at the point (x0 , y0 ) ∈ Ea,b . This gives a different way of obtaining the equation (9.2).
F2 F1
133
Geometry - first year [email protected]
9.2 Hyperbola
9.2.1 Geometric description
P′
F2 F1
Definition 9.4. A hyperbola is the locus of points in E2 for which the difference of the distances from
two given points, the focal points, is constant.
F2 O i F1
134
Geometry - first year [email protected]
• Equation (9.3) is called the canonical equation of the hyperbola Ha,b . Clearly, the same hyperbola
may be represented by a different equation in another frame.
• If 2c denotes the distance between F1 and F2 , then b2 = c2 − a2 .
• The intersections of Ha,b with the coordinate axes are the points (±a, 0).
• The numerical quantity r
c b2
ε = = 1 + 2 ∈ (1, ∞)
a a
is called the eccentricity of the hyperbola Ha,b . It measures how widely the branches of the
hyperbola open.
• The canonical equation shows that M(xM , yM ) ∈ Ha,b if and only if (±xM , ±yM ) ∈ Ha,b .
c
b
F2 c O a F1
135
Geometry - first year [email protected]
Consider the canonical equation (9.3) of the hyperbola Ha,b . Let ℓ be a line with equation y =
kx + m. The intersection of the two objects is the set of points whose coordinates are solutions to the
system
( 2 y2 ( 2 (kx+m)2
x x
− = 1 − b2 = 1
a 2 b 2
⇔ a2 .
y = kx + m y = kx + m
The solutions to this system are (x, y) = (x, kx + m) where x is a solution to the first equation. Let us
discuss that equation:
(b2 − a2 k 2 )x2 − 2kma2 x − a2 (m2 + b2 ) = 0 (9.4)
This is a quadratic equation in x since a, b, k, m are fixed. The discriminant of this equation is
∆ = 4k 2 m2 a4 + 4a2 (m2 + b2 )(b2 − a2 k 2 ) = 4a2 b2 (m2 + b2 − a2 k 2 ).
So, the number of real solutions is controlled by m2 +b2 −a2 k 2 . . . if the equation is quadratic. Suppose
equation (9.4) is quadratic, i.e. b2 − a2 k 2 , 0.
√ √
• m < − a2 k 2 − b2 or m > a2 k 2 − b2 , in which case ℓ intersects Ha,b in two distinct points.
√
• m = ± a2 k 2 − b2 , in which case ℓ intersects Ha,b in a unique point. Such a point is a double
intersection point because it is obtained as a double solution to the algebraic equation. For these
two values of m, the line ℓ : y = kx + m is tangent to the hyperbola. Therefore, if k satisfies
b2 − a2 k 2 , 0, i.e., if k , ± ba , then there are two tangent lines to the hyperbola with slope k:
√
y = kx ± a2 k 2 − b2 .
√ √
• − a2 k 2 − b2 < m < a2 k 2 − b2 , in which case there is no intersection point between ℓ and Ha,b .
Suppose equation (9.4) is not quadratic, i.e. b2 − a2 k 2 = 0 and the equation is
−2kma2 x − a2 (m2 + b2 ) = 0
Notice that k , 0 and a , 0 and that k = ± ba . We have two cases:
136
Geometry - first year [email protected]
2 2
• m , 0, hence the unique solution x = − m2a+b 2 , which corresponds to a unique intersection point.
In this case, it is a simple intersection point, corresponding to a simple solution of the algebraic
equation (not a double solution).
x0 x y0 y
T(x0 ,y0 ) Ha,b : − 2 = 1. (9.5)
a2 b
This can be deduced either with the algebraic method or via the gradient as in the case of the ellipse.
F2 F1
137
Geometry - first year [email protected]
9.3 Parabola
9.3.1 Geometric description
P′
ℓ
F
Definition 9.6. A parabola is the locus of points in E2 for which the distance from a given point, the
focal point, equals the distance to a given line, the directrix.
Pp : y 2 = 2px (9.6)
• Equation (9.6) is called the canonical equation of the parabola Pp . Clearly, with respect to some
other coordinate system, the same parabola will have a different equation.
p p
• The focal point is F( 2 , 0) and the directrix has equation d : x = − 2 .
• The intersection of Pp with the coordinate axes is the point (0, 0).
• The canonical equation shows that M(xM , yM ) ∈ Pp if and only if (xM , ±yM ) ∈ Pp .
138
Geometry - first year [email protected]
j
ℓ ℓ
O i F p p F
2 2
This gives a partial parametrization of Pp . For the ‘northern part’ we have the parametrization
p
φ : [0, ∞) → E2 given by φ(x) = x, y(x) = x, 2px .
Thus, we can use the known methods to verify the monotonicity and the convexity of y(x), which
describes this part of the parabola.
We can, in fact, parametrize the whole parabola if we express x in terms of y, which is another
way of reading equation (9.6). We then have the parametrization
y2
φ : R → E2 given by φ(x) = x(y), y = ,y .
2p
139
Geometry - first year [email protected]
Consider the canonical equation (9.6) of the parabola Pp . Let ℓ be a line with equation y = kx + m.
The intersection of the two objects is the set of points whose coordinates are solutions to the system
( 2
(kx + m)2 = 2px
(
y = 2px
⇔ .
y = kx + m y = kx + m
The solutions to this system are (x, y) = (x, kx + m) where x is a solution to the first equation. Let us
discuss that equation:
k 2 x2 + 2(km − p)x + m2 = 0 (9.7)
This is a quadratic equation in x since p, k, m are fixed. The discriminant of this equation is
∆ = 4p(p − 2km).
So, the number of real solutions is controlled by p − 2km:
• km < p/2, in which case ℓ intersects Pp in two distinct points.
• km = p/2, in which case ℓ intersects Pp in a unique point. Such a point is a double intersection
point because it is obtained as a double solution to the algebraic equation. For this value of m,
the line ℓ : y = kx + m is tangent to the parabola. Therefore, if a slope k is given, there is one
tangent line to the parabola having the given slope:
p
y = kx + .
2k
• km > p/2, in which case there is no intersection point between ℓ and Pp .
140
Geometry - first year [email protected]
These properties are used for lenses, parabolic reflectors, satellite dishes, etc.
141
CHAPTER 10
Hyperquadrics
Contents
10.1 Hyperquadrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10.2 Reducing to canonical form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10.3 Classification of conics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.3.1 Isometric classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.3.2 Algorithm 1: Isometric invariants . . . . . . . . . . . . . . . . . . . . . . . . . . 155
10.3.3 Affine classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
10.3.4 Algorithm 2: Lagrange’s method . . . . . . . . . . . . . . . . . . . . . . . . . . 157
10.4 Classification of quadrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
142
Geometry - first year [email protected]
10.1 Hyperquadrics
Definition 10.1. A hyperquadric Q in En is the set of points whose coordinates satisfy a quadratic
equation, i.e.,
Xn n
X
Q: aij xi xj + bi x i + c = 0 (10.1)
i,j=1 i=1
a +a
where qii = aii and qij = qji = ij 2 ji . The matrix Q = (qij ) is symmetric and we call it the
symmetric matrix associated to Equation (10.2) of the quadric Q.
• The matrix Q defines a homogeneous polynomial of degree 2 in the above equation.
• Notice that Equation (10.2) can be rearranged in matrix form as follows
Q : xT · Q · x + bT · x + c = 0 (10.3)
[Step 1 - Rotation] By the Spectral Theorem (see Corollary ??), there is an orthonormal basis B ′ of
eigenvectors for Q which diagonalizes Q. Changing the frame from K = (O, B) to K′ = (O, B ′ ), the
equation of the hyperquadric becomes
λ1 0 . . . 0
0 λ2 . . . 0
T T
Q : y · D · y + v · y + c = 0 where D = .
.. . . ..
(10.4)
.. . . .
0 0 . . . λn
and where y = (y1 , . . . , yn ) and v = (v1 , . . . , vn ). This change of coordinates consists of replacing x by
MB,B ′ y since x = MB,B ′ y. Notice that, since the two bases B and B ′ are orthonormal, the matrix MB,B ′
is an orthogonal matrix, i.e., MB,B ′ ∈ O(n).
143
Geometry - first year [email protected]
For the proof of the Spectral theorem, one uses the fact that, since Q is symmetric, the eigenvalues
λ1 , . . . , λn are all real. Hence, eventually after permuting the basis vectors in B ′ we may assume
that λ1 , . . . , λp > 0, λp+1 , . . . , λr < 0 and λr+1 , . . . , λn = 0 where r is the rank of Q. This permutation
corresponds to changing B ′ = (e′1 , . . . , e′n ) to B ′′ = (eσ (1) , . . . , eσ (n) ) for some permutation σ of {1, . . . , n}.
The change of basis matrix MB ′ ,B ′′ is again orthogonal. Thus, so far, we have a change of coordinates
from K = (O, B) to K = (O, B ′′ ) given by the change of basis matrix MB,B ′′ = MB,B ′ MB ′ ,B ′′ ∈ O(n).
Recall that det(MB,B ′′ ) = 1 if MB,B ′′ ∈ SO(n) and det(MB,B ′′ ) = −1 if MB,B ′′ is not special orthogonal.
In the latter case, the matrix MB,B ′′ changes the orientation of the basis B ′′ , i.e., the change of coor-
dinates is an indirect isometry. So, if we replace one vector in B ′′ by minus that vector, for example
e′′1 ↔ −e′′1 , then B ′′ has the same orientation as B and MB,B ′ ∈ SO(n), i.e., the change of coordinates is
a displacement. In conclusion, we may assume that B ′′ is such that MB,B ′′ ∈ SO(n).
Writing out the equation of Q in K′′ we have:
Q : λ1 y12 + λ2 y22 + · · · + λr yr2 + v1 y1 + v2 y2 + · · · + vn yn + c = 0. (10.5)
[Step 2 - translation] Notice that r > 0, i.e., there is at least one eigenvalue distinct from 0 otherwise
Q = 0n (the n × n zero matrix) and (10.2) is not a quadratic polynomial. Now, if v1 , 0 then
v v 2 v12
λ1 y12 + v1 y1 = λ1 y12 + 2 1 y1 = λ1 y1 + 1
− 2.
2λ1 2λ1 4λ1
vi
Thus, if for i ∈ {1, . . . , r} we let zi = yi + 2λ and zi = yi for i > r then Equation (10.5) becomes
i
144
Geometry - first year [email protected]
From the discussion in Section 10.2, we may apply a rotation and a translation to change the frame
such that Equation (10.7) becomes
C : λ1 x2 + λ2 y 2 = k or C : λ1 x2 + v2 y = k. (10.8)
We may also assume that λ1 > 0 since if λ1 , λ2 < 0 we can multiply the entire equation by −1.
Case 1: If λ2 > 0 and k = 0, then the equation has only (0, 0) as solution, in this case the curve
degenerates to a point: the origin.
Case 2: If λ2 > 0 and k < 0, then there are no real solutions to the equation.
Case 3: If λ2 > 0 and k > 0, after dividing by k the equation becomes
x2 y2
k
+ k
=1
λ1 λ2
which is the equation of an ellipse. The ellipse is in canonical form if λk > λk . If this is not the case,
1 2
◦
we need to do one additional change of coordinates by rotating the reference frame with 90 which
is a direct isometry.
Case 4: If λ2 < 0 and k = 0, then the equation becomes
p p p p
( λ1 x − −λ2 y)( λ1 x + −λ2 y) = 0.
x2 y2
k
− k
=1
λ1 −λ2
C : λ1 x2 + v2 y = k.
√ √ √ √
If v2 = 0 and k ≥ 0 then we have two lines described by the equation ( λ1 x − k)( λ1 x + k) = 0. If
k = 0 this is a double-line, x2 = 0.
145
Geometry - first year [email protected]
λ1 2 k v
x = − 2y
|v2 | |v2 | |v2 |
and we may change the coordinates with the translation/glide reflection (x, |vk | − |vv2 | y) → (x, y) such
1 2
that the equation simplifies to
|v |
x2 = 2 y.
λ1
This change of coordinates corresponds to a reflection and a translation (again, an isometric change
|v1 |
of coordinates) and we recognize the equation of a parabola with parameter p = 2λ . However,
1
here again, we do not yet have the canonical form of a parabola. In order to obtain Pp we need to
interchange the x-axis with the y-axis.
[Conclusion] Starting with an equation of the form (10.7), we may use rotations, translations and
reflections to transform the equation into one of the following
• non-degenerate cases:
x2 y 2 x2 y 2
Ea,b : + =1 or Ha,b : − =1 or Pp : y 2 = 2px.
a2 b2 a2 b2
Moreover, we can either carefully select direct isometries at each step or ensure at the end that the
composition of all isometries used is direct. So, the curves described by an equation of the form
(10.7) are conic sections and can be reduced to such curves via direct isometries (displacements).
146
Geometry - first year [email protected]
It is the ellipse in the image above. However, a priori, it is not at all clear that C is an ellipse. The
symmetric matrix associated to this equation is
" #
73 36
Q=
36 52
The eigenvalues of Q are λ1 = 100 and λ2 = 25. Corresponding eigenvectors are (4, 3) for λ1 and
(3, −4) for λ2 . These two vectors form an orthogonal basis. Thus, an orthonormal basis is B ′ =
(e′1 (4/5, 3/5), e′2 (3/5, −4/5)). The change of basis matrix from B ′ to B is
" #
1 4 3
MB,B ′ = .
5 3 −4
We know that this matrix is orthogonal, which in this example is easy to check directly. In particular
T
M−1
B,B ′ = MB,B ′ . Moreover, the determinant is −1, which means that MB,B ′ is not a direct isometry, i.e.,
MB,B ′ ∈ O(2)\SO(2). If we change the direction of the second eigenvector, we still have an eigenvector
for the eigenvalue λ2 , but now, with respect to the basis B ′ = (e′1 (4/5, 3/5), e′2 (−3/5, 4/5)), we have
" #
1 4 −3
MB,B ′ = ∈ SO(2).
5 3 4
147
Geometry - first year [email protected]
and we have
C : 100x′2 + 25y ′2 + 25x′ + 50y ′ + 25 = 0
equivalently
C : 4x′2 + y ′2 + x′ + 2y ′ + 1 = 0
equivalently
x′ 1 1 ′2
C : 4 x′2 + + − + y + 2y ′ + 1 − 1 + 1 = 0
4 64 16
equivalently
1 2 ′ 2 1
C : 4 x′ + + y +1 − = 0.
8 16
Now, let us change the frame again, using a translation by the vector (− 18 , −1). The new frame is
K′′ = (O′′ , B ′′ ), where O′′ = (− 81 , −1), and the basis B ′′ = B ′ does not change. In K′′ , the equation of C
becomes
1 x′′2 y ′′2
C : 4x′′2 + y ′′2 − =0 ⇔ 1
+ − 1 = 0.
16 64
1
Clearly, this is the equation of an ellipse. However, it is not yet in canonical form because the focal
points are on the y-axis. To obtain the canonical form, we perform a final coordinate change and
permute the coordinate axes, for instance with
" # " #
0 −1 0 1
∈ SO(2) or with ∈ O(2).
1 0 1 0
148
Geometry - first year [email protected]
To recap:
• We changed the coordinates from K to K′ by the rotation MB,B ′ of angle θ, where cos(θ) = 45 .
• We changed the coordinates from K′ to K′′ with a translation by the vector (− 18 , −1).
• We changed the coordinates from K′′ to K′′′ in order to interchange the variables.
• We obtained E1, 1 .
8
149
Geometry - first year [email protected]
The eigenvalues of Q are λ1 = 338 and λ2 = −169. Corresponding eigenvectors are (5, 12) for λ1
and (12, −5) for λ2 . These two vectors form an orthogonal basis. Thus, an orthonormal basis is
B ′ = (e′1 (5/13, 12/13), e′2 (12/13, −5/13)). The change of basis matrix from B ′ to B is
" #
1 5 12
MB,B ′ = .
13 12 −5
We know that this matrix is orthogonal, which in this example is easy to check directly. In particular
T
M−1
B,B ′ = MB,B ′ . Moreover, the determinant is −1, which means that MB,B ′ is not a direct isometry, i.e.,
MB,B ′ ∈ O(2)\SO(2). If we change the direction of the second eigenvector, we still have an eigenvector
for the eigenvalue λ2 , but now, with respect to the basis B ′ = (e′1 (5/13, 12/13), e′2 (−12/13, 5/13)), we
have " #
1 5 −12
MB,B ′ = ∈ SO(2).
13 12 5
Changing the frame from K to K′ = (O, B ′ ), Equation (10.11) becomes
" ′# " ′#
h
′ ′
i
T x x
C : x y MB,B ′ Q MB,B ′ ′ + b MB,B ′ ′ + 169 = 0
y y
and we have
C : 338x′2 − 169y ′2 + 169x′ + 169y ′ + 169 = 0
150
Geometry - first year [email protected]
equivalently
C : 2x′2 − y ′2 + x′ + y ′ + 1 = 0
equivalently
x′ 1 1 ′2 1 1
C : 2 x′2 + + − − y − y′ + + + 1 = 0
2 16 8 4 4
equivalently
1 2 ′ 1 2 9
C : 2 x′ + + y − + = 0.
4 2 8
Now, let us change the frame again, using a translation by the vector (− 14 , 12 ). The new frame is
K′′ = (O′′ , B ′′ ), where O′′ = (− 14 , 21 ), and the basis B ′′ = B ′ does not change. In K′′ , the equation of C
becomes
9 x′′2 y ′′2
C : 2x′′2 − y ′′2 + = 0 ⇔ − 9 + 9 − 1 = 0.
8 16 8
Clearly, this is the equation of a hyperbola. However, it is not yet in canonical form because the focal
points are on the y-axis. To obtain the canonical form, we perform a final coordinate change and
permute the coordinate axes, for instance with
" # " #
0 −1 0 1
∈ SO(2) or with ∈ O(2).
1 0 1 0
151
Geometry - first year [email protected]
To recap:
5
• We changed the coordinates from K to K′ by the rotation MB,B ′ of angle θ, where cos(θ) = 13 .
152
Geometry - first year [email protected]
The eigenvalues of Q are λ1 = 578 and λ2 = 0. Corresponding eigenvectors are (8, 15) for λ1 and
(15, −8) for λ2 . These two vectors form an orthogonal basis. Thus, an orthonormal basis is B ′ =
(e′1 (8/17, 15/17), e′2 (15/17, −8/17)). The change of basis matrix from B ′ to B is
" #
1 8 15
MB,B ′ = .
17 15 −8
We know that this matrix is orthogonal, which in this example is easy to check directly. In particular
T
M−1
B,B ′ = MB,B ′ . Moreover, the determinant is −1, which means that MB,B ′ is not a direct isometry, i.e.,
MB,B ′ ∈ O(2)\SO(2). If we change the direction of the second eigenvector, we still have an eigenvector
for the eigenvalue λ2 , but now, with respect to the basis B ′ = (e′1 (8/17, 15/17), e′2 (−15/17, 8/17)), we
have " #
1 8 −15
MB,B ′ = ∈ SO(2).
17 15 8
Changing the frame from K to K′ = (O, B ′ ), Equation (10.13) becomes
" ′# " ′#
h
′ ′
i
T x x
C : x y MB,B ′ Q MB,B ′ ′ + b MB,B ′ ′ + 289 = 0
y y
which one calculates to be
0 x′ h i x′
" #" # " #
h i 578
′
C: x y ′ + 289 −289 + 289 = 0
0 0 y ′ | {z } y ′
| {z } ′ b
Q′
and we have
C : 578x′2 + 289x′ − 289y ′ + 289 = 0
equivalently
C : 2x′2 + x′ − y ′ + 1 = 0
equivalently
x′ 1 1
C : 2 x′2 + + − − y′ + 1 = 0
2 16 8
equivalently
1 2 ′ 7
C : 2 x′ + − y − = 0.
4 8
153
Geometry - first year [email protected]
Now, let us change the frame again, using a translation by the vector (− 14 , 78 ). The new frame is
K′′ = (O′′ , B ′′ ), where O′′ = (− 14 , 78 ), and the basis B ′′ = B ′ does not change. In K′′ , the equation of C
becomes
1
C : 2x′′2 − y ′′ = 0 ⇔ x′′2 = y ′′ .
2
Clearly, this is the equation of a parabola. However, it is not yet in canonical form because the
focal point is on the y-axis. To obtain the canonical form, we perform a final coordinate change and
permute the coordinate axes, for instance with
" # " #
0 −1 0 1
∈ SO(2) or with ∈ O(2).
1 0 1 0
To recap:
8
• We changed the coordinates from K to K′ by the rotation MB,B ′ of angle θ, where cos(θ) = 17 .
154
Geometry - first year [email protected]
The matrix Q b is symmetric and we call it the extended symmetric matrix associated to Equation (10.15)
of the conic C. Notice that changing coordinates via a rotation R [Step 1], followed by a translation
with a vector v [Step 2], amounts to a single matrix multiplication:
′
# x ′ x ′
x " # x " −1 −1
′ R v
R −R v ′ ′
y = y ⇔ y = y
0 1 0 1
1 1 | {z } 1 1
=M
b does not change when reducing to the canonical form with displacements. We
Observe that det(Q)
b is invariant. Indeed det(M) = det(R−1 ) = 1, so it does not affect the determinant of Q:
say that det(Q) b
det(M T QM)
b = det(M T ) det(Q)
b det(M) = det(Q).
b
In order to distinguish between the different conics, we use three invariants. As before, Q denotes
the symmetric matrix associated to Equation (10.15) of the conic C. The invariants are:
b = det(Q),
D b D = det(Q) and T = tr(Q).
155
Geometry - first year [email protected]
D
b D T curve C
D >0 - A point
b=0
D D =0 - Two lines or the empty set
D <0 - Two lines
D >0 b <0
DT An ellipse
b,0
D D >0 b >0
DT The empty set
D =0 - A parabola
D <0 - A hyperbola
156
Geometry - first year [email protected]
(Step 3) Then, with respect to some frame, the curve C has the equation
ax2 + by 2 + c = 0 or ax2 + by + c = 0
where a, b, c are obtained in Steps 1 and 2. It is now easy to identify the type of curve (see Table
10.2).
This works because Step 1 and Step 2 correspond to affine changes of coordinates.
157
Geometry - first year [email protected]
From the discussion in Section 10.2, we may apply an orthogonal change of coordinates and a trans-
lation to change the frame so that Equation (10.17) becomes
The isometric classification in this case is similar: we work out all possible cases to see what we
obtain. One important remark is that the change of basis matrix in Step 1 – the matrix MB,B ′′ used
to obtain Equation (10.18) – is an element of the group SO(3). Thus, by Euler’s theorem (Theorem
7.17), this transformation is a rotation around an axis.
However, as in Section 10.3.3 and Section 10.3.4, one can ‘stretch’ the coordinate axes with affine
transformations which are not isometries, to show that one may change the frame so that Equation
(10.17) is one of the possibilities listed in the following table.
158
CHAPTER 11
Contents
11.1 Ellipsoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
11.1.1 Canonical equation - global description . . . . . . . . . . . . . . . . . . . . . . 161
11.1.2 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
11.1.3 Parametrizations - local description . . . . . . . . . . . . . . . . . . . . . . . . . 163
11.2 Elliptic Cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
11.2.1 Canonical equation - global description . . . . . . . . . . . . . . . . . . . . . . 163
11.2.2 Conic sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
11.2.3 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
11.2.4 Ca,b,c as ruled surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
11.2.5 Parametrizations - local description . . . . . . . . . . . . . . . . . . . . . . . . . 165
11.3 Hyperboloid of one sheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
11.3.1 Canonical equation - global description . . . . . . . . . . . . . . . . . . . . . . 165
11.3.2 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
11.3.3 H1a,b,c as ruled surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
11.3.4 Parametrizations - local description . . . . . . . . . . . . . . . . . . . . . . . . . 169
11.4 Hyperboloid of two sheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
11.4.1 Canonical equation - global description . . . . . . . . . . . . . . . . . . . . . . 170
11.4.2 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
11.4.3 Parametrizations - local description . . . . . . . . . . . . . . . . . . . . . . . . . 172
11.5 Elliptic paraboloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
11.5.1 Canonical equation - global description . . . . . . . . . . . . . . . . . . . . . . 172
11.5.2 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
11.5.3 Parametrizations - local description . . . . . . . . . . . . . . . . . . . . . . . . . 174
159
Geometry - first year [email protected]
160
Geometry - first year [email protected]
Here we look at the main examples of quadrics in E3 , sometimes called quadratic surfaces. These
are surfaces which satisfy a quadratic equation of the form
Notice that an equation as above may not define a surface. It could happen that there are no solutions
or that the coordinates of only one point satisfy a given quadratic equation.
11.1 Ellipsoid
11.1.1 Canonical equation - global description
An ellipsoid is a surface which (in some coordinate system) satisfies an equation of the form
x2 y 2 z2
Ea,b,c : + + =1 so Ea,b,c = ϕ −1 (1)
a2 b 2 c 2
| {z }
ϕ(x,y,z)
If we fix one variable, we obtain intersections with planes parallel to the coordinate axes. For this
surface the intersections are either ellipses, points or the empty set. Check this for z = h and deduce
the axes of the ellipses that you obtain.
161
Geometry - first year [email protected]
We will check this with the algebraic method. Let us consider a line passing through p in parametric
form
x xp vx
l : y = yp + t vy ⇔ l = p + ⟨v⟩.
z zp vz
|{z}
=v
Recall that the tangent plane Tp Ea,b,c is the union of all lines intersecting the quadric Ea,b,c at p in a
special way, i.e. in a double point. This is why we investigate the intersection of our quadric with l.
How do we obtain the intersection Ea,b,c ∩ l? We look at those points of l which satisfy the equation
of the ellipsoid, i.e. we look for solutions t for the equation
xp vx (xp + tvx )2 (yp + tvy )2 (zp + tvz )2
ϕ(yp + t vy ) = 1 ⇔ + + − 1 = 0.
a2 b2 c2
zp vz
v2
x
vy2 vz2 2 xp vx yp vy zp vz xp2 yp2 zp2
+ + t + 2 + 2 + 2 t + 2 + 2 + 2 −1 = 0
a2 b2 c2 a2 b c a b c
| {z }
=1
v2
x
vy2 vz2 2 xp vx yp vy zp vz
⇔ + + t + 2 + 2 + 2 t = 0.
a2 b2 c2 a2 b c
| {z }
,0
This equation in t admits the solution t = 0. That is clear, the point on l corresponding to t = 0 is
the point p which, by assumption, lies on Ea,b,c . Furthermore, the second solution will correspond to
a second point of intersection. However, the line l is tangent to our quadric if p is a double point of
intersection. This happens if and only if the above equation has t = 0 as double solution, i.e. if and
only if
xp vx yp vy zp vz
+ 2 + 2 = 0.
a2 b c
x y z
How do we interpret this? In the Euclidean setting this can be interpreted as saying that ( a2p , bp2 , cp2 )
is perpendicular to the direction vector v of the line. All lines which are tangent to the surface and
contain p, need to satisfy this condition, so
xp
a2 x xp xp x yp y zp z
y
Tp Ea,b,c : bp2 · (y − yp ) = 0 ⇔ + + − 1 = 0.
zp a2 b2 c2
2
z zp
c
162
Geometry - first year [email protected]
If we fix one variable, we obtain intersections with planes parallel to the coordinate axes. For this
surface the intersections are either ellipses, hyperbolas or a point. Check this for z = h and deduce
the axes of the ellipses that you obtain.
2 Image source: Wikipedia
163
Geometry - first year [email protected]
Proposition 11.1. The intersection of an elliptic cone with an arbitrary plane is a (possibly degener-
ate) quadratic curve.
xp x yp y zp z
Tp Ca,b,c : + − = 0.
a2 b2 c2
If we denote by L the family of all these lines, it is easy to see that the surface is the union of them:
[
S= l.
l∈L
164
Geometry - first year [email protected]
The lines in L are called rectilinear generators of the surface S. We refer to them as generators since we
don’t consider here non-rectilinear generators.
So, how is the cone a ruled surface? Fix a point (x0 , y0 , z0 ) ∈ Ca,b,c and notice that for any t ∈ R we
have
2
(tx0 )2 (ty0 )2 (tz0 )2 y02 z02
!
2 x0
+ − 2 =t + − = 0.
a2 b2 c a2 b2 c2
Thus, the line {(tx0 , ty0 , tz0 ) : t ∈ R}, which passes through the given point and the origin is contained
in Ca,b,c . The set of all lines obtained in this way form the generators L of the cone.
You want to rescale this curve with the height such that when the height z = 0 you have a point, and
for all other values of z you have a rescaled version of your curve:
x(θ, h) = ha cos(θ)
y(θ, h) = hb sin(θ) θ ∈ [0, 2π[ h ∈ R.
z(θ, h) = hc
165
Geometry - first year [email protected]
If we fix one variable, we obtain intersections with planes parallel to the coordinate axes. For this
surface the intersections are either ellipses, hyperbolas or two lines. Check this for y = h and deduce
the axes of the hyperbolas that you obtain.
Recall that the tangent plane Tp H1a,b,c is the union of all lines intersecting the quadric H1a,b,c at p in a
special way, i.e. in a double point. This is why we investigate the intersection of our quadric with l.
How do we obtain the intersection H1a,b,c ∩ l? We look at those points of l which satisfy the equation
of our hyperboloid, i.e. we look for solutions t for the equation
xp vx (xp + tvx )2 (yp + tvy )2 (zp + tvz )2
ϕ(yp + t vy ) = 1 ⇔ + − − 1 = 0. (11.3)
a2 b2 c2
zp vz
166
Geometry - first year [email protected]
v2
x
vy2 vy2 2 xp vx yp vy zp vz xp2 yp2 zp2
+ − t + 2 + 2 − 2 t + 2 + 2 − 2 −1 = 0
a2 b2 c2 a2 b c a b c
| {z }
=1
v2
x
vy2
vy2 2 xp vx yp vy zp vz
⇔ + 2 − 2 t + 2 2 + 2 − 2 t = 0. (11.4)
a2
b c a b c
| {z }
=0?
This equation in t admits the solution t = 0. That is clear, the point on l corresponding to t = 0 is
the point p which, by assumption, lies on H1a,b,c . Furthermore, a second solution will correspond to
a second point of intersection. However, the line l is tangent to our quadric if p is a double point of
intersection. This happens if and only if the above equation has t = 0 as double solution, i.e. if and
only if
2 2
vx2 vy vy xp vx yp vy zp vz
2
+ 2 − 2 , 0 and + 2 − 2 = 0.
a b c a2 b c
How do we interpret the second condition? In the Euclidean setting this can also be interpreted as
x y z
saying that ( a2p , bp2 , − cp2 ) is perpendicular to the direction vector v of the line. In both cases, all lines
which are tangent to the surface and contain p, need to satisfy this equations, so the tangent plane is
xp
x xp
2
xp x yp y zp z
ya
Tp H1a,b,c :
p
2
· (y − yp ) = 0 ⇔ + − − 1 = 0.
bz a2 b2 c2
z
− 2p
zp
c
167
Geometry - first year [email protected]
This should help to see that, when the vector v = (vx , vy , vz ) satisfies equation (11.5), the line l is
parallel to a line contained in the cone Ca,b,c . It will therefore intersect H1a,b,c in at most one point.
Notice also that if l ⊆ Ca,b,c , it will not intersect H1a,b,c at all, but this cannot happen in our setting
because we chose the point p such that it lies both on l and on our quadric. In fact, the related ques-
tion ‘does a given line l intersect H1a,b,c ?’ can be answered by investigating equation (11.3) without
the assumption that p lies on the surface H1a,b,c . How would you do this?
One way to see where the two families of lines come from is to rearrange Equation (11.2):
x2 y 2 z2 x2 z2 y2 x z x z y y
+ − = 1 ⇔ − = 1 − ⇔ − + = 1 − 1 + (11.6)
a2 b2 c2 a2 c 2 b2 a c a c b b
Now, assume that the factors in the last equation are not 0, then we can divide to obtain
x y
a − zc 1+ b µ
⇔ y = x z =
1− b a+c λ
for some parameters λ and µ. We introduced these parameters in order to separate the above equa-
tion:
y
x z
λ a − c = µ 1 − b
⇔ lλ,µ : .
µ xa + zc = λ 1 + yb
168
Geometry - first year [email protected]
What we end up with is a system of two equations, which are linear in x, y, z and which depend on
the parameters λ and µ. For each fixed pair of parameters, λ and µ, we get a line which we denote
with lλ,µ . Reading the above deduction backwards it is easy to see that all points on such a line satisfy
the equation of H1a,b,c . So, we have a family of lines contained in your hyperboloid.
We assumed that the factors in (11.6) are not zero. In fact, you only divide by two of them, so .. if
one of those two is zero, you can flip the above fraction and divide by the other two. That will lead
to the same family of lines L1 = {lλ,µ : λ, µ ∈ R, λ2 + µ2 , 0}.
OK, what about L2 ? The second family of generators (these lines are called generators), is ob-
tained if you group the terms differently:
x y
a − zc 1− µ
y = x bz = .
1+ b a + c λ
As above, one can check that points on these lines satisfy the equation of our hyperboloid.
One important thing to notice is that, although we write down two parameters, λ and µ, we don’t
necessarily get distinct lines for distinct parameters: lλ,µ = ltλ,tµ for any nonzero scalar t. So, in fact,
L1 depends on one parameter. More concretely
y y
x z
a− c = α 1 − b 0 = 1− b
[
L1 = l : l :
α y ∞
α xa + zc = 1 + b xa + zc = 0
for θ1 ∈ [0, 2π[ and θ2 ∈ R. Why? The parameter θ1 is used to rotate on ellipses the curve obtained for
θ1 = 0. What is this curve that we ‘rotate’? Check this and deduce a parametrization of the tangent
plane Tp H1a,b,c at the point
x(θ1,p , θ2,p )
p = y(θ1,p , θ2,p ) ∈ H1a,b,c
z(θ1,p , θ2,p )
where θ1,p and θ2,p are the parameters of the point p, i.e. p = p(x(θ1,p , θ2,p ), y(θ1,p , θ2,p ), z(θ1,p , θ2,p )).
169
Geometry - first year [email protected]
If we fix one variable, we obtain intersections with planes parallel to the coordinate axes. For this
surface the intersections are either ellipses, hyperbolas or the empty set. Check this for y = h and
deduce the axes of the hyperbolas that you obtain.
Recall that the tangent plane Tp H2a,b,c is the union of all lines intersecting the quadric H2a,b,c at p in a
special way, i.e. in a double point. This is why we investigate the intersection of our quadric with l.
How do we obtain the intersection H2a,b,c ∩ l? We look at those points of l which satisfy the equation
170
Geometry - first year [email protected]
v2
x
vy2 vy2 2 xp vx yp vy zp vz xp2 yp2 zp2
+ − t + 2 + 2 − 2 t + 2 + 2 − 2 +1 = 0
a2 b2 c2 a2 b c a b c
| {z }
=−1
v2
x
vy2
vy2 2 xp vx yp vy zp vz
⇔ + 2 − 2 t + 2 2 + 2 − 2 t = 0. (11.7)
a2b c a b c
| {z }
=0?
This equation in t admits the solution t = 0. That is clear, the point on l corresponding to t = 0 is
the point p which, by assumption, lies on H2a,b,c . Furthermore, a second solution will correspond to
a second point of intersection. However, the line l is tangent to our quadric if p is a double point of
intersection. This happens if and only if the above equation has t = 0 as double solution, i.e. if and
only if
2 2
vx2 vy vy xp vx yp vy zp vz
2
+ 2
− 2
, 0 and + 2 − 2 = 0.
a b c a2 b c
How do we interpret the second condition? Similar to the case of the hyperboloid of one sheet:
vx
a2 x xp xp x yp y zp z
2 v
Tp Ha,b,c : by2 · (y − yp ) = 0 ⇔ + − + 1 = 0.
vx a2 b2 z2
− c2 z zp
then equation (11.7) is linear, and has one simple solution t = 0. This means that l intersects H2a,b,c
only once (it punctures the surface in one point). Such lines are not tangent to the surface. How can
we visualize this? Let us start with what we know: the vector v = (vx , vy , vz ) satisfies the equation
(11.8), so we can think of it as the position vector of some point on the cone Ca,b,c . How does this cone
relate to our hyperboloid? Our surface H2a,b,c is the union of hyperbolas and if we take the union of
all the asymptotes to these hyperbolas we get Ca,b,c (see Figure 11.5). So, when l is parallel to a line
contained in Ca,b,c , it will intersect H2a,b,c in at most one point. Notice also that if l ⊆ Ca,b,c , it will not
intersect H2a,b,c at all, but this cannot happen because we chose the point p such that it lies both on l
and on our quadric.
171
Geometry - first year [email protected]
for θ1 ∈ [0, 2π[, θ2 ∈ R and ε ∈ {±1}. Why? The parameter θ1 is used to ‘rotate’ on ellipses the curve
obtained for θ1 = 0. What is this curve? Check this and deduce a parametrization of the tangent
plane Tp H2a,b,c at the point
x(θ1,p , θ2,p )
p = y(θ1,p , θ2,p ) ∈ H2a,b,c
z(θ1,p , θ2,p )
where θ1,p and θ2,p are the parameters of the point p, i.e. p = p(x(θ1,p , θ2,p ), y(θ1,p , θ2,p ), z(θ1,p , θ2,p )).
Notice also that with σ2 we have a parametrization for each sheet of this hyperboloid, with ε = 1 we
get one sheet and with ε = −1 we get the other sheet. One should also be careful with where the
parameters live: for σ1 you want to choose θ2 in ] − ∞, −1] ∪ [1, ∞[ so that the square root is defined.
e x2 y 2 e
Pa,b : + − 2z = 0 so Pa,b = ϕ −1 (0)
a b
| {z }
ϕ(x,y,z)
172
Geometry - first year [email protected]
If we fix one variable, we obtain intersections with planes parallel to the coordinate axes. For this
surface the intersections are either ellipses, parabolas or the empty set. Check this for y = h and see
what parabolas you obtain.
v2
x
vy2 2 xp vx yp vy xp2 yp2
+ t +2 + − vz t + + − 2zp = 0
a b a b a b
| {z }
=0
v2 vy2 2
x
xp vx yp vy
⇔ + t +2 + − vz t = 0. (11.9)
a b a b
| {z }
,0
This equation in t admits the solution t = 0. That is clear, the point on l corresponding to t = 0 is
e
the point p which, by assumption, lies on Pa,b . Furthermore, a second solution will correspond to
a second point of intersection. However, the line l is tangent to our quadric if p is a double point of
intersection. This happens if and only if the above equation has t = 0 as double solution, i.e. if and
only if
xp vx yp vy
+ − vz = 0.
a b
173
Geometry - first year [email protected]
How do we interpret this condition? Similar to the quadrics treated in the previous sections, so
xp
a x xp xp x yp y
e y
Tp Pp,q : bp (y − yp ) = 0 ⇔ + − zp − z = 0.
a b
−1 z zp
e
Why? Check this and deduce a parametrization of the tangent plane Tp Pa,b at the point
x(θ1,p , θ2,p )
e
p = y(θ1,p , θ2,p ) ∈ Pa,b .
z(θ1,p , θ2,p )
where θ1,p and θ2,p are the parameters of the point p, i.e. p = p(x(θ1,p , θ2,p ), y(θ1,p , θ2,p ), z(θ1,p , θ2,p )).
174
Geometry - first year [email protected]
If we fix one variable, we obtain intersections with planes parallel to the coordinate axes. For
this surface the intersections are either parabolas, hyperbolas or two lines. Check this for y = h and
deduce the parabolas that you obtain.
h h
Recall that the tangent plane Tp Pa,b is the union of all lines intersecting the quadric Pa,b at p in a
special way, i.e. in a double point. This is why we investigate the intersection of our quadric with l.
h
How do we obtain the intersection Pa,b ∩ l? We look at those points of l which satisfy the equation of
our paraboloid, i.e. we look for solutions t for the equation
xp vx (xp + tvx )2 (yp + tvy )2
ϕ(yp + t vy ) = 0 ⇔ − − 2(zp + tvz ) = 0.
a b
zp vz
175
Geometry - first year [email protected]
v2
x
vy2 2 xp vx yp vy
⇔ − t +2 − − vz t = 0. (11.11)
a b a b
| {z }
=0?
This equation in t admits the solution t = 0. That is clear, the point on l corresponding to t = 0 is
h
the point p which, by assumption, lies on Pa,b . Furthermore, a second solution will correspond to
a second point of intersection. However, the line l is tangent to our quadric if p is a double point of
intersection. This happens if and only if the above equation has t = 0 as double solution, i.e. if and
only if
2
vx2 vy xp vx yp vy
− , 0 and − − vz = 0.
a b a b
How do we interpret the second condition? Similar to the quadrics treated in the previous sections,
so xp
a x xp x p x yp y
h y
Tp Pp,q : − bp · (y − yp ) = 0 ⇔ − − zp − z = 0.
a b
−1 z zp
176
Geometry - first year [email protected]
h
11.6.3 Pa,b as ruled surface
Here is another fact: the hyperbolic paraboloid is a doubly ruled surface (like the hyperboloid of one
sheet). In other words, there are two families of lines, L1 and L2 respectively, such that
[ [
h h
Pa,b = l and Pa,b = l.
l∈L1 l∈L2
Every point on this surface lies on one line in L1 and on one line in L2 . The generators containing
the saddle point (with our equation, this point is the origin of the coordinate system) are visible in
Figure 11.8a.
Again, one way to see where the two families of lines come from is to rearrange (11.10)
x2 y 2 x2 y 2
! !
x y x y
− − 2z = 0 ⇔ − = 2z ⇔ √ −√ √ + √ = 2z (11.13)
a b a b a b a b
Similar to the case of the hyperboloid of one sheet, we can introduce two parameters λ and µ, in
order to separate the above equation:
y
x
λ a − b = 2µz
√ √
⇔ lλ,µ : .
µ √x + √y = λ
a b
What we end up with is a system of two equations, which are linear in x, y, z and which depend on
the parameters λ and µ. For each fixed pair of parameters, λ and µ, we get a line which we denote
h
with lλ,µ . It is easy to check that all points on such a line satisfy the equation of Pa,b . This is the first
family of lines L1 = {lλ,µ : λ, µ not both zero}.
The second family of generators, L2 , is obtained if you group the terms differently:
y
λ √x + √ = 2µz
a
l˜λ,µ : b
y
.
x
µ √ −√ =λ
a b
As above, one can check that points on these lines satisfy the equation of our paraboloid.
Again, one important thing to notice is that, although we write down two parameters, λ and µ,
we don’t necessarily get distinct lines for distinct parameters: lλ,µ = ltλ,tµ for any nonzero scalar t. So,
in fact, L1 depends on one parameter. More concretely
y
x
a − b = 2αz 0
= 2z
√ √
[
L1 := lα : l :
∞ y
α √x + √y = 1 √xa + √ = 0
a
b
b
and similarly for L2 . You might have noticed that l∞ is one of the lines visible in Figure 11.8a, since
it lies in the plane z = 0. The other one, l˜∞ , belongs to the family L2 .
177
Geometry - first year [email protected]
h
Check this and deduce a parametrization of the tangent plane Tp Pa,b at the point
x(θ1,p , θ2,p )
h
p = y(θ1,p , θ2,p ) ∈ Pa,b .
z(θ1,p , θ2,p )
where θ1,p and θ2,p are the parameters of the point p, i.e. p = p(x(θ1,p , θ2,p ), y(θ1,p , θ2,p ), z(θ1,p , θ2,p )).
178
APPENDIX A
Axioms
‘That it was the Greeks who added the element of logical structure to geometry is virtually univer-
sally admitted today’ [6, p.47]. According to Aristotle1 , for Thales2 ‘the primary question was not
what do we know, but how we know it’ [6, Chapter 4]. The ancient Greeks revolutionized mathematics
through their systematic use of ordered sequences of logical deductions. In Euclid’s3 Elements these
deductions are grounded in what are considered self-evident truths – definitions and postulates.
A major revision of the fundamental assumptions underlying Euclidean geometry was carried
out by Hilbert4 in his Foundations of Geometry [14]. The axioms Hilbert proposed as the basis for
Euclidean geometry are presented below. Hilbert showed that these axioms are independent. They
are necessary and sufficient assumptions for describing the interactions between primitives (points,
lines, and planes). The purpose of any axiomatic treatment is to put all statements on a solid ground,
the axioms from which they can be deduced. As Blumenthal recounts, Hilbert remarked that ‘one
must be able to say “tables, chairs, beer-mugs” each time in place of “points, lines, planes” ’ [12,
p.208].
Remark. We use the notation [AB] for ‘the segment AB’. Moreover, Axiom III.5 does not appear in
[14] but it is needed for the transitivity of the congruence relation on angles.
I.1 For every two points A, B there exists a line a that contains each of the points A, B.
I.2 For every two points A, B there exists no more than one line that contains each of the points A,
B.
1 384–322 BC
2 c.626/623 – c.548/545 BC
3 ∼ 300 BC
4 1862 – 1943
179
Geometry - first year [email protected]
I.3 There exist at least two points on a line. There exist at least three points that do not lie on a
line.
I.4 For any three points A, B, C that do not lie on the same line there exists a plane α that contains
each of the points A, B, C. For every plane there exists a point which it contains.
I.5 For any three points A, B, C that do not lie on one and the same line there exists no more than
one plane that contains each of the three points A, B, C.
I.6 If two points A, B of a line a lie in a plane α then every point of a lies in the plane α.
I.7 If two planes α, β have a point A in common then they have at least one more point B in
common.
I.8 There exist at least four points which do not lie in a plane.
II.1 If a point B lies between a point A and a point C then the points A, B, C are three distinct points
of a line, and B then also lies between C and A.
II.2 For two points A and C, there always exists at least one point B on the line AC such that C lies
between A and B.
II.3 Of any three points on a line there exists no more than one that lies between the other two.
II.4 (Pasch’s Axiom) Let A, B, C be three points that do not lie on a line and let a be a line in the
plane ABC which does not meet any of the points A, B, C. If the line a passes through a point of
the segment [AB], it also passes through a point of the segment [AC], or through a point of the
segment [BC].
III.1 If A, B are two points on a line a, and A′ is a point on the same or on another line a′ then it is
always possible to find a point B′ on a given side of the line a′ through A′ such that the segment
[AB] is congruent or equal to the segment [A′ B′ ]. In symbols [AB] ≡ [A′ B′ ].
III.2 If a segment [A′ B′ ] and a segment [A′′ B′′ ] are congruent to the same segment [AB], then the seg-
ment [A′ B′ ] is also congruent to the segment [A′′ B′′ ], or briefly, if two segments are congruent
to a third one they are congruent to each other.
III.3 On the line a let [AB] and [BC] be two segments which except for B have no point in common.
Furthermore, on the same or on another line a′ let [A′ B′ ] and [B′ C ′ ] be two segments which
except for B′ also have no point in common. In that case, if [AB] ≡ [A′ B′ ] and [BC] ≡ [B′ C ′ ] then
[AC] ≡ [A′ C ′ ].
180
Geometry - first year [email protected]
III.4 Let ∡(h, k) be an angle in a plane α and a′ a line in a plane α ′ and let a definite side of a′ of α ′
be given. Let h′ be a ray on the line a′ that emanates from the point O′ . Then there exists in the
plane α ′ one and only one ray k ′ such that the angle ∡(h, k) is congruent or equal to the angle
∡(h′ , k ′ ) and at the same time all interior points of the angle ∡(h′ , k ′ ) lie on the given side of a′ .
Symbolically ∡(h, k) ≡ ∡(h′ , k ′ ). Every angle is congruent to itself, i.e., ∡(h, k) ≡ ∡(h, k) is always
true.
III.5 If an angle ∡(h′ , k ′ ) and an angle ∡(h′′ , k ′′ ) are congruent to the same angle ∡(h, k), then the
angle ∡(h′ , k ′ ) is also congruent to the angle ∡(h′′ , k ′′ ), or briefly, if two angles are congruent to
a third one they are congruent to each other.
III.6 If for two triangles ABC and A′ B′ C ′ the congruences AB ≡ A′ B′ , AC ≡ A′ C ′ , ∡BAC ≡ ∡B′ A′ C ′
hold, then the congruence ∡ABC ≡ ∡A′ B′ C ′ is also satisfied.
IV (Euclid’s Axiom). Let a be any line and A a point not on it. Then there is at most one line in the
plane, determined by a and A, that passes through A and does not intersect a.
V.1 (Axiom of measure or Archimedes’ Axiom). If [AB] and [CD] are any segments then there exists
a number n such that n segments [CD] constructed contiguously from A, along the ray from A
through B, will pass beyond the point B.
V.2 (Axiom of line completeness). An extension of a set of points on a line with its order and
congruence relations that would preserve the relations existing among the original elements
as well as the fundamental properties of line order and congruence that follows from Axioms
I-III, and from V.1 is impossible.
181
APPENDIX B
We are used to thinking about lengths of segments as positive real numbers R≥0 . We are also used to
drawing the field of real numbers R as a line with 0 represented by a point which separates positive
and negative numbers. In this section we describe a path which connects Hilbert’s Axioms with real
numbers using vectors. The zero vector corresponds to 0 ∈ R, but R also has an identity element 1
which needs a correspondent. For this, we need to make a choice.
Definition B.1. A unit segment is a non-trivial segment in E which we choose. Once a unit segment
[AB] is chosen the length |AB| is called the unit length and, equivalently the distance from A to B is
called unit distance (see Definition 1.3). Then, vectors having unit length are called unit vectors or
versors.
Definition B.2. For two points A and B, we denote by FAB the set of vectors represented with points
on the line AB. Observe that we have a partition
n −−−→ − o n −−−→
o n→ o
FAB = XY : X, Y ∈ AB and |XY ⟩ = |AB⟩ ∪ 0 ∪ XY : X, Y ∈ AB and |XY ⟩ = −|AB⟩ .
| {z } | {z }
F+AB F−AB
Proposition B.3. Let O be a point on a line AB. The following maps are bijections:
−−→
1. The map φO : AB → FAB defined by φO (P ) = OP ,
→− −−−→ −−−→
2. The map ψ+ : F+AB ∪ { 0 } → L defined by ψ+ ( XY ) = | XY |,
→− −−−→ −−−→
3. The map ψ− : F−AB ∪ { 0 } → L defined by ψ− ( XY ) = | XY |.
−−→ −−−→
Sketch of proof. For 1., if OP = OQ then P = Q by Lemma 1.4. For 2. and 3. let X, Y be two distinct
points on the line AB. By Lemma 1.4, there are exactly two points Z, Z ′ with O between them such
182
Geometry - first year [email protected]
−−−→ −−−→
that the segment [XY ] is congruent to [OZ] and to [Z ′ O]. Then OZ = − OZ ′ and exactly one of them
will have direction |AB⟩.
Definition B.4. Under the bijections in Proposition B.3 we identify F+AB ∪{0} with L and we call these
elements positive. The elements in F+AB are called strictly positive. We call the elements in F−AB ∪ {0}
negative and the ones in F−AB are called strictly negative. Notice that the involution in Definition 1.13
restricts to an involution −□ : FAB → FAB which interchanges positive and negative elements.
−−→ −−−→ −−→ −−−→
Definition B.5. We define an ordering on FAB . If AC , AD are positive, we let AC ≤ AD if and only
−−→ −−−→ −−→ −−−→
if [AC] ⊆ [AD]. If AC , AD are negative, we let AC ≤ AD if and only if [AD] ⊆ [AC].
Proposition B.6. The ordering in Definition B.5 is a total order. With this ordering and with addition
of vectors, FAB is an abelian totally ordered group and L is an ordered submonoid of FAB .
Sketch of proof. Since FAB is stable under addition, it follows from Proposition 1.18 that it is an
abelian group. The other claims follow by inspecting the definitions.
Definition B.7 (Multiplication of segments). Assume that a unit segment 1 was chosen. Let a and
b be two segments which represent the lengths x and y respectively. If x = 0 or y = 0 by definition
we have x · y = 0. If they are not both zero, let us cite the construction from [14, p.52]: “Lay off the
segments 1 and b from the vertex O on a side of a right angle. Then lay off the segment a on the other
side. Join the end points of the segment 1 and a with a line and through the end point of the segment
b draw a parallel to this line. It will delineate a segment c on the other side. This segment is then
called the product of the segments a by the segment b” and we write c = ab.
ab
O 1 b
This defines a multiplication on L which we identified with the positive elements F+AB ∪ {0}. The
product can be extended to the whole of FAB by requiring that x · y = (−x) · (−y) whenever x, y are
negative and that (−x) · y = −(x · y) = x · (−y) whenever x is negative and y is positive.
Proposition B.8. Assume that a unit segment was chosen. With addition of vector and the above
multiplication and ordering, FAB is a totally ordered field and L is an ordered submonoid of FAB
with respect to both addition and multiplication.
Sketch of proof. The sum and product defined above are shown to be associative, commutative and
distributive [14, §5]. Moreover it is easy to see that 1 is the neutral element for the product and it is
easy to construct inverses. The other claims follow by inspecting the definitions.
183
Geometry - first year [email protected]
The upshot of this section is the next theorem stating that the Axioms imply that FAB is isomor-
phic to R. Let us point out first that there are several ways of describing R. It can be described as
the ‘smallest complete ordered field’. Concretely, this description requires that R satisfies a certain
set of axioms (see for example [18, Chapter 1]). In order to show that it exists one can rely on several
constructions. The construction closest to our setting is the construction of R via Dedekind cuts (see
for example [18, Appendix to Chapter 1]).
Theorem B.9. For any unit segment [AB] there is a unique isomorphism φAB : FAB → R of ordered
→− −−→
fields mapping 0 to 0 and AB to 1. This isomorphism maps L to R≥0 .
Sketch of proof. By Proposition B.8, we have that FAB is an ordered field. Then, by Archimedes’ Ax-
iom (Axiom V.1) the field FAB contains Z as a subring. Thus, it contains Q as an ordered subfield. By
the Axiom of line completeness (Axiom V.2), the field FAB is a complete field. However, any com-
plete ordered field which containing Q as an ordered subfield is isomorphic to R through an order
preserving isomorphism [17, p.17].
Definition B.10. Assume that a unit segment [AB] was chosen. By Theorem B.9, we may identify
FAB with R which gives a multiplication of real numbers with vectors. More precisely we have a map
−1
□ · □ : R × FAB → FAB given by (r, a) 7→ r · a := φAB (r)a.
Proposition B.11. Assume that a unit segment and a point O have been chosen. For any point X and
−−−→ −−−→ −−−→ −−−→
any r ∈ R there is a unique point Y such that OY = r · OX . Moreover we have | OY | = |r| · | OX |.
Proposition B.12. In the setting of Definition B.10, for any vector a and any scalars x, y ∈ R, we have
(x + y) · a = x · a + y · a.
184
APPENDIX C
Let φ : V → W be a linear map between the vector spaces V and W . Let E = (e1 , . . . , en ) be a basis for
V and let F be a basis for W . In your Algebra course, you used the notation [φ]E,F for the matrix of
the linear map φ with respect to the bases E and F [9, Definition 3.4.1]. We will use the notation
MF ,E (φ) = [φ]E,F .
Notice that the indices E, F are reversed. Recall that this is the matrix whose columns are the compo-
nents of the φ(ei )’s with respect to the basis F :
↑ ... ↑
MF ,E (φ) = [φ(e1 )]F . . . [φ(en )]F .
↓ ... ↓
You have also learned [9, Theorem 3.4.8] that if ψ : W → U is another linear map, to some vector
space U with basis G, then
MG,E (ψ ◦ φ) = MG,F (ψ) · MF ,E (φ).
MF ,F (φ) = MF ,E (IdV ) · ME,E (φ) · ME,F (IdV ) = ME,F (IdV )−1 · ME,E (φ) · ME,F (IdV ).
So, the matrix of φ with respect to the basis F is obtained from the matrix of φ with respect to the
basis E by conjugating with the matrix ME,F (IdV ).
185
Geometry - first year [email protected]
Definition C.1. The matrix ME,F := ME,F (IdV ) is called the change of basis matrix from the basis F to
the basis E. It is the matrix whose columns are the components of the vectors in F with respect to E.
If F = (f1 , . . . , fn ) then
↑ ... ↑
ME,F = [f1 ]E . . . [fn ]E .
↓ ... ↓
186
APPENDIX D
Coordinate systems
The correspondence between Cartesian coordinates (x, y) and polar coordinates (r, ϕ) is as follows:
( ( p
x = r cos ϕ r = r(x, y) = x2 + y 2
,
y = r sin ϕ ϕ = f (x, y)
where, with r , 0,
arccos( xr )
(
if y ≥ 0
f (x, y) = x .
− arccos( r ) if y < 0
1 Image source: Wikipedia
187
Geometry - first year [email protected]
The correspondence between Cartesian coordinates (x, y) and polar coordinates (r, ϕ) is as follows:
p
x = r cos ϕ
r = r(x, y) = x2 + y 2
y = r sin ϕ , ϕ = f (x, y)
z=z
z=z
188
Geometry - first year [email protected]
The correspondence between Cartesian coordinates (x, y, z) and polar coordinates (r, ϕ, θ) is as fol-
lows: p
x = r cos ϕ sin θ
r = r(x, y, z) = x2 + y 2 + z2
y = r sin ϕ sin θ , ϕ = f (x, y)
z = r cos θ
θ = arccos( z )
r
where f is as in the previous Section D.1.
Where Areaor stands for the oriented area of a triangle. Similar, in higher dimensions, the barycentric
coordinates are give by oriented volume relative to an n-simplex.
189
APPENDIX E
Bundles of lines are useful when a point Q is given as the intersection of two lines, but its coor-
dinates are not known explicitly, and one wants to find the equation of a line passing through Q and
satisfying some other conditions. For example, the condition that it contains some point P distinct
from Q, or that it is parallel to a given line.
190
Geometry - first year [email protected]
Notice that there is redundancy in the two parameters λ, µ, meaning that we don’t have two
independent parameters here. If λ , 0 then one can divide the equation of ℓλ,µ by λ to obtain
Definition E.3. A reduced bundle is the set of all lines LQ passing through a common point Q from
which we remove one line, i.e. it is LQ \{ℓ2 } for some ℓ2 ∈ LQ . With the above notation and discussion,
it is the set n o
ℓ1,t : (a1 x + b1 y + c1 ) + t(a2 x + b2 y + c2 ) = 0 : t ∈ R .
The fact that we use one parameter instead of two, to describe almost all lines passing through
Q, simplifies calculations.
Definition E.4. Let v ∈ V2 . The set Lv of all lines in A2 with direction vector v is called an improper
bundle of lines, and v is called a direction vector of the bundle Lv .
The connection between bundles of lines and improper bundles of lines is best understood through
projective geometry, where the improper bundle of lines is the set of all lines intersecting in the same
point at infinity.
191
Geometry - first year [email protected]
Bundles of planes are useful when a line ℓ is given as the intersection of two planes (see Sub-
section 3.3.2) and one wants to find the equation of a plane containing ℓ and satisfying some other
conditions. For example, the condition that it contains some point P which does not belong to ℓ, or
that it is parallel to a given line.
As in the case of line bundles, there is redundancy in the two parameters λ, µ. If λ , 0 then one
can divide the equation of πλ,µ by λ to obtain
Definition E.7. A reduced bundle is the set of all planes Πℓ with axis ℓ from which we remove one
plane, i.e. it is Πℓ \ {π2 } for some π2 ∈ Πℓ . With the above notation and discussion, it is the set
n o
π1,t : (a1 x + b1 y + c1 z + d1 ) + t(a2 x + b2 y + c2 z + d2 ) = 0 : t ∈ R .
The fact that we use one parameter instead of two, to describe almost all planes containing ℓ,
simplifies calculations.
Definition E.8. Let W ⊆ V3 be a vector subspace of dimension 2. The set ΠW of all planes in A3
which admit W as direction space is called an improper bundle of planes, and W is called the vector
subspace associated to the bundle ΠW .
The connection between bundles of planes and improper bundles of planes is best understood
through projective geometry, where we can think of the improper bundle of planes as the set of all
planes intersecting in a line at infinity.
192
APPENDIX F
In this section we collect classical theorems in affine geometry. The proofs of the first three theorems
are vectorially (the theorem of Thales F.1, the affine version of Pappus’ theorem F.2 and the theorem
of Desargues F.3). We then introduce Lemma F.4 and Lemma F.4 which allow us to prove the next
four theorem by means of frames (Pappus’ hexagon theorem F.6, Newton-Gauss theorem F.7, and the
theorems of Menelaus F.8 and Ceva F.9).
It is important to point out that in the process of deducing the affine space structure from
Hilbert’s Axioms we made use of some of these results. Indeed, in [14], Theorem 40 is Pappus’
affine theorem F.2 and is a particular case of Pascal’s theorem F.11. This result was implicitly used
to define and derive properties of the multiplication of vectors with scalars. Therefore, proving the
following theorem after using results from [14] may introduce circular reasoning.
The value of the following proofs lies in the fact that we prove them starting from the notion
of an affine space only. Moreover, the proofs are such that they work not only for real affine spaces
but for affine spaces over other commutative fields. These latter affine spaces are outside the scope
of these notes. However, the proofs which allow this type of generalization are certainly of interest
since they capture the essence of the statements.
The following theorem asserts that the ratio in which parallel lines cut a transversal line ℓ does
not depend on ℓ but only on the parallel lines.
Theorem F.1 (Thales’1 intercept theorem). Let H, H ′ and H ′′ be three distinct parallel lines in the
affine plane A2 . Let ℓ1 and ℓ2 be two lines not parallel to H, H ′ and H ′′ . For i = 1, 2 let
Pi = ℓi ∩ H
Pi′ = ℓi ∩ H ′
Pi′′ = ℓi ∩ H ′′
1 c.626/623 – c.548/545 BC
193
Geometry - first year [email protected]
Proof. We follow the proof in [19]. If ℓ1 = ℓ2 the theorem is trivial. Suppose that ℓ1 , ℓ2 . Then, if
P1 = P2 the points P1′ and P2′ must be distinct. Thus, interchanging H with H ′ we may assume that
−−−−→
P1 , P2 . Now let v = P1 P2 and consider the vectors
−−−−→′ −−−−→′ −−−′ −→′ −−−−→
P2 P2 − P1 P1 = P1 P2 − P1 P2 = αv
−−−−→ −−−−→ −−−−−→ −−−−→
P2 P2′′ − P1 P1′′ = P1′′ P2′′ − P1 P2 = βv
where α and β are scalars.
−−−−→ −−−−→
If α = 0, from the first equation we have P2 P2′ = P1 P1′ . Since these are direction vectors of ℓ1
−−−−→ −−−−→
and ℓ2 it follows that the two lines are parallel. Since P2 P2′′ and P1 P1′′ are also direction vectors for
the two lines, we see from the second equation that βv is a direction vector for ℓ1 and ℓ2 . But v is a
direction vector for H which is not parallel to ℓ1 . Thus, β = 0. Moreover, in this case we have
−−−−→ −−−−→ −−−−→
P2 P2′′ = k2 P2 P2′ = k2 P1 P1′
−−−−→ −−−−→
P1 P1′′ = k1 P1 P1′
−−−−→ −−−−→
and since P2 P2′′ = P1 P1′′ it follows that k1 = k2 .
If α , 0, then
−−−−→ −−−−→ −−−−→ −−−−→
P2 P2′′ − P1 P1′′ = βv = α −1 β(αv) = α −1 β P2 P2′ − α −1 β P1 P1′ .
On the other hand,
−−−−→ −−−−→ −−−−→ −−−−→
P2 P2′′ − P1 P1′′ = k2 P2 P2′ − k1 P1 P1′ .
−−−−→ −−−−→
Now, since α , 0, the vectors P1 P1′ and P2 P2′ are not parallel, i.e. they are linearly independent.
Therefore, the last two equations imply k2 = α −1 β = k1 .
194
Geometry - first year [email protected]
Remark. Theorem F.1 can be generalize to the case where the three parallel lines are replaced by
parallel hyperpanes. The proof does not change if H, H ′ and H ′′ are replaced by hyperplanes.
Theorem F.2 (Pappus’2 affine theorem). Let H, H ′ be two distinct lines in the affine plane A2 . Let
P , Q, R ∈ H and P ′ , Q′ , R′ ∈ H ′ be distinct points, none of which lies at the intersection H ∩ H ′ . If
P Q′ ∥ P ′ Q and QR′ ∥ Q′ R then P R′ ∥ P ′ R.
Proof. We follow the proof in [19]. Suppose H and H ′ are not parallel, and let H ∩ H ′ = {O}. By
Theorem F.1, for some scalars h, k we have
−−−→′ −−−−→ −−−→ −−→
OP = k OQ′ , OQ = k OP , since P Q′ ∥ P ′ Q,
−−−−→′ −−−→ −−−→ −−−→
OQ = h OR′ , OR = h OQ , since QR′ ∥ Q′ R.
But then,
−−−→′ −−−→′ −−→ −−−−→ −−−→
P R = OR − OP = h−1 OQ′ − k −1 OQ
−−−→′ −−−→′ −−−→ −−−−→ −−−→
RP = OP − OR = k OQ′ − h OQ ,
−−−→ −−−→
and so RP ′ = hk P R′ , that is, RP ′ ∥ P R′ . If H ∥ H ′ , then
−−→ −−−′−→′
PQ = Q P , since P Q ∥ Q′ P ′ ,
−−−→ −−−−→
QR = R′ Q′ , since QR ∥ R′ Q′ ,
and so
−−→ −−→ −−−→ −−−′−→′ −−− −→ −−−−→
P R = P Q + QR = Q P + R′ Q′ = R′ P ′ .
Thus P R′ ∥ P ′ R.
2 c.290 – c.350
195
Geometry - first year [email protected]
Theorem F.3 (Desargues’3 theorem). Let A, B, C, A′ , B′ , C ′ ∈ A2 be points such that no three are collinear,
and such that AB ∥ A′ B′ , BC ∥ B′ C ′ and AC ∥ A′ C ′ . Then the three lines AA′ , BB′ and CC ′ are either
parallel or have a point in common.
Proof. We follow the proof in [19]. Suppose that AA′ , BB′ and CC ′ are not parallel. Then two of them
meet, and we may assume that AA′ ∩ BB′ = {O}. By Theorem F.1 applied to AB and A′ B′ we have
−−−→′ −−−→ −−−→′ −−→
OA = k OA , and OB = k OB
for some scalar k. Let {C ′′ } = OC ∩ A′ C ′ . By Theorem F.1, now applied to AC and A′ C ′ we have
−−−−→ −−−→
OC ′′ = k OC
−−−→ −−−→
since OA′ = k OA . On the other hand, putting {C ′′′ } = OC ∩ B′ C ′ , Theorem F.1 applied to the lines
BC and B′ C ′ implies
−−−−−→ −−−→
OC ′′′ = k OC
−−−→ −−→
since OB′ = k OB . Then, the last two equations imply that C ′′ = C ′′′ = C ′ , and so O, C and C ′ are
collinear.
ab(a′ − b′ ) a′ b′ (a − b)
!
, .
aa′ − bb′ aa′ − bb′
3 1591 – 1661
196
Geometry - first year [email protected]
Proof. A direction vector for ℓ1 is (−a, b′ ) and a direction vector for ℓ2 is (−b, a′ ). The two vectors are
linearly dependent if and only if
a b′
= ⇔ aa′ = bb′ ⇔ aa′ − bb′ = 0.
b a′
If ℓ1 and ℓ2 are not parallel, the intersection point is obtained by solving the system given by the
equations of the two lines.
Lemma F.5. Let K be a frame of A2 . Consider the points A(a, 0), B(b, 0), A′ (a′ , h′ ) and B′ (b′ , h′ ) with
a , b, a′ , b′ and h′ , 0. If the lines AB′ and A′ B are not parallel, they meet in the point with
coordinates
bb′ − aa′ h′ (b − a)
!
, .
b + b′ − a − a′ b + b′ − a − a′
Proof. A direction vector for AB′ is (b′ −a, h′ ) and a direction vector for A′ B is (a′ −b, h′ ). The equations
of the two lines are
x−a y x−b y
AB′ : ′ = and A′ B : ′ =
b − a h′ a − b h′
If ℓ1 and ℓ2 are not parallel, the intersection point is obtained by solving the system given by the
equations of the two lines.
197
Geometry - first year [email protected]
Theorem F.6 (Pappus’4 hexagon theorem). Let H, H ′ be two distinct lines in the affine plane A2 . Let
P , Q, R ∈ H and P ′ , Q′ , R′ ∈ H ′ be distinct points, none of which lies at the intersection H ∩H ′ . Assume
that P Q′ ∩ P ′ Q = {X}, P R′ ∩ P ′ R = {Y } and QR′ ∩ Q′ R = {Z}. Then the points X, Y , Z are collinear.
Proof. We first consider the case where H and H ′ are not parallel. Let H ∩ H ′ = {O} and choose a
frame K with the origin in O, with the first coordinate axis H and the second coordinate axis H ′ .
With respect to K the coordinates of the given points are as follows
Applying Lemma F.4 three times with the frame K, we obtain the coordinates of the intersection
points:
We may check the collinearity of these points by calculating a determinant. This amounts to an
algebraic calculation. We may simplify this calculation by noticing that the frame K can be choosen
such that p = p′ = 1. A calculation shows that
q(1−q′ ) q′ (1−q)
1−qq′ 1−qq′ 1 q(1 − q′ ) q′ (1 − q) 1 − qq′
r(1−r ′ ) r ′ (1−r) 1 ′ ′
1−rr ′ 1−rr ′ 1 = r(1 − r ) r (1 − r) 1 − rr ′ = 0.
(1 − qq′ )(1 − rr ′ )(qq′ − rr ′ )
qr(q′ −r ′ ) q′ r ′ (q−r)
1 qr(q − r ) q r (q − r) qq′ − rr ′
′ ′ ′ ′
qq′ −rr ′ qq′ −rr ′
Now consider the case where H and H ′ are parallel. Choose a frame K with origin P and with the
first coordinate axis H and the second coordinate axis the line P P ′ . With respect to K the coordinates
of the points are as follows
4 c.290 – c.350
198
Geometry - first year [email protected]
Applying Lemma F.5 three times with the frame K, we obtain the coordinates of the intersection
points:
qq′ h′ q rr ′ h′ r qq′ − rr ′ h′ (q − r)
! ! !
X , Y , Z ,
q + q′ q + q′ r + r′ r + r′ q + q′ − r − r ′ q + q′ − r − r ′
Similar to the previous case, the collinearity of these points can be checked by calculating a determi-
nant. It is easy to see that this determinant is zero:
qq′ q q + q′
h′
rr ′ r r + r′ = 0.
(q + q′ )(r + r ′ )(q + q′ − r − r ′ )
qq − rr q − r q + q′ − r − r ′
′ ′
Remark. The two theorems of Pappus can be unified in the context of projective geometry. If in
the previous theorem (Theorem F.6) we let the points X, Y , Z go to infinity, we obtain Theorem F.2.
Moreover, the union of the two lines H and H ′ in these theorems is a degenerate conic. Both these
theorems are particular cases of Pascal’s Theorem F.11.
199
Geometry - first year [email protected]
Theorem F.7 (Newton-Gauss line). A complete quadrilateral is the configuration of four lines, no
three of which pass through the same point. Let O, A, B, C, A′ , B′ be the intersection points of such
lines, i.e. the vertices of the complete quadrilateral, such that A lies between O and B and such that
A′ lies between O and B′ . The diagonals of this quadrilateral are the segments [OC], [AA′ ] and [BB′ ].
The midpoints of a complete quadrilateral are collinear.
−−−→ −−−→
Proof. Choose the coordinate frame with origin O and basis ( OA , OA′ ). Then the coordinates of the
points are O(0, 0), A(1, 0), A′ (0, 1), B(b, 0), B′ (0, b′ ) and by Lemma F.4
b(1 − b′ ) b′ (1 − b)
!
C , .
1 − bb′ 1 − bb′
b(1 − b′ ) b′ (1 − b) b b′
! !
1 1
X , , Y , , Z , .
2(1 − bb′ ) 2(1 − bb′ ) 2 2 2 2
The collinearity of the three points is equivalent to the vanishing of the following determinant:
b(1 − b′ ) b′ (1 − b) 1 − bb′
1
1 1 1 = 0.
4(1 − bb′ )
b b′ 1
The following two theorems give necessary and sufficient conditions for three points to be collinear
and, respectively, for three lines to be concurrent in terms of oriented ratios of the sides of a triangle.
In the context of projective geometry, Menelaus’ theorem and Ceva’s theorem may be seen as projec-
tive duals (see for example [2]). In order to state them, we need to introduce oriented ratios. Let P be
a point on the line AB. The oriented ratio (or signed ratio) AP : P A, in which P divides the segment
[AB], is the ordinary ratio |AP | : |P B| if P is between A and B and it is −|AP | : |P B| otherwise. Notice
that the signed ratio is defined by the equation
−−→ −−→
AP = AP : P B P B (F.1)
−−→ −−→
which is why it is sometimes written as AP / P B .
Theorem F.8 (Menelaus’5 Theorem). Let ABC be a triangle and ℓ a line which does not pass through
the vertices of the triangle. Let X, Y , Z be points on the lines BC, CA and AB respectively and consider
the signed ratios α = BX : XC, β = CY : Y A and γ = AZ : ZB. Then X, Y , Z are collinear if and only if
α · β · γ = −1.
5 c.70–c.130
200
Geometry - first year [email protected]
Proof. By Proposition 6.4, affine transformations preserve oriented ratios, i.e. oriented ratios are in-
−−→ −−→
dependent of the choice of the Cartesian frame. We choose a frame with origin A and basis ( AC , AZ ).
Consider the coordinates of the points in this frame: A(0, 0), B(0, b), C(1, 0), Y (y, 0), Z(0, 1). By Lemma
F.4, the point X ∈ BC lies on the line Y Z if and only if it has the following coordinates
!
y(1 − b) b(1 − y)
X , . (F.2)
1 − by 1 − by
Considering the components of the relevant vectors
−−→ y(1 − b) −−−→ y − 1
BX = (1, −b) and XC = (1, −b)
1 − by 1 − by
we see that (F.2) holds if and only if
y(1 − b)
β = BX : XC = . (F.3)
y −1
Considering the other vectors we have
−−→ −−→ 1
AZ = (0, 1), ZB = (b − 1)(0, 1) ⇒ γ = AZ : ZB = and
b−1
−−−→ −−→ y −1
CY = (y − 1)(1, 0), Y A = −y(1, 0) ⇒ α = CY : Y A = .
−y
It follows that (F.3) is equivalent to
1
β=− ⇔ α · β · γ = −1.
α·β
Theorem F.9 (Ceva’s6 Theorem). Let ABC be a triangle. Let E ∈ BC, F ∈ CA and G ∈ AB and consider
the signed ratios α = BE : EC, β = CF : FA and γ = AG : GB. The three lines AE, BF and CG are
concurrent if and only if
α · β · γ = 1.
6 1647-1734
201
Geometry - first year [email protected]
Proof. By Proposition 6.4, affine transformations preserve oriented ratios, i.e. oriented ratios are in-
−−−→ −−→
dependent of the choice of the Cartesian frame. We choose a frame with origin A and basis ( AG , AF ).
Consider the coordinates of the points in this frame: A(0, 0), B(0, b), C(c, 0), F(0, 1), G(1, 0). By Lemma
F.4, the intersection point P of the lines BF and CG has coordinates
!
b(1 − c) c(1 − b)
P , .
1 − bc 1 − bc
Then, the lines AE, BF, CG are concurrent if and only if P lies on AE, i.e. if and only if E is the
intersection of AP with BC. These two lines are described by the equations
b(1−c)
x y x = 1−bc t
BC : + = 1 and AP : .
b c y = c(1−b) t
1−bc
b−1
α= . (F.5)
c−1
Considering the other vectors we have
−−−→ −−→ 1
AG = (1, 0), GB = (b − 1)(1, 0) ⇒ γ = AG : GB =
b−1
202
Geometry - first year [email protected]
Remark. The lines AE, BF and CG are known as cevians. As is often the case with mathematical
discoveries, attributing a theorem to a single mathematician can be challenging. Today, we know
that a proof of Ceva’s theorem was discovered much earlier by Al-Mu’taman7 .
The intersection point of two cevians can be described vectorially as follows.
Proposition F.10. Let ABC be a triangle. Let G ∈ AB and F ∈ AC be two points distinct from the
vertices of the triangle and let P = BF ∩ CG. Consider the signed ratios λ = AG : GB and µ = AF : FC.
For any point O we have
−−−→ −−→ −−−→
−−→ OA + λ OB + µ OC
OP = . (F.6)
1+λ+µ
−−−→ −−→ −−→ −−→
Proof. By (F.1) we have AG = λ GB and AF = µ FC . Therefore
−−→
−−→ −−−→ −−→ −−→ GB −−−→ 1 + λ −−−→
AB = AG + GB = (1 + λ) GB = (1 + λ) −−−→ AG = AG .
AG λ
and
−−→
−−→ −−→ −−→ −−→ FC −−→ 1 + µ −−→
AC = AF + FC = (1 + µ) FC = (1 + µ) −−→ AF = AF .
AF µ
−−→ −−→
In other words, with respect to the frame with origin A and basis ( AB , AC ), we have the following
coordinates B(1, 0), C(0, 1), G(λ/(1 + λ), 0) and F(0, µ/(1 + µ)). Thus, by Lemma F.4, the intersection
point P has coordinates !
λ µ
P , .
1+λ+µ 1+λ+µ
Hence, for any point O we have
−−→ −−→ −−−→ −−→ −−−→ −−−→ −−→ −−−→
−−→ −−−→ −−→ −−−→ λ AB + µ AC −−−→ (λ + µ) AO + λ OB + µ OC OA + λ OB + µ OC
OP = OA + AP = OA + = OA + =
1+λ+µ 1+λ+µ 1+λ+µ
Remark. Equation (F.6) in the above proposition does not depend on the point O. The coefficients
of the three vectors are the barycentric coordinates of the point P relative to the triangle ABC (See
Section D.4). The above proposition can be applied to derive the barycentric coordinates of the
intersection point of different cevians, such as the medians, the altitudes, etc.
7 11th century
203
Geometry - first year [email protected]
Theorem F.11 (Pascal’s hexagrammum mysticum theorem). If all six vertices of a hexagon lie on a
conic section and the three pairs of opposite sides intersect, then the three points of intersection are
collinear.
Proof. In the case where the conic section is a circle, a proof of this theorem based on Menelaus’
theorem can be found in [8, §.3.8].
204
APPENDIX G
Definition G.1. Let V be a R-vector space and let φ : V → V be a linear map. A non-zero vector v ∈ V
is called an eigenvector of φ if there is a scalar λ ∈ R such that φ(v) = λv. The scalar λ is then called
the eigenvalue associated to the eigenvector v. The set of eigenvalues of φ is called the spectrum of φ.
For A ∈ Matn×n (R) an eigenvector of A is an eigenvector v ∈ Rn for the map φA : Rn → Rn defined
by A through φ(x) = Ax, and an eigenvalue of A is an eigenvalue for φA .
Proposition G.3. If v1 , v2 ∈ V are eigenvectors with the same eigenvalue λ, then for every c1 , c2 ∈ R
the vector c1 v1 + c2 v2 , if it is non-zero, is also an eigenvector with eigenvalue λ.
Definition G.4. From the above proposition it follows that for each λ ∈ R the set
is a vector subspace of V, called the eigenspace for the eigenvalue λ. For a matrix A ∈ Matn×n (R) the
eigenspace for the eigenvalue λ is defined to be the subspace Vλ (A) := Vλ (φA ) in Rn .
Proposition G.6. If every v ∈ V \ {0} is an eigenvector of φ then there exists λ ∈ R such that φ = λ IdV .
205
Geometry - first year [email protected]
206
Geometry - first year [email protected]
x1 0
has rank r < n and therefore has nontrivial solutions. The space of solutions is the eigenspace
Vλ (A).
• If the sum of the dimensions of the eigenspaces found by letting λ vary over the roots of PA is
equal to n then A is diagonalizable (by Theorem G.12).
• If dim(V) is odd, then the characteristic polynomial has odd degree and so has at least one real
root. Thus, every linear map V → V on an odd dimensional real vector space has at least one
eigenvalue, and so at least one eigenvector.
Definition G.14. Let V be a finite dimensional R-vector space and let φ : V → V be linear admitting
λ as eigenvalue. The number dim(Vλ (φ)) is called the geometric multiplicity of λ for φ. The algebraic
multiplicity of λ for φ is instead the multiplicity of λ as a root of the characteristic polynomial Pφ ; this
is denoted by hφ (λ).
that is, the geometric multiplicity is not larger than the algebraic multiplicity.
207
APPENDIX H
1. We prove Sylvester’s theorem (Theorem H.7) which shows that the properties in Proposition
4.15 define scalar products (Corollary H.10).
2. From the proof of Sylvester’s theorem one obtains a simple algorithm for the affine classification
of quadrics (see Chapter 10).
3. We prove the Spectral theorem (Theorem H.14) which is used in the isometric classification of
quadratic surfaces (see Chapter 10).
Bilinear forms are the natural context for the proofs of both theorems.
Definition H.1. A map φ : V × V → R is called a bilinear form if it is linear in both arguments, i.e.
φ(au + bv, w) = aφ(u, w) + bφ(v, w) and φ(u, av + bw) = aφ(u, v) + bφ(u, w). (H.1)
With respect to a basis B = (e1 , . . . , en ) of V, the values of a bilinear form φ are determined by its
values on the basis vectors. These values are the entries in the Gram matrix GB (φ) of φ relative to B,
which is defined by GB (φ) = φ(ei , ej ). This matrix determines φ since
208
Geometry - first year [email protected]
In particular, the rank of GB (φ) does not depend on B. It depends only on φ. We denote it by rank(φ)
and call it the rank of φ.
Definition H.2. A bilinear form φ : Vn × Vn → R is called symmetric if the Gram matrix GB (φ) is a
symmetric matrix with respect to a basis B. It follows from (H.2) that this definition doesn’t depend
on the basis B.
Definition H.3. Let φ be a bilinear form on the vector space V. The quadratic form associated to φ is
the map
qφ : V → R defined by qφ (v) = φ(v, v).
Proposition H.4. Let φ be a symmetric bilinear form on the vector space V. The quadratic form qφ
associated to φ satisfies
qφ (λv) = λ2 qφ (v)
2φ(v, w) = q(v + w) − q(v) − q(w)
for every λ ∈ R and every v, w ∈ V.
Proof. The first property is an immediate consequence of (H.1). Further,
qφ (v + w) − qφ (v) − qφ (w) = φ(v + w, v + w) − φ(v, v) − φ(w, w)
= φ(v, w) + φ(w, v)
= 2φ(v, w).
Remark. In particular, Proposition H.4 implies that the quadratic form qφ determines uniquely the
symmetric bilinear form φ, i.e. the correspondence φ ↔ qφ is a bijection between symmetric bilinear
forms and quadratic forms.
Definition H.5. Let φ be a symmetric bilinear form on the vector space V. A diagonalizing basis for
φ is a basis B of V such that GB (φ) is a diagonal matrix.
Theorem H.6. Let φ be a symmetric bilinear form on the vector space V. Then there exists a diago-
nalizing basis for φ.
Proof. We take the proof from [19, p.232]. The proof is ‘by completing the squares’ and induction on
n. If n = 1 there is nothing to prove. Suppose therefore that n ≥ 2, and that every symmetric bilinear
form on a space of dimension less than n has a diagonalizing basis. Choose a basis B = (b1 , . . . , bn )
of V. If φ is the zero form, then B is diagonalizing and there is nothing to prove. Otherwise we can
obtain from B a second basis B ′ = (c1 , . . . , cn ) for which φ(c1 , c1 ) , 0. Indeed, if there is some i for
which φ(bi , bi ) , 0 then it suffices to exchange b1 and bi . If, on the other hand, φ(bi , bi ) = 0 for all i,
then there are i , j for which φ(bi , bj ) , 0 (otherwise φ is the zero form), and again we can exchange
these with b1 and b2 so that φ(b1 , b2 ) , 0. The new basis
B ′ = (b1 + b2 , b2 , . . . , bn )
209
Geometry - first year [email protected]
has the required property. In the basis B ′ , the quadratic form qφ associated to φ has the form
n
X n
X
q(v(y1 , . . . , yn )) = h11 y12 + 2 h1i y1 yi + hij yi yj ,
i=2 i,j=2
where hij = φ(ci , cj ). Since h11 = φ(c1 , c1 ) , 0 we can rewrite the above equation as
n
2
X
q(v(y1 , . . . , yn )) = h11 y1 + h−1
11 h1i yi
+ (terms not involving y ).
1
i=2
which corresponds to a change of basis from B ′ to B ′′ = (d1 , . . . , dn ) given by d1 = c1 , and for i > 1,
di = ci − h−1
11 h1i c1 . In these coordinates, qφ has the form
Theorem H.7. (Sylvester’s1 Theorem) Let φ be a symmetric bilinear form of rank r on the vector
space V. Then there is an integer p depending only on φ, and a basis B of V such that the Gram
matrix GB (φ) has the form
Ip 0 0
0 −Ir−p 0 (H.3)
0 0 0
where 0 stands for zero matrices of appropriate sizes.
Proof. We take the proof from [19, p.232]. By Theorem H.6 there is a basis B = (b1 , . . . , bn ) for which
for each v = y1 b1 + · · · + yn bn ∈ V. The number of non-zero coefficients bii is equal to the rank of the
quadratic from qφ , and so, depends only on φ. After possibly reordering the basis, we can suppose
that the first p coefficients are positive, the next r − p are negative, and the remaining n − r are zero.
Then one has
210
Geometry - first year [email protected]
for appropriate β1 , . . . , βr ∈ R, which can be taken to be positive. Then, with respect to the basis
c1 = b1 /β1 , . . . , cr = br /βr , cr+1 = br+1 , . . . , cn = bn ,
the matrix of φ is that given in (H.3), an so the quadratic form qφ is
Theorem H.8. (Sylvester’s Theorem - matrix formulation) Let A be a symmetric matrix of rank r.
Then there is an integer p depending only on A, and an invertible matrix P such that
Ip 0 0
P T AP = 0 −Ir−p 0
0 0 0
where P = MB′ B . Since p depends only on φ, it does not depend on B and B ′ , i.e. it depends only on
A and the claim follows.
211
Geometry - first year [email protected]
Corollary H.9. Let φ be a positive definite symmetric bilinear form on an n-dimensional vector space
V. There is a basis B of V such that GB (φ) = In .
Proof. By Theorem H.7 there is a basis B = (b1 , . . . , bn ) such that the Gram matrix of φ has the form
(H.3). Since φ is positive definite, i.e. φ(v, v) > 0 for all non-zero vectors v, its Gram matrix must
equal the identity matrix In (otherwise ⟨bn , bn ⟩ equals 0 or −1, a contradiction).
Corollary H.10. Assume that the dimension of the space V of geometric vectors is n and assume that
a unit segment has been chosen. Then, the properties in Proposition 4.15 define the scalar product.
More concretely, there is a unique positive definite symmetric bilinear form φ which recognizes right
angles and unit lengths.
Proof. The first three properties in Proposition 4.15, assert that the scalar product is a positive defi-
nite symmetric bilinear form on the space Vn of geometric vectors. By Corollary H.9 there is a basis
such that the Gram matrix of the scalar product equals In . Therefore, by property (SP4), B is an
orthonormal basis.
Let φ be another positive definite symmetric bilinear form. Since φ is symmetric, the Gram
matrix GB (φ) is symmetric. If φ satisfies property (SP4), i.e. if it recognizes right angles and unit
lengths, then we must have GB (φ) = In since B is orthonormal. Hence φ is the scalar product.
We say that φ is the linear map associated to M in the basis B and ψ is the bilinear form associated to
M in the basis B. The other way around, given a linear map φ : Vn → Vn , it has an associated matrix
MB,B (φ) with respect to the bases B. Similarly, given a bilinear form ψ : Vn × Vn → R we associate to
it the Gram matrix GB (ψ) (see Section H.1). Then
MB ′ ,B ′ (φ) = M−1
B,B ′ · MB,B (φ) · MB,B ′ and GB ′ (ψ) = MTB,B ′ · GB (ψ) · MB,B ′ (H.4)
212
Geometry - first year [email protected]
In this setting we add the following assumptions: we consider the scalar product ⟨ , ⟩ on Vn , we
let B be an orthonormal basis and we assume that M is a symmetric matrix, i.e. M = M T . Then, we
let φ be the linear map associated to M in the basis B and we let ψ be the bilinear form associated to
M in the basis B. Then, since the gram matrix of the scalar product is In we have
ψ(v, w) = [v]TB · M · [w]B = ⟨v, M · [w]B ⟩ = ⟨v, φ(w)⟩.
Since M is symmetric, we also have
[v]TB · M · [w]B = (M T · [v]B )T · [w]B = (M · [v]B )T · [w]B
and therefore
⟨φ(v), w⟩ = ψ(v, w) = ⟨v, φ(w)⟩. (H.5)
Definition H.11. A linear map φ : Vn → Vn is called symmetric (relative to the scalar product) if (H.5)
holds for all v, w ∈ Vn .
The proof of the Spectral Theorem uses the concept of orthogonal complement to a vector.
Definition H.12. Let v ∈ Vn . The orthogonal complement, denoted by v⊥ , is the set of all vectors in Vn
which are orthogonal to v. So n o
v⊥ = w ∈ Vn : ⟨v, w⟩ = 0 .
Since the scalar product is bilinear, the map fv : Vn → R defined by f (w) = ⟨v, w⟩ is linear and we
notice that v⊥ = ker(fv ). Thus, if v is non-zero, v⊥ is an (n − 1)-dimensional vector subspace of Vn .
The proof of the Spectral Theorem also uses the following linear algebra fact.
Lemma H.13. The characteristic polynomial of a symmetric matrix M ∈ Matn×n (R) has only real
roots. Equivalently, the eigenvalues of a symmetric linear map are all real.
Proof. The equivalence of the two statements is obvious, by considering the linear map associated to
M. We take the proof of the first claim from [19, p.232].
We can consider M as a matrix over C (that is, with complex entries), and so we may view φ as a
linear map Cn → Cn . Let λ ∈ C be a root of the characteristic polynomial of M, and let x(x1 , . . . , xn ) ∈
Cn be a corresponding eigenvector. Then
Mx = λx.
Taking the complex conjugates of both sides gives
Mx = λx.
Consider the scalar xT Mx. Writing it in two different ways using the above equations gives
xT Mx = xT (Mx) = xT (λx) = λxT x (H.6)
xT Mx = (xT M)x = (Mx)T x = (λx)T x = λxT x (H.7)
Note that
xT x = x1 x1 + . . . xn xn
is a strictly positive real number, since x , 0. We can therefore deduce from (H.6) and (H.7) that
λ = λ, that is that λ is real.
213
Geometry - first year [email protected]
Theorem H.14 (Spectral Theorem). Let φ : Vn → Vn be a symmetric linear map. Then, there exists
an orthonormal basis B such that MB,B (φ) is a diagonal matrix.
Equivalently, let ψ : Vn ×Vn → R be a symmetric bilinear form. Then, there exists an orthonormal
basis B such that GB (ψ) is a diagonal matrix.
Proof. We take the proof of the first claim from [19, p.232]. The proof is by induction on n = dim(Vn ).
If n = 1 there is nothing to prove. Suppose therefore that n ≥ 2, and that the theorem holds for spaces
of dimension n−1. Since φ is symmetric, it has only real eigenvalues, by Lemma H.13. Thus φ has an
eigenvalue λ; let e1 be a corresponding eigenvector, which we can take to be of length 1. Let U = e⊥ 1,
the orthogonal complement to e1 . For each u ∈ U,
and so φ(u) ∈ U, that is φ induces a map φU : U → U. Since φU (u) = φ(u) for every u ∈ U, the map φ
is a symmetric linear map. By the inductive hypothesis, U has an orthonormal basis (e2 , . . . , en ) which
diagonalizes φU . Thus B = (e1 , e2 , . . . , en ) is an orthonormal basis of V which diagonalizes φ.
For the second claim we use (H.5). Let φ be the symmetric linear map associated to the Gram
matrix GB ′ (ψ) of ψ for some basis B ′ . Let B = (e1 , e2 , . . . , en ) be the diagonalizing orthonormal basis
for φ obtained in the first part of the proof. By construction, it consists of eigenvectors of φ. Then,
by (H.5), we have
(
λi if i = j
ψ(ei , ej ) = ⟨φ(ei ), ej ⟩ = ⟨λi ei , ej ⟩ = λi ⟨ei , ej ⟩ =
0 if i , j.
Theorem H.15 (Spectral Theorem - matrix formulation). Let M ∈ Matn×n (R) be a symmetric matrix.
There exists an orthogonal matrix P such that
λ1 0 . . . 0
0 λ2 . . . 0
−1 T
P MP = P MP = . .. ..
.. . .
0 0 ... λn
where λ1 , . . . , λn are the eigenvalues of φ, and therefore of M. Then we may choose P = MB ′ ,B since
P −1 MP = M−1
B ′ ,B MB ′ ,B ′ (φ) MB ′ ,B = MB,B (φ)
214
Geometry - first year [email protected]
by (H.4). Moreover, since B and B ′ are both orthogonal, it follows that P is an orthogonal matrix, i.e.
P −1 = P T and the proof is finished. Indeed, since B ′ is orthonormal and since P = MB ′ ,B is the matrix
with columns consisting of the components of the basis vectors in B (which is an orthonormal basis),
we have P T · P = 1, hence P −1 = P T .
215
APPENDIX I
Trigonometric functions
Let a and b be two vectors in V2 . In Section 4.1 we noticed that there is a bijection between the
1
set W of unoriented angles ∡(a, b) and the semicircle S 2 as well as a bijection between the set Wor
of oriented angles ∡or (a, b) and the unit circle S1 . It is clearly uncomfortable to do calculations in
this setting. For a more comfortable manipulation, one attaches numbers to angles (we parametrize
angles). The standard way to do this is to define a measure of an angle, a numerical value which we
may use instead of describing the angle as two rays emanating from a point. The standard measure
is that of radians, which can be introduced as follows
−−−→ −−→
Definition I.1. Let a = OA and b = OB be two non-zero vectors in V2 . The interior of the angle
−−→
∡AOB is the set of points P with the property that there are scalars α, β ≥ 0 such that OP = αa + βb.
The sector S(AOB) is the set of points P in the interior of the angle and in the interior of the circle of
radius 1 centered in O, i.e. d(O, P ) ≤ 1.
The measure of the angle ∡(a, b), denoted by m(∡(a, b)), is
2 · Area(S(AOB)) if a and b are linearly independent
m(∡(a, b)) = 0 if a and b have the same direction
π if a and b have opposite direction
where π is by definition the area of a unit circle. The definition does not depend on the representa-
tives.
Theorem I.2. The properties of the area function translate into properties of angles. For two non-
zero vectors a and b we have
1. 0 ≤ m(∡(a, b)) ≤ π
216
Geometry - first year [email protected]
⟨a,b⟩ ⟨c,d⟩
4. m(∡(a, b)) = m(∡(c, d)) if and only if |a|·|b|
= |c|·|d|
.
Definition I.3. The measure of the angle ∡or (a, b), denoted by m(∡or (a, b)), is
(
m(∡(a, b)) if [a, b] ≥ 0
m(∡or (a, b)) =
−m(∡(a, b)) if [a, b] < 0
Proposition I.4. The measure of an oriented angle satisfies the following properties
2. 0 < m(∡or (a, b)) < π if and only if (a, b) is right oriented.
π
3. m(∡or (a, J(a))) = 2
This allows us to view sin, cos, tan as functions (−π, π] → R. Notice also that if θ is an oriented angle
then the sign is no longer positive and the sine function changes the sign while the cosine function
doesn’t, i.e. we have the sign rules that we are familiar with:
In what follows we work out main properties of the trigonometric functions. The bijections (I.1)
allow us to simplify notation: we write ∡or (a, b) instead of m(∡or (a, b)).
Proposition I.5. If a and b are two non-zero vectors in the oriented Euclidean plane E2 , then
⟨a, b⟩ [a, b]
cos(θ) = and sin(θ) =
|a| · |b| |a| · |b|
217
Geometry - first year [email protected]
Proposition I.6. If a is a non-zero vector in the oriented Euclidean plane E2 , and if c2 + s2 = 1 then
there is a vector b such that
Instead of viewing ∡or (a, b) as a value in (−π, π] it is more convenient to view it as an equivalence
class of R modulo 2π, i.e. θ ∈ R represents the angle ∡or (a, b) if θ = ∡or (a, b) + 2kπ for some integer
k. With this convention we have.
Proposition I.7. If a, b, c are three non-zero vectors in the oriented Euclidean plane E2 , then
sin(α)
α
tan 2 = 1+cos(α)
α
2 cos2 2 = 1 + cos α
2 α
2 sin 2 = 1 − cos α
sin(θ) ≤ θ ≤ tan(θ).
Corollary I.13. The sine and cosine functions are differentiable and the derivatives are
218
APPENDIX J
In this section we collect classical theorems in Euclidean geometry. The theorems in Appendix F are
purely affine but they also belong to this list.
−−→ −−→
Theorem J.1 (Euclid’s first theorem). Let ABC be a rectangular triangle with BA ⊥ BC , and let
−−−→ −−→
H ∈ [AC] be such that BH ⊥ AC . Then
Theorem J.2 (Pythagoras’ theorem). Let the triangle ABC be rectangular in B. Then
Theorem J.3 (Theorem of heights). Let the triangle ABC be rectangular in B and let H be the foot of
the altitude on the leg [AC]. Then
Theorem J.4 (Thales’ circle theorem). If A, B, C are points on a circle, then [AC] is a diameter if and
only if ∡ABC is a right angle.
Theorem J.5 (Central angle theorem). Let ABC be a triangle and consider the circumcircle with
center O. We say that the angle ϕ = ∡(ACB) is subtended by the chord [AB] and that the angle
ψ = ∡AOB is the corresponding central angle. We have ψ = 2ϕ.
Theorem J.6 (Ptolemy’s theorem). A quadrilateral is called inscribed if its vertices lie on a circle. A
quadrilateral is inscribed if and only if the sum of the products of the lengths of its two pairs of
opposite sides is equal to the product of the lengths of its diagonals.
219
Geometry - first year [email protected]
Theorem J.7 (Euler’s line). The orthocenter H, the centroid G and the circumcenter U of a triangle
are collinear. Moreover show that
−−−→ −−−→
HG = 2 GU .
Theorem J.8 (Feuerbach Circle). In every triangle the three midpoints of the sides, the three base
points of the altitudes, and the midpoints of the three altitude sections touching the vertexes lie on
a circle. Moreover, the center of this circle lies on Euler’s line half way between the orthocenter and
the circumcenter.
220
APPENDIX K
Definition K.1. Denote the standard basis of R4 by 1, i, j, k and consider the bilinear map
· : R4 × R4 → R4
1 i j k
1 1 i j k
i i −1 k −j .
j j −k −1 i
k k j −i −1
We denote R4 with the above multiplication by H. The elements of H are called quaternions. The
product is the Hamilton product.
1. The multiplication map on arbitrary quaternions p = a1 +b1 i+c1 j+d1 k and q = a2 +b2 i+c2 j+d2 k
is
pq = (a1 a2 − b1 b2 − c1 c2 − d1 d2 ) + (a1 b2 + a2 b1 + c1 d2 − c2 d1 )i
(K.1)
+ (a1 c2 + a2 c1 − b1 d2 + b2 d1 )j + (a1 d2 + a2 d1 + b1 c2 − b2 c1 )k
221
Geometry - first year [email protected]
3. H is not commutative, i · j = k = −j · i.
5. C = R · 1 + R · i is a subfield of H.
Proposition K.3. A quaternion is real if and only if it commutes with all quaternions, i.e. the center
of H is R.
Proposition K.4. A quaternion is purely imaginary if and only if its square is real and non-positive.
q = a − bi − cj − dk = ℜ(q) − ℑ(q) ∈ H.
1. p + q = p + q
2. ap = ap
3. p = p
4. p · q = q · p
5. p ∈ R ⇔ p = p
6. p is purely imaginary ⇔ p = −p
7. ℜ(p) = 12 (p + p)
8. ℑ(p) = 12 (p − p)
Proposition K.7 (Compare this with the similar statements for C E2 ). For p, q ∈ H we have
2. ⟨p, p⟩ = pp
p
3. |p| = pp
222
Geometry - first year [email protected]
5. ⟨p, p⟩ = −p2
p
6. |p| = −p2
7. ⟨p, q⟩ = 0 ⇔ pq = −qp.
Definition K.8. With our identification, quaternions are vectors in V4 D(H) H and the norm |q|
1
of a quaternion q equals (q q) 2 . If |q| = 1 we say that q is a unit quaternion.
q
q−1 = .
|q|2
Proposition K.11. Let q1 , q2 be two quaternions with ai = ℜqi , vi = ℑqi . Making use of the scalar
product and the cross product in E3 , we have
p′ = qpq−1
where
θ θ
q = cos + sin v.
2 2
223
Bibliography
[1] C. Alsina, B. Nelsen, A mathematical space odyssey. Solid geometry in the 21st century. The
Dolciani Mathematical Expositions, 50. Mathematical Association of America, Washington, DC,
2015.
[2] J. Benı́tez, A unified proof of Ceva and Menelaus’ theorems using projective geometry. J. Geom.
Graph. 11 (2007), no. 1, 39–44.
[3] M. de Berg, O. Cheong, M. van Kreveld, M. Overmars, Computational geometry. Algorithms
and applications. Third edition. Springer-Verlag, Berlin, 2008.
[4] M. Berger, Geometry I. Translated from the 1977 French original by M. Cole and S. Levy. Fourth
printing of the 1987 English translation Universitext. Springer-Verlag, Berlin, 2009.
[5] P.A. Blaga, Geometrie liniară. Cu un ochi către grafica pe calculator, Vol. I, Presa Universitară
Clujeană, 2022.
[6] C.B. Boyer, A history of mathematics. Second edition. Edited and with a preface by Uta C.
Merzbach. John Wiley & Sons, Inc., New York, 1989.
[7] H.S.M. Coxeter, Regular polytopes. Methuen, London, 1948.
[8] H.S.M. Coxeter, S.L. Greitzer, Geometry revisited. New Mathematical Library, 19. Random
House, Inc., New York, 1967.
[9] S. Crivei, Basic Linear Algebra. Presa Universitară Clujeană, 2022.
[10] Euclid. The thirteen books of Euclid’s Elements translated from the text of Heiberg. Vol. I: In-
troduction and Books I, II. Vol. II: Books III–IX. Vol. III: Books X–XIII and Appendix. Translated
with introduction and commentary by Thomas L. Heath. 2nd ed. Dover Publications, Inc., New
York, 1956.
[11] A.E. Fekete, Real linear algebra. Monographs and Textbooks in Pure and Applied Mathematics,
91. Marcel Dekker, Inc., New York, 1985.
224
Geometry - first year [email protected]
[12] I. Grattan-Guinness, The search for mathematical roots, 1870–1940. Logics, set theories and
the foundations of mathematics from Cantor through Russell to Gödel. Princeton Paperbacks.
Princeton University Press, Princeton, NJ, 2000.
[14] D. Hilbert, Foundations of Geometry. Second English edition. Translated by Unger L. from 10th
ed. Revised and Enlarged by Bernays P. Open Court, Illinois, 1971.
[17] C.C. Pugh, Real mathematical analysis. Second edition. Undergraduate Texts in Mathematics.
Springer, Cham, 2015.
[18] W. Rudin, Principles of mathematical analysis. Third edition. International Series in Pure and
Applied Mathematics. McGraw-Hill Book Co., New York-Auckland-Düsseldorf, 1976.
[19] E. Sernesi, Linear Algebra. A geometric approach. Translated by Montaldi J., Chapman &
Hall/CRC, 1993.
[20] Handbook of discrete and computational geometry. Second edition. Edited by Jacob E. Good-
man and Joseph O’Rourke. Discrete Mathematics and its Applications (Boca Raton). Chapman
& Hall/CRC, Boca Raton, FL, 2004.
[22] The mathlib Community: The Lean mathematical library. In: Proceedings of the 9th ACM SIG-
PLAN International Conference on Certified Programs and Proofs. p. 367–381. CPP 2020, New
York, NY, USA (2020).
225