Introductory Lectures
Linear Algebra
Four Pillars of Machine Learning
Machine Learning
Dimensionality
Classification
Estimation
Regression
Reduction
Vector Calculus Density
Probability and Distributions Optimization
Linear Algebra Analytic Geometry Matrix Decomposition
Example from Machine Learning
Area Age Bedrooms Bathrooms Mainroad Parking Stories ….VM Price
Sample 1 x11 x12 x13 x14 x15 x16 x17 ….x1M y1
Sample 2 x21 x22 x23 x24 x25 x26 x27 ….x2M y2
Sample 3 x31 x32 x33 x34 x35 x36 x37 ….x3M y3
…. …. …. …. …. …. …. …. ….
…. …. …. …. …. …. …. …. ….
Sample N xN1 xN2 xN3 xN4 xN5 xN6 xN7 ….xNM yN
What should be the price of the house in the real estate market? (Go back to past data)
Clearly indicates a short of relation between the Price of the house and the M features
𝒚𝑵 = 𝒇(𝒙𝑵𝟏 , 𝒙𝑵𝟐 , 𝒙𝑵𝟑 , … , 𝒙𝑵𝑴 ) Task is to find !
Machine Learning - Set Up
A short of relation exists between the Price of the house and the M features
𝑵 𝑵𝟏 𝑵𝟐 𝑵𝟑 𝑵𝑴
What should be the structure of ?
can be a linear combination of the features:
𝟏 𝟏 𝟏𝟏 𝟐 𝟏𝟐 𝟑 𝟏𝟑 𝑴 𝟏𝑴
𝑵 𝟏 𝑵𝟏 𝟐 𝑵𝟐 𝟑 𝑵𝟑 𝑴 𝑵𝑴
The combination can also be a complex one
𝒚𝟏 = 𝒂𝟏 𝒙𝟏𝟏 + 𝒃𝟏𝟐 𝒙𝟏𝟏 𝒙𝟏𝟐 + 𝒂𝟑 𝒙𝟏𝟑 + 𝒃𝟏𝟑 𝒙𝟏𝟏 𝒙𝟏𝟑 + 𝒃𝟐𝟑 𝒙𝟏𝟑 𝒙𝟏𝟐 + 𝒄𝟏𝟐𝟑 𝒙𝟏𝟏 𝒙𝟏𝟐 𝒙𝟏𝟑 + ⋯ + 𝒄𝑴 𝒙𝟏𝑴
Connection to Linear Algebra
Sample 1 𝟏 𝟏 𝟏𝟏 𝟐 𝟏𝟐 𝟑 𝟏𝟑 𝑴 𝟏𝑴
Sample 2 𝟐 𝟏 𝟐𝟏 𝟐 𝟐𝟐 𝟑 𝟐𝟑 𝑴 𝟐𝑴
Sample 3 𝟑 𝟏 𝟑𝟏 𝟐 𝟑𝟐 𝟑 𝟑𝟑 𝑴 𝟑𝑴
Sample 4 𝟒 𝟏 𝟒𝟏 𝟐 𝟒𝟐 𝟑 𝟒𝟑 𝑴 𝟒𝑴
….
Sample N 𝑵 𝟏 𝑵𝟏 𝟐 𝑵𝟐 𝟑 𝑵𝟑 𝑴 𝑵𝑴
Goal: To find the variables 𝟏 𝑴 which best explains the relationship between the
and the
Connection to Linear Algebra
Matrix Vector Vector
x11 x12 x13 x14 x15 x16 x17 ….x1M y1
1
x21 x22 x23 x24 x25 x26 x27 ….x2M y2
2
x31 x32 x33 x34 x35 x36 x37 ….x3M = y3
3
…. …. …. …. …. …. …. …. …. ….
…. …. …. …. …. …. …. …. …. ….
xN1 xN2 xN3 xN4 xN5 xN6 xN7 ….xNM yN
𝑀
Connection to Linear Algebra
Linear Equation/Linear Transformation
: matrix
: vector Concept of Eigenvalues, Eigenvectors
: scalar
Geometric Vectors
Vectors are arrows pointing in space, where the length of the arrow represents the
magnitude, and the direction of the arrow represents the direction.
Y
In general approach Vectors are special objects
that can be added together and multiplied by
Weight (kg)
scalars to produce another object of the same (175,64)
kind. 𝒗
Any object that satisfies these two properties
can be considered a vector.
X
Height (cm)
Vector Operations – Scalar Multiplication
Multiplying a scalar and a vector
-0.5
and 0.5 times as long as
Vector Operations – Vector Addition
It is the operation of adding two or more vectors together into a vector sum.
𝑣⃗ + 𝑤
Vector addition is performed simply by adding the corresponding components of the vectors.
…, …, …
Vector Operation-Scalar/Dot/Inner Product
Used to quantify the similarity or alignment between two vectors, and results in a scalar.
…, …, …
Gives a measure of the alignment of the vectors.
• if the dot product is positive, the vectors are more aligned
• if it's negative, they are pointing in opposite directions
• if it's zero, they are orthogonal (perpendicular)
Can be used to calculate the angle between two vectors
Vector Operation-Scalar/Dot/Inner Product
The dot product between two vectors is based on the projection of one vector onto
another.
How much of is pointing in the same direction as ?
It depends on the direction of , not the magnitude
Define unit vector:
| |
Matrix
Collection of row vectors or Collection of column vectors
It is a rectangular array of numbers, symbols, or expressions arranged in rows and columns
m = n : square matrices
×
m ≠ n : rectangular matrices
The trace of a square matrix is the sum of its diagonal elements.
Matrix Addition and Multiplication
Sum of two matrices 𝐦×𝐧 𝐦×𝐧 is defined as the element-wise sum
Product of two matrices 𝐦×𝐧 𝐧×𝐤 results in 𝐦×𝒌 with the elements
Hadamard product of two matrices 𝐦×𝐧 𝐧×𝐤 is element-wise multiplication
with the elements .
Types of Matrices
Square Matrix: number of rows = number of columns
Symmetric Matrix: top-right triangle = bottom-left triangle
Triangular Matrix: Square matrix with all the values in the
upper-right or lower-left of the matrix and the remaining
elements filled with zero values
Diagonal Matrix: Identity Matrix:
Determinant
Square matrices are associated with the specific property, known as determinant of the
matrix, |X|
Properties of Determinant
The determinant, |X| is a real number. Provided |X| = 0, the matrix X is not invertible
If a matrix contains either a row of zeros or a column of zeros, the determinant equals
zero.
If either two rows or two columns are identical/linearly dependent, the determinant
equals zero.
Orthogonal Matrix
Two vectors are orthogonal if their dot product is 0,
Two vectors are orthonormal if their dot product is 0 and their lengths are both 1,
An orthogonal matrix is a square matrix whose rows are mutually orthonormal and
whose columns are mutually orthonormal, , which implies
Orthogonal Matrix
Transformations by orthogonal matrices are special because the length of a vector is
not changed when transforming it using an orthogonal matrix .
𝟐 𝑻 𝑻 𝑻 𝑻 𝑻 𝟐
The angle between any two vectors , as measured by their inner product, is also
unchanged when transforming both of them using an orthogonal matrix .
𝑻 𝑻 𝑻 𝑻
𝑻 𝑻 𝑻 𝑻
(rotation about the origin) (permutation of coordinate axes)
Matrix with a Vector
x11 x12 x13 x14 x15 x16 x17 ….x1M y1
1
x21 x22 x23 x24 x25 x26 x27 ….x2M y2
2
x31 x32 x33 x34 x35 x36 x37 ….x3M = y3
3
…. …. …. …. …. …. …. …. …. ….
…. …. …. …. …. …. …. …. …. ….
xN1 xN2 xN3 xN4 xN5 xN6 xN7 ….xNM yN
𝑀
Particular and General Solution
Two equations and four unknowns.
Generally expect infinitely many
solutions.
We can have a solution of the form - particular solution or special solution
We can also have the general solution of the form:
However, these are not
Particular solution All solutions to 𝑿 𝒂 = 𝟎
unique solutions.
to 𝑿 𝒂 = 𝒀
Matrix Method - I
= Unknown [Needs to be solved]
Only possible if is a square matrix and invertible.
Provided , the system of equations is consistent and has a unique solution.
𝑻 𝑻 𝑻 𝟏 𝑻
Moore-Penrose Pseudo-Inverse
Matrix Method – II (QR Decomposition)
Any matrix can be decomposed, to an orthogonal matrix 𝑻 𝟏 with
the size , and an upper triangle matrix with the size .
𝟏 𝟏 𝟏 𝟏 𝐓
Note that is invertible because a triangular matrix is invertible if its diagonal entries are
strictly positive.
Detour
x11 x12 x13 x14 x15 x16 x17 ….x1M y1
1
x21 x22 x23 x24 x25 x26 x27 ….x2M y2
2
x31 x32 x33 x34 x35 x36 x37 ….x3M = y3
3
…. …. …. …. …. …. …. …. …. ….
…. …. …. …. …. …. …. …. …. ….
xN1 xN2 xN3 xN4 xN5 xN6 xN7 ….xNM yN
𝑀
What should be the nature of and for which the solution to the above equation
or exists?
Vector Space
Vector Space
A vector space consists of a set of vectors (V) and a field of scalars (F), along with
defined operations for vector addition and scalar multiplication.
• Vector addition takes two vectors , and produces a third vector
• Scalar multiplication takes a scalar and vector produces a new vector
Vector Space
Axioms:
Associativity of vector addition: , for all
Existence of a zero vector: , for all
Existence of negatives: For every , there is , such that
Associativity of multiplication: , for any
Distributivity: , and for all
Unitarity:
Vector Space – e.g.
Euclidean Space ³ : Consider the three-dimensional Euclidean space, denoted as ³. This
space consists of all ordered triples of real numbers (x, y, z), where x, y, and z are real
numbers.
Vector Addition: (1, 2, 3) + (4, -1, 2) = (1 + 4, 2 + (-1), 3 + 2) = (5, 1, 5)
Scalar Multiplication: 3 * (2, -1, 0) = (3 * 2, 3 * (-1), 3 * 0) = (6, -3, 0)
Additive Identity: (1, 2, 3) + (0,0,0) = (1,2,3)
Additive Inverse: (1,2,3) + (-1,-2,-3) = (0,0,0) • We get a subset of ℝ³, where the
axioms of vector space no longer hold.
• Scalar Multiplication: 0 × (2, −1, 0) =
What will happen if we remove (0,0,0) from ³? (0,0,0) ∉ 𝑉
• Is there any subset of ℝ³ that still satisfies
the requirements of a vector space?
Subspace
A subspace is a non-empty subset of a vector space which still satisfies all the requirements
of a vector space.
Original Vector Space, V=
Note 0 vector by default is a subspace. 𝑣⃗ = (2,2)
Consider
For passing through the origin, all the multiples of ,
0, , 2 , …… form a subspace of V
Possible subspaces of R3
Independent Vectors
Independent Vectors are a set of vectors within a vector space that do not have any redundant
relationships among them.
No vector in the set can be expressed as a linear combination of the others.
For
Basis
It is a set of vectors that are linearly independent and span the entire vector space.
Any vector within the vector space can be represented as a unique linear combination of
the basis vectors.
Same vector space can have multiple
Basis of {(1,0) (0,1)} spans basis
Basis of {(1,0,0)(0,1,0) (0,0,1)} spans The number of vectors in a basis is
the dimension of the vector space.
Does {(1,1),(2,3)} form the basis of ? Every basis will have the same
Does {(1,1,1) (1, 2, 3) (3, 4, 1)} form the basis of ? number of vectors
Conclusion
• Data Representation: Linear algebra provides the foundation for representing data as
matrices and vectors
• Feature Engineering: Transformation of data and creation of new features often involve
matrix operations, like normalization and dimensionality reduction
• Model Representation: ML/DL models are often represented as systems of linear
equations or matrix-vector multiplications
Conclusions
Principal Component Analysis (PCA): PCA, a dimensionality reduction technique, uses
eigenvectors and eigenvalues from linear algebra to capture important features
Singular Value Decomposition (SVD): SVD is employed in data compression,
collaborative filtering, and image processing, enabling better understanding of data
structures
Optimization: Many optimization algorithms in ML/DL leverage linear algebra, like
gradient descent for model training