0% found this document useful (0 votes)

77 views5 pages

(MIT 18.656) Lecture 10 Notes

Lecture 10 notes from 18.656.

Uploaded by

winniethepooh2718

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views5 pages

(MIT 18.656) Lecture 10 Notes

Lecture 10 notes from 18.656.

Uploaded by

winniethepooh2718

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Lecture 10: Matrices Review

Isabella Zhu
11 March 2025

§1 Last Lecture Wrapup

We will wrap up the proof from lecture 9.

Theorem 1.1
Assume INC(k) with k equal to the sparsity of θ∗ (i.e. k = |θ∗ |0 ). Fix
p p
2τ = 8σ log(2d)/n + 8σ log(1/δ)/n.

Then, the MSE of the lasso estimator is at most

L σ 2 |θ∗ |0
2
MSE(Xθ̂ ) ≤ 32kτ ≲ log(d/δ)
n
Moreover,
|θ̂ − θ∗ |22 ≤ 2MSE(Xθ̂L )
all happening with probability at least 1 − δ.

Proof. For the five hundred millionth time, we start with the good ole basic inequality

|Xθ̂ − Xθ∗ |22 ≤ 2⟨ϵ, Xθ̂ − Xθ∗ ⟩ + 2nτ |θ∗ |1 − 2nτ |θ̂|1

We bound
2⟨ϵ, Xθ̂ − Xθ∗ ⟩ ≤ 2|XT ϵ|∞ · |θ̂ − θ∗ |1
We bound the highest column norm of X. We have
n
|Xj |22 = (XT X)jj ≤ n + ≤ 2n
32k
by the incoherence property. Therefore, we get
τ
2⟨ϵ, Xθ̂ − Xθ∗ ⟩ ≤ 2|XT ϵ|∞ · |θ̂ − θ∗ |1 ≤ 2 · 2n · · |θ̂ − θ∗ |1 = nτ |θ̂ − θ∗ |1
4
To summarize, we’ve proved so far that

|Xθ̂ − Xθ∗ |22 ≤ nτ |θ̂ − θ∗ |1 + 2nτ |θ∗ |1 − 2nτ |θ̂|1

1
Isabella Zhu — 11 March 2025 Lecture 10: Matrices Review

We add nτ |θ̂ − θ∗ |1 on both sides.

|Xθ̂ − Xθ∗ |22 + nτ |θ̂ − θ∗ |1 ≤ 2nτ |θ̂ − θ∗ |1 + 2nτ |θ∗ |1 − 2nτ |θ̂|1

Now we take the support S into account. We have

|θ̂|1 = |θ̂S |1 + |θ̂S c |1 =⇒ |θ̂ − θ∗ |1 − |θ̂|1 = |θ̂S − θ∗ |1 − |θ̂S |1 .

Putting it together,
h i
|Xθ̂ − Xθ∗ |22 + |Xθ̂ − Xθ∗ |22 ≤ 2nτ |θ̂S − θ |1 + |θ |1 − |θ̂|S ≤ 4nτ |θ̂S − θ∗ |1
∗ ∗

We have that

|θ̂ − θ∗ |1 ≤ 4|θ̂S − θ∗ |1 ⇔ |θ̂S c − θS∗ c | ≤ 3|θ̂S − θS∗ |

which is exactly the cone condition! Everything below this is kinda suspicious because
I was playing squardle instead of paying attention. So for our lower bound, we get

2|X(θ̂ − θ∗ )|22
≥ |θ̂ − θ∗ |22
n
By Cauchy,

√ √
r
∗ 2k
|θ̂S − θ |1 ≤ k|θ̂s − θ |2 ≤ k||θ̂ − θ∗ |2 ≤
∗
|Xθ̂ − Xθ∗ |2
n
Therefore, we get r
2k
|Xθ̂ − Xθ∗ |22 ≤ 4nτ
|Xθ̂ − Xθ∗ |2
n
from which we divide and square to get the desired result.

§2 Matrix Estimation
We will go over some linear algebra ”basics” which need to be known for later lectures.
Apparently this lecture will be ”boring to death” (not my words).

§2.1 SubGaussian Sequence Model

Our subGaussian sequence model is of the form Y = θ∗ + ϵ ∈ Rd . We can make this a
matrix problem by just reshaping each vector into a matrix.

If θ∗ is sparse, then we can just use θ̂HARD , so we aren’t utilizing matrix properties.

§2.2 An Aside: Netflix Prize 2006

Aka how Netflix got half the academic community to work for them for free. The problem
is the following: consider matrix M , with n users and m movies, such that Mi,j is how
the ith person rated the jth movie.

2
Isabella Zhu — 11 March 2025 Lecture 10: Matrices Review

Clearly, the matrix is very sparse. In fact, only 1% was filled. The goal was the
fill the rest of the matrix.

§2.2.1 A Simple Model

Consider where Mij only has two effects: user and movie. So,

Mij = ui · vj + noise.

For the simple model, we reduce the number of parameters from nm to n + m.

M = uv T + noise

The rank of uv T is 1. More generally, if the rank of M is r, we can write as

r
X
M= u(j) v (j)T
j=1

§3 Matrix Redux
§3.1 Eigenvalues and Eigenvectors
Square matrix A ∈ Rn×n . Defines eigenvalue and eigenvector Au = λu.
Fact 3.1. If A is symmetric, then all eigenvalues are real.
In this class, we will assume that all eigenvectors have norm 1.
Fact 3.2. If u1 , . . . un eigenvectors of symmetric A, they can form an orthogonal basis
for column span of A. We will call this the eigenbasis.

§3.2 Singular Value Decomposition

Let A ∈ Rm×n . The SVD of A is A written as

A = U DV T , U ∈ Rm×r , V ∈ Rr×n , D ∈ Rr×r

where r is the rank of A, U T U = Ir , V T V = Ir , D is diagonal with positive entries.

This implies that u1 , u2 , . . . ∈ colspan(A) and v1T , v2T , . . . vnT ∈ rowspan(A).

The vector form of this is r

X
A= λj uj vjT
j=1

Remark 3.3. We have AAT uj = λ2j uj and AT Avj = λ2j vj .

Consider the special case when A is positive semidefinite. The eigenvalues are positive
and are equal to the singular values. U and V become the same matrix. In this case,

||A||op = maxm |Ax|2 = λmax (A)

x∈B2

3
Isabella Zhu — 11 March 2025 Lecture 10: Matrices Review

§3.3 Vector Norms and Inner Products

Let A and B be matrices. The q-norm is defined as
!1/q
X
|A|q = |Aij |q
ij

Remark 3.4. Notep that |A|∞ =pmax |Aij | and |A|0 is the number of nonzero entries. We
also have |A|2 = T r(AT A) = T r(AAT ) = ||A||F .

Then we can define the inner product

⟨A, B⟩ = T r(AT B) = T r(AB T )

§3.4 Spectral Norms

Let A have singular values λ1 , . . . , λr . Consider vector λ = (λ1 , . . . , λr ). The Schatten
q-norm is defined as
||A||q = |λ|q
When q = 2, we have
||A||22 = |λ|22 = ||A||2F = |A|22
which can be derived trivially by plugging in SVD into T r(AT A).

When q = 1, we call this the nuclear/trace norm.

X
||A||1 = |λ|1 = λj = ||A||A

§3.5 Matrix Inequalities

Let A and B be positive semidefinite. Order their eigenvalues in decreasing order.

Theorem 3.5
Weyl. We have
max |λj (A) − λj (B)| ≤ ||A − B||op
j

Theorem 3.6
Hoffman-Wielaudt. We have
X
|λj (A) − λj (B)|2 ≤ ||A − B||2F
j

Theorem 3.7
1 1
Holder. We have for p
+ q
= 1,

⟨A, B⟩ ≤ ||A||p ||B||q

4
Isabella Zhu — 11 March 2025 Lecture 10: Matrices Review

§3.6 Eckert-Young
Also known as best rank-k approximation.

Lemma 3.8
Let matrix A be of rank r. Look at SVD A = rj=1 λj uj vjT and assume singular
P
values are in decreasing order. For any k ≤ r, define the truncated SVD
k
X
A= λj uj vjT
j=1

This matrix has rank k. Then, we have

r
X
||A − Ak ||2F = inf ||A − B||2F = λ2j
rank(B)≤k
j=k+1

Numerical Linear Algebra Course Overview
No ratings yet
Numerical Linear Algebra Course Overview
66 pages
Graduate Numerical Linear Algebra
No ratings yet
Graduate Numerical Linear Algebra
66 pages
2 - Numerical Methods For Solving Linear Systems of Equations
No ratings yet
2 - Numerical Methods For Solving Linear Systems of Equations
35 pages
Ecd 01
No ratings yet
Ecd 01
16 pages
Lecture 2: Background: - Linear Algebra
No ratings yet
Lecture 2: Background: - Linear Algebra
36 pages
Linear Algebra II
No ratings yet
Linear Algebra II
97 pages
Linear Algebra & Convex Optimization
No ratings yet
Linear Algebra & Convex Optimization
16 pages
Chapter1 - Numerical Analysis II 2023-2024
No ratings yet
Chapter1 - Numerical Analysis II 2023-2024
30 pages
Machine Learning Matrix Methods
No ratings yet
Machine Learning Matrix Methods
25 pages
斯坦福大学机器学习数学基础 9-16
No ratings yet
斯坦福大学机器学习数学基础 9-16
8 pages
Nonlinear Optimization Course Overview
No ratings yet
Nonlinear Optimization Course Overview
11 pages
Svdnotes
No ratings yet
Svdnotes
10 pages
Lin Al Rev
No ratings yet
Lin Al Rev
7 pages
Matrix Algebra Solution
No ratings yet
Matrix Algebra Solution
23 pages
Linear Algebra Essentials
No ratings yet
Linear Algebra Essentials
13 pages
72073931-8e00-4107-bdde-c19d4ec282cb
No ratings yet
72073931-8e00-4107-bdde-c19d4ec282cb
5 pages
Matrix 3
No ratings yet
Matrix 3
14 pages
The QR Algorithm: and Other Methods To Compute The Eigenvalues of Complex Matrices
No ratings yet
The QR Algorithm: and Other Methods To Compute The Eigenvalues of Complex Matrices
28 pages
Iterative Linear
No ratings yet
Iterative Linear
10 pages
Adsip Notes 4 - SVD
No ratings yet
Adsip Notes 4 - SVD
34 pages
Inner Products and Norms Lecture
No ratings yet
Inner Products and Norms Lecture
13 pages
Nielsen, Chuang - QCQI Chapter 2
100% (3)
Nielsen, Chuang - QCQI Chapter 2
19 pages
Lecture1 Slides
No ratings yet
Lecture1 Slides
26 pages
Selected Linear Algebra For Machine Learning
No ratings yet
Selected Linear Algebra For Machine Learning
30 pages
Chapter 4: Matrix Norms: ε-rank (also known as numerical rank), defined by
No ratings yet
Chapter 4: Matrix Norms: ε-rank (also known as numerical rank), defined by
16 pages
Matrix Norms and MATLAB Functions
No ratings yet
Matrix Norms and MATLAB Functions
16 pages
Matrix Analyisis
No ratings yet
Matrix Analyisis
23 pages
Matrix Analysis and Numerical Algorithms
No ratings yet
Matrix Analysis and Numerical Algorithms
115 pages
Advanced Linear Algebra Concepts
No ratings yet
Advanced Linear Algebra Concepts
42 pages
Chapter1 - II 2024-2025
No ratings yet
Chapter1 - II 2024-2025
35 pages
Linear Algebra
No ratings yet
Linear Algebra
6 pages
L02 Notes
No ratings yet
L02 Notes
6 pages
Linear Algebra Concepts & Methods
No ratings yet
Linear Algebra Concepts & Methods
2 pages
Algebraic Methods in Data Science: Lesson 3: Dan Garber
No ratings yet
Algebraic Methods in Data Science: Lesson 3: Dan Garber
14 pages
ComputationalMathematics - Chapter 2 PDF
No ratings yet
ComputationalMathematics - Chapter 2 PDF
29 pages
MIT18 065S18PSets
No ratings yet
MIT18 065S18PSets
36 pages
EE 520 Compressed Sensing Notes
No ratings yet
EE 520 Compressed Sensing Notes
18 pages
Solution Manual For Discrete Time Signal Processing 3 E 3rd Edition Alan V Oppenheim Ronald W Schafer
0% (1)
Solution Manual For Discrete Time Signal Processing 3 E 3rd Edition Alan V Oppenheim Ronald W Schafer
4 pages
Understanding Matrix Norms and Sensitivity
No ratings yet
Understanding Matrix Norms and Sensitivity
13 pages
Linear Algebra PSET Solutions
No ratings yet
Linear Algebra PSET Solutions
6 pages
01 - Lab Notes
No ratings yet
01 - Lab Notes
8 pages
Linear Algebra Cheat Sheet
100% (1)
Linear Algebra Cheat Sheet
2 pages
Matrix Norms and Inner Products
No ratings yet
Matrix Norms and Inner Products
15 pages
Vector and Matrix Norms Guide
No ratings yet
Vector and Matrix Norms Guide
42 pages
Spectral - Graph - Theory - 3
No ratings yet
Spectral - Graph - Theory - 3
27 pages
Machine Learning Homework Guide
No ratings yet
Machine Learning Homework Guide
15 pages
Linear Algebra Lecture Notes
No ratings yet
Linear Algebra Lecture Notes
208 pages
Matrix Norms
No ratings yet
Matrix Norms
6 pages
Linear Algebra & Singular Value Decomposition
No ratings yet
Linear Algebra & Singular Value Decomposition
5 pages
Lecture Week04 PDF
No ratings yet
Lecture Week04 PDF
9 pages
Matrix Completion
No ratings yet
Matrix Completion
43 pages
Numerical Linear Algebra Solution
No ratings yet
Numerical Linear Algebra Solution
55 pages
Quantum Mechanics Solutions
No ratings yet
Quantum Mechanics Solutions
6 pages
Matrix Norms for Math Students
No ratings yet
Matrix Norms for Math Students
10 pages
Eval Norms
No ratings yet
Eval Norms
49 pages
Iterative Methods & Matrix Norms
No ratings yet
Iterative Methods & Matrix Norms
14 pages
Math 5610 Fall 2018 Notes of 9/24/18 Review: The Significance of Orthogonal Matrices
No ratings yet
Math 5610 Fall 2018 Notes of 9/24/18 Review: The Significance of Orthogonal Matrices
16 pages
Solution 1
No ratings yet
Solution 1
9 pages
6s184 Diffusion Model Notes
No ratings yet
6s184 Diffusion Model Notes
51 pages
Factor Investing Currency
No ratings yet
Factor Investing Currency
107 pages
(MIT 18.656) Lecture 12 Notes
No ratings yet
(MIT 18.656) Lecture 12 Notes
3 pages
(MIT 18.656) Lecture 1 Notes
No ratings yet
(MIT 18.656) Lecture 1 Notes
4 pages
Pipe Design Handbook For Use With Special Hazard Fire Suppression Systems 2nd Edition Fssa Pdh-01 PDF Download
100% (1)
Pipe Design Handbook For Use With Special Hazard Fire Suppression Systems 2nd Edition Fssa Pdh-01 PDF Download
54 pages
Di01c00021 1
No ratings yet
Di01c00021 1
4 pages
Aisc Bolting and Welding 190606215451 PDF
No ratings yet
Aisc Bolting and Welding 190606215451 PDF
73 pages
Chapter 1
No ratings yet
Chapter 1
5 pages
1 s2.0 S0169809516301995 Main
No ratings yet
1 s2.0 S0169809516301995 Main
31 pages
A Schmidt Hammer
100% (1)
A Schmidt Hammer
3 pages
PHY 408 Eletromagnetic Theory
No ratings yet
PHY 408 Eletromagnetic Theory
22 pages
The Mathematical Shape of Big Science Data - Quanta Magazine
No ratings yet
The Mathematical Shape of Big Science Data - Quanta Magazine
14 pages
Exercise Sheet 5
No ratings yet
Exercise Sheet 5
3 pages
Maths Qns CK
No ratings yet
Maths Qns CK
76 pages
Year 9 Revision Worksheet
No ratings yet
Year 9 Revision Worksheet
41 pages
Annexure 1 B 3CX240 SQMM - 10 KV - IEC 60502-2 2014 - TUV
No ratings yet
Annexure 1 B 3CX240 SQMM - 10 KV - IEC 60502-2 2014 - TUV
19 pages
Notes On Complexity (Neil Theise) (Z-Library)
100% (3)
Notes On Complexity (Neil Theise) (Z-Library)
151 pages
Electromagnetic Lock: Application Scenario
No ratings yet
Electromagnetic Lock: Application Scenario
2 pages
Class of Mathematics II
No ratings yet
Class of Mathematics II
18 pages
Solar PV Integration Assessment in Nigeria
No ratings yet
Solar PV Integration Assessment in Nigeria
16 pages
Clause 14.3.6 NDT of Code of Practice For The Structural Use of Steel 2023
No ratings yet
Clause 14.3.6 NDT of Code of Practice For The Structural Use of Steel 2023
7 pages
Compression Strength Analysis of Hat Section
No ratings yet
Compression Strength Analysis of Hat Section
3 pages
New Holland E135BSR Hydraulic Excavator Service Repair Workshop Manual
No ratings yet
New Holland E135BSR Hydraulic Excavator Service Repair Workshop Manual
21 pages
SPC407 Course Outline
No ratings yet
SPC407 Course Outline
1 page
1.laser Cse 2024 Part 1
No ratings yet
1.laser Cse 2024 Part 1
7 pages
Chapter 4 - 2 Fluid Dynamics
No ratings yet
Chapter 4 - 2 Fluid Dynamics
12 pages
LC Pump Linearity & Dwell Volume Guide
No ratings yet
LC Pump Linearity & Dwell Volume Guide
4 pages
RR3-3269 Eo1352
No ratings yet
RR3-3269 Eo1352
2 pages
Physics MCQs for Class 9 Students
No ratings yet
Physics MCQs for Class 9 Students
20 pages
Streamline Computations Available in FEFLOW: H.-J. G. Diersch
No ratings yet
Streamline Computations Available in FEFLOW: H.-J. G. Diersch
8 pages
Current Calculation in Electrical Circuits
No ratings yet
Current Calculation in Electrical Circuits
27 pages
Importance of Statistics in Society
No ratings yet
Importance of Statistics in Society
3 pages
Design Strength in Steel Structures
No ratings yet
Design Strength in Steel Structures
40 pages
BEMO Brochure Doof - EN
No ratings yet
BEMO Brochure Doof - EN
52 pages

(MIT 18.656) Lecture 10 Notes

Uploaded by

(MIT 18.656) Lecture 10 Notes

Uploaded by

Lecture 10: Matrices Review

§1 Last Lecture Wrapup

Then, the MSE of the lasso estimator is at most

|Xθ̂ − Xθ∗ |22 ≤ nτ |θ̂ − θ∗ |1 + 2nτ |θ∗ |1 − 2nτ |θ̂|1

We add nτ |θ̂ − θ∗ |1 on both sides.

Now we take the support S into account. We have

|θ̂|1 = |θ̂S |1 + |θ̂S c |1 =⇒ |θ̂ − θ∗ |1 − |θ̂|1 = |θ̂S − θ∗ |1 − |θ̂S |1 .

|θ̂ − θ∗ |1 ≤ 4|θ̂S − θ∗ |1 ⇔ |θ̂S c − θS∗ c | ≤ 3|θ̂S − θS∗ |

§2.1 SubGaussian Sequence Model

§2.2 An Aside: Netflix Prize 2006

§2.2.1 A Simple Model

For the simple model, we reduce the number of parameters from nm to n + m.

The rank of uv T is 1. More generally, if the rank of M is r, we can write as

§3.2 Singular Value Decomposition

A = U DV T , U ∈ Rm×r , V ∈ Rr×n , D ∈ Rr×r

where r is the rank of A, U T U = Ir , V T V = Ir , D is diagonal with positive entries.

This implies that u1 , u2 , . . . ∈ colspan(A) and v1T , v2T , . . . vnT ∈ rowspan(A).

The vector form of this is r

Remark 3.3. We have AAT uj = λ2j uj and AT Avj = λ2j vj .

||A||op = maxm |Ax|2 = λmax (A)

§3.3 Vector Norms and Inner Products

Then we can define the inner product

⟨A, B⟩ = T r(AT B) = T r(AB T )

§3.4 Spectral Norms

When q = 1, we call this the nuclear/trace norm.

§3.5 Matrix Inequalities

⟨A, B⟩ ≤ ||A||p ||B||q

This matrix has rank k. Then, we have

You might also like