0% found this document useful (0 votes)

12 views4 pages

hw2 5

Uploaded by

explosion4601

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views4 pages

hw2 5

Uploaded by

explosion4601

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Mehryar Mohri

Foundations of Machine Learning 2014

Courant Institute of Mathematical Sciences
Homework assignment 2
October 3, 2014
Due: October 17, 2014

A. VC-dimension of axis-aligned squares or triangles

1. What is the VC-dimension of axis-aligned squares in the plane?

Solution: It is not hard to see that the set of 3 points with coordinates
(1, 0), (0, 1), and (−1, 0) can shattered by axis-aligned squares: e.g.,
to label positively two of these points, use a square defined by the
axes and with those to points as corners. Thus, the VC-dimension is
at least 3. No set of 4 points can be fully shattered. To see this, let
PT be the highest point, PB the lowest, PL the leftmost, and PR the
rightmost, assuming for now that these can be defined in a unique way
(no tie) – the cases where there are ties can be treated in a simpler
fashion. Assume without loss of generality that the difference dBT of
y-coordinates between PT and PB is greater than the difference dLR
of x-coordinates between PL and PR . Then, PT and PB cannot be
labeled positively while PL and PR are labeled negatively. Thus, the
VC-dimension of axis-aligned squares in the plane is 3.

2. Consider right triangles in the plane with the sides adjacent to the
right angle both parallel to the axes and with the right angle in the
lower left corner. What is the VC-dimension of this family?

Solution: It is not hard to see that the set of 3 points with coordinates
(0, 0), (−1, −1), (−2, −2), and (−3, −3) can shattered by such trian-
gles. To see that no five points can be shattered, the same example
or argument as for axis-aligned rectangles can be used: labeling all
points positively except from the one within the interior of the convex
hull is not possible (for the degnerate cases where no point is in the
interior of the convex hull is simpler, this is even easier to see). Thus,
the VC-dimension of this family of triangles is 4.

1
B. Growth function bound

1. Consider the family H of threshold functions over RN defined by {x =

(x1 , . . . , xN ) 7→ sgn(xi − θ) : i ∈ [1, N ], θ ∈ R}, where sgn(z) = +1 if
z ≥ 0, sgn(z) = −1 otherwise. Give an explicit upper bound on the
growth function ΠH (m) of H that is in O(mN ).

Solution: For each feature, there at most m + 1 ways of selecting

the threshold (between any two feature values or beyond or below all
values). Thus, the total number of thresholds functions for a sample
of size m is at most (m + 1)N . Thus, the growth function is upper
bounded by (m + 1)N .

2. In class, we gave a bound on the Rademacher complexity of a family

G in terms of the growth function (Lecture 3, slide 18). Show that
a finer upper bound on the Rademacher complexity can be given in
terms of ES [Π(G, S)], where Π(G, S) is the number of ways to label
the points in sample S.

Solution: Following the proof given in class and using Jensen’s in-
equality (at the last step), we can write:
" σ g(z ) #
1 ..1 .. 1
b m (G) = E sup
R . · .
S,σ g∈G m σm g(zm )
"√ p #
m 2 log |{(g(z1 ), . . . , g(zm )) : g ∈ G}|
≤E (Massart’s Lemma)
S m
"√ p #
m 2 log Π(G, S)
=E
S m
√ p r
m 2 log ES [Π(G, S)] 2 log ES [Π(G, S)]
≤ = .
m m

C. VC-dimension of neural networks

Let C be a concept class over Rr with VC-dimension d. A C-neural net-

work with one intermediate layer is a concept defined over Rn that can be
represented by a directed acyclic graph such as that of Figure 1, in which
the input nodes are those at the bottom and in which each other node is
labeled with a concept c ∈ C.

2
The output of the neural network for a given input vector (x1 , . . . , xn )
is obtained as follows. First, each of the n input nodes is labeled with the
corresponding value xi ∈ R. Next, the value at a node u in the higher layer
and labeled with c is obtained by applying c to the values of the input nodes
admitting an edge ending in u. Note that since c takes values in {0, 1}, the
value at u is in {0, 1}. The value at the top or output node is obtained
similarly by applying the corresponding concept to the values of the nodes
admitting an edge to the output node.

1. Let H denote the set of all neural networks defined as above with
k ≥ 2 internal nodes. Show that the growth function ΠH (m) can be
upper bounded in terms of the product of the growth functions of the
hypothesis sets defined at each intermediate layer.

Solution: Let Πu (m) denote the growth function at a node u in the

intermediate layer. For a fixed set of values at the intermediate layer,
using the concept class C the output
Q node can generate at most ΠC (m)
distinct labelings. There are u Πu (m) possible sets of values at the
intermediate layer since, by definition, for a sample of size m, at
most Πu (m)Q distinct values are possible at each u. Thus, at most
ΠC (m) × u Πu (m) labelings
Q can be generated by the neural network
and ΠH (m) ≤ ΠC (m) u Πu (m).

2. Use that to upper bound the VC-dimension of the C-neural networks

(hint: you can use the implication m = 2x log2 (xy) ⇒ m > x log2 (ym)
valid for m ≥ 1, and x, y > 0 with xy > 4).

Solution: For any intermediate node u, Πu (m) = ΠC (m). Thus,

d
ΠH (m) ≤ ΠC (m)k . By Sauer’s lemma, ΠC (m)k ≤ em d , thus ΠH (m) ≤
em dk

d . Let m = 2kd log2 (ek). In view of the inequality given by the
hint and ek > 4, this implies m > dk log2 em m > em dk .

d , that is 2 d
Thus, the VC-dimension of H is less than

2kd log2 (ek).

3. Let C be the
P family of concept classes defined by threshold functions
C = {sgn( rj=1 wj xj ) : w ∈ Rr }. Give an upper bound on the VC-
dimension of H in terms of k and r.

3
Figure 1: A neural network with one intermediate layer.

Solution: For threshold functions, the VC-dimension of C is r, thus,

the VC-dimension of H is upper bounded by

2kr log2 (ek).

F4-Chap 4-Simultaneous Equations
No ratings yet
F4-Chap 4-Simultaneous Equations
10 pages
Random Fully Connected Neural Networks As Perturbatively Solvable Hierarchies
No ratings yet
Random Fully Connected Neural Networks As Perturbatively Solvable Hierarchies
58 pages
Geometry of High-dimensional Space
No ratings yet
Geometry of High-dimensional Space
36 pages
Vapnik-Chervonenkis Dimension
No ratings yet
Vapnik-Chervonenkis Dimension
6 pages
vcdim
No ratings yet
vcdim
18 pages
Lecture27_vc
No ratings yet
Lecture27_vc
23 pages
Vmls Additional Exercises
No ratings yet
Vmls Additional Exercises
66 pages
SML_Lecture3
No ratings yet
SML_Lecture3
36 pages
Lect 26 PDF
No ratings yet
Lect 26 PDF
14 pages
VC_Dim
No ratings yet
VC_Dim
22 pages
Lecture16 VC
No ratings yet
Lecture16 VC
42 pages
Slides Lect 07
No ratings yet
Slides Lect 07
22 pages
MachineLearningMathematics
No ratings yet
MachineLearningMathematics
15 pages
MLSM Lecture3 190923
No ratings yet
MLSM Lecture3 190923
36 pages
VC-dim
No ratings yet
VC-dim
16 pages
2360_PDF_C06
No ratings yet
2360_PDF_C06
13 pages
ECS171: Machine Learning: Lecture 8: VC Dimension (LFD 2.2)
No ratings yet
ECS171: Machine Learning: Lecture 8: VC Dimension (LFD 2.2)
43 pages
05-vc-bound
No ratings yet
05-vc-bound
27 pages
17-612
No ratings yet
17-612
17 pages
The Bias Complexity Trade-Off: No Free Lunch Theorem, Error Decomposition
No ratings yet
The Bias Complexity Trade-Off: No Free Lunch Theorem, Error Decomposition
38 pages
SSR TSRPT
No ratings yet
SSR TSRPT
2 pages
AA1_Tema4
No ratings yet
AA1_Tema4
37 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
NP-hard Problems and Approximation Algorithms: 10.1 What Is The Class NP?
No ratings yet
NP-hard Problems and Approximation Algorithms: 10.1 What Is The Class NP?
29 pages
05 VC Theory
No ratings yet
05 VC Theory
11 pages
Problems Chap2
No ratings yet
Problems Chap2
26 pages
VC-dimension For Characterizing Classifiers
No ratings yet
VC-dimension For Characterizing Classifiers
40 pages
Self Reading - KNN - Notes
No ratings yet
Self Reading - KNN - Notes
7 pages
Thirteen 19240 PDF
No ratings yet
Thirteen 19240 PDF
17 pages
Solution11
No ratings yet
Solution11
4 pages
Ain Shams University Faculty of Engineering
No ratings yet
Ain Shams University Faculty of Engineering
8 pages
Homework For The Course "Advanced Learninig Models": 1 Neural Networks
No ratings yet
Homework For The Course "Advanced Learninig Models": 1 Neural Networks
10 pages
hw2 4
No ratings yet
hw2 4
3 pages
2360_PDF_C04
No ratings yet
2360_PDF_C04
18 pages
lect3
No ratings yet
lect3
4 pages
hw2 3
No ratings yet
hw2 3
3 pages
f11ms
No ratings yet
f11ms
4 pages
A Tutorial on ν-Support Vector Machines: 1 An Introductory Example
No ratings yet
A Tutorial on ν-Support Vector Machines: 1 An Introductory Example
29 pages
Blakers 2008
No ratings yet
Blakers 2008
9 pages
Montanari
No ratings yet
Montanari
10 pages
Theorist's Toolkit Lecture 9: High Dimensional Geometry (Continued) and VC-dimension
No ratings yet
Theorist's Toolkit Lecture 9: High Dimensional Geometry (Continued) and VC-dimension
8 pages
ORIGINS-OF-GEOMETRY
No ratings yet
ORIGINS-OF-GEOMETRY
20 pages
How Many Samples To Learn A Finite Class?
No ratings yet
How Many Samples To Learn A Finite Class?
4 pages
Assignment2 PDF
No ratings yet
Assignment2 PDF
2 pages
Paradigm E Math Consolidation Cheatsheet
No ratings yet
Paradigm E Math Consolidation Cheatsheet
7 pages
lec1-6
No ratings yet
lec1-6
92 pages
Worksheet PDF
No ratings yet
Worksheet PDF
76 pages
Problem Set 2
No ratings yet
Problem Set 2
4 pages
hw2 Sol
No ratings yet
hw2 Sol
3 pages
10-701/15-781 Machine Learning Mid-Term Exam Solution: Your Name
No ratings yet
10-701/15-781 Machine Learning Mid-Term Exam Solution: Your Name
12 pages
uet-awk
No ratings yet
uet-awk
26 pages
Roch Mmids Intro 5exercises
No ratings yet
Roch Mmids Intro 5exercises
9 pages
Dis11 Sol
No ratings yet
Dis11 Sol
5 pages
Mathematical Proficiency of Senior High School Students of Lemery National High School
100% (1)
Mathematical Proficiency of Senior High School Students of Lemery National High School
66 pages
hw2_red
No ratings yet
hw2_red
4 pages
Hw5 Solution
No ratings yet
Hw5 Solution
4 pages
CS246 Final Exam Solutions, Winter 2011
No ratings yet
CS246 Final Exam Solutions, Winter 2011
18 pages
lec02 (1)
No ratings yet
lec02 (1)
27 pages
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 3
No ratings yet
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 3
8 pages
ECE/CS 559 - Neural Networks Lecture Notes #8: Associative Memory and Hopfield Networks
No ratings yet
ECE/CS 559 - Neural Networks Lecture Notes #8: Associative Memory and Hopfield Networks
9 pages
Midterm 2008s Solution
No ratings yet
Midterm 2008s Solution
12 pages
Year 7 Maths - Algebra - Questions (Ch2)
No ratings yet
Year 7 Maths - Algebra - Questions (Ch2)
26 pages
Jpjc 9758 2023 Prelim p1
No ratings yet
Jpjc 9758 2023 Prelim p1
5 pages
Lecture2 Chapter1 - Unsigned Numbers, Subtraction of Unsigned Numbers Using Complements
No ratings yet
Lecture2 Chapter1 - Unsigned Numbers, Subtraction of Unsigned Numbers Using Complements
23 pages
Linear Algebra Advanced Assignment 4
No ratings yet
Linear Algebra Advanced Assignment 4
12 pages
MH 1301 Discrete Mathematics
No ratings yet
MH 1301 Discrete Mathematics
3 pages
The Universal Enveloping Algebra of The Witt Algebra Is Not Noetherian
No ratings yet
The Universal Enveloping Algebra of The Witt Algebra Is Not Noetherian
22 pages
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Handout 3969 3634
No ratings yet
Handout 3969 3634
6 pages
Numpy Session1
No ratings yet
Numpy Session1
1 page
2022 CS244 End Sem Soln
No ratings yet
2022 CS244 End Sem Soln
6 pages
BMATC101 Set 1
No ratings yet
BMATC101 Set 1
3 pages
Probability and Sampling
No ratings yet
Probability and Sampling
20 pages
June 2019 QP
No ratings yet
June 2019 QP
12 pages
Cyclic Codes: Review: EE 387, October 28, 2015 Notes 15, Page 1
No ratings yet
Cyclic Codes: Review: EE 387, October 28, 2015 Notes 15, Page 1
18 pages
Genetic Algorithms Versus Traditional Methods
No ratings yet
Genetic Algorithms Versus Traditional Methods
7 pages
RATIO and PROPORTION
No ratings yet
RATIO and PROPORTION
10 pages
Murugappa Polytechnic College, Chennai - 62: Prepared by Approved by
No ratings yet
Murugappa Polytechnic College, Chennai - 62: Prepared by Approved by
1 page
Kami Export - Ariana Valenzuela - IM - 2 - Chapter - 6 - Crossword
No ratings yet
Kami Export - Ariana Valenzuela - IM - 2 - Chapter - 6 - Crossword
1 page
MA5158 Unit I Section 6
No ratings yet
MA5158 Unit I Section 6
52 pages
Grade - 10 - Paper 1
No ratings yet
Grade - 10 - Paper 1
2 pages
Graduate Econometrics Lecture Notes - Michael Creel (414 Pages)
100% (1)
Graduate Econometrics Lecture Notes - Michael Creel (414 Pages)
414 pages
Modul 5 Indices & Logarithms
100% (10)
Modul 5 Indices & Logarithms
12 pages
Section 7 Wireframe and Surface Design: S7-1 V5 Fundamentals, Section 7, November 2002
No ratings yet
Section 7 Wireframe and Surface Design: S7-1 V5 Fundamentals, Section 7, November 2002
70 pages
Ages PDF
No ratings yet
Ages PDF
14 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
CBSE Class 12 Mathematics Syllabus 2022-23
No ratings yet
CBSE Class 12 Mathematics Syllabus 2022-23
6 pages
Applied Computational Fluid Dynamics I To
100% (6)
Applied Computational Fluid Dynamics I To
353 pages
Chapter 1 Ratio & Proportion
No ratings yet
Chapter 1 Ratio & Proportion
34 pages
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet

hw2 5

Uploaded by

hw2 5

Uploaded by

Mehryar Mohri

Foundations of Machine Learning 2014

A. VC-dimension of axis-aligned squares or triangles

1. What is the VC-dimension of axis-aligned squares in the plane?

1. Consider the family H of threshold functions over RN defined by {x =

Solution: For each feature, there at most m + 1 ways of selecting

2. In class, we gave a bound on the Rademacher complexity of a family

C. VC-dimension of neural networks

Let C be a concept class over Rr with VC-dimension d. A C-neural net-

Solution: Let Πu (m) denote the growth function at a node u in the

2. Use that to upper bound the VC-dimension of the C-neural networks

Solution: For any intermediate node u, Πu (m) = ΠC (m). Thus,

2kd log2 (ek).

Solution: For threshold functions, the VC-dimension of C is r, thus,

2kr log2 (ek).

You might also like