0% found this document useful (0 votes)
8 views

hw2 5

Uploaded by

explosion4601
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

hw2 5

Uploaded by

explosion4601
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Mehryar Mohri

Foundations of Machine Learning 2014


Courant Institute of Mathematical Sciences
Homework assignment 2
October 3, 2014
Due: October 17, 2014

A. VC-dimension of axis-aligned squares or triangles

1. What is the VC-dimension of axis-aligned squares in the plane?

Solution: It is not hard to see that the set of 3 points with coordinates
(1, 0), (0, 1), and (−1, 0) can shattered by axis-aligned squares: e.g.,
to label positively two of these points, use a square defined by the
axes and with those to points as corners. Thus, the VC-dimension is
at least 3. No set of 4 points can be fully shattered. To see this, let
PT be the highest point, PB the lowest, PL the leftmost, and PR the
rightmost, assuming for now that these can be defined in a unique way
(no tie) – the cases where there are ties can be treated in a simpler
fashion. Assume without loss of generality that the difference dBT of
y-coordinates between PT and PB is greater than the difference dLR
of x-coordinates between PL and PR . Then, PT and PB cannot be
labeled positively while PL and PR are labeled negatively. Thus, the
VC-dimension of axis-aligned squares in the plane is 3.

2. Consider right triangles in the plane with the sides adjacent to the
right angle both parallel to the axes and with the right angle in the
lower left corner. What is the VC-dimension of this family?

Solution: It is not hard to see that the set of 3 points with coordinates
(0, 0), (−1, −1), (−2, −2), and (−3, −3) can shattered by such trian-
gles. To see that no five points can be shattered, the same example
or argument as for axis-aligned rectangles can be used: labeling all
points positively except from the one within the interior of the convex
hull is not possible (for the degnerate cases where no point is in the
interior of the convex hull is simpler, this is even easier to see). Thus,
the VC-dimension of this family of triangles is 4.

1
B. Growth function bound

1. Consider the family H of threshold functions over RN defined by {x =


(x1 , . . . , xN ) 7→ sgn(xi − θ) : i ∈ [1, N ], θ ∈ R}, where sgn(z) = +1 if
z ≥ 0, sgn(z) = −1 otherwise. Give an explicit upper bound on the
growth function ΠH (m) of H that is in O(mN ).

Solution: For each feature, there at most m + 1 ways of selecting


the threshold (between any two feature values or beyond or below all
values). Thus, the total number of thresholds functions for a sample
of size m is at most (m + 1)N . Thus, the growth function is upper
bounded by (m + 1)N .

2. In class, we gave a bound on the Rademacher complexity of a family


G in terms of the growth function (Lecture 3, slide 18). Show that
a finer upper bound on the Rademacher complexity can be given in
terms of ES [Π(G, S)], where Π(G, S) is the number of ways to label
the points in sample S.

Solution: Following the proof given in class and using Jensen’s in-
equality (at the last step), we can write:
"  σ   g(z ) #
1 ..1 .. 1
b m (G) = E sup
R . · .
S,σ g∈G m σm g(zm )
"√ p #
m 2 log |{(g(z1 ), . . . , g(zm )) : g ∈ G}|
≤E (Massart’s Lemma)
S m
"√ p #
m 2 log Π(G, S)
=E
S m
√ p r
m 2 log ES [Π(G, S)] 2 log ES [Π(G, S)]
≤ = .
m m

C. VC-dimension of neural networks

Let C be a concept class over Rr with VC-dimension d. A C-neural net-


work with one intermediate layer is a concept defined over Rn that can be
represented by a directed acyclic graph such as that of Figure 1, in which
the input nodes are those at the bottom and in which each other node is
labeled with a concept c ∈ C.

2
The output of the neural network for a given input vector (x1 , . . . , xn )
is obtained as follows. First, each of the n input nodes is labeled with the
corresponding value xi ∈ R. Next, the value at a node u in the higher layer
and labeled with c is obtained by applying c to the values of the input nodes
admitting an edge ending in u. Note that since c takes values in {0, 1}, the
value at u is in {0, 1}. The value at the top or output node is obtained
similarly by applying the corresponding concept to the values of the nodes
admitting an edge to the output node.

1. Let H denote the set of all neural networks defined as above with
k ≥ 2 internal nodes. Show that the growth function ΠH (m) can be
upper bounded in terms of the product of the growth functions of the
hypothesis sets defined at each intermediate layer.

Solution: Let Πu (m) denote the growth function at a node u in the


intermediate layer. For a fixed set of values at the intermediate layer,
using the concept class C the output
Q node can generate at most ΠC (m)
distinct labelings. There are u Πu (m) possible sets of values at the
intermediate layer since, by definition, for a sample of size m, at
most Πu (m)Q distinct values are possible at each u. Thus, at most
ΠC (m) × u Πu (m) labelings
Q can be generated by the neural network
and ΠH (m) ≤ ΠC (m) u Πu (m).

2. Use that to upper bound the VC-dimension of the C-neural networks


(hint: you can use the implication m = 2x log2 (xy) ⇒ m > x log2 (ym)
valid for m ≥ 1, and x, y > 0 with xy > 4).

Solution: For any intermediate node u, Πu (m) = ΠC (m). Thus,


d
ΠH (m) ≤ ΠC (m)k . By Sauer’s lemma, ΠC (m)k ≤ em d , thus ΠH (m) ≤
em dk

d . Let m = 2kd log2 (ek). In view of the inequality given by the
hint and ek > 4, this implies m > dk log2 em m > em dk .
 
d , that is 2 d
Thus, the VC-dimension of H is less than

2kd log2 (ek).

3. Let C be the
P family of concept classes defined by threshold functions
C = {sgn( rj=1 wj xj ) : w ∈ Rr }. Give an upper bound on the VC-
dimension of H in terms of k and r.

3
Figure 1: A neural network with one intermediate layer.

Solution: For threshold functions, the VC-dimension of C is r, thus,


the VC-dimension of H is upper bounded by

2kr log2 (ek).

You might also like