Generalization Bounds and Stability: 9.520 Class 14, 03 April 2006 Sasha Rakhlin
Generalization Bounds and Stability: 9.520 Class 14, 03 April 2006 Sasha Rakhlin
Sasha Rakhlin
Plan
• Generalization Bounds
• Stability
• Generalization Bounds Using Stability
Algorithms
Hypothesis Space
� �
IPS sup |IS [f ] − I[f ]| > � < δ
f ∈H
Michel Talagrand:
2�2
(
i=1 n
�
�
2n�2
= 2 exp − .
(b − a)2
Inequality
IES D[fS ]
=
IES [IS [fS ] − I[fS ]]
⎡ ⎤
n
�
1 ⎣
= IES,z V (fS (xi), yi) − V (fS (x), y)⎦
n i=1
⎡ ⎤
n
1
�
n i=1
≤ β
|D[fS ] − D[fS i,z ]| = |IS [fS ] − I[fS ] − IS i,z [fS i,z ] + I[fS i,z ]|
≤ |I[fS ] − I[fS i,z ]| + |IS [fS ] − IS i,z [fS i,z ]|
1
≤ β + |V (fS (xi), yi) − V (fS i,z (x), y)|
n
1
�
+ |V (fS (xj ), yj ) − V (fS i,z (xj ), yj )|
n j=i
�
M
≤ β+ + β
n
M
= 2β +
n
Applying McDiarmid’s Inequality
⎛ ⎞
2�2
�2 n�2
= 2 exp ⎝
− M )2
⎠
= 2 exp
−
2n(β + n
2(nβ + M )2
Note that
IP(D[fS ] > β + �) = IP(D[fS ] − IED[fS ] > �)
≤ IP(|D[fS ] − IED[fS ]| > �)
Hence,
� �
n�2
If we define
� �
n�2
δ ≡ 2 exp − .
2(nβ + M )2
By varying δ (and �), we can say that for any δ ∈ (0, 1),
with probability 1 − δ,
�
2 ln(2/δ)
I[fS ] ≤ IS [fS ] + β + (nβ + M ) .
n
Fast Convergence
An algorithm A : Z n → F is
� �
∀
δ S ∈ Z n, ∀i, u ∈ Z, �V (fS , u) − V (fS i,u , u)� ≤ β.
Thoughts on stability and open questions