Nonex Kap3
Nonex Kap3
Characterization of best
approximations
(a) w = PC (x) .
(b) �x − w|y − w� ≤ 0 for all y ∈ C .
(c) �x − w�2 + �y − w�2 ≤ �x − y�2 for all y ∈ C .
Proof:
(a) =⇒ (b) Let y ∈ C and t ∈ (0, 1] . Then ty + (1 − t)w ∈ C and we have
47
Kolmogorov’s criterion in Hilbert spaces1 , i.e. the equivalence (a) ⇐⇒ (b) is saying
that the best approximation w to x is such that x − w forms an obtuse angle with all
direction y − w from w into the set C . Another way of looking at this criterion is to
define, for x ∈ H\C the hyperplane
Hx := {y ∈ H : �x − PC (x)|y� = 0} .
One then notes that the translate Hx +PC (x) supports the convex set C at PC (x) . Lateron,
we will present Kolmogorov’s criterion also for Banach spaces. Here the formulation and
the proof is much more delicate.
The equivalence (a) ⇐⇒ (c) leads to the so called strong uniqueness of PC (x):
Theorem 3.2. Let H be a Hilbert space and let C be a nonempty closed convex subset of
H . Then the metric projection PC has the following properties:
(2) �PC (x) − PC (x � )�2 + �(I − PC )(x) − (I − PC )(x � )�2 ≤ �x − x � �2 for all x, x � ∈ X .
(3) PC : X −→ C is nonexpansive.
(6) PC is positively homogeneous, i.e. PC (tx) = tPC (x) for all t > 0, x ∈ H, if C is a
cone.
Proof:
Ad (1) C is a Chebyshev set; see Theorem 2.17.
Ad (2) Let w = PC (x), w � = PC (x � ) . Then
48
Theorem 3.3. Let H be a Hilbert space and let U be a closed subspace of H . Then the
metric projection PU das the following properties:
(4) For each x ∈ H we have: w = PU (x) if and only if �x − w|u� = 0 for all u ∈ U .
Proof:
Ad (1), (2), (3) are clear because U is Chebyshev set; see Theorem 2.17.
Ad (4) From (b) in Theorem 3.1 we conclude �x − PU (x), u� for all u ∈ U .
Ad (5) Follows from (4)
Ad (6) Using (5) we obtain
�PU (x)|x� = �PU (PU (x))|x� = �PU (x)|PU (x)� = �PU (x)� ≥ 0 .
49
This implies x0 + PU (x − x0 ) = Px0 +U (x) . Let
Since PU is a linear continous mapping (see (8),(9)), L is linear and continuous too.
Let x ∈ H and let x = w − limn xn . Then
This shows w − limn Lxn = Lx and therefore w − limn Px0 +U (xn ) = Px0 +U (x) . �
Theorem 3.4 (Reduction principle). Let H be Hilbert space, let C be a convex subset and
let U ⊂ H be a linear subspace which is a Chebyshev set. Suppose that C ⊂ U . Then we
have:
(a) PC ◦ PU = PC = PU ◦ PC .
(b) dist(x, C)2 = dist(x, U)2 + dist(PU (x), C)2 for all x ∈ H .
Proof:
Notice that PC (x) is empty or a singleton due to the fact that C is a convex subset of a
Hilbert space. Moreover PU (x) is a singleton for each x ∈ H since U is a Chebyshev set.
Ad (a) PU ◦ PC = PC since C ⊂ U . Let x ∈ H, y ∈ C . Then y ∈ U and x − PU (x) ∈ U⊥
due to (5) in Theorem 3.3. Therefore
We conclude that y minimizes u �−→ �x − u�2 iff y minimizes v �−→ �PU (x) − v�2 .
This shows that PC (x) exists iff PC (PU (x)) exists and PC (x) = PC (PU (x)) holds true. We
obtain that PC (x) �= ∅ iff PC (PU (x)) �= ∅ . This implies (a) .
Ad (b) Follows from (3.1). �
The interpretation of the reduction principle is the following: To find the best approx-
imation to x from C we first project x onto U and then project PU (x) onto C ; the result
is PC (x) .
Corollary 3.5. Let H be a Hilbert space, let C ⊂ X be a nonempty closed convex subset
of H . Then PC : X −→ C satifies
Proof:
Let x, y ∈ H . From Kolmogorov’s criterion we have:
The property (3.2) of metric projections in Hilbert spaces is called firmly nonexpan-
sivity. We shall come back to this property in a general context in Chapter 7. Notice
that we obtain from (3.2) the nonexpansivity of PC .
50
3.2 Duality mapping
Let X be a Banach space. We set:
(6) The duality map is surjective if and only if every functional λ ∈ X ∗ attains its maxi-
mum on S1 .
51
therefore weak∗ compact (Theorem of Banach-Alaoglu).
Ad (5) Let x, y ∈ X and λ ∈ J(x), µ ∈ J(y) . Then
�λ − µ, x − y� ≥ = �λ, x� − �λ, y� − �µ, x� + �µ, y�
≥ �x�2 + �y�2 − (�λ��y� − �µ��x�) = (�x� − �y�)2 (3.4)
Now, we read off the monotonicity.
Ad (6) Let λ ∈ X ∗ . Then there exists x ∈ X with �λ, x� = �λ� : see the preliminaries in
the preface. Let y := �λ�x . Then λ ∈ J(y) since �λ� = �y� and
�λ, y� = �λ, �λ�x� = �λ��λ, x� = �λ�2 = �y�2 .
Let λ ∈ X ∗ and let y ∈ X ∗ with J(y) = λ . If y = θ then λ = θ and nothing has to be
proved. Let y �= θ . Then
�λ, y�y�−1 � = �y�−1 �λ, y� = �y� = �λ� ,
which implies that λ attains its norm in �y�−1 y ∈ S1 .
Ad (7) This is the theorem of Bishop-Phelps; see [7].
Ad (8) This follows from (5) by using the Theorem of James which says that a Banach
space is reflexive iff every functional λ ∈ X ∗ attains its maximum on S1 .
Ad (9) Let λ ∈ J(x) . By the Riesz-representation theorem we have y ∈ X with �y|x� =
�λ, x� = �x�2 , �λ� = �y� = �x� . By the parallelogram identity it is easily seen that
y = x. �
Example 3.7. Let for 1 < p < ∞ and let X := Lp (Ω) be the Banach space of p-integrable
functions on the measurable set Ω ⊂ Rn . In this case, Lp (Ω)∗ = Lq (Ω) with 1/p+1/q = 1
and one has
J(x) = {�x�p−2 |x|p−1 sign(x)} , x �= θ .
Later on, we find a way for computing the result above. �
Lemma 3.8. Let X be a Banach spoace with duality mapping JX . Let x ∈ X and (xn )n∈N
such that
lim�JX (xn ) − JX (x), xn − x� = 0 .
n
n
Then x = limn x .
Proof:
See [5]. �
Lemma 3.9. Let X be a Banach spoace with duality mapping JX . Let (sn )n∈N be a non-
increasing sequence of positive numbers with lim sn = ∞ and let (xn )n∈N . Assume
�sn JX (xn ) − sm JX (x), xn − xm � =≤ 0 , m, n ∈ N .
Then (�xn �)n∈N is a nondecreasing sequence. If it is bounded then (xn )n∈N is convergent.
Proof:
See [4]. �
52
Example 3.10. Consider X := R2 endowed with the l1 -norm and let U be the one-
dimensional subspace R1 × {θ} . Then µ : U � (x, y) �−→ x ∈ R is a linear continuous
functional on U with �µ� = 1 = �µ, (1, 0)�. Each continuation of µ to a continuous
functional λ on R2 has the representation λ : R2 � (x, y) �−→ αx + βy ∈ R . It is easy
to verify that such a functional λ has the property λ|U = µ and �λ� = 1 if |β| ≤ 1 ; notice
that the dual norm on R2 is the l∞ -norm. Thus, for each such λ we have
Example 3.11. Let K be a compact topological space and let C(K) be the space of con-
tinuous functions f from K into R . C(K) is a Banach space under the supremum norm
� · �∞ : �f�∞ := supξ∈K |f(ξ)| . Its dual space C(K)∗ can be represented in different ways.
In the case K = [0, 1] the representation as the space BV[0, 1] of functions with bounded
variations is approriate. Each function g of bounded variation leads to a Stieltjes-integral
on C(K) the elements of C(K)∗ have the following presentation:
�
∗
∀ λ ∈ C(K) ∃ g ∈ BV[0, 1] (�λ, f� = fdg) .
K
In the general case, we have to consider the set of signed Borel measure µ on K .
Each such measure µ has a decomposition µ = µ+ − µ− where µ+ , µ− are nonnegative
measures; |µ| := µ+ + µ− is called the total variation of µ . Now, the dual space C(K)∗
can be described as follows: For each λ ∈ C(K)∗ there exists a uniquely determined finite
and regular signed Borel measure with
�
�λ, f� = fdµ , f ∈ C(K) .
K
Finite“ means that the range of µ is contained in R and regular“ means that the measure
” ”
µ(A) of each Borel subset A of K can be approximated from interior by the measure of
open subsets. The norm in C(K)∗ under this representation is given by the total variation.
Examples of functionals in C(K)∗ are Dirac’s point measures δξ , ξ ∈ K : �δξ , f� = f(ξ) .
Now the duality map J on C(K) is given as follows:
�
J(f) = {µ : µ finite, regular, signed measure on K, fdµ = |µ|�f�∞ , |µ| = �f�∞ } .
K
If we choose the function f ≡ 1 then J(f) is the set of probability measures on the Borel
sigma algebra of K . �
The property (4) in Lemma 3.6 has a nice consequence concerning the continuity of J
on the unit sphere.
Lemma 3.12. Let X be a Banach space with the duality mapping J and let x ∈ S1 with
#J(x) = 1 . Suppose we have sequences (xn )n∈N , (λn )n∈N with �xn � = 1, n ∈ N, lim xn =
x, λn ∈ J(xn ), n ∈ N . Then w∗ − limn λn = J(x) .
53
Proof:
We have �λn � = �xn � = �λn , xn � = 1, n ∈ N . Since the unit ball in X ∗ is σ(X∗ , X) com-
pact (Theorem of Alaoglu-Banach) the sequence (λn )n∈N posesses σ(X ∗ , X ) convergent
subsequences. Let (λnk )k∈N be such a subsequence of (λn )n∈N ; λ := w∗ − limk λnl . Then
�λ� ≤ 1 and
By taking k → ∞ this implies �λ, x� = 1, �λ� = 1 and therefore λ ∈ J(x) . Since #J(x) = 1
we conclude that λ = w∗ − limn λn = J(x) .
�
In several cases, when one works in specific Banach spaces like lp , Lp (Ω), W 1,p (Ω), 1 <
p < ∞, a generalization of the normalized duality map is appropriate. In a more general
consideration, one defines a duality mapping as
The relation between the normalized duality map and the generalized duality map Jh is
easily seen to be
h(�x�)
Jh (x) = J(x) , x ∈ X .
�x�
Duality mappings will be used as a main tool for studying the properties of the metric
projection. We restrict ourselves to a consideration using a normalized duality mapping
only.
Lemma 3.13. Let X be a Banach space such that the dual space X ∗ is a strictly convex
Banach space. Then the duality map J is a single-valued mapping.
Proof:
Let x ∈ X . If x = θ then J(x) = {θ} . Suppose that x �= θ . Let λ, µ ∈ J(x) . Then
�λ� = �µ� = �x� and
From this follows 2�λ� ≤ �λ + µ� and therefore by symmetry �λ� + �µ� ≤ �λ + µ� . This
implies �λ� + �µ� = �λ + µ� . By the strict convexity of X ∗ we get λ = µ ; see Lemma
2.14. �
Lemma 3.14. Let X be a Banach space with duality mapping J . Then the following
conditions are equivalent:
(a) X is strictly convex.
54
(b) For all λ ∈ X ∗ \{θ} there exists at most one x ∈ S1 with �λ� = �λ, x� .
(c) J is strictly monotone, i.e.
Proof:
(a) =⇒ (b) Let λ ∈ X ∗ \{θ} . We may assume �λ� = 1 . Let x, y ∈ X with
Proof:
Let λ ∈ X ∗ . If λ = θ then λ is in the range of J since J(θ) = θ . Assume now λ �= θ and
consider µ := λ�λ�−1 . Since X is reflexive there exists z ∈ X with �z� = 1, �λ, z� = 1 .
Then �µ� = �z�, �µ, z� = �µ��z� and we conclude that µ ∈ J(z) . Since X ∗ is strictly
convex J is single-valued and we have J(z) = µ . With x := �λ�z we have J(x) = J(�λ�z) =
�λ�J(z) = �λ�µ = λ . Now, the proof of surjectivity is complete.
Assume: J is not injective. Then there exist u, v ∈ X , u �= v, with λ := J(u) = J(v) . We
have �λ� = �u�, �λ, u� = �λ��u�, �λ� = �v�, �λ, v� = �λ��v� . We set with t ∈ (0, 1)
x := tu + (1 − t)v . Then �λ, x� = �λ�(t�u� + (1 − t)�v�) and therefore �λ�(t�u� + (1 −
t)�v�) = |�λ, x�| ≤ �λ��x�, t�u� + (1 − t)�v� ≤ �x� . Obviously, �x� ≤ t�u� + (1 − t)�v� .
Thus, we have
�tu + (1 − t)v� = t�u� + (1 − t)�v� .
55
We set w := u�u�−1 , z : v�v�−1 . Since �u� = �v� = λ� we have
This shows that the segment [w, z] is contained in S1,X ∗ . This is contradicting the as-
sumption that X ∗ is strictly convex.
Now we have that J is bijective. From the definition of J and J∗ we conclude J∗ ◦ J =
I, J ◦ J∗ = I . �
3.3 Smoothness
Let us begin with a helpful result concerning convex real functions.
Lemma 3.16. Let X be a vector space and let g : X −→ R be a convex function which
satisfies g(θ) = 0 . Let y ∈ X . Then the mapping (0, ∞) � t �−→ g(ty)/t ∈ R is
monotone non-decreasing.
Proof:
Let 0 < s ≤ t and set a := s/t ∈ [0, 1] . Then
ν : X � x �−→ �x� ∈ R , x ∈ X ,
Lemma 3.17. Let X be a Banach space with norm-mapping ν . Let x ∈ X \{θ} . Then we
have:
(3) ν+� (x, ay) = aν+� (x, y), ν−� (x, ay) = aν−� (x, y) for all y ∈ X , a ≥ 0 .
(4) ν+� (x, u + w) ≤ ν+� (x, u) + ν+� (x, w), ν−� (x, u + w) ≥ ν−� (x, u) + ν−� (x, w) , u, w ∈ X .
(5) |ν+� (x, y)| ≤ �y�, |ν−� (x, y)| ≤ �y� for all y ∈ X .
56
Proof:
Ad (1) Let x, y ∈ X . We apply Lemma 3.16 to g(y) := �x + y� − �x� and obtain that
�x + ty� − �x�
dx,y : R\{0} � t �−→ t is monotone non-decreasing in (0, ∞) . In the same
way, dx,−y is monotone non-decreasing in (0, ∞) . This implies by a simple argumentation
that dx,y is monotone non-decreasing in R\{0} . We conclude that
�x + ty� − �x�
ν+� (x, y) = lim
t↓0 t
exists. A similar argumentation shows that ν−� (x, y) exists.
Ad (2) See the proof of (1) .
Ad (3), (4) This is easy to show.
Ad (5) Follows from
−�y� ≤ ν−� (x, y) ≤ ν+� (x, y) ≤ �y�, y ∈ X .
Ad (6) Obviously. �
Definition 3.18. Let X be a Banach space with norm mapping ν . Then the norm in X
is called Gateaux differentiable in x if ν+� (x, y) = ν−� (x, y) for all y ∈ X . If the norm
is Gateaux differentiable in x we set ν � (x, y) := ν+� (x, y), y ∈ X , and we call ν � (x, y) the
Gateaux derivative of the norm in x in direction y . �
Clearly, a norm cannot be Gateaux differentiable in θ .
Lemma 3.19. Let X be a Banach space with norm-mapping ν and let the norm be
Gateaux differentiable in x ∈ X \{θ} . Then we have:
(1) ν � (x, ·) is linear.
(2) �ν � (x, y)� ≤ �y� for all y ∈ X .
(3) ν � (x, ·) ∈ X∗ , ν � (x, x) = �x� and �ν � (x, ·)� = 1 .
Proof:
Ad (1) Follows from (2), (3), (4) in Lemma 3.17.
Ad (2) See (5) in Lemma 3.17
Ad (3) We know from (1), (2) that ν � (x, ·) ∈ X ∗ and �ν � (x, ·)� ≤ 1 . Since ν � (x, x) = �x�
we have �ν � (x, ·)� = 1 . �
Lemma 3.20. Let X be a Banach space with norm-mapping ν and let the norm be
Gateaux differentiable in x ∈ X \{θ} . Then we have �x�ν � (x, ·) ∈ J(x) .
Proof:
Due to Lemma 3.19 we have for x ∈ X \{θ}
ν � (x, ·) ∈ {λ ∈ X ∗ : �λ� = 1, �λ, x� = �x�} .
�
We set
Σ(x) := {λ ∈ X ∗ : �λ, x� = �x�, �λ� = 1} , x ∈ X \{θ} .
Since for x �= θ �x�−1 J(x) = Σ(x) we know that Σ(x) is nonempty for all x ∈ X \{θ} .
Above we have seen ν � (x, ·) ∈ Σ(x) if ν is Gateaux differentiable in x .
57
Lemma 3.21. Let X be a Banach space with norm-mapping ν . Let x ∈ S1 and let λ ∈ X ∗ .
Then the following statements are equivalent:
(a) λ ∈ Σ(x) .
(b) ν−� (x, y) ≤ �λ, y� ≤ ν+� (x, y) for all y ∈ X .
Proof:
(a) =⇒ (b) Let y ∈ X , t > 0 . Then
ν−� (x, y) ≤ �λ, y� ≤ ν+� (x, y) for all y ∈ X , �λ, z� = s , �λ� = 1 , �λ, x� = 1 . (3.5)
Proof:
We shall construct λ by the Hahn-Banach theorem. Set Z := span({z}) and define µ on
Z by
�µ, az� := as , a ∈ R .
Using the properties (2), (3) in Lemma 3.17 it is easy to prove
We know from (3) in Lemma 3.17 that ν+� (x, ·) is subadditive, i.e. ν+� (x, u + w) ≤
ν+� (x, u) + ν+� (x, w) for all u, w ∈ X . Hence, by the Hahn-Banach Theorem we may
extend µ from Z to the space X and obtain a linear functional λ ∈ X � such that
�λ, u� = �µ, u� for all u ∈ Z and �λ, y� ≤ ν+� (x, y) for all y ∈ X .
Therefore λ ∈ X ∗ and �λ� ≤ 1 . Since ν−� (x, x) = ν+� (x, x) = 1 we have �λ� = 1 . �
58
Clearly, a Hilbert space H is Gateaux differentiable. The Gateaux derivative ν � (x, ·)
of the norm in x ∈ H\{θ} is given as follows:
x
ν � (x, y) = � |y� , y ∈ H .
�x�
Gateaux differentiability of a space is an analytic property of the norm-mapping. Now,
we want to show that this property is related to the observation that in an euclidean space
a boundary point of the unit sphere allows to draw a unique line through this point which
is tangent to the sphere. We know ν � (x, ·) ∈ Σ(x) . Each λ ∈ Σ(x) defines a family of
hyperplanes
Hλ,a := {z ∈ X : �λ, z� = a} , a ∈ R .
The hyperplane Hλ,a for a := �x� touches the ball B�x� in X and we say that it supports
B�x� in x .
Definition 3.24. Let X be a Banach space.
(a) X is smooth at x ∈ S1 (or x ∈ S1 is point of smoothness of X ) if #Σ(x) = 1 .
(b) X is smooth if X is smooth at every x ∈ S1 .
�
Clearly, a Hilbert space H is smooth since we already know that #Σ(x) = 1 for all
x ∈ H\{θ} .
Theorem 3.25. Let X be a Banach space. Then we have:
(a) If X ∗ is strictly convex then X is smooth.
(b) If X ∗ is smooth then X is strictly convex.
Proof:
Ad (a) Assume that X is not smooth. Then there exists x0 ∈ S1 and λ, µ ∈ Σ(x0 ) with
λ �= µ . We obtain
1 1 1 1
1 = �λ, (x + y)� = �λ, x� + �λ, y� ≤ (�x� + �y�) = 1 .
2 2 2 2
Therefore we have �λ, x� = �λ, y� = 1 . Consider now x, y as elements in X ∗∗ . Then we
conclude that X ∗ is not smooth at λ . �
Theorem 3.26. Let X be a Banach space. Let x ∈ X \{θ} . Then the following statements
are equivalent:
(a) The duality map is single-valued.
(b) X is smooth at x .
59
(c) The norm is Gateaux differentiable in x .
Proof:
(a) ⇐⇒ (b) This follows from the fact that X is smooth at x ∈ S1 iff #SSigma(x) = 1 .
(b) =⇒ (c) If the norm is not Gateaux differentiable at x then there exists a z ∈ X with
ν−� (x, z) < ν+� (x, z) . Then by Lemma 3.22 there exist at least two different functionals
λ1 , λ2 ∈ Σ(x) . Therefore x is not a point of smoothness.
(c) =⇒ (b) Since ν−� (x, y) = ν+� (x, y) for all y ∈ X the only functional λ ∈ Σ(x) is the
functional λ with �λ, y� = ν � (x, y) = ν+� (x, y) for all y ∈ X . �
Theorem 3.27. Let X be a reflexive Banach space with the duality map J . Then the
following conditions are equivalent:
(a) J is single-valued.
(b) X ∗ is strictly convex.
(c) X is Gateaux differentiable.
(d) X is smooth.
Proof:
(a) =⇒ (b) We know #Σ(x) = 1 for all x ∈ S1 . Let λ, µ ∈ X ∗ with
By Lemma 3.22 there exist λ1 , λ2 ∈ X ∗ with �λ1 , z� �= �λ2 , z�, �λ1 � = �λ2 � = 1, and
ν−� (x, y) ≤ �λ1 , y� ≤ ν+� (x, y) , ν−� (x, y) ≤ �λ2 , y� ≤ ν+� (x, y) , for all y ∈ X .
Then
�λ1 + λ2 � ≥ �λ1 + λ2 , x� = �λ1 , x� + �λ2 , x� ≥ ν−� (x, x) + ν−� (x, x) = 2
and we conclude that X ∗ is not strictly convex by Lemma 3.14.
(c) ⇐⇒ (d) See Theorem 3.26.
(d) =⇒ (a) If #J(x) = #Σ(x) ≥ 2 for x ∈ S1 then X cannot be smooth at x .
�
Without reflexivity the implication (d) =⇒ (b) in Theorem 3.27 is not true. One
obtains a counterexample by remorming l1 such that the resulting space lr1 is strictly
convex and l∗1 = l∞ = (lr1 )∗ ; see for instance [14].
60
3.4 Characterization of best approximations in
Banach spaces
Now, we want to derive similar results as in Section 3.1 for the case that the approximation
problem is formulated in a Banach space. First of all, we introduce the tangential and
normal cone of a nonempty closed convex set.
Definition 3.28. Let X be a Banach space, let C be a nonempty closed convex subset of
X and let z ∈ C .
Clearly, T (z, C), N(z, C) are nonempty closed convex cones. One can easily see that
T (z, C) is the smallest closed convex cone which contains alle vectors u − z, u ∈ C . The
normal cone is the polar cone of the tangential cone. For interior points z of C both cones
are not very interesting: T (z, C) = H, N(z, C) = {θ} .
As we know from Kolmogorov’s criterion, if x ∈ X and w ∈ C then
This form of Kolmogorov’s criterion will be now transfered to the Banach space case. The
way we do that is to use optimality conditions for convex programs.
Let X be a Banach space, let C be a nonempty closed convex subset of X and let
x ∈ X . The approximation problem may be reformulated as follows:
such that
epi(G) ⊂ H+ := {(y, s) : �λ, y − y0 � + G(x0 ) ≤ s} .
This leads us to the subdifferential of a convex function.
61
Definition 3.29. Let X be a Banach space and let F : X −→ R convex. If y0 ∈
dom(F) := {y : F(y) < ∞} then the set
∂F : X ⇒ X ∗ .
Let us come back to the minimization problem (3.7). Suppose that w ∈ C belongs to
dom(g) , g := f + δC . Then we have the obvious observation that
62
Clearly,
�µ, h� ≤ �x − w� + �h� − �x − w� = �h� for all h ∈ X
and
�µ, x − w� ≤ �x − x� − �x − w�
which implies �µ� = 1 since x − w �= θ . Setting h := w − x in (10.27) we obtain
Theorem 3.32. Let X be a Banach space and let C be a nonempty closed convex subset
of X . Let x ∈ X , w ∈ C . Then the following conditions are equivalent:
(a) w ∈ PC (x) .
(b) J(x − w) ∩ N(w, C) �= ∅ .
Proof:
If x − w = θ then x ∈ PC (x) and λ := θ ∈ J(x − w) ∩ N(w, C) . If x − w �= θ then we have
shown above w ∈ PC (x) iff Σ(x−w)∩N(w, C) �= ∅ . But as we know, J(x−w) = {�x−w�µ :
µ ∈ Σ(x − w)} and therefore J(x − w) ∩ N(w, C) �= ∅ iff Σ(x − w) ∩ N(x − w, C) �= ∅ . �
In Lemma 3.20 we have shown that in a Banach space X the Gateaux dervative ν � (x, ·)
at the point x ∈ X \{θ} belongs to Σ(x) (when it exists). Hence, �x�ν � (x, ·) ∈ J(x) .
1
Theorem 3.33 (Asplund, 1966). Let X be a Banach space and let j : X � x �−→ 2
�x�2 ∈
R . Then
∂j(x) = J(x) for all x ∈ X . (3.16)
Proof:
We follow [1]. Notice that ∂j(x) is nonempty for all x ∈ X since j is continuous. Moreover,
for x = θ nothing has to be proved. Let x ∈ X \{θ} .
Let λx ∈ ∂j(x) . Choose any y ∈ S�x� . Then �λx , y − x� ≤ 0, i.e. �λx , y� ≤ �λx , x� . This
implies
�λx ��x� = sup �λx , y� ≤ �λx , x� .
y∈S�x�
�λx ��x� = �λx , x� for all λx ∈ ∂j(x), for all x ∈ X \{θ} . (3.17)
63
Let z ∈ S1 and let t > 0, s > 0 . Let λtz ∈ ∂j(tz) . Then using (3.17) we obtain
1 2 1 2
t − s ≥ �λsz , tz� − �λsz ��sz�
2 2
= �λsz , tz� − �λsz ��sz� + �λsz ��tz� − �λsz ��tz�
= �λsz �(t − s) + (�λsz , tz� − �λsz ��tz�)
= �λsz �(t − s) + t(�λsz , sz� − �λsz ��sz�)/s
≥ �λsz �(t − s)
This shows the property λw ∈ J(w) on the ray from θ through z . From this we conclude
that λx ∈ J(x) for all x ∈ J(x) .
Conversely, suppose that λx ∈ J(x) . Then by the definition of J(x)
1 1
�x�2 + �λx , y − x� ≤ �x�2 + �λx �(�y� − �x�)
2 2
and
1 1 1 1 1 1
�x�2 +�λx �(�y�−�x�) ≤ �x�2 +�λx ��y�−�x�2 ≤ �x�2 + �λx �2 + �y�2 −�x�2 = �y�2 .
2 2 2 2 2 2
This shows λx ∈ ∂j(x) . �
Now, we know the subdifferential of the norm mapping ν : X � x �−→ �x� ∈ R and
j : X � x �−→ 12 �x�2 ∈ R:
∂ν(θ) = B1 (3.18)
∂ν(x) = Σ(x) if x �= θ (3.19)
∂j(x) = J(x) (3.20)
64
Definition 3.34. Let X be a real vector space4 . A mapping [·|·] : X × X −→ R is called
a semi-inner product if the following properties are satisfied:
is a norm on X .
Proof:
This can easily be shown. For convenience we verify the triangle inequality. Let x, y ∈ X
with �x + y� �= 0 ; the case �x + y� = 0 is trivial. Then
λ(y) := [y|x] ∈ R
1
is a functional in X ∗ with �λ� = �x� when X is endowed with the norm [·|·| 2 (see Corollary
3.35.
Proof:
The linearity follows from the linearity of [·|·] in the first argument. From |λ(y)| = |[y|x]| ≤
1 1 1
[x|x] 2 [y|y] 2 we conclude �λ� ≤ [x|x] 2 = �x� and hence, λ is continuous. On the other
�
1 1
hand, we have [x|x] = |λ(x)| ≤ �λ�[x|x] 2 , i.e. �λ� ≥ [x|x] 2 = �x� .
Definition 3.37. Let X be a Banach space with norm � · � and endowed with a semi-inner
product [·|·] . Then we say that [·|·] generates the norm if �x� = [x|x] 2 for all x ∈ X . �
1
Definition 3.38. Let X be a Banach space and let J : X ⇒ X∗ be the duality map. Then
a mapping J̃ : X −→ X∗ with J̃(x) ∈ J(x), x ∈ X , is called a section of J . Such a section
J̃ is called a homogeneous section if the mapping x �−→ J̃(x) is homogeneous, i.e.
J̃(ax) = aJ̃(x), x ∈ X , a ∈ R . �
4
Semi-inner products can be defined for vector spaces with complex scalars. The homogenity property
has to be adapted.
65
Lemma 3.39. Let X be a Banach space with duality map J . Then the following statements
hold:
Proof:
Ad (a) Let z ∈ S1 . Choose (exactly one) λz ∈ X∗ with �λz � = 1, �λz , z� = 1 . Then
λz ∈ Σ(z) . Now, if x = az, a ∈ R, z ∈ S1 , set λx := aλz .
Let x, y ∈ X , x �= θ, r ∈ R, r �= 0 . Then for r > 0
Moreover,
J̃(x) = λx , x ∈ X .
Ad (b) From the proof in (a) it is clear that homogeneous sections are not uniquely
determined if #J(x) ≥ 2 for some x ∈ X \{θ} .
Ad (c) Let J̃ be a homogeneous section of J . Define the mapping
Then | · |·] is linear in the first argument and homogeneous in each argument. We have
66
Then
�J̃(x), x� = [x|x] = �x�2 , �J̃(y)� = �x�, x, y ∈ X ,
Consequentely, we have
Lemma 3.39 says that every Banach space can be endowed with a semi-inner product.
Corollary 3.40. Let X be a smooth Banach space. Then the duality map is single-valued
and the uniquely determined semi-inner product which generates the norm is given by
Proof:
We know that in a smooth Banach space the duality map is single-valued; see Theorem
3.27. Therefore by Lemma 3.39 the semi-inner product is uniquely determined. �
Example 3.41. Consider the Banach space X := R3 endowed with the l1 -norm. It is
easy to check that by setting
3
� xk y k
[x|y] := �y�1 , x = (x1 , x2 , x3 ), y = (y1 , y2 , y3 ) .
k=1,yk � 0
=
|yk |
With the help of a semi-inner product which generates the norm in a Banach space we
can reformulate the property of strictly convexity.
Theorem 3.42. Let X be a Banach space and let [·|·] be a semi-inner product which
generates the norm in X . Then the following statements are equivalent:
Proof:
(a) =⇒ (b) Let J̃ be a homogeneous section of J which generates the semi-inner product.
Let x, y ∈ X \{θ} with [x|y] = �x��y� . Then [x|y] = �J̃(y), x� = �x��y� . This implies
x y
�J̃(y), � = �y� = �J̃(y)� , �J̃(y), � = �y� = �J̃(y)� .
�x� �y�
Since X is strictly convex, every continuous linear functional attains its norm on at most
one point; see Lemma 3.14. This implies
x y
=
�x� �y�
67
and we have x = ay with a = �x��y�−1 .
(b) =⇒ (a) Let x, y ∈ X \{θ} with �x + y� = �x� + �y� . Then
Suppose that in (3.23) one of inequalities above is strict. Then, by addition, we obtain
Let x, y ∈ X , t > 0, x �= θ, x + ty �= θ .
68
(d) The duality map of X is single-valued.
(e) limt→0 [y|x + ty] exists for all x ∈ X \{θ}, y ∈ X .
Additionally, if one of the above statements is true, the limit in (e) is given as [y, x] .
Proof:
Let J be the duality map of X and let Let J̃ be a homogeneous section which generates
[·|·] .
(a) ⇐⇒ (b) Theorem 3.26.
(a) =⇒ (c) If X is smooth, #J(x) = 1 for all x ∈ X ; see Theorem 3.26. Therefore there
exists exactly one homogeneous section of J which generates the semi-inner product.
(c) =⇒ (d) Obviously, there exists only one homogeneous section of J and by Lemma
3.39, #J(x) = 1 for all x ∈ X .
(d) =⇒ (e) Let x ∈ X \{θ}, y ∈ X .
Let (tn )n∈N be a sequence in (0, ∞) with limn tn = 0 . Then (J̃((x + tn y)�x + tn y�−1 ))n∈N
is a sequence in the unit ball of X ∗ . By Lemma 3.12 limt↓0 J̃((x + tn y)�x + tn y�−1 )) =
J̃(x�x�−1 ) . This shows
�
Theorem 3.46. Let X be a Banach space and let [·|·] : X × X −→ R be a semi-inner
product on X which generates the norm. Let C be a nonempty closed convex set and
x ∈ X , w ∈ C . Then the following statements are equivalent:
(a) w ∈ PC (x) .
(b) [(u − w|x − w − t(u − w)] ≤ 0 for all u ∈ C and t ∈ [0, 1] .
(c) [u − w|x − w] ≤ 0 for all u ∈ C .
Proof:
(a) =⇒ (b) Assume by contradiction, [(u − w|x − w − t(u − w)] > 0 for some u ∈ C
and t ∈ [0, 1] . Clearly, u �= w . Set z := tu + (1 − t)w . Then z ∈ C and we have
69
This is a contradiction to the best approximation property of w .
(b) =⇒ (c) Take t := 0 in (b) .
(c) =⇒ (a) We have for all u ∈ C:
�x − w�2 = [x − w|x − w]
= [x − u|x − w] + [u − w|x − w]
≤ [x − u|x − w]
≤ �x − u��x − w�
70
Proof:
See Dragomir [9]. �
[·, ·]+ , [·, ·]− may be use to develop Kolmogorov-like results for best approximations in
Banach spaces. This is done in [15].
A semi-inner product [·|·] in a real vector space X max be used to introduce an orthog-
onality relation via x orthogonal to y“ if and only if [y|x] = 0 . This is not a symmetric
”
relation and transitivity of the orthogonality of vectors is not guaranteed in general.
Example 3.50. Consider the space R3 endowed with the l1 -norm; see Example 3.41.
Then the semi-inner product
3
�
[x|y] := �y� |yi |−1 xi yi
i=1,yi �=0
generates the norm. Consider x := (−2, 1, 0), y := (1, 1, 0) . Then we have [y|x] = 0,
[x|y] = −2 and we see that x is orthogonal to y but y is not orthogonal to x . �
There are other various concepts of orthogonality in normed spaces. In general, they
are modeled along a property which holds in inner product spaces. Let (X, �·|·�) be an
inner product space. Here are three properties in an inner product space which may serve
as a model for orthogonality in normed spaces:
(2) x⊥y ⇐⇒ �x + y� = �x − y�
x is orthogonal to y iff �J(y), x� = 0 .
where we suppose that J is single-valued. This is the concept which corresponds to semi-
inner products; see above. We denote this orthogonality as J-orthogonality.
Orthogonality in the sense of Birkhoff in normed spaces is derived from (3):
In general, Birkhoff-orthogonality is not a symmetric property too, i.e. x⊥y does not
imply y⊥x . Nevertheless, Birkhoff-orthogonality may be used to generalize orthogonal
projections from Hilbert spaces to general Banach spaces; see for instance [9, 12].
71
3.6 Appendix: Convexity II
We want to consider the following optimization problem:
We set
f∗∗ (x) := sup (�λ, x� − f∗ (λ)) , x ∈ X,
λ∈X∗
and call f∗∗ the double conjugate of f . One has the remarkable duality (see [2])
The most important theorem concerns the Fenchel-Rockafellar duality which is the
content of the following theorem. To formulate this duality we associate to the primal
problem (10.33) a dual problem using the conjugates of f and g:
and we set
p := inf (f(x) + g(x)) , d := sup (−(f∗ + g∗ )(−λ)) ,
x∈X λ∈X∗
p ∈ R̄, d ∈ R̄ are the values of the primal and dual problem, respectively.
Corollary 3.52 (Weak duality). Let X be a Banach space and let f, g : X −→ (−∞, ∞]
be convex. Then p ≥ d .
72
Remark 3.53. In case of d < p we say that there exists a duality gap. Without further
assumptions we cannot guarantee that there is no duality gap. If there is no duality gap
the identity p = d can be used to find a lower bound for p since we have
Theorem 3.54 (Fenchel, 1949, Rockafellar, 1966). Let X be a Banach space and let
f, g : X −→ R ^ be convex. We assume that one of the functions f, g is continuous in
0
some point x ∈ dom(f) ∩ dom(g) . Then
Remark 3.55. The duality asserts that in the presence of a so called primal constraint
qualification such as f (or g) is continuous at a point x0 ∈ dom(f) ∩ dom(g)“ one has
”
p := inf (f(x) + g(x)) = d := sup (−f∗ + g∗ )(−λ))
x∈X λ(∈X∗
and the supremum for d is a maximum. If we add a bf dual constraint qualification such
as f∗ (or g∗ ) is continuous at a point λ0 ∈ dom(f∗ ) ∩ dom(g∗ )“ one has that the infimum
”
for p is a minimum and we have that the primal and the dual problem are solvable and
that there exists no duality gap. �
One can show directly from the definition, using the Hahn-Banach, that J(x) is a non-
empty, weak∗ -closed convex subset of X for each x ∈ X.
Lemma 3.57. Let X be a Banach space and let j : X −→ R be the function defined by
x �−→ 12 �x�2 . Then
∂j(x) = J(x) , x ∈ X . (3.33)
73
Proof:
Let µ ∈ ∂j(x). It suffices to consider the case x �= θ since ∂j(θ) = {θ} and J(θ) = {θ} .
Then
1 1
�y�2 ≥ �x�2 + �µ, y − x� , x ∈ X . (3.34)
2 2
Let t > 0 and u ∈ X be arbitrary, and replace in (10.40) y by x + tu:
1 1 1
t�µ, u� ≤ �x + tu�2 − �x�2 ≤ t�x��u� + t2 �u�2 .
2 2 2
Dividing through by t and passing to the limit t ↓ 0 we obtain �µ, u� ≤ �x��u� . Since u
was arbitrary this implies �µ� ≤ �x� . On the other hand, let y = tx in (10.40):
1 2
(t − 1)�x�2 ≥ (t − 1)�µ, x� . (3.35)
2
For 0 < t < 1 we obtain from (10.41) 12 (t + 1)�x�2 ≤ �µ, x�. Letting t → 1 we have
�µ, x� ≥ �x�2 , completing the proof that µ ∈ J(x) .
For other way around, assume now that µ ∈ J(x) . Let us estimate the righthand side in
(10.40) using the properties of µ :
1 1 1
�µ, y� − �µ, x� + �x�2 ≤ �x��y� − �x�2 + �x�2 ≤ �y�2 .
2 2 2
�
Definition 3.58. Let X be a Banach space and let C be asubset of X . Then the set
�
Lemma 3.60. Let X be a Banach space and let f, g ∈ Γ0 (X ), h := f � g . Then
74
(a) (f + g)∗ = f∗ � g∗ .
(b) For all λ ∈ X ∗ exists µ ∈ X ∗ with (f + g)∗ (λ) = f∗ (µ) + g∗ (λ − µ) .
(c) ∂(f + g) = ∂f + ∂g .
Proof:
75
11.) Let X be a Banach space with a single-valued duality mappings J : X ⇒ X ∗ , J∗ :
X ∗ ⇒ X . If J is strongly monotone, i.e.
g(x, y) + g(y, x)
cos(φ) =
2�x��y�
(c) If X is a Hilbert space H then the angle above is the usual in inner product
spaces.
13.) Let H be a Hilbert space and let C be a nonempty closed convex subset of H .
Suppose that U : H −→ H is a surjective linear isometry. Then U(C) is a
nonempty closed convex subset of H and PU(C) = U ◦ PC ◦ U∗ .
14.) Let H be a Hilbert space and let C be a nonempty closed convex cone of H . The
set C◦ := {x ∈ H : �x|y� ≤ 0 for all y ∈ C} is called the polar cone of C . Verify
P C◦ = I − P C .
15.) Show x = PC (x + z) if z ∈ N(x, C) .
16.) Let H be a Hilbert space and let S ⊂ H be a nonempty set. S is called a sun if
for all x ∈ H the following assertion hold:
Let C ⊂ H be a no(f + g)∗ 0f∗ � g∗ .nempty Chebyshev set. Prove the equivalence
of the following conditions:
(a) C is convex.
(b) C is a sun.
(c) PC is a nonexpansive mapping.
17.) Let X be a Banach space and let [·|·] be a semi-inner product which generates the
norm in X . Then the following statements are equivalent:
(a) X is smooth.
[x|x + ty] − �x�2
(b) limt→0 t exists for all x ∈ X \{θ}, y ∈ X .
18.) Give an elementary proof of the fact that each convex function f : R −→ R is
continuous.
76
Bibliography
[1] E. Asplund. Positivity of duality mappings. Amer. Math. Soc., 73:200–203, 1966.
[2] J. Baumeister. Funktionalanalysis, 2013. Skriptum einer Vorlesung, Goethe–Universität
Frankfurt/Main.
[3] A. Beuerling and A.E. Livingston. A theorem on duality mappings in Banach spaces. Ark.
Mat., 4:405–411, 1962.
[4] H. Brézis, M.G. Crandall, and A. Pazy. Perturbations of nonlinear maximal monotone sets
in Banach spaces. Comm. Pure Appl. Math., 23:123–144, 1970.
[5] F. Browder. Nonlinear operators and nonlinear equations of evolution in Banach spaces.
Proc. Symp. Pure and Appl. Math. 18. Providence, 1976.
[6] E.W. Cheney. Introduction to approximation theory, 2nd ed. AMS publsihing, Providence,
1982.
[7] I. Cioranescu. Geometry of Banach Spaces, Duality Mappings and Nonlinear Problems.
Kluwer, Dordrecht, 1990.
[8] F. Deutsch. Best approximation in inner product spaces. Springer, New York, 2001.
[9] S.S. Dragomir. Semi-inner products and applications. Nova Sciene Hauppage, 2004.
[10] J.R. Giles. Classes of semi-inner-product spaces. Trans. of Amer. Math. Soc., 129:436–446,
1967.
[13] G. Lumer. Semi-inner product spaces. Trans. Amer. Math. Soc., 100:29–43, 1961.
[14] R.E. Megginson. An introduction to Banach space theory. Springer, New York, 1998.
[15] J.-P. Penot and R. Ratsimahalo. Characteritations of metric projections in Banach spaces
and applications. 1998. Preprint.
[16] I. Rosca. Semi produits scalaires et réprésentaion du type riesz pour les functionélles
linéaires te bornées sur les espace normés. C. R. Acad. Sci. Paris, 283 (19), 1976.
[17] I. Singer. The theory of best approximation and functional analysis, volume 13 of Series in
applied mathematics. SIAM, Philadelphia, 1974.
77