
4468 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 10, OCTOBER 2012
Fig. 2. Hierarchical structure formed by three pattern levels (i.e., point,
subspace, and manifold) in face recognition.
by a large number of samples). See Fig. 2 for an illustration.
In some sense, the core of pattern classification is the distance
computation among these representations. The distances over
point and subspace have been well studied in the literature;
whereas very few studies have been done on the distance
related to manifold.
A. Distances Over Point and Subspace
Hereinafter we always denote points by x
i
, y
i
, subspaces
by S
i
, and manifolds by M
i
. The distances over point and
subspace include the following three ones:
1) Point to Point Distance (PPD): denote by d(x
1
, x
2
) the
distance from point x
1
to x
2
. The most commonly used PPD
is the Euclidean distance as follows:
d(x
1
, x
2
) =
x
1
− x
2
. (1)
2) Point to Subspace Distance (PSD): denote by d(x, S) the
distance from point x to subspace S. It is generally defined as
the so-called L2-Hausdorff distance:
d(x, S) = min
y∈S
x − y
=
x − x
. (2)
In fact, x
is the projection of x in the subspace S,alsothe
nearest point to x in S. Thus, the PSD is actually the PPD
from x to its projection x
in S. It is also known as “distance-
from-feature-space” (DFFS) in [23].
3) Subspace to Subspace Distance (SSD): denote by
d(S
1
, S
2
) the distance between two subspaces S
1
and S
2
.
While there is not a unified definition yet to measure the SSD,
the concept of principal angles [18], [19] is perhaps the most
commonly exploited one due to its favorable performance.
Recently, another SSD is proposed in [24], which can be
regarded as utilizing the sum of DFFS between the bases of
two subspaces.
As known in linear algebra, the single point x
i
spans a
special linear subspace, i.e., the trivial zero subspace L{0},
which is centered on x
i
and of zero dimensional. In this sense,
both PPD and PSD are special cases of SSD.
B. Distances Over Manifold
Our main motivation arises from the fact that local lin-
earity holds everywhere on a globally nonlinear manifold.
Thus, a manifold can be modeled by a collection of local
linear models, each depicted by a subspace [25]. In general,
manifold can be viewed as extending subspace to account
for more general and complex data variations. The distances
associated with manifold are then related to those defined on
subspace. Formally, we denote the i-th component subspace
of a manifold M by C
i
, and express M as a set containing all
the C
i
:
M ={C
i
: i = 1, 2,...,m}={C
1
, C
2
,...,C
m
}. (3)
where m is the number of local linear subspaces.
1) Point to Manifold Distance (PMD): denote by d(x, M)
the distance from point x to manifold M. Similar to PSD, one
can define this distance by finding the closest point to x in M
as follows:
d(x, M)= min
C
i
∈M
d(x, C
i
)= min
C
i
∈M
min
y∈C
i
x−y
=
x−x
. (4)
In analogy to x
in the PSD, here we call x
the projection of
x in the manifold M.
2) Subspace to Manifold Distance (SMD): denote by
d(S, M) the distance from subspace S to manifold M. It can
be defined by seeking the closest subspace to S in manifold
M:
d(S, M) = min
C
i
∈M
d(S, C
i
). (5)
It comes that SMD is reduced to SSD in a simple manner
similar to that from PSD to PPD.
3) Manifold to Manifold Distance (MMD): denote by
d(M
1
, M
2
) the distance between two manifolds M
1
and
M
2
. With the local linear model representation in (3), MMD
can be converted to integrating the distances between pair of
subspaces respectively from one of the involved manifolds.
See Fig. 3 for a conceptual illustration.
Formally, given two manifolds M
1
={C
i
: i =
1, 2,...,m}, M
2
={C
j
: j = 1, 2,...,n}, we formulate
MMD as follows:
d(M
1
, M
2
) =
m
i=1
n
j=1
f
ij
d
C
i
, C
j
,
s.t.
m
i=1
n
j=1
f
ij
= 1, f
ij
≥ 0. (6)
In this general formulation, MMD comes in the form of a
weighted average of pairwise SSDs, i.e., d(C
i
, C
j
).
It has been figured out that point is a special case of
subspace. Similarly, subspace can be viewed as a special case
of manifold under the formulation in (3). Therefore, the three
pattern levels form a hierarchical structure and all the six
distances can be formulated in a general multi-level MMD
framework.
III. M
ANIFOLD–MANIFOLD DISTANCE
From Fig. 3 and (6), one can find that there are three key
ingredients in MMD: (i) local linear model construction, i.e.,
the component subspaces C
i
, C
j
, (ii) local model distance
measure, i.e., the SSD d(C
i
, C
j
), and (iii) global integration
of local distances, i.e., the choice of the weights f
ij
.Inthis
section, we present details of these ingredients and extensive
investigations on their various configurations.