Biomedical Image Segmentation Guide
Biomedical Image Segmentation Guide
Image Analysis
TOPICS IN BIOMEDICAL ENGINEERING
INTERNATIONAL BOOK SERIES
Series Editor: Evangelia Micheli-Tzanakou
Rutgers University
Piscataway, New Jersey
A Continuation Order Plan is available for this series. A continuation order will bring delivery of each new volume
immediately upon publication. Volumes are billed only upon actual shipment. For further information please contact
the publisher.
Handbook of Biomedical
Image Analysis
Volume II: Segmentation Models Part B
Edited by
Jasjit S. Suri
Department of Biomedical Engineering
Case Western Reserve University
Cleveland, Ohio
David L. Wilson
Department of Biomedical Engineering
Case Western Reserve University
Cleveland, Ohio
and
Swamy Laxminarayan
Institute of Rural Health
Idaho State University
Pocatello, Idaho
vii
viii Contributors
This book is the result of collective endeavor from several noted engineering
and computer scientists, mathematicians, medical doctors, physicists, and radi-
ologists. The editors are indebted to all of their efforts and outstanding scientific
contributions. The editors are particularly grateful to Drs. Petia Reveda, Alex
Falco, Andrew Laine, David Breen, David Chopp, C. C. Lu, Gary Christensen,
Dirk Vandermeulen, Aly Farag, Alejandro Frangi, Gilson Antonio Giraldi, Gabor
Szekely, Pierre Hellier, Gabor Herman, Ardeshir Coshtasby, Jan Kybic, Jeff Weiss,
Jean-Claude Klein, Majid Mirmehdi, Maria Kallergi, Yangming Zhu, Sunanda Mi-
tra, Sameer Singh, Alessandro Sarti, Xioping Shen, Calvin R. Maurer, Jr., Yoshi-
nobu Sato, Koon-Pong Wong, Avdhesh Sharma, Rakesh Sharma, and Chun Yuan
and their team members for working with us so closely in meeting all of the
deadlines of the book. We would like to express our appreciation to Kluwer
Publishers for helping create this invitational handbook. We are particularly
thankful to Aaron Johnson, the acquisition editor and Shoshana Sternlicht for
their excellent coordination of the book at every stage.
Dr. Suri thanks Philips Medical Systems, Inc., for the MR datasets and en-
couragements during his experiments and research. Special thanks are due to
Dr. Larry Kasuboski and Dr. Elaine Keeler from Philips Medical Systems, Inc., for
their support and motivations. Thanks are also due to my past Ph.D. committee
research professors, particularly Professors Linda Shapiro, Robert M. Haralick,
Dean Lytle, and Arun Somani, for their encouragements.
We extend our appreciations to Drs. Ajit Singh, Siemens Medical Sys-
tems, George Thoma, chief, Imaging Science Division, National Institutes
of Health, Dr. Sameer Singh, University of Exeter, UK, for his motivations.
xi
xii Acknowledgments
xiii
xiv Preface
presence of poor image quality or image artifacts. A novel method for automatic
dilation assessment based on a global image analysis strategy is presented. We
model interframe arterial dilation as a superposition of a rigid motion model
and a scaling factor perpendicular to the artery. Rigid motion can be interpreted
as a global compensation for patient and probe movements, an aspect that has
not been sufficiently studied before. The scaling factor explains arterial dilation.
The ultrasound (US) sequence is analyzed in two phases using image registra-
tion to recover both transformation models. Temporal continuity in the regis-
tration parameters along the sequence is enforced with a Kalman filter since
the dilation process is known to be a gradual physiological phenomenon. Com-
paring automated and gold standard measurements we found a negligible bias
(0.04(1.14measurements (bias = 0.47 better reproducibility (CV = 0.46
In Chapter 6 we present the assessment of onset and progression of diseases
from images of various modalities is critically dependent on identification of
lesions or changes in structures and regions of interest. Mathematical model-
ing of such discrimination among regions as well as identification of changes
in anatomical structures in an image result from the process of segmentation.
For clinical applications of segmentation, a compromise between the accuracy
and computational speed of segmentation techniques is needed. Optimal seg-
mentation processes based on statistical and adaptive approaches and their
applicability to clinical settings have been addressed using diverse modalities
of images. Current drawbacks of automated segmentation methodologies stem
mostly from nonuniform illumination, inhomogeneous structures, and the pres-
ence of noise in acquired images. The effect of preprocessing on the accuracy of
segmentation has been discussed. The superior performance of advanced clus-
tering algorithms based on statistical and adaptive approaches over traditional
algorithms in medical image segmentation has been presented.
Chapter 7 presents the automatic analysis of color fundus images and with
its application to the diagnosis of diabetic retinopathy, a severe and frequent eye
disease. We give an overview of computer assistance in this domain and describe
in detail some algorithms developed within this framework: the detection of
main features in the human eye (vascular tree, the optic disc, and the macula)
and the detection of retinal lesions like microaneurysms and hard exudates.
Chapter 8 presents the advanced atherosclerotic plaque that can lead to dis-
eases, such as vessel lumen stenosis, thrombosis, and embolization, which are
the leading causes of death and major disability among adults in the United
Preface xvii
States. Previous studies have shown that plaque constituents are important de-
terminants for plaque vulnerability and stenosis risk access. To identify and
quantitatively measure the composition of atherosclerotic lesions in carotid ar-
teries, plaque segmentation techniques will be discussed in this chapter. First, to
extract the lumen contour and outer wall boundary of carotid artery accurately,
we will discuss Active Contour Based boundary detection methods, including
how to convert exerting energy design and searching process optimizations. Sec-
ond part is about region-based image segmentation technique, such as Markov
random fields, and its applications on image sequence processing. In recent
study, plaque components identification with multiple contrast weightings MR
images has shown more promising results than single contrast weightings im-
ages. In third part, we will introduce multiple contrast weighting MR image
segmentation methods and its validation results by comparing with histology
images. At last, a software package developed specifically for the quantitative
analysis of atherosclerotic plaque by MRI, quantitative vascular analysis system
(QVAS) will be presented.
Chapter 9 presents the pre- and postcontrast Gd-DTPA1 MR images of any
body organ hold diagnostic utility in the area of medicine, particularly for breast
lesion characterization. This paper reviews the state-of-the-art tools and tech-
niques for lesion characterization, such as uptake curve estimation (functional
segmentation), image subtraction, velocity thresholding, differential character-
istics of lesions, such as maximum derivative of image sequence, steep slope
and washout, fuzzy clustering, Markov random fields, and interactive deformable
models such as Live-Wire. In first part of the paper, we discuss the MRI system
and breast coils along with the MR breast data acquisition protocol for spatial and
temporal MR data collection. Then the perfusion analysis tools are discussed for
staging breast tumors. Here, the rate of absorption of the contrast agent (Gad) is
used to stage a breast lesion. The differences in contrast enhancement have been
shown to be able to help differentiate between benign and malignant lesions.
Thus, oncologists, radiologists, and internists have shown great interest in such
classification by examining the quantitative characteristics of the tissue signal
enhancement. Then we discuss two other major tools for breast lesion charac-
terization. The first set of tools is based on pixel-classification algorithms and
second set of tools is based on user-based deformable models such as Live-Wire.
1
Gadolinium-Dithylene-Triamine-Penta-Acetate, we will refer to it as Gad from now on.
xviii Preface
Finally, the paper also presents the user-friendly Marconi Medical System’s real-
time MR Breast Perfusion2 Software Analysis System (BPAS), based on Motif
using C/C++ and X window libraries that runs on Digital Unix and XP1000
workstations supporting Unix and Linux Operating Systems, respectively. This
software was tested on 20 patient studies from the data collected from two major
sites in the United States and Europe.
Chapter 10 presents methods for the enhancement and segmentation of 3D
local structures, that is, line-like shapes such as blood vessels, sheet-like shapes
such as articular cartilage, and blob-like shapes such as nodules in medical
volume data. Firstly, a method for enhancement of 3D local structures with
various widths is presented. Multiscale Gaussian filters and the eigenvalues of
Hessian matrix of the volume function are combined to effectively enhance
various widths of structures. The characteristics of multiscale filter responses
are analysed to clarify the guidelines for the filter design. Secondly, methods
for description and quantification of the 3D local structures are presented. Me-
dial axis/surface elements are locally determined based on the second-order
approximations of local intensity structures. Diameter/thickness quantification
is performed based on detected medial axis/surface elements. Limits on the
accuracy of thickness quantification from 3D MR data is analyzed based on a
mathematical models of imaged structures, MR imaging and thickness measure-
ment processes. The utility of the methods is demonstrated by examples using
3D CT and MR data of various parts of the body.
Chapter 11 presents work in the area of CAD design. Research into the
computer-aided detection (CAD) of breast lesions from digitised mammograms
has been extensive over the past 15 years. The large number of computer al-
gorithms for mammogram contrast enhancement, segmentation, and region
discrimination reflects the nontrivial nature in the problem of detecting can-
cer during breast screening. In addition, due to the enormous variability in the
mammographic appearance of the breast, engineering a single CAD solution is
formidable. Mammographic CAD must provide a high level of sensitivity for the
detection of breast lesions, while maintaining a low number of false-positive
regions for each image. This chapter describes an adaptive knowledge-based
framework for the detection of breast cancer masses from digitised mammo-
grams. The proposed framework accommodates a set of distinct contrast
2
Or Breast Uptake.
Preface xix
several modules including image segmentation and classification steps that are
based on multiresolution methods and artificial neural networks. Morphology,
distribution, and demographics are the domains where features are determined
from for the classification task. As a result, the segmentation and feature se-
lection processes of the algorithm are critical in its performance and are the
areas we focus on in this chapter. We look particularly into the segmentation
aspects of the implementation and the impact of multiresolution filtering has on
feature estimation and classification. General aspects of algorithm evaluation
and, particularly, segmentation validation are presented using results of experi-
ments conducted for the evaluation of our CADiagnosis scheme as the basis of
discussion.
Chapter 14 presents research in the area of neuro segmentation.
Contents
xxi
xxii Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805
Chapter 1
1.1 Introduction
1
Department of Radiology, Helsinki University Central Hospital, Finland
2
Medical Image Computing (Radiology -ESAT/PSI), Faculties of Medicine and Engineering,
Katholieke Universiteit Leuven, Belgium
1
2 Leemput et al.
Following Warfield et al. [13] and Collins et al. [14], who developed the idea
that the inherent limitations of intensity-based classification can be alleviated by
combining it with an elastically deformable atlas, we propose here to combine
photometric and geometric models in a single framework. Tissues are classified
using an unsupervised parametric approach by using a mixture of Gaussians as
the feature space distribution model in an iterative expectation-maximization
loop. Classifications are further initialized and constrained by iconic matching of
the image to a digital atlas containing spatial maps of prior tissue probabilities.
Section 1.1.2 presents the standard intensity model and optimization ap-
proach that we will use throughout this chapter. Section 1.1.3 discusses the
basic geometric model (a brain atlas) and its iconic matching to image data us-
ing linear and nonlinear registration algorithms. Section 1.1.4 summarizes the
changes made to the basic intensity model in order to model the MR imaging
process more faithfully and to make the segmentation procedure more robust
in the presence of abnormalities.
The overall probability density for image Y given the model parameters is
then given by
f (Y | ) = f (Y | L, ) f (L)
L
= f (yj |)
j
with f (yj | ) = f (yj | l j = k, ) · π k (1.3)
k
Equation (1.3) is the well-known mixture model (see Fig. 1.1). It models the
histogram of image intensities as a sum of normal distributions, each distribution
weighted with its prior probability π k.
Image segmentation aims to reconstruct the underlying tissue labels L based
on the image Y. If an estimation of the model parameters is somehow available,
then each voxel can be assigned to the tissue type that best explains its inten-
sity. Unfortunately, the result depends largely on the model parameters used.
Typically, these are estimated by manually selecting representative points in
the image of each of the classes considered. However, once all the voxels are
classified, the model parameter estimation can in its turn automatically be im-
proved based on all the voxels instead of on the subjectively selected ones alone.
6 Leemput et al.
histrogram
white matter
gray matter
CSF
total mixture model
(a) (b)
Figure 1.1: The mixture model fitted to a T1-weighted brain MR image: (a) the
intracranial volume and (b) its intensity histogram with a mixture of normal
distributions overlayed. The normal distributions correspond to white matter,
gray matter, and CSF.
Intuitively, both the segmentation and the model parameters can be estimated
in a more objective way by interleaving the segmentation with the estimation of
the model parameters.
The expectation-maximization (EM) algorithm [18] formalizes this intuitive
approach. It estimates the maximum likelihood (ML) parameters ˆ
by iteratively filling in the unknown tissue labels L based on the current pa-
rameter estimation , and recalculating that maximizes the likelihood of the
so-called complete data {Y, L}. More specifically, the algorithm interleaves two
steps:
with m the iteration number. It has been shown that the likelihood log f (Y | )
is guaranteed to increase at each iteration for EM algorithms [19].
Model-Based Brain Tissue Classification 7
With the image model described above, the expectation step results in a
statistical classification of the image voxels
f (yj | l j , (m−1) ) · πl j
f (l j | Y, (m−1) ) = (1.4)
k f (yj | l j = k, (m−1) ) · πk
(m−1)
(m) j f (l j = k | Y, (m−1) ) · yj
µk = (1.5)
j f (l j = k | Y,
(m−1) )
classify
Figure 1.3: Digital brain atlas with spatially varying a priori probability maps
for (a) white matter, (b) gray matter, and (c) CSF. High intensities indicate high
a priori probabilities. The atlas also contains a T1 template image (d), which
is used for registration of the study images to the space of the atlas. (Source:
Ref. [23].)
Model-Based Brain Tissue Classification 9
Figure 1.4: Top row (from left to right): Original T1 MPRAGE patient image;
CSF segmented by atlas-guided intensity-based tissue classification using affine
registration to atlas template; CSF segmented after nonrigid matching of the
atlas. Middle row: Atlas priors for gray matter, white matter, and CSF affinely
matched to patient image. Bottom row: Idem after nonrigid matching.
(a) (b)
(c) (d)
(e)
Figure 1.5: Illustration of the statistical models for brain MR images used in
this study. The mixture model (a) is first extended with an explicit parametric
model for the MR bias field (b). Subsequently, an improved spatial model is used
that takes into account that tissues occur in clustered regions in the images
(c). Then the presence of pathological tissues, which are not included in the
statistical model, is considered (d). Finally, a downsampling step introduces
partial voluming in the model (e).
Model-Based Brain Tissue Classification 13
is illustrated in Fig. 1.5(c), which shows a sample of the total resulting image
model. The MRF brings general spatial and anatomical constraints into account
during the classification, facilitating discrimination between tissue types with
similar intensities such as brain and nonbrain tissues.
The method is further extended in section 1.4 in order to quantify lesions
or disease-related signal abnormalities in the images (see Fig. 1.5(d)). Adding
an explicit model for the pathological tissues is difficult because of the wide
variety of their appearance in MR images, and because not every individual
scan contains sufficient pathology for estimating the model parameters. These
problems are circumvented by detecting lesions as voxels that are not well ex-
plained by the statistical model for normal brain MR images. Based on principles
borrowed from the robust statistics literature, tissue-specific voxel weights are
introduced that reflect the typicality of the voxels in each tissue type. Inclusion
of these weights results in a robustized algorithm that simultaneously detects
lesions as model outliers and excludes these outliers from the model parameter
estimation. In section 1.5, this outlier detection scheme is applied for fully auto-
matic segmentation of MS lesions from brain MR scans. The method is validated
by comparing the automatic lesions segmentations to manual tracings by human
experts.
Thus far, the intensity model assumes that each voxel belongs to one single
tissue type. Because of the complex shape of the brain and the finite resolution
of the images, a large part of the voxels lies on the border between two or more
tissue types. Such border voxels are commonly referred to as partial volume (PV)
voxels as they contain a mixture of several tissues at the same time. In order
to be able to accurately segment major tissue classes as well as to detect the
subtle signal abnormalities in MS, e.g., the model for normal brain MR images
can be further refined by explicitly taking this PV effect into account. This is
accomplished by introducing a downsampling step in the image model, adding
up the contribution of a number of underlying subvoxels to form the intensity
of a voxel. In voxels where not all subvoxels belong to the same tissue type, this
causes partial voluming, as can be seen in Fig. 1.5(e). The derived EM algorithm
for estimating the model parameters provides a general framework for partial
volume segmentation that encompasses and extends existing techniques. A full
presentation of this PV model is outside the scope of this chapter, however. The
reader is referred to [27] for further details.
14 Leemput et al.
(a) (b)
Figure 1.6: The MR bias field in a proton density-weighted image. (a) Two axial
slices; (b) the same slices after intensity thresholding.
intensities of these points. They also presented a slightly different version where
the reference points are obtained by an intermediate classification operation, us-
ing the estimated bias field for final classification. Meyer et al. [32] also estimated
the bias field from an intermediate segmentation, but they allowed a region
of the same tissue type to be broken up into several subregions which creates
additional but sometimes undesired degrees of freedom.
Wells et al. [4] described an iterative method that interleaves classification
with bias field correction based on ML parameter estimation using an EM algo-
rithm. However, for each set of similar scans to be processed, their method, as
well as its refinement by other authors [5, 6], needs to be supplied with specific
tissue class conditional intensity models. Such models are typically constructed
by manually selecting representative points of each of the classes considered,
which may result in segmentations that are not fully objective and reproducible.
In contrast, the method presented here (and in [23]) does not require such
a preceding training phase. Instead, a digital brain atlas is used with a priori
probability maps for each tissue class to automatically construct intensity mod-
els for each individual scan being processed. This results in a fully automated
algorithm that interleaves classification, bias field estimation, and estimation of
class-conditional intensity distribution parameters.
f (yj | l j , ) = G Σl j (u j − µl j )
⎡ ⎤
φ1 (x j )
⎢ .. ⎥
u j = yj − [b1 · · · bC ]t ⎢
⎣ . ⎦
⎥
φ P (x j )
16 Leemput et al.
with
⎡ ⎤
φ1 (x1 ) φ2 (x1 ) φ3 (x1 ) ···
⎢ ⎥
A= ⎢ φ (x )
⎣ 1 2
φ2 (x2 ) φ3 (x2 ) · · · ⎥
⎦
.. .. .. ..
. . . .
(m) (m)
(m) (m) f l j = k | Y, (m−1)
W (m)
= diag wj , wj = w jk , w jk =
k σk2
⎡ (m)
⎤
y1 − ỹ1 (m) (m)
⎢ (m) ⎥ w jk µk
=⎢ y − ỹ2 ⎥ , (m) k
R (m)
⎣ 2 ⎦ ỹj = (m)
.. k w jk
.
This can be interpreted as follows (see Fig. 1.8). Based on the current classifi-
cation and distribution estimation, a prediction { ỹj , j = 1, 2, . . . , J} of the MR
image without the bias field is constructed (Fig. 1.8(b)). A residue image R (Fig.
1.8(c)) is obtained by subtracting this predicted signal from the original image
(Fig. 1.8(a)). The bias (Fig. 1.8(e)) is then estimated as a weighted least-squares
fit through the residue image using the weights W (Fig. 1.8(d)), each voxel’s
weight being inversely proportional to the variance of the class that voxel be-
longs to. As can be seen from Fig. 1.8(d), the bias field is therefore computed
primarily from voxels that belong to classes with a narrow intensity distribution,
such as white and gray matter. The smooth spatial model extrapolates the bias
3
For more general expressions for the multispectral case we refer to [23]; in the unispectral
case yj = yj , µk = µk , and Σk = σk2 .
Model-Based Brain Tissue Classification 17
classify
field from these regions, where it can be confidently estimated from the data, to
regions where such estimate is ill-conditioned (CSF, nonbrain tissues).
(a)
(b)
(c)
(d)
brain, consisting of several different tissues, and which may cause severe errors
in the residue image and the bias field estimation [6, 34]. Guillemaud and Brady
[6] proposed to model nonbrain tissues by a single class with a uniform distribu-
tion, artificially assigning the nonbrain tissue voxels a zero weight for the bias
estimation. This is not necessary with our algorithm: the class distribution pa-
rameters are updated at each iteration from all voxels in the image and classes
consisting of different tissue types are automatically assigned a large variance.
Since the voxel weights for the bias correction are inversely proportional to
the variance of the class each voxel is classified to, such tissues are therefore
automatically assigned a low weight for the bias estimation.
A number of authors have proposed bias correction methods that do not use
an intermediate classification. Styner, Brechbühler et al. [33, 35] find the bias field
for which as many voxels as possible have an intensity in the corrected image
that lies close to that of a number of predefined tissue types. Other approaches
search for the bias field that makes the histogram of the corrected image as
sharp as possible [36–38]. The method of Sled et al. [36] for instance is based
on deconvolution of the histogram of the measured signal, assuming that the
histogram of the bias field is Gaussian, while Mangin [37] and Likar et al. [38] use
entropy to measure the histogram sharpness. Contrary to our approach, these
methods treat all voxels alike for bias estimation. This looks rather unnatural,
since it is obvious that the white matter voxels, which have a narrow intensity
histogram, are much more suited for bias estimation than, for instance, the
tissues surrounding the brain or ventricular CSF. As argued above, our algorithm
takes this explicitly into account by the class-dependent weights assigned to
each voxel. Furthermore, lesions can be so large in a scan of a MS patient that
the histogram of the corrected image may be sharpest when the estimated bias
field follows the anatomical distribution of the lesions. As will be shown in
section 1.4, our method can be made robust for the presence of such pathologic
tissues in the images, estimating the bias field only from normal brain tissue.
As a consequence of the assumption that the tissue type of each voxel is inde-
pendent from the tissue type of the other voxels, each voxel is classified inde-
pendently based on its intensity. This yields acceptable classification results as
Model-Based Brain Tissue Classification 21
long as the different classes are well separated in intensity feature space, i.e.
have a clearly discernible associated intensity distribution. Unfortunately, this
is not always true for MR images of the brain, especially not when only one MR
channel is available. While white matter, gray matter, and CSF usually have a
characteristic intensity, voxels surrounding the brain often show an MR intensity
that is very similar to brain tissue. This may result in erroneous classifications
of small regions surrounding the brain as gray matter or white matter.
We therefore extend the model with a Markov random field prior (MRF),
introducing general spatial and anatomical constraints during the classifica-
tion [39]. The MRF is designed to facilitate discrimination between brain and
nonbrain tissues while preserving the detailed interfaces between the various
tissue classes within the brain.
is used:
1
U (L | ) = ξl j l j + νl jl j
2 j p
j ∈N jo
j ∈N j
Here, the MRF parameters ξkk and νkk denote the cost associated with transition
from class k to class k among neighboring voxels in the plane and out of the
plane, respectively. If these costs are higher for neighbors belonging to different
classes than for neighbors of the same tissue class, the MRF favors configura-
tions of L where each tissue is spatially clustered. An example of this is shown
in Fig. 1.5(c).
It has been described in the literature that fine structures, such as the in-
terface between white matter and gray matter in the brain MR images, can
be erased by the Potts/Ising MRF model [5, 42]. The MRF may overregular-
ize such subtle borders and tempt to produce nicely smooth interfaces. There-
fore, a modification is used that penalizes anatomically impossible combina-
tions such as a gray matter voxel surrounded by nonbrain tissues, while at the
same time preserving edges between tissues that are known to border each
other. We impose that a voxel surrounded by white matter and gray matter
voxels must have the same a priori probability for white matter as for gray
matter by adding appropriate constraints on the MRF transition costs ξkk and
νkk . As a result, voxels surrounded by brain tissues have a low probability for
CSF and other nonbrain tissues, and a high but equal probability for white and
gray matter. The actual decision between white and gray matter is therefore
based only on the intensity, so that the interface between white and gray matter
is unaffected by the MRF. Similar constraints are applied for other interfaces
as well.
Since the voxel labels are not independent with this model, the expectation
step of the EM algorithm no longer yields the classification of equation 4. Because
of the interaction between the voxels, the exact calculation of f (l j | Y, (m−1) )
involves calculation of all the possible realizations of the MRF, which is not
computationally feasible. Therefore, an approximation was adopted that was
proposed by Zhang [43] and Langan et al. [44], based on the mean field theory
from statistical mechanics. This mean field approach suggests an approximation
to f (l j | Y, (m−1) ) based on the assumption that the influence of l j of all other
voxels j = j in the calculation of f (l j | Y, (m−1) ) can be approximated by the
influence of their classification f (l j | Y, (m−2) ) from the previous iteration.
Model-Based Brain Tissue Classification 23
This yields
(m−1)
f (yj | l j , (m−1) ) · π j (l j )
f (l j | Y, (m−1) ) ≈ (m−1)
(1.7)
Σk f (yj | l j = k, (m−1) ) · πj (k)
The difference with Eq. (1.4) lies herein that now the a priori probability that
a voxel belongs to a specific tissue class depends on the classification of its
neighboring voxels.
With the addition of the MRF, the subsequent maximization step in the EM
algorithm not only involves updating the intensity distributions and recalculating
the bias field, but also estimating the MRF parameters {ξkk } and {νkk }. As a
result, the total iterative scheme now consists in four steps, shown in Fig. 1.11.
classify
update MRF
Figure 1.11: The extension of the model with a MRF prior results in a four-step
algorithm that interleaves classification, estimation of the normal distributions,
bias field correction, and estimation of the MRF parameters.
24 Leemput et al.
The calculation of the MRF parameters poses a difficult problem for which a
heuristic, noniterative approach is used. For each neighborhood configuration
(N p , N o ), the number of times that the central voxel belongs to class k in the
current classification is compared to the number of times it belongs to class
k , for every couple of classes (k, k ). This results in an overdetermined linear
system of equations that is solved for the MRF parameters (ξkk , νkk ) using a
least squares fit procedure [40].
1.3.2 Example
Figure 1.12 demonstrates the influence of each component of the algorithm on
the resulting segmentations of a T1-weighted image. First, the method of sec-
tion 1.2, where each voxel is classified independently, was used without the bias
Figure 1.12: Example of how the different components of the algorithm work.
From left to right: (a) T1-weighted image, (b) gray matter segmentation without
bias field correction and MRF, (c) gray matter segmentation with bias field cor-
rection but without MRF, (d) gray matter segmentation with bias field correction
and MRF, and (e) gray matter segmentation with bias field correction and MRF
without constraints. (Source: Ref. [39].)
Model-Based Brain Tissue Classification 25
correction step (Fig. 1.12(b)). It can be seen that white matter at the top of the
brain is misclassified as gray matter. This was clearly improved when the bias
field correction step was added (Fig. 1.12(c)). However, some tissues surround-
ing the brain have intensities that are similar to brain tissue and are wrongly
classified as gray matter. With the MRF model described in section 1.3.1, a better
distinction is obtained between brain tissues and tissues surrounding the brain
(Fig. 1.12(d)). This is most beneficial in case of single-channel MR data, where
it is often difficult to differentiate such tissues based only on their intensity. The
MRF cleans up the segmentations of brain tissues, while preserving the detailed
interface between gray and white matter, and between gray matter and CSF.
Figure 1.13 depicts a 3-D volume rendering of the gray matter segmentation
map when the MRF is used.
To demonstrate the effect of the constraints on the MRF parameters ξkk and
νkk described in section 1.3.1, the same image was processed without such con-
straints (Fig. 1.12(e)). The resulting segmentation shows nicely distinct regions,
but small details, such as small ridges of white matter, are lost. The MRF prior
has overregularized the segmentation and should therefore not be used in this
form.
Figure 1.13: A 3-D volume rendering of the gray matter segmentation of the
data of Fig. 1.12 with bias field correction and MRF. (Source: Ref. [39].)
26 Leemput et al.
k
2V12
(1.8)
V1k + V2k
k
where V12 denotes the volume of the voxels classified as tissue k by both raters,
and V1k and V2k the volume of class k assessed by each of the raters separately.
This metric, first described by Dice [46] and recently re-introduced by Zijdenbos
et al. [47], attains the value of 1 if both segmentations are in full agreement,
and 0 if there is no overlap at all. For all the simulated data, it was found that
the total brain volume was accurately segmented, but the segmentation of gray
matter and white matter individually did generally not attain the same accuracy.
This was caused by misclassification of the white matter–gray matter interface,
where PV voxels do not belong to either white matter or gray matter, but are
really a mixture of both.
The automated method was also validated by comparing its segmentations
of nine brain MR scans of children to the manual tracings by a human expert.
The automated and manual segmentations showed an excellent similarity index
of 95% on average for the total brain, but a more moderate similarity index of
83% for gray matter. Figure 1.14 depicts the location of misclassified gray matter
voxels for a representative dataset. It can be seen that the automatic algorithm
segments the gray matter–CSF interface in more detail than the manual tracer.
Some tissue surrounding the brain is still misclassified as gray matter, although
this error is already reduced compared to the situation where no MRF prior
is used. However, by far most misclassifications are due to the classification
of gray–white matter PV voxels to gray matter by the automated method. The
human observer has segmented white matter consistently as a thicker structure
than the automatic algorithm.
Finally, Park et al. [48] tested the method presented in this section on a
database of T1-weighted MR scans of 20 normal subjects. These data along with
expert manual segmentations are made publicly available on the WWW by the
Model-Based Brain Tissue Classification 27
4
http://www.cma.mgh.harvard.edu/ibsr.
28 Leemput et al.
So far, only the segmentation of MR images of normal brains has been addressed.
In order to quantify MS lesions or other neuropathology related signal abnormal-
ities in the images, the method needs to be further extended. Adding an explicit
model for the pathological tissues is difficult because of the wide variety of
their appearance in MR images, and because not every individual scan contains
sufficient pathology for estimating the model parameters. These problems can
be circumvented by detecting lesions as voxels that are not well explained by
the statistical model for normal brain MR images [52]. The method presented
here simultaneously detects the lesions as model outliers, and excludes these
outliers from the model parameter estimation.
1.4.1 Background
Suppose that J samples yj , j = 1, 2, . . . , J are drawn independently from a mul-
tivariate normal distribution with mean µ and covariance matrix Σ that are
grouped in = {µ, Σ} for notational convenience. Given these samples, the ML
parameters can be assessed by maximizing
log f (yj | ) (1.9)
j
which yields
j yj
µ=
J
j (yj − µ)(yj − µ)
t
Σ= (1.10)
J
In most practical applications, however, the assumed normal model is only
an approximation to reality, and estimation of the model parameters should
not be severely affected by the presence of a limited amount of model outliers.
Model-Based Brain Tissue Classification 29
Considerable research efforts in the field of robust statistics [53] have resulted in
a variety of methods for robust estimation of model parameters in the presence
of outliers, from which the so-called M-estimators [53] present the most popular
family.
Considering Eq. 1.9, it can be seen that the contribution to the loglikeli-
hood of an observation that is atypical of the normal distribution is high, since
lim f (y | )→0 log f (y| ) = −∞. The idea behind M-estimators is to alter Eq. (1.9)
slightly in order to reduce the effect of outliers. A simple way to do this, which
has recently become very popular in image processing [54] and medical image
processing [6, 16, 17, 55], is to model a small fraction of the data as being drawn
from a rejection class that is assumed to be uniformly distributed. It can be
shown that assessing the ML parameters is now equivalent to maximizing
log( f (yj | ) + λ), λ≥0 (1.11)
j
f (yj | (m−1) )
t(yj | (m−1) ) = (1.12)
f (yj | (m−1) ) + λ
down-weights observations that are atypical for the normal distribution, making
the parameter estimation more robust against such outliers.
1 − t(yj | ) (1.14)
reflects the belief that it is a model outlier. The ultimate goal in our application
is to identify these outliers as they are likely to indicate pathological tissues.
However, the dependence of Eq. (1.14) through t(yj | ) on the determinant of
the covariance matrix Σ prevents its direct interpretation as a true outlier belief
value.
In statistics, an observation y is said to be abnormal with respect to a given
normal distribution if its so-called Mahalanobis distance
d= (y − µ)t Σ−1 (y − µ)
f (yj | (m−1) )
t(yj | (m−1) ) =
f (yj | (m−1) ) + √ C1 (m−1) exp(− 12 κ 2 )
(2π ) |Σ |
where |Σ| is explicitly taken into account and where λ is reparameterized using
the more easily interpretable κ. This κ ≥ 0 is an explicit Mahalanobis distance
threshold that specifies a statistical significance level, as illustrated in Fig. 1.15.
The lower κ is chosen, the easier voxels are considered as outliers. On the other
hand, choosing κ = ∞ results in t(yj | (m−1) ) = 1, ∀ j which causes no outliers
to be detected at all.
Model-Based Brain Tissue Classification 31
Figure 1.15: The threshold κ defines the Mahalanobis distance at which the
belief that a voxel is a model outlier exceeds the belief that it is a regular sample
(this figure depicts the unispectral case, where Σ = σ 2 ).
f (yj | l j , (m−1) )
t(yj | l j , (m−1) ) = (1.15)
f (yj | l j , (m−1) ) + √ c1 (m−1) exp(− 12 κ 2 )
(2π ) |κ |
32 Leemput et al.
reflects the degree of typicality of voxel j in tissue class l j . Since k f (l j =
k | Y, (m−1)
) · t(yj | l j = k,
(m−1)
) is not constrained to be unity, model out-
liers can have a small degree of membership in all tissue classes simultaneously.
Therefore, observations that are atypical for each of the K tissue types have a re-
duced weight on the parameter estimation, which robustizes the EM-procedure.
Upon convergence of the algorithm, the belief that voxel j is a model outlier is
given by
1− f (l j = k | Y, ) · t(yj | l j = k, ) (1.16)
k
Section 1.5 discusses the use of this outlier detection scheme for fully auto-
mated segmentation of MS lesions from brain MR images.
In [52], the outlier detection scheme of section 1.4 was applied for fully automatic
segmentation of MS lesions from brain MR scans that consist of T1-, T2-, and PD-
weighted images. Unfortunately, outlier voxels also occur outside MS lesions.
This is typically true for partial volume voxels that, in contravention to the
assumptions made, do not belong to one single tissue type but are rather a
mixture of more than one tissue. Since they are perfectly normal brain tissue,
they are prevented from being detected as MS lesion by introducing constraints
on intensity and context on the weights t(yj | l j , ) calculated in Eq. (1.15).
r Since around 90–95% of the MS lesions are white matter lesions, the con-
textual constraint is added that MS lesions should be located in the vicinity
of white matter. In each iteraction, the normal white matter is fused with
the lesions to form a mask of the total white matter. Using a MRF as in
section 1.3, a voxel is discouraged from being classified as MS lesion in
the absence of neighboring white matter. Since the MRF parameters are
estimated from the data in each iteration as in section 1.3, these contextual
constraints automatically adapt to the voxel size of the data.
Model-Based Brain Tissue Classification 33
Figure 1.16: The complete method for MS lesion segmentation iteratively inter-
leaves classification of the voxels into normal tissue types, MS lesion detection,
estimation of the normal distributions, bias field correction, and MRF parameter
estimation.
1.5.2 Validation
As part of the BIOMORPH project [57], we analyzed MR data acquired dur-
ing a clinical trial in which 50 MS patients were repeatedly scanned with an
interval of approximately 1 month over a period of about 1 year. The serial
image data consisted at each time point of a PD/T2-weighted image pair and
34 Leemput et al.
Figure 1.17: A 3-D rendering of (a) gray matter, (b) white matter, and (c) MS
lesion segmentation maps. (Color slide)
6000
0.4
similarity index
0.35 4000
3000
0.3
2000
0.25
1000
0.2 0
2.7 2.8 2.9 3 3.1 3.2 3.3 3.4 3.5 3.6 2 4 6 8 10 12 14 16 18 20
Mahalanobis distance scan number
(a) (b)
Figure 1.18: (a) Similarity index between the automatic and the expert lesion
delineations on 20 images for varying κ, with and without the bias field correc-
tion component enabled in the automated method. (b) The 20 automatic total
lesion load measurements for κ = 3 shown along with the expert measurements.
(Source: Ref. [52].)
0.45 means that less than half of the voxels labeled as lesion by the expert were
also identified by the automated method and vice versa.
For illustration purposes, the expert TLLs of the 20 scans are depicted along
with the automatic ones for κ = 3 in Fig. 1.18(b). A paired t test did not reveal
a significant difference between the manual and these automatic TLL ratings
( p = 0.94). Scans 1 and 2 are two consecutive scans from one patient, 3 and 4
from the next and so on. Note that in 9 out of 10 cases, the two ratings agree over
the direction of the change of the TLL over time. Figure 1.19 displays the MR
data of what is called scan 19 in Fig. 1.18(b) and the automatically calculated
classification along with the lesion delineations performed by the human expert.
1.5.3 Discussion
Most of the methods for MS lesion segmentation described in the literature are
semiautomated rather than fully automated methods, designed to facilitate the
tedious task of manually outlining lesions by human experts, and to reduce
the inter- and intrarater variability associated with such expert segmentations.
Typical examples of user interaction in these approaches include accepting or
rejecting automatically computed lesions [58] or manually drawing regions of
36 Leemput et al.
(d) (e) (f )
(g) (h)
Figure 1.19: Automatic classification of one of the 20 serial datasets that were
also analyzed by a human expert. (a) T1-weighted image; (b) T2-weighted image;
(c) PD-weighted image; (d) white matter classification; (e) gray matter classifi-
cation; (f) CSF classification; (g) MS lesion classification; (h) expert delineation
of the MS lesions. (Source: Ref. [52].)
Model-Based Brain Tissue Classification 37
pure tissue types for training an automated classifier [58–61]. While these meth-
ods have proven to be useful, they remain impractical when hundreds of scans
need to be analyzed as part of a clinical trial, and the variability of manual
tracings is not totally removed. In contrast, the method presented here is fully
automated as it uses a probabilistic brain atlas to train its classifier. Furthermore
the atlas provides spatial information that avoids nonbrain voxels from being
classified as MS lesion, making the method work without the often-used tracing
of the intracranial cavity in a preprocessing step [58–63].
A unique feature of our algorithm is that it automatically adapts its intensity
models and contextual constraints when analyzing images that were acquired
with a different MR pulse sequence or voxel size. Zijdenbos et al. described
[64] and validated [65] a fully automated pipeline for MS lesion segmentation
based on an artificial neural network classifier. Similarly, Kikinis, Guttmann et al.
[62, 66] have developed a method with minimal user intervention that is built
on the EM classifier of Wells et al. [4] with dedicated pre- and postprocessing
steps. Both methods use a fixed classifier that is trained only once and that
is subsequently used to analyze hundreds of scans. In clinical trials, however,
interscan variations in cluster shape and location in intensity space cannot be
excluded, not only because of hardware fluctuations of MR scanners over a
period of time, but also because different imagers may be used in a multicenter
trial [66]. In contrast to the methods described above, our algorithm retrains its
classifier on each individual scan, making it adaptive to such contrast variations.
Often, a postprocessing step is applied to automatically segmented MS le-
sions, in which false positives are removed based on a set of experimentally
tuned morphologic operators, connectivity rules, size thresholds, etc. [59, 60, 62].
Since such rules largely depend on the voxel size, they may need to be retuned
for images with a different voxel size. Alternatively, images can be resampled
to a specific image grid before processing, but this introduces partial volum-
ing that can reduce the detection of lesions considerably, especially for small
lesion loads [66]. To avoid these problems, we have added explicit contextual
constraints on the iterative MS lesions detection that automatically adapt to the
voxel size. Similar to other methods [59, 61, 63, 64], we exploit the knowledge
that the majority of MS lesions occurs inside white matter. Our method fuses
the normal white matter with the lesions in each iteration, producing, in com-
bination with MRF constraints, a prior probability mask for white matter that
is automatically updated during the iterations. Since the MRF parameters are
38 Leemput et al.
Epilepsy is the most frequent serious primary neurological illness. Around 30%
of the epilepsy patients have epileptic seizures that are not controlled with
medication. Epilepsy surgery is by far the most effective treatment for these
patients. The aim of any presurgical evaluation in epilepsy surgery is to delineate
the epileptogenic zone as accurate as possible. The epileptogenic zone is that
part of the brain that has to be removed surgically in order to render the patient
seizure-free.
We applied the framework presented in this chapter to detect structural
abnormalities related to focal cortical dysplasia (FCD) epileptic lesions in the
cortical and subcortical grey matter in high-resolution MR images of the human
brain. FCD is characterized by a focal thickening of the cerebral cortex, loss
of definition between the gray and the white matter at the site of the lesion,
Model-Based Brain Tissue Classification 39
(b) (c)
Figure 1.21: Left–right hemisphere grey matter and white matter separation
on high quality high resolution data, consisting of a single sagittal T1-weighted
image (Siemens Vision 1.5 T scanner, 3D MPRAGE, 256∗ 256 matrix, 1.25 mm slice
thickness, 128 slices, FOV = 256 mm, TR = 11.4 ms, TE = 4.4 ms) with good
contrast between grey matter, white matter, and the tissues surrounding the
brain. (a): Coronal sections through the grey matter segmentation map before
(top row) and after left–right and cerebrum–cerebellum separation. (b) and
(c): 3D Rendering of the segmented white and grey matter respectively. (Source:
Ref. [75].)
Model-Based Brain Tissue Classification 43
between both hemispheres to be planar. We refer the reader to [75] for further
details and discussion of the results obtained.
1.8 Conclusion
Questions
1. What is the parametric model of the MR bias field proposed in this chapter?
6. What are the drawbacks of using such an atlas of priors and how to deal
with it?
7. What is the Markov random field model used in brain tissue classification?
Bibliography
[2] Laidlaw, D. H., Fleischer, K. W., and Barr, A. H., Partial-volume Bayesian
classification of material mixtures in MR volume data using voxel his-
tograms, IEEE Trans. Med. Imaging, Vol. 17, No. 1, pp. 74–86, 1998.
[3] Choi, H. S., Haynor, D. R., and Kim, Y., Partial volume tissue classifi-
cation of multichannel magnetic resonance images—A mixel model,
IEEE Trans. Med. Imaging, Vol. 10, No. 3, pp. 395–407, 1991.
[4] Wells, W., III, Grimson, W., Kikinis, R., and Jolesz, F., Adaptive segmenta-
tion of MRI data, IEEE Trans. Med. Imaging, Vol. 15, No. 4, pp. 429–442,
1996.
[5] Held, K., Kops, E. R., Krause, B. J., Wells, W. M., III, Kikinis, R., and
Müller-Gärtner, H. W., Markov random field segmentation of brain MR
images, IEEE Trans. Med. Imaging, Vol. 16, No. 6, pp. 878–886, 1997.
[6] Guillemaud, R. and Brady, M., Estimating the bias field of MR Images,
IEEE Trans. Med. Imaging, Vol. 16, No. 3, pp. 238–251, 1997.
[7] Kass, M., Witkin, A., and Terzopoulos, D., Snakes: Active contour mod-
els, Int. J. Comput Vision, Vol. 1, No. 4, pp. 321–331, 1988.
[9] Lötjönen, J., Reissman, P.-J., Mangin, I., and Katila, T., Model extraction
from magnetic resonance volume data using the deformable pyramid,
Med. Image Anal., Vol. 3, No. 4, pp. 387–406, 1999.
[10] Zeng, X., Staib, L., Schultz, R., and Duncan, J., Segmentation and mea-
surement of the cortex from 3D MR images using coupled surfaces
propagation, IEEE Trans. Med. Imaging, Vol. 18, No. 10, pp. 927–937,
1999.
Model-Based Brain Tissue Classification 47
[11] González Ballester, M., Zisserman, A., and Brady, M., Segmentation and
measurement of brain structures in MRI including confidence bounds,
Med. Image Anal., Vol. 4, pp. 189–200, 2000.
[12] Xu, C., Pham, D., Rettmann, M., Yu, D., and Prince, J., Reconstruction
of the human cerebral cortex from magnetic resonance images, IEEE
Trans. Med. Imaging, Vol. 18, No. 6, pp. 467–480, 1999.
[13] Warfield, S., Kaus, M., Jolesz, F., and Kikinis, R., Adaptive, template
moderated, spatially varying statistical classification, Med. Image Anal.,
Vol. 4, No. 1, pp. 43–55, 2000.
[14] Collins, D. L., Zijdenbos, A. P., Barr, W. F. C., and Evans, A. C.,
ANIMAL+INSECT: Improved cortical structure segmentation, In: Pro-
ceedings of the Annual Symposium on Information Processing in Med-
ical Imaging, Kuba, A., Samal, M., and Todd-Pokropek, A., eds., Lecture
Notes in Computer Science, Vol. 1613, Springer, Berlin, pp. 210–223,
1999.
[15] Liang, Z., MacFall, J. R., and Harrington, D. P., Parameter estimation
and tissue segmentation from multispectral MR images, IEEE Trans.
Med. Imaging, Vol. 13, No. 3, pp. 441–449, 1994.
[16] Schroeter, P., Vesin, J.-M., Langenberger, T., and Meuli, R., Robust pa-
rameter estimation of intensity distributions for brain magnetic reso-
nance images, IEEE Trans. Med. Imaging, Vol. 17, No. 2, pp. 172–186,
1998.
[17] Wilson, D. and Noble, J., An adaptive segmentation algorithm for time-
of-flight MRA data, IEEE Trans. Med. Imaging, Vol. 18, No. 10, pp. 938–
945, 1999.
[18] Dempster, A. P., Laird, N. M., and Rubin, D. B., Maximum likelihood
from incomplete data via the EM algorithm, J. R. Stat. Soc., Vol. 39, pp.
1–38, 1977.
[19] Wu, Z., Chung, H.-W., and Wehrli, F., A Bayesian approach to subvoxel
tissue classification in NMR microscopic images of trabecular bone,
MRM, Vol. 31, pp. 302–308, 1994.
48 Leemput et al.
[20] Evans, A., Collins, D., Mills, S., Brown, E., Kelly, R., and Peters, T.,
3D statistical neuroanatomical models from 305 MRI volumes, In: Pro-
ceeding of the IEEE Nuclear Science Symposium and Medical Imaging
Conference, pp. 1813–1817, 1993.
[21] Maes, F., Collignon, A., Vandermeulen, D., Marchal, G., and Suetens,
P., Multi-modality image registration by maximization of mutual in-
formation, IEEE Trans. Med. Imaging, Vol. 16, No. 2, pp. 187–198,
1997.
[22] Maes, F., Vandermeulen, D., and Suetens, P., Medical image registration
using mutual information, Proc. IEEE, Vol. 91, No. 10, pp. 1699–1722,
2003.
[23] Van Leemput, K., Maes, F., Vandermeulen, D., and Suetens, P., Au-
tomated model-based bias field correction of MR images of the
brain, IEEE Trans. Med. Imaging, Vol. 18, No. 10, pp. 885–896,
1999.
[24] D’Agostino, E., Maes, F., Vandermeulen, D., and Suetens, P., A vis-
cous fluid model for multimodal non-rigid image registration using
mutual information, Med. Image Anal., Vol. 7, No. 4, pp. 565–575,
2003.
[25] Simmons, A., Tofts, P., Barker, G., and Arridge, S., Sources of intensity
nonuniformity in spin echo images at 1.5 T, Magn. Reson. Med., Vol. 32,
pp. 121–128, 1994.
[27] Van Leemput, K., Maes, F., Vandermeulen, D., and Suetens, P., A unifying
framework for partial volume segmentation of brain MR images, IEEE
Trans. Med. Imaging, Vol. 22, No. 1, pp. 105–119, 2003.
[28] Tincher, M., Meyer, C., Gupta, R., and Williams, D., Polynomial modeling
and reduction of RF body coil spatial inhomogeneity in MRI, IEEE
Trans. Med. Imaging, Vol. 12, No. 2, pp. 361–365, 1993.
Model-Based Brain Tissue Classification 49
[29] Moyher, S. E., Vigneron, D. B., and Nelson, S. J., Surface coil MR imag-
ing of the human brain with an analytic reception profile correction,
J. Magn. Reson. Imaging, Vol. 5, No. 2, pp. 139–144, 1995.
[31] Dawant, B. M., Zijdenbos, A. P., and Margolin, R., Correction of intensity
variations in MR images for computer-aided tissue classification, IEEE
Trans. Med. Imaging, Vol. 12, No. 4, pp. 770–781, 1993.
[32] Meyer, C., Bland, P., and Pipe, J., Retrospective correction of MRI am-
plitude inhomogeneities, In: Proceedings of the First International Con-
ference on Computer Vision, Virtual Reality, and Robotics in Medicine,
CVRMED’95, Ayache, N., ed., Lecture Notes in Computer Science, Vol.
905, Springer, Nice, France, pp. 513–522, 1995.
[33] Brechbühler, C., Gerig, G., and Székely, G., Compensation of spatial
inhomogeneity in MRI based on a parametric bias estimate, In: Pro-
ceedings of Visualization in Biomedical Computing, VBC’96, Lecture
Notes in Computer Science, Vol. 1131, Springer, Berlin, pp. 141–146,
1996.
[34] Sled, J. G., Zijdenbos, A. P., and Evans, A. C., A comparison of retro-
spective intensity non-uniformity correction methods for MRI, In: Pro-
ceedings of XVth International Conference on Information Processing
in Medical Imaging, IPMI’97, Lecture Notes in Computer Science, Vol.
1230, Springer, Berlin, pp. 459–464, 1997.
[35] Styner, M., Brechbühler, C., Székely, G., and Gerig, G., Parametric es-
timate of intensity inhomogeneities applied to MRI, IEEE Trans. Med.
Imaging, Vol. 19, No. 3, pp. 153–165, 2000.
[36] Sled, J. G., Zijdenbos, A. P., and Evans, A. C., A Nonparametric method
for automatic correction of intensity nonuniformity in MRI Data, IEEE
Trans. Med. Imaging, Vol. 17, No. 1, pp. 87–97, 1998.
[38] Likar, B., Viergever, M., and Pernus, F., Retrospective correction of
MR intensity inhomogeneity by information minimization, In: Proceed-
ings of Medical Image Computing and Computer-Assisted Intervention,
MICCAI 2000, Lecture Notes in Computer Science, Vol. 1935, Springer,
Berlin, pp. 375–384, 2000.
[39] Van Leemput, K., Maes, F., Vandermeulen, D., and Suetens, P., Auto-
mated model-based tissue classification of MR images of the brain, IEEE
Trans. Med. Imaging, Vol. 18, No. 10, pp. 897–908, 1999.
[40] Li, S., Markov Random Field Modeling in Computer Vision, Computer
Science Workbench Series, Springer, Berlin, 1995.
[41] Ising, E., Beitrag zur theorie des ferromagnetismus, Zeitschrift für
Physik, Vol. 31, pp. 253–258, 1925.
[42] Descombes, X., Mangin, J.-F., Pechersky, E., and Sigelle, M., Fine struc-
ture preserving markov model for image processing, In: Proceedings
of the 9th Scandinavian Conference on Image Analysis, SCIA’95, pp.
349–356, 1995.
[43] Zhang, J., The mean-field theory in EM procedures for Markov random
fields, IEEE Trans. Signal Process., Vol. 40, No. 10, pp. 2570–2583, 1992.
[44] Langan, D. A., Molnar, K. J., Modestino, J. W., and Zhang, J., Use of
the mean-field approximation in an EM-based approach to unsuper-
vised stochastic model-based image segmentation, In: Proceedings of
ICASSP’92, San Fransisco, CA, March 1992, Vol. 3, pp. 57–60.
[45] Kwan, R. K.-S., Evans, A. C., and Pike, G. B., MRI simulation-based eval-
uation of image-processing and classification methods, IEEE Trans.
Med. Imaging, Vol. 18, No. 11, pp. 1085–1097, 1999. Available at
http://www.bic.mni.mcgill.ca/brainweb/.
[47] Zijdenbos, A. P., Dawant, B. M., and Margolin, R. A., Intensity correction
and its effect on measurement variability in the computer-aided analysis
of MRI, In: Proceedings of 9th International Symposium and Exhibition
on Computer Assisted Radiology, CAR’95, Springer, Berlin, pp. 216–221,
June 1995.
[48] Park, J., Gerig, G., Chakos, M., Vandermeulen, D., and Lieberman, J.,
Neuroimaging of psychiatry disease: Reliable and efficient automatic
brain tissue segmentation for increased sensitivity, Schizophrenia Res.,
Vol. 49, p. 163, 1994.
[50] Marroquin, J. L., Vemuri, B. C., Botello, S., Calderon, F., and Fernandez-
Bouzas, A., An accurate and efficient Bayesian method for automatic
segmentation of brain MRI, IEEE Trans. Med. Imaging, Vol. 21, No. 8,
pp. 934–945, 2002.
[51] Niessen, W., Vincken, K., Weickert, J., ter Haar Romeny, B., and
Viergever, M., Multiscale segmentation of three-dimensional MR brain
images, Int. J. Comput. Vision, Vol. 31, No. 2/3, pp. 185–202, 1999.
[52] Van Leemput, K., Maes, F., Vandermeulen, D., Colchester, A., and
Suetens, P., Automated segmentation of multiple sclerosis lesions by
model outlier detection, IEEE Trans. Med. Imaging, Vol. 20, No. 8,
pp. 677–688, 2001.
[53] Huber, P., Robust Statistics, Wiley series in Probability and Mathemati-
cal Statistics, Wiley, New York, 1981.
[54] Zhuang, X., Huang, Y., Palaniappan, K., and Zhao, Y., Gaussian mixture
density modeling, decomposition, and applications, IEEE Trans. Image
Process., Vol. 5, No. 9, pp. 1293–1302, 1996.
[56] Hoaglin, D., Mosteller, F., and Tukey, J., eds., Understanding Robust and
Explanatory Data Analysis, Wiley series in Probability and Mathemati-
cal Statistics, Wiley, New York, 1983.
[58] Udupa, J., Wei, L., Samarasekera, S., Miki, Y., van Buchem, M., and
Grossman, R., Multiple sclerosis lesion quantification using fuzzy-
connectedness principles, IEEE Trans. Med. Imaging, Vol. 16, No. 5,
pp. 598–609, 1997.
[59] Johnston, B., Atkins, M., Mackiewich, B., and Anderson, M., Segmen-
tation of multiple sclerosis lesions in intensity corrected multispectral
MRI, IEEE Trans. Med. Imaging, Vol. 15, No. 2, pp. 154–169, 1996.
[60] Zijdenbos, A., Dawant, B. M., Margolin, R. A., and Palmer, A. C., Mor-
phometric analysis of white matter lesions in MR images: Method and
validation, IEEE Trans. Med. Imaging, Vol. 13, No. 4, pp. 716–724, 1994.
[61] Kamber, M., Shinghal, R., Collins, D., Francis, G., and Evans, A., Model-
based 3-D segmentation of multiple sclerosis lesions in magnetic res-
onance brain images, IEEE Trans. Med. Imaging, Vol. 14, No. 3, pp.
442–453, 1995.
[62] Kikinis, R., Guttmann, C., Metcalf, D., Wells, W., III, Ettinger, G., Weiner,
H., and Jolesz, F., Quantitative follow-up of patients with multiple scle-
rosis using MRI: Technical aspects, J. Magn. Reson. Imaging, Vol. 9,
No. 4, pp. 519–530, 1999.
[63] Warfield, S., Dengler, J., Zaers, J., Guttmann, C., Wells, W., III, Ettinger,
G., Hiller, J., and Kikinis, R., Automatic identification of grey matter
structures from MRI to improve the segmentation of white matter le-
sions, J. Image Guided Surg., Vol. 1, No. 6, pp. 326–338, 1995.
[64] Zijdenbos, A., Evans, A., Riahi, F., Sled, J., Chui, J., and Kollokian, V., Au-
tomatic quantification of multiple sclerosis lesion volume using stereo-
taxic space, In: Proceedings of Visualization in Biomedical Computing,
VBC’96, Lecture Notes in Computer Science, Springer, Berlin, Vol. 1131,
pp. 439–448, 1996.
Model-Based Brain Tissue Classification 53
[65] Zijdenbos, A., Forghani, R., and Evans, A., Automatic quantification of
MS lesions in 3D MRI brain data sets: Validation of INSECT, In: Proceed-
ings of Medical Image Computing and Computer-Assisted Intervention,
MICCAI’98, Lecture Notes in Computer Science, Vol. 1496, Springer,
Berlin, pp. 439–448, 1998.
[66] Guttmann, C., Kikinis, R., Anderson, M., Jakab, M., Warfield, S., Killiany,
R., Weiner, H., and Jolesz, F., Quantitative follow-up of patients with
multiple sclerosis using MRI: Reproducibility, J. Magn. Reson. Imaging,
Vol. 9, No. 4, pp. 509–518, 1999.
[67] Evans, A., Frank, J., Antel, J., and Miller, D., The role of MRI in clinical
trials of multiple sclerosis: Comparison of image processing techniques,
Ann. Neurol., Vol. 41, No. 1, pp. 125–132, 1997.
[68] Filippi, M., Horsfield, M., Tofts, P., Barkhof, F., Thompson, A., and
Miller, D., Quantitative assessment of MRI lesion load in monitoring
the evolution of multiple sclerosis, Brain, Vol. 118, pp. 1601–1612,
1995.
[69] Antel, S. B., Bernasconi, A., Bernasconi, N., Collins, D. L., Kearney, R. E.,
Shinghal, R., and Arnold, D. L., Computational models of MRI character-
istics of focal cortical dyaplasia improve lesion detection, NeuroImage,
Vol. 17, No. 4, pp. 1755–1760, 2002.
[70] Antel, S. B., Collins, D. L., Bernasconi, N., Andermann, F., Shinghal, R.,
Kearney, R. E., Arnold, D. L., and Bernasconi, A., Automated detection
of focal cortical dysplasia lesions using computational models of their
MRI characteristics and texture analysis, NeuroImage, Vol. 19, No. 4,
pp. 1784–1759, 2003.
[71] Ashburner, J., Friston, K., Holmes, A., and Poline, J.-B., Statis-
tical Parametric Mapping, The Wellcome Department of Cogni-
tive Neurology, University College London, London. Available at
http://www.fil.ion.ucl.ac.uk/spm/.
[72] Srivastava, S., Vandermeulen, D., Maes, F., Dupont, P., van Paesschen,
W., and Suetens, P., An automated 3D algorithm for neo-cortical thick-
ness measurement, In: Proceedings of Medical Image Computing and
54 Leemput et al.
[73] DeLisi, L., Tew, W., Shu-Hong, X., Hoff, A. L., Sakuma, M., Kushner,
M., Lee, G., Shedlack, K., Smith, A. M., and Grimson, R., A prospec-
tive follow-up study of brain morphology and cognition in first-epsiode
schizophrenic patients: Preliminary findings, Biol. Psychiatry, Vol. 38,
No. 2, pp. 349–360, 1995.
[74] Crow, T., Ball, J., Bloom, S., Brown, R., Bruton, C. J., Colter, N., Firth,
C. D., Johnstone, E. C., Owens, D. E., and Roberts, G. W., Schizophre-
nia as an anomaly of development of cerebral asymmetry, Arch. Gen.
Psychiatry, Vol. 46, pp. 1145–1150, 1989.
[75] Maes, F., Van Leemput, K., DeLisi, L. E., Vandermeulen, D., and Suetens,
P., Quantification of cerebral grey and white matter asymmetry from
MRI, In: Proceedings of Medical Image Computing and Computer-
Assisted Intervention, MICCAI’99, Lecture Notes in Computer Science,
Vol. 1679, Springer, Berlin, pp. 348–357, 1999.
[76] Marais, P., Guillemaud, R., Sakuma, M., Zisserman, A., and Brady, M.,
Visualising cerebral asymmetry, In: Visualization in Biomedical Com-
puting, Vol. 1131 of Lecture Notes in Computer Science, Höhne, K. H.
and Kikinis, R. eds., Homburg, Germany, Springer, pp. 411–416, 1999.
[77] Liu, Y., Collins, R. T., Rothfus, W. E., Automatic Bilateral symmetry
(midsagittal) plane extraction from pathological 3D neuroradiologi-
cal images, In: Medical Imaging 1998: Image Processing, Vol. 3338 of
Proc. SPIE, Hanson, K. M. ed., San Diego, CA, USA, pp. 1528–1539,
1998.
[78] Prima, S., Thirion, J.-P., Subsol, G., and Roberts, N., Automatic anal-
ysis of normal brain dissymmetry of males and females in MR im-
ages, In: Medical Image Computing and Computer-Assisted Intervention
(MICCAI’98), Vol. 1496 of Lecture Notes in Computer Science, Wells,
W. M., Colchester, A., and Delp, S., eds., Cambridge, MA, USA, Springer,
pp. 770–779, 1998.
Model-Based Brain Tissue Classification 55
[80] Pohl, K. M., Wells, W. M., III, Guimond, A., Kasai, K., Shenton, M. E.,
Kikinis, R., Grimson, W. E. L., and Warfield, S. K., Incorporating non-
rigid registration into expectation maximization algorithm to segment
MR images, In: Proceedings of the 5th International Conference on
Medical Image Computing and Computer-Assisted Intervention, Part I,
Springer-Verlag, Berlin, pp. 564–571, 2002.
[81] Wyatt, P. P. and Noble, J. A., MAP MRF joint segmentation and regis-
tration of medical images, Med. Image Anal., Vol. 7, No. 4, pp. 539–552,
2003.
[82] D’Agostino, E., Maes, F., Vandermeulen, D., and Suetens, P., An infor-
mation theoretic approach for non-rigid image registration using voxel
class probabilities, In: Proceedings of the Second International Work-
shop on Biomedical Image Registration, WBIR 2003, Lecture Notes in
Computer Science, Vol. 2717, pp. 122–131, Springer, 2003.
[83] Holmes, C., Hoge, R., Collins, L., Woods, R., Toga, A., and Evans, A.,
Enhancement of MR images using registration for signal averaging, J.
Comput Tomography, Vol. 22, 1998.
[84] Kochunov, P., Lancaster, J., Thompson, P., Toga, A., Brewer, P., Hardies,
J., and Fox, P., An optimized individual target brain in the Talairach
coordinate system, NeuroImage, Vol. 17, 2002.
[85] Vandermeulen, D., Descombes, X., Suetens, P., and Marchal, G., Un-
supervised regularized classification of multi-spectral MRI, Technical
Report KUL/ESAT/MI2/9608, Katholieke Universiteit Leuven, Feb. 1996.
Chapter 2
2.1 Introduction
1
Computer Vision Center, Universitat Autònoma de Barcelona, Campus UAB, Edifici O,
08193 Bellaterra (Barcelona), Spain
57
58 Pujol and Radeva
IVUS image. It is generally accepted that the different kind of plaque tissues
distinguishable in IVUS images is threefold: Calcium formation is character-
ized by a very high echoreflectivity and absorbtion of the emitted pulse from
the transducer. This behavior produces a deep shadowing effect behind calcium
plaques. In the figure, calcium formation can be seen at three o’clock and from
five to seven o’clock. Fibrous plaque has medium echoreflectivity resembling
that of the adventitia. This tissue has a good transmission coefficient allowing
the pulse to travel through the tissue, and therefore, providing a wider range
of visualization. This kind of tissue can be observed from three o’clock to five
o’clock. Soft plaque or Fibro-Fatty plaque is the less echoreflective of the three
kind of tissues. It also has good transmission coefficient allowing to see what is
behind this kind of plaque. Observing the figure, a soft plaque configuration is
displayed from seven o’clock to three o’clock.
Because of time consumption and subjectivity of the classification depend-
ing on the specialist, there is a crescent interest of the medical community in
developing automatic tissue characterization procedures. This is accentuated
because the procedure for tissue classification by physicians implies the man-
ual analysis of IVUS images.
The problem of automatic tissue characterization has been widely stud-
ied in different medical fields. The unreliability of gray-level only methods to
achieve good discrimination among the different kind of tissues forces us to use
more complex measures, usually based on texture analysis. Texture analysis
has played a prominent role in computer vision to solve tissue characterization
problems in medical imaging [2–9].
Several researching groups have reported different approximations to char-
acterize the tissue of IVUS image.
Supervised Texture Classification for Intravascular Tissue Characterization 59
Vandenberg in [10] base their contribution on reducing the noise of the image
to have a clear representation of the tissue. The noise reduction is achieved by
averaging sets of images when the least variance in diameter of the IVUS occurs.
At the end, a fuzzy logic based expert is set to discriminate among the tissues.
Nailon and McLaughlin devote several efforts to IVUS tissue characteriza-
tion. In [11] they use classic Haralick texture statistics to discriminate among
tissues. In [12] the authors propose the use of co-occurrence matrices texture
analysis and fractal texture analysis to characterize intravascular tissue. Thir-
teen features plus fractal dimension derived from Brownian motion are used for
this task. The conclusion shows that fractal dimension is unable to discriminate
between calcium and fibrous plaque but helps in fibrous versus lipidic plaque.
On the other hand, co-occurrence matrices are well suited for the overall clas-
sification. In [13], it is stated that the discriminative power of fractal dimension
is poor when trying to separate fibrotic tissue, lipidic tissue, and foam cells. The
method used is based on fractal dimension estimation techniques (box-counting,
brownian motion, and frequency domain).
Spencer in [14] center their work on spectral analysis. Different features are
compared: mean power, maximum power, spectral slope, and 0 Hz interception.
The work concludes with the 0 Hz spectral slope as the most discriminative
feature.
Dixon in [15] use co-occurrence matrices and discriminant analysis to eval-
uate the different kind of tissues in IVUS images.
Ahmed and Leyman in [16] use a radial transform and correlation for pattern
matching. The features used are higher order statistics such as kurtosis, skew-
ness, and up to four order cumulants. The results provided appear to have fairly
good visual recognition rate.
The work of de Korte and van der Steen [17] opens a new proposal based
on assessing the local strain of the atherosclerotic vessel wall to identify differ-
ent plaque components. This very promising technique, called elastography, is
based on estimating the radial strain by performing cross-correlation analysis
on pairs of IVUS at a certain intracoronary pressure.
Probably, one of the most interesting work in this field is the one provided
by Zhang and Sonka in [18]. This work is much more complex trying to evalu-
ate the full morphology of the vessel. Detecting the plaque and adventitia bor-
ders and characterizing the different kind of tissues, the tissue discrimination is
done using a combination of well-known techniques previously reported in the
60 Pujol and Radeva
To introduce the texture feature extraction methods we divide them into two
groups: The first group, that forms the statistic-related methods, is comprised
of co-occurrence matrix measures, accumulation local moments, fractal dimen-
sion, and local binary patterns. All these methods are somehow related to statis-
tics. Co-occurrence matrix measures are second-order measures associated to
the probability density function estimation provided by the co-occurrence ma-
trix. Accumulation local moments are directly related to statistics. Fractal di-
mension is an approximation of the roughness of a texture. Local binary patterns
provides a measure of the local inhomogeneity based on an “averaging” process.
The second group, that forms the analytic kernel-based extraction techniques,
comprises Gabor bank of filters, derivatives of Gaussian filters, and wavelet de-
composition. The last three methods are derived from analytic functions and
sampled to form a set of filters, each focused on the extraction of a certain
feature.
In 1962 Julesz [29] showed the importance of texture segregation using second-
order statistics. Since then, different tools have been used to exploit this issue.
The gray-level co-occurrence matrix is a well-known statistical tool for extract-
ing second-order texture information from images [22]. In the co-occurrence
method, the relative frequencies of gray-level pairs of pixels at certain relative
displacement are computed and sorted in a matrix, the co-occurrence matrix
P. The co-occurrence matrix can be thought of as an estimate of the joint prob-
ability density function of gray-level pairs in an image. For G gray levels in the
image, P will be of size G × G. If G is large, the number of pixel pairs contribut-
ing to each element, pi, j in P is low, and the statistical significance poor. On the
other hand, if the number of gray levels is low, much of the texture information
may be lost in the image quantization. The element values in the matrix, when
normalized, are bounded by [0, 1], and the sum of all element values is equal
to 1.
where I(l, m) is the image at pixel (l, m), D is the distance between pixels,
Supervised Texture Classification for Intravascular Tissue Characterization 63
and θ is the angle. It has been proved by other researchers [21, 30] that the
nearest neighbor pairs at distance D at orientations θ = {0◦ , 45◦ , 90◦ , 135◦ }
are the minimum set needed to describe the texture second-order statistic
measures. Figure 2.2 illustrates the method providing a graphical explanation.
The main idea is to create a “histogram” of the occurrences of having two
pixels of certain gray levels at a determined distance with a fixed angle.
Practically, we add one to the cell of the matrix pointed by the gray levels of
two pixels (one pixel gray level gives the file and the other the column of the
matrix) that fulfill the requirement of being at a certain predefined distance and
angle.
Once the matrix is computed several characterizing measures are extracted.
Many of these features are derived by weighting each of the matrix element
values and then summing these weighted values to form the feature value. The
weight applied to each element is based on a feature-weighing function, so by
varying this function, different texture information can be extracted from the
matrix. We present here some of the most important measures that characterize
the co-occurrence matrices: energy, entropy, inverse difference moment, shade,
inertia, and promenance [30]. Let us introduce some notation for the definition
of the features:
64 Pujol and Radeva
Hence, we create a feature vector for each of the pixels by assigning each fea-
ture measure to a component of the feature vector. Given that we have four
different orientations and the six measures for each orientation, the feature
vector is a 24-dimensional vector for each pixel and for each distance. Since we
have used two distances D = 2 and D = 3, the final vector is a 48-dimensional
vector.
Figure 2.3 shows responses for different measures on the co-occurrence
matrices. Although a straightforward interpretation of the feature extraction
response is not easy, some deduction can be made by observing the figures.
Figure 2.3(b) shows shade measure; as its name indicates it is related to the
shadowed areas in the image, and thus, localizing the shadowing behind the
calcium plaque. Figure 2.3(c) shows inverse different moment response, this
measure seems to be related to the first derivative of the image, enhancing
contours. Figure 2.3(d) depicts the output for the inertia measure, which seems
to have some relationship with local homogeneity of the image.
Supervised Texture Classification for Intravascular Tissue Characterization 65
(a) (b)
(c) (d)
off errors and small perturbations in the input data [31], the reverse accumulation
moments are recommendable.
The reverse accumulation moment of order (k − 1, l − 1) of matrix Iab is the
value of Iab [1, 1] after bottom-up accumulating its column k times (i.e., after
applying k times the assignment Iab [a − i, j] ← Iab [a − i, j] + Iab [a − i + 1, j],
for i = 0 to a − 1, and for j = 1 to b), and accumulating the resulting first row
from right to left l times (i.e., after applying l times the assignment Iab [1, b − j] ←
Iab [1, b − j] + Iab [1, b − j + 1], for j = 1 to b − 1). The reverse accumulation
moment matrix is defined so that Rmn[k.l] is the reverse accumulation moment
of order (k − 1, l − 1).
Consider the matrix in the following example:
⎛ ⎞
0 1 2
⎜ ⎟
⎝1 1 1⎠
4 2 3
Then it is said that the reverse accumulation moment of order (1,2) of the former
matrix is 119.
The set of moments alone is not sufficient to obtain good texture features in
certain images. Some iso-second order texture pairs, which are preattentively
discriminable by humans, would have the same average energy over finite re-
gions. However, their distribution would be different for the different textures.
One solution suggested by Caelli is to introduce a nonlinear transducer that
maps moments to texture features [32]. Several functions have been proposed
in the literature: logistic, sigmoidal, power function, or absolute deviation of fea-
ture vectors from the mean [23]. The function we have chosen is the hyperbolic
tangent function, which is logistic in shape. Using the accumulation moments
image Im and a nonlinear operator |tanh(σ (Im − Īm)| an “average” is performed
Supervised Texture Classification for Intravascular Tissue Characterization 67
(a) (b)
Figure 2.4: Accumulation local moments response. (a) Original image. (b) Ac-
cumulation local moment of order (3,1).
throughout the region of interest. The parameter σ controls the shape of the
logistic function. Therefore each textural feature will be the result of the appli-
cation of the nonlinear operator to the computed moments. If n = k · l moments
are computed over the image, then the dimension of the feature vector will be
n. Hence, a n-dimensional point is associated with each pixel of the image.
Figure 2.4 shows the response of moment (3,1) on an IVUS image. In this
figure, the response seems to have a smoothing and enhancing effect, clearly
resembling diffusion techniques.
Another classic tool for texture description is the fractal analysis [13, 33],
characterized by the fractal dimension. We talk roughly about fractal struc-
tures when a geometric shape can be subdivided in parts, each of which are
approximately a reduced copy of the whole (this property is also referred as
self-similarity). The introduction of fractals by Mandelbrot [27] allowed a char-
acterization of complex structures that could not be described by a single mea-
sure using Euclidean geometry. This measure is the fractal dimension, which
is related to the degree of irregularity of the surface texture.
The fractal structures can be divided into two subclasses: the deterministic
fractals and the random fractals. Deterministic fractals are strictly self-similar,
that is, they appear identical over a range of magnification scales. On the other
hand, random fractals are statistical self-similar. The similarity between two
scales of the fractal is ruled by a statistical relationship.
68 Pujol and Radeva
The fractal dimension represents the disorder of an object. The higher the
dimension, the more complex the object is. Contrary to the Euclidian dimension,
the fractal dimension is not constrained to integer dimensions.
The concept of fractals can be easily extrapolated to image analysis if we
consider the image as a three-dimensional surface in which the height at each
point is given by the gray value of the pixel.
Different approaches have been proposed to compute the fractal dimension
of an object. Herein we consider only three classical approaches: box-counting,
Brownian motion, and Fourier analysis.
Nr D = 1
where E is the mean and H the Hurst coefficient. The fractal dimension is
related to H in the following way: D = 3 − H. In the same way than the former
Supervised Texture Classification for Intravascular Tissue Characterization 69
(a) (b)
Figure 2.5: Fractal dimension from box-counting response. (a) Original image.
(b) Fractal dimension response with neighborhoods of 10 × 10.
method for calculating the fractal dimension the mean difference of intensities
is calculated for different scales (each scale given by the euclidian distance
between two pixels), and the slope of the regression line between log E(|I( p1 ) −
√
I( p2 )|) and (x2 − x1) + (y 2 − y1) gives the Hurst parameter.
Local binary patterns [28] are a feature extraction operator used for detecting
“uniform” local binary patterns at circular neighborhoods of any quantization of
70 Pujol and Radeva
the angular space and at any spatial resolution. The operator is derived based
on a circularly symmetric neighbor set of P members on a circle of radius R.
It is denoted by LBPriu2
P,R . Parameter P controls the quantization of the angular
space, and R determines the spatial resolution of the operator. Figure 2.6 shows
typical neighborhood sets. To achieve gray-scale invariance, the gray value of the
center pixel (gc ) is subtracted from the gray values of the circularly symmetric
neighborhood g p ( p = 0, 1, . . . , P − 1) and assigned a value of 1 if the difference
is positive and 0 if negative.
1 if x ≥ 0
s(x) =
0 otherwise
P
LBP R,P = s(g p − gc ) · 2 p
p=0
To achieve rotation invariance the pattern set is rotated as many times as nec-
essary to achieve a maximal number of the most significant bits, starting always
from the same pixel. The last stage of the operator consists on keeping the in-
formation of “uniform” patterns while filtering the rest. This is achieved using a
transition count function U . U is a function that counts the number of transitions
Supervised Texture Classification for Intravascular Tissue Characterization 71
(a) (b)
Figure 2.7: Local binary pattern response. (a) Original image. (b) Local binary
pattern output with parameters R = 3, P = 24.
P−1
U (LBP P,R ) = |s(g P−1 − gc ) − s(g0 − gc )| + |s(g p − gc ) − s(g p−1 − gc )|
p=1
Therefore,
LBPri if U (LBP P,R ) ≤ 2
LBPriu2
P,R = P,R
P +1 otherwise.
Figure 2.7 shows an example of an IVUS image filtered using a uniform rotation
invariant local binary pattern with values P = 24, R = 3. The feature extraction
image displayed in the figure looks like a discrete response focussed on the
structure shape and homogeneity.
L(·; t) = g(·; t) ∗ f
1 1 N 2
−xT x/(2t) − i=1 xi /(2t)
g(x; t) = e = e x ∈ Re N , xi ∈
(2π t) N/2 (2π t) N/2
√
The square root of the scale parameter, σ = (t), is the standard deviation of
the kernel g and is a natural measure of spatial scale in the smoothed signal at
scale t. From this scale-space representation, multiscale spatial derivatives can
be defined by
F = {{∂ n, G n}, n ∈ }
Supervised Texture Classification for Intravascular Tissue Characterization 73
(a) (b)
(c) (d)
Figure 2.8 shows some of the responses for the derivative of Gaussian bank of
filters for σ = 2. Figures 2.8(b), 2.8(c), and 2.8(d) display the first, second, and
third derivatives of Gaussian, respectively.
2.2.2.2 Wavelets
The multiresolution analysis (MRA) tries to build orthonormal bases for a dyadic
grid, where a0 = 2, b0 = 1, which besides have a compact support region. Fi-
nally, we can imagine the coefficients dm,n of the discrete wavelet transform
as the sampling of the convolution of signal f (t) with different filters ψm(−t),
Supervised Texture Classification for Intravascular Tissue Characterization 75
−m/2
where ψm(t) = a0 ψ(a−mt)
ym(t) = f (s)ψm(s − t) ds dm,n = ym na0m
Figure 2.9 shows the dual effect of shrinking of the mother wavelet as the
frequency increases, and the translation value decreasing as the frequency in-
creases. The mother wavelet keeps its shape but if high-frequency analysis is
desired the spatial support of the wavelet has to decrease. On the other hand, if
the whole real line has to be covered by translations of the mother wavelet, as
the spatial support of the wavelet decreases, the number of translations needed
to cover the real line increases. Unlike Fourier transform, where translations of
analysis are at the same distance for all the frequencies.
The choice of a representation of the wavelet transform leads us to define
the concept of a frame. A frame is a complete set of functions that, though able
to span L 2 (), is not a base because it lacks the property of linear independence.
MRA proposed in [26] is another representation in which the signal is decom-
posed in an approximation at a certain level L with L detail terms of higher
resolutions. The representation is an orthonormal decomposition instead of a
redundant frame, and therefore, the number of samples that defines a signal is
the same as that the number of coefficients of their transform. A MRA consists
76 Pujol and Radeva
where
+∞
f, g = f (t)g(t) dt
−∞
Earlier we have pointed out the nesting condition of the V j spaces, V j ⊂ V j−1 .
Now, if f ∈ V j−1 then f ∈ V j or f is orthonormal to all the V j functions, that
is, we divide V j−1 in two disjoint parts: V j and other space W j , such that if
f ∈ V j , g ∈ W j , f ⊥g; W j is the orthonormal complement of V j in V j−1 :
V j−1 = V j ⊕ W j
From these equations some conclusions can be extracted. First, the projec-
tion of a signal f in a space V j gives a new signal P j f , an approximation of
the initial signal. Secondly, we have a hierarchy of spaces, then P j−1 f will be
a better approximation (more reliable) than P j f . Since V j−1 can be divided in
two subspaces V j and W j , if V j is an approximation space then W j , which is
the complementary orthonormal space, it is the detail space. The less the j, the
finer the details.
This can be viewed as a decomposition tree (see Fig. 2.10). At the top-left side
of the image the approximation can be seen, and surrounding it the successive
details. The further the detail is located the finer the information provided. So,
the details at the bottom and at the right side of the image have information about
the finer details and the smallest structures of the image decomposed. Therefore,
we have a feature vector composed by the different detail approaches and the
approximation for each of the pixels.
Gabor filters represent another multiresolution technique that relies on scale and
direction of the contours [25,37]. The Gabor filter consists of a two-dimensional
sinusoidal plane wave of a certain orientation and frequency that is modulated in
amplitude by a two-dimensional Gaussian envelope. The spatial representation
of the Gabor filter is as follows:
1 x2 y2
h(x, y) = exp − + 2 cos(2π u0 x + φ)
2 σx2 σy
78 Pujol and Radeva
where u0 and φ are the frequency and phase of the sinusoidal plane wave along
the x axis and σx and σ y are the space constants of the Gaussian envelope along
the x and y axis, respectively. Filters at different orientations can be created by
rigid rotation of x–y coordinate system.
An interesting property of this kind of filters is their frequency and
orientation-selection. This fact is better displayed in the frequency domain.
Figure 2.11 shows the filter area in the frequency domain. We can observe
that each of the filters has a certain domain defined by each of the leaves of
the Gabor “rose.” Thus, each filter responds to a certain orientation and at a
certain detail level. Wider the range of orientations, smaller the space filter di-
mensions and smaller the details captured by the filter, as bandwidth in the
frequency domain is inversely related to filter scope in the space domain. There-
fore, Gabor filters provide a trade-off between localization or resolution in both
the spatial and the spatial-frequential domains. As it has been mentioned, dif-
ferent filters emerge from rotating the x–y coordinate system. For practical
approaches one can use four angles θ0 = 0◦ , 45◦ , 90◦ , 135◦ . For an image ar-
ray of N pixels (with N power of 2), the following values of u0 are suggested
[25, 37]:
√ √ √ √
1 2, 2 2, 3 2, . . . , and (Nc /4) 2
Supervised Texture Classification for Intravascular Tissue Characterization 79
cycles per image width. Therefore, the orientations and bandwidth of such filters
vary with 45◦ and 1 octave. These parameters are chosen because there is phys-
iologic evidences of frequency bandwidth of simple cells in visual cortex being
of about 1 octave, and Gabor filters try to mimic part of the human perceptual
system.
The Gabor function is an approximation to a wavelet. However, though ad-
missible, it does not result in an orthogonal decomposition, and therefore, a
transformation based on Gabor’s filters is redundant. On the other hand, Gabor
filtering is designed to be nearly orthogonal, reducing the amount of overlap
between filters.
Figure 2.12 shows different responses for different filters of the spectrum.
Figures 2.12(a) and 2.12(b) correspond to the inner filters with reduced fre-
quency bandwidth displayed in Fig. 2.11. It can be seen that they deliver only
(a) (b)
(c) (d)
Figure 2.12: Gabor filter bank example responses. (a) Gabor vertical energy of
a coarse filter response. (b) Gabor horizontal energy of a coarse filter response.
(c) Gabor vertical energy of a detail filter response. (d) Gabor horizontal energy
of a detail filter response.
80 Pujol and Radeva
coarse information of the structure and the borders are far from the original
location. In the same way, Figs. 12(c) and 12(d) are filters located on a further
ring, and therefore respond to details in the image.
It can be observed that the feature extraction process is a transformation
of the original two-dimensional image domain to a feature space that probably
will have different dimensions. In some cases, the feature space remains low,
as in fractal dimension and local binary patterns, that with very few features try
to describe the texture present in the image. However, several feature spaces
require higher dimensions, such as accumulation local moments, co-occurrence
matrix measures, or derivatives of Gaussian. Table 2.1 shows the dimensionality
of the different spaces generated by the feature extraction process in our texture-
based IVUS analysis.
The next step after the feature extraction is the classification process. As
a result of the disparity of the dimensionality of the feature spaces, we have
to choose a classification scheme that is able to deal with high dimensionality
feature data.
k j = max{k1 · · · kL } → X ∈ ω j
k1 + · · · + kL = k
P(w j | x) = P(x | w j )
Figure 2.14: (a) Graphic example of the maximum likelihood classification as-
suming an underlying density model. (b) Unknown probability density function
estimation by means of a 2 Gaussian mixture model. (c) Resulting approximation
of the unknown density function.
C
pi (x | ) = pk (x | θk )Pk (2.2)
k=1
Supervised Texture Classification for Intravascular Tissue Characterization 85
yk = W T xk , k = 1, . . . , M
86 Pujol and Radeva
where M is the number of samples and µ is the mean vector of all samples.
Applying the linear transformation W T , the scatter of the transformed feature
vectors is W T ST W . PCA is defined as to maximize the determinant of the scatter
of the transformed feature vectors:
y = At (x − µ)
analysis [38, 39] seeks a transformation matrix W such that the ratio of the
between-class scatter and the within-class scatter is maximized. Let the between-
class scatter SB be defined as follows:
c
SB = Ni (µi − µ)(µi − µ)T (2.3)
i=1
where µi is the mean value of class Xi , µ is the mean value of the whole data,
c is the number of classes, and Ni is the number of samples in class Xi . Let the
within-class scatter be
c
SW = (xk,i − µi )(xk,i − µi )T (2.4)
i=1 xk,i ∈Xi
r For t = 1..T
– Normalize weights
wt,i
wt,i ← n
j=1 wt,i
Figure 2.17: Error rates associated to the AdaBoost process. (a) Weak single
classification error. (b) Strong classification error on the training data. (c) Test
error rate.
threshold function. This scheme is easy to embed in the adaboost process since
it relies on the weights to make the classification.
Another approach to be taken in consideration is to model the feature points
as Gaussian distributions. This allows us to define a simple scheme by simply
calculating the weighed mean and weighed covariance of the classes at each
step t of the process:
j
j
j 2
µi,t = wi,t xi i,t = wi,t xi − µi,t
i i
j
for each xi point in class C j . Wi, j are the weights for each data point.
If feature selection is desired, this scheme is highly constrained to the N fea-
tures of the N-dimensional feature space. If N is not enough large, the procedure
could not improve its performance.
Both, the feature extraction and the classification processes, are the cen-
tral parts of the tissue characterization framework. Next section is devoted to
explain the different frameworks where these processes are applied for tissue
characterization of IVUS images as well as provide quantitative results of their
performance.
the physicians. These results have been used to validate against the manually
segmented plaque regions (Fig. 2.18(b)).
Therefore, though we are concerned with tissue characterization, we cannot
forget the segmentation of the plaque. A brief review on how to segment the
plaque is exposed in the next section.
We begin our process of tissue characterization taking the IVUS image and
transforming it to cartesian coordinates (Fig. 2.20(a)). Once the cartesian
94 Pujol and Radeva
transformation is done, artifacts are removed from the image (Fig. 2.20(b)).
There are three main artifacts in an IVUS image: the transducer wrapping, which
creates a first halo at the center of the image (in the cartesian image the echo is
shown at the top of the image); the guide-wire effect, which produces an echo
reverberation near the transducer wrapping; and the calibration grid, which are
markers at a fixed location that allow the physicians to evaluate quantitatively
the morphology and the lesions in the vessel. With the artifacts removed, we pro-
ceed to identify intima and adventitia using the process described in the former
section. At this point, we have the plaque located and we are concerned with
tissue identification (Fig. 2.20(c)). The tissue classification process is divided
Supervised Texture Classification for Intravascular Tissue Characterization 95
in three stages: First, the soft–hard classification (Figs. 2.20(d) and 2.20(e)),
in which the soft plaque, the hard plaque, and calcium are separated. In the
second stage, the calcium is separated from the hard plaque (Figs. 2.20(f) and
2.20(g)). At the last stage, the information is fused and the characterization is
completed. We will refer later to this diagram to explain some parts of the pro-
cess. Recall that the plaque is the area comprised between the intima and the
adventitia. With both borders located we can focus on the tissue of that area.
96 Pujol and Radeva
For such task, the three stages scheme formerly described is used. Regarding
the first stage of the process, a classification is performed on the feature space.
At this point, a feature space and a classifier must be selected. To help to choose
which feature space and which classifier to use, we try each of the feature
spaces with a general purpose classifier, the k-nearest neighbors method used
as a ground truth classifier. Regardless the classifier used, the information pro-
vided at the output of the system is a pixel classification. Using these data we
can further process the classification result incorporating region information
from the classification process and obtain clear and smoother borders of the
soft and the mixed plaques. Different processes can be applied to achieve this
goal, two possible approaches are region-based area filtering and classification
by density filtering. In a region-based area filtering the less significant regions
in terms of size are removed from the classification. On the other hand, the
other method relies on keeping the regions that have high density of classifi-
cation responses. As the classification exclusively aims to distinguish between
soft and hard plaque, a separate process is added to separate hard plaque from
calcium.
Once soft and hard plaque are distinguished, we proceed to identify what
part of the hard plaque corresponds to calcium. One can argue why not to
include a third class in the previous classifier. The reason we prefer not to do so
is because experts’ identification of calcium plaques is performed by context.
Experts use the shadowing produced by the absorbtion of the echoes, behind
a high-echoreflective area, to label a certain area as calcium. In the same way,
we take the same approach. On the other hand, the fact of including a third
class only hinders the decision process and increases the classifier complexity.
Therefore, the calcium identification process is made by finding the shadowing
areas behind hard plaque. Those areas are easily identified because the soft–
hard classification also provides this information (Fig. 2.20) since shadowing
areas are classified as nontissue. We can see a plausible way of finding calcified
areas. Figure 2.20(f) shows the classification result under the adventitia border
of the “hard” tissue. Dark gray level areas are regions with soft plaque and,
therefore, do not provide information of the calcium composition of the plaque.
We use one of the previous classified images, the soft–hard classification or the
blood–plaque classification. In white, it is displayed the regions of tissue under
the adventitia border in the area of interest. Figure 2.20(g) shows in light gray
the areas of shadowing, and therefore, the areas labelled as calcium.
Supervised Texture Classification for Intravascular Tissue Characterization 97
To end the process, the last stage is devoted to recast the resulting classifica-
tion to its original polar domain by means of a simple coordinate transformation.
Table 2.3: Feature space performance discriminating hard plaque from soft
plaque using k-nearest neighbor
the qualitative evaluation shown in Table 2.4, where we observe that the same
feature spaces are the ones that perform best. Analyzing each of the feature
spaces in terms of FP and FN rates, we can deduce that Co-occurrence fea-
ture space has good discrimination power, having a “symmetric” nature where
both FP and FN rates are comparable. In the same sense, we can deduce that
the overlapping of both classes is similar. Derivatives of Gaussian’s filter space
have tendency to over-classify hard plaque. The classes in the feature space
are not very well defined as hard plaque must have a higher scatter than the
soft plaque. Gabor filter’s bank gives a good description of both classes as they
have similar false rates. However, both classes are very overlapped giving a
hard time to the classification process. Wavelets overlapping of classes in the
feature space is extremely high; therefore, it describes bad each of the classes.
Accumulation local moments have similar description power than Gabor filter’s
bank; however, the different responses from both allow a much better postpro-
cessing in accumulation local moments. This fact allows us to suppose that the
classification error points in the image domain are much more scattered and
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Figure 2.21: Tissue pixel classification data using 7-nearest neighbors method
on different feature spaces. (a) Original image in cartesian coordinates. (b) Ex-
pert manual classification of tissue. (c) Co-occurrence feature space. (d) Gabor
feature space. (e) Wavelets feature space. (f) Derivative of Gaussian feature
space. (g) Accumulation local moments feature space. (h) Local binary patterns
feature space.
with very few local density. Local binary patterns have good descriptive power
as well as giving a more sparse pattern in FP and FN in the image domain.
Figure 2.21 provides a graphical example of the performance of 7-nearest neigh-
bors method applied to several feature spaces. Observing the images, we realize
that scale-space processes, derivative of Gaussian, Gabor filters, and wavelets
have poor to acceptable discrimination power, and therefore, are not suitable
for the task of tissue discrimination. On the other hand, statistic-based feature
spaces and structure feature spaces have acceptable to good performances.
Table 2.4 details the conclusions arisen from the experiment. The qualitative
speed nomenclature (fast/slow) indicates the viability of the feature space tech-
nique to be included in a real time or near-real time process. A “fast” scheme de-
notes a method over 10 times faster than the “slow” one. Because the results are
obtained using prototypes and not a full application, no absolute time measure
is provided. Note, also, that the images displayed are pixel-based classification
results and have no further smoothing postprocessing. To further develop our
100 Pujol and Radeva
Table 2.5: Feature space performance using FLD and Mahalanobis distance
Feature space RAW error FP FN Post error FP FN
discussion we will only take the three best postprocessed data performing fea-
ture spaces: co-occurrence matrix measures, accumulation local moments, and
local binary patterns. Up to this point we have neither taken into account com-
plexity of the methods nor time issues. However, these are critical parameters
in real applications, thus, we consider them in our following discussions.
Once the feature space is selected, the next decision is to find the most
suitable classifier taking into account our problem constraints, if any. We are
concerned with speed issues, therefore, simple but powerful classifiers are re-
quired. Because the high dimensionality of two of the feature spaces selected
(co-occurrence matrix measures have about 24 features per distance and ac-
cumulation local moments have 81 features) a dimensionality reduction step is
desired. PCA is the first obvious choice, but because great amount of overlap-
ping data the results are worse than using Fisher’s linear discriminant analysis
which is focalized in finding the most discriminative axes for our given set of
data. The result of this experiment is shown in Table 2.5. We use maximum like-
lihood combined with a Fisher linear discriminant analysis reduction. As local
binary patterns do not need dimensionality reduction due to the small amount
of features computed (three features), the comparison with this method is done
by just classifying with the ML method. As expected, the raw results are much
worse with this kind of classifier. Co-occurrence matrix measures take the worse
part doubling their error rate. However, local binary patterns, though they have
also worse error rate with ML, manage to be the most discriminative of the
three methods. This fact is also shown in the postprocessing, where local binary
patterns still have the lower error ratio. Co-occurrence matrix measures regain
their discrimination power after the postprocessing.
Therefore, using one of the fastest classifiers, ML, one achieves, at least, a
classification ratio over 87% (with accumulation local moments). If the selected
feature space is local binary patterns, the scheme is the fastest possible scheme
as local binary patterns are computationally efficient and low-time consuming as
Supervised Texture Classification for Intravascular Tissue Characterization 101
(a) (b)
(c) (d)
Figure 2.22: Boosting procedure for tissue characterization at different stages
of its progress. (a) Expected hand classification by an expert. (b) First stage of
the boosting procedure. (c) Classification with a five classifiers ensemble. (d)
Classification with 10 “weaks” ensemble.
well as the ML classifier does not transform data in another feature space. This
scheme is really well suited for real-time or near-real-time applications because
of both time efficiency and reliability in the classification. This is, however, by no
means the only near-real-time configuration available since accumulation local
moments are computationally as fast as local binary patterns. However, the FLD
dimensionality reduction hinders the process due to the complexity of the data
in its original feature space. To overcome this problem, other classifiers can be
used. The necessity to find reliable and fast classifiers lead us to boosting tech-
niques. Boosting techniques allow a fast and simple classification procedure to
improve its performance as well as maintaining part of its speed. To illustrate
this fact Fig. 2.22 shows the evolution of the classification when more classifiers
are added to the strong classifier. Figure 2.22(a) shows the expected hand clas-
sification by a physician. Figure 2.22(b) shows the base classification of a single
“weak”. Figure 2.22(c) illustrates the result of the classification using an ensem-
ble of five classifiers. Figure 2.22(d) shows the resulting classification after the
addition of 10 weak classifiers to the ensemble. The error rates at different stages
of the process are also shown in Table 2.6. These results are computed using a
ML method as a weak classifier on the accumulation local moments space. The
numbers show how the error rate is improved, and, though the raw classification
error rate is nearly immutable, we can observe that there is a great change in
the classification data points distribution in the image domain since the FP and
FN rates drastically change. The postprocessing error rate gives better descrip-
tion of what is happening. The disposition of the error points in the classifica-
tion image is more sparse and unrelated to their neighborhood, allowing better
102 Pujol and Radeva
Table 2.6: Error rates using boosting methods with maximum likelihood with
the accumulation local moments space
postprocessing and classification rates. In this case, the classification rate is over
92% with a classifier as fast as applying 10 times a threshold. Therefore, using
accumulation local moments and boosting techniques we have another fast and
highly accurate scheme for real-time or near-real-time tissue characterization.
Up to this point, we have discussed the reliability of the soft plaque versus
hard plaque discrimination process, which is our main concern, since the identi-
fication of calcium is reduced to the part of hard plaque with a large shadowing
area. Using the method described in the former section, 99% of the calcium
plaque is correctly identified. Figure 2.23 shows some results of the tissue char-
acterization process. Figures 2.23(a) and 2.23(b) show the characterization of a
soft plaque. In Figs. 2.23(c) and 2.23(d), there are two different kind of plaques
detected, calcium (gray region) and soft plaque (white region). Figures 2.23(e)
and 2.23(f) show the characterization of the three kind of plaques: fibrotic (light
gray region), soft plaque (white region), and calcium (dark gray region).
2.4.3 Conclusions
Tissue characterization in IVUS images is a crucial problem for the physicians
for studying the vascular diseases. However, this task is complex and suffers
from multiple drawbacks (slow manual process, subjective interpretation, etc.)
Therefore, automatic plaque characterization is a highly desirable tool.
However, automatic tissue characterization is a problem of high complex-
ity. First of all, we need a unique and powerful description of the tissues to be
classified. This is done by the feature extraction process, that in order to ob-
tain complete and meaningful description, image features should be based on
texture. Thus, a study of the most representable feature spaces is done, to con-
clude with some enlightening results. After analyzing the experimental results,
we conclude that co-occurrence matrix measures, local binary patterns, and
Supervised Texture Classification for Intravascular Tissue Characterization 103
(a) (b)
(c) (d)
(e) (f)
Figure 2.23: Tissue characterization results: (b), (d), and (f) White labels soft
plaque, dark gray areas are displayed where calcium plaques are located, and
light gray areas labels hard plaque. (a), (c), and (e) Original images.
accumulation local moments are good descriptors of the different kind of plaque
tissues. However, local binary patterns and accumulation local moments are also
fast, in terms of low-time processing. On the other hand, the classification of the
feature data is a critical step. Different approaches to the classification problem
are described and proposed as candidates in our framework. We proved that
104 Pujol and Radeva
Questions
10. Which are the most reliable frameworks for real-time classification?
Supervised Texture Classification for Intravascular Tissue Characterization 105
Bibliography
[2] Arul, P. and Amin, V., Characterization of beef muscle tissue using tex-
ture analysis of ultrasonic images, In: Proceedings of the Twelfth South-
ern Biomedical Engineering Conference, 1993, pp. 141–143.
[4] Jin, X. and Ong, S., Fractal characterization of kidney tissue sections,
In: Engineering in Medicine and Biology Society, 1994. Engineering Ad-
vances: New Opportunities for Biomedical Engineers, Proceedings of
the 16th Annual International Conference of the IEEE, Vol. 2, pp. 1136–
1137, 1994.
[6] Mavromatis, S. and Boi, J., Medical image segmentation using texture
directional features, In: Engineering in Medicine and Biology Society,
2001. Proceedings of the 23rd Annual International Conference of the
IEEE, Vol. 3, pp. 2673–2676, 2001.
[8] Donohue, K. and Forsberg, F., Analysis and classification of tissue with
scatterer structure templates, IEEE Trans. Ultrasonics, Ferroelect. Fre-
quency Control, Vol. 46, No. 2, pp. 300–310, 1999.
106 Pujol and Radeva
[16] Ahmed, M. and Leyman, A., Tissue characterization using radial trans-
form and higher order statistics, In: Nordic Signal Processing Sympo-
sium, 2000, pp. 13–16.
[20] Pujol, O. and Radeva, P., Near real time plaque segmentation of ivus, In:
Proceedings of Computers in Cardiology, 2003, pp. 159–168.
[22] Haralick, R., Shanmugam, K., and Dinstein, I., Textural features for
image classification, IEEE Trans. System, Man, Cybernetics, Vol. 3,
pp. 610–621, 1973.
[28] Ojala, T., Pietikainen, M., and Maenpaa, T., Multiresolution gray-scale
and rotation invariant texture classification with local binary patterns,
IEEE Trans. Pattern Anal. Machine Intell., Vol. 24, No. 7, pp. 971–987,
2002.
108 Pujol and Radeva
[29] Julesz, B., Visual pattern discrimination, IRE Trans. Inf. Theory,
Vol. IT-8, pp. 84–92, 1962.
[30] Ohanian, P. and Dubes, R., Performance evaluation for four classes of
textural features, Pattern Recogn., Vol. 25, No. 8, pp. 819–833, 1992.
[32] Caelli, T. and Oguztoreli, M. N., Some tasks and signal dependent rules
for spatial vision, Spatial Vision, No. 2, pp. 295–315, 1987.
[33] Chaudhuri, B. and Sarkar, N., Texture segmentation using fractal di-
mension, IEEE Trans. Pattern Anal. Machine Intell., Vol. 17, No. 1,
pp. 72–77, 1995.
[34] Lindeberg, T., Scale-space theory: A basic tool for analysing structures
at different scales, J. Appl. Stat., Vol. 21, No. 2, pp. 225–270, 1994.
[35] Rao, R. and Ballard, D., Natural basis functions and topographic mem-
ory for face recognition, In: Proceedings of International Joint Confer-
ence on Artificial Intelligence, 1995, pp. 10–17.
[41] Viola, P. and Jones, M., Rapid object detection using a boosted cascade
of simple features, In: Conference on Computer Vision and Pattern
Recognition, 2001, pp. 511–518.
[42] Duda, R. and Hart, P., Pattern Classification, Wiley InterScience, New
York, 2001. Second Edition.
[45] Klingensmith, J., Shekhar, R., and Vince, D., Evaluation of three-
dimensional segmentation algorithms for identification of luminal and
medial-adventitial borders in intravascular ultrasound images, IEEE
Trans. Med. Imaging, Vol. 19, No. 10, pp. 996–1011, 2000.
Koon-Pong Wong
3.1 Introduction
1
Department of Electronic and Information Engineering, Hong Kong Polytechnic Univer-
sity, Hung Hom, Kowloon, Hong Kong
111
112 Wong
Diagnosis
Staging
Treatment
Clinical Knowledge
Clinical Information
Diagnostic Features
Figure 3.1: The steps and the ultimate goal of medical image analysis in a
clinical environment.
and their diagnostic features embedded within the multidimensional image data
that can guide and monitor interventions after the disease has been detected and
localized, and ultimately leading to knowledge for clinical diagnosis, staging, and
treatment of disease. These processes can be represented diagrammatically as a
pyramid, as illustrated in Fig. 3.1. Starting from the bottom level of the pyramid
is the medical image data obtained from a specific imaging modality, the ultimate
goal (the top level of the pyramid) is to make use of the extracted information to
form a set of clinical knowledge that can lead to clinical diagnosis and treatment
of a specific disease. Now the question is how to reach the goal. It is obvious that
the goal of the imaging study is very clear, but the solution is not. At each level of
the pyramid, specific techniques are required to process the data, extract the in-
formation, label, and represent the information in a high level of abstraction for
knowledge mining or to form clinical knowledge from which medical diagnosis
and decision can be made. Huge amounts of multidimensional datasets, ranging
from a few megabytes to several gigabytes, remain a formidable barrier to our
capability in manipulating, visualizing, understanding, and analyzing the data.
Effective management, processing, visualization, and analysis of these datasets
cannot be accomplished without high-performance computing infrastructure
composed of high-speed processors, storage, network, image display unit, as
well as software programs. Recent advances in computing technology such as
development of application-specific parallel processing architecture and dedi-
cated image processing hardware have partially resolved most of the limiting
factors. Yet, extraction of useful information and features from the multidi-
mensional data is still a formidable task that requires specialized and sophisti-
cated techniques. Development and implementation of these techniques requires
114 Wong
segmentation is a broad field and because the goal of segmentation varies ac-
cording to the aim of the study and the type of the image data, it is impossible
to develop only one standard method that suits all imaging applications. This
chapter focuses on the segmentation of data obtained from functional imaging
modalities such as PET, SPECT, and fMRI. In particular, segmentation based on
cluster analysis, which has great potential in classification of functional imaging
data, will be discussed in great detail. Techniques for segmentation of data ob-
tained with structural imaging modalities have been covered in depth by other
chapters of this book, and therefore, they will only be described briefly in this
chapter for the purpose of completeness.
delineated for each imaging study. Needless to say, manual ROI delineation is
also operator dependent and the selected regions are subject to large intra- and
interrater variability [8, 9]. Because of scatter and partial volume effects
(PVEs) [10], the position, size, and shape of the ROI need careful considera-
tion. Quantitative measurement inaccuracies exhibited by small positional dif-
ferences are expected to be more pronounced for ROI delineation in the brain,
which is a very heterogeneous organ and contains many small structures of ir-
regular shape that lie adjacent to other small structures of markedly differing
physiology [11]. Small positional differences can also confound the model fit-
ting results [12, 13]. To minimize errors due to PVEs, the size of the ROI should
be chosen as small as possible, but the trade-off is the increase in noise levels
within the ROI, which maybe more susceptible to anatomical imprecision. On
the other hand, a larger region offers a better signal-to-noise ratio but changes
that occurred only within a small portion of the region maybe obscured, and the
extracted TAC does not represent the temporal behavior of the ROI but a mixture
of activities with adjacent overlapping tissue structures. Likewise, an irregular
ROI that conforms to the shape of the structure/region where abnormality has
occurred will be able to detect this change with much higher sensitivity than any
other geometrically regular ROI that may not conform well. In addition, man-
ual ROI delineation requires software tools with sophisticated graphical user
interfaces to facilitate drawing ROIs and image display. Methodologies that can
permit semiautomated or ideally, fully automated ROI segmentation will present
obvious advantages over the manual ROI delineation.
Semiautomated or fully automated segmentation in anatomical imaging such
as CT and MR is very successful, especially in the brain, as there are many well-
developed schemes proposed in the literature (see surveys in [14]). This may
be because these imaging modalities provide very high resolution images in
which tiny structures are visible even in the presence of noise, and that four
general tissue classes, gray matter, white matter, cerebrospinal fluid (CSF), and
extracranial tissues such as fat, skin, and muscles, can be easily classified with
different contrast measures. For instance, the T1- and T2-weighted MR images
provide good contrast between gray matter and CSF, while T1 and proton den-
sity (PD) weighted MR images provide good contrast between gray matter and
white matter. In contrast to CT and MRI, PET and SPECT images lack the ability
to yield accurate anatomical information. The segmentation task is further com-
plicated by poor spatial resolution and counting statistics, and patient motion
Medical Image Segmentation 117
during scanning. Therefore, segmentation in PET and SPECT does not attract
much interest over the last two decades, even though there has been remarkable
progress in image segmentation during the same period of time. It still remains
a normal practice to define ROIs manually.
Although the rationale for applying automatic segmentation to dynamic PET
and SPECT images is questionable due to the above difficulties, the application
of automatic segmentation as an alternative to manual ROI delineation has at-
tracted interest recently with the improved spatial resolution of PET and SPECT
systems. Automatic segmentation has advantages in that the subjectivity can be
reduced and that there is saving in time for manual ROI delineation. Therefore,
it may provide more consistent and reproducible results as less human interven-
tion is involved, while the overall time for data analysis can be shortened and
thereby the efficiency can be improved, which is particularly important in busy
clinical settings.
r Reproducible
r Accurate
r Independent of tracer employed
r Independent of instrument spatial resolution
r Independent of ancillary imaging techniques
r Minimizes subjectivity and investigator bias
118 Wong
r Reasonable in cost
r Equally applicable in both clinical and research settings
r Time efficient for both data acquisition and analysis
These criteria are not specific to the functional analysis of the brain, and
they are equally applicable to other organs and imaging applications upon mi-
nor modifications, in spite of the fundamentally differences between imaging
modalities.
r Thresholding
r Edge-based segmentation
r Region-based segmentation
r Pixel classification
3.4.1 Thresholding
Semiautomatic methods can partially remove the subjectivity in defining ROIs
by human operators. The most commonly used method is by means of thresh-
olding because of its simplicity in implementation and intuitive properties. In
this technique, a predefined value (threshold) is selected and an image is divided
into groups of pixels having values greater than or equal to the threshold and
groups of pixels with values less than the threshold. The most intuitive approach
is global thresholding, which is best suited for bimodal image. When only a single
threshold is selected for a given image, the thresholding is global to the entire
image. For example, let f (x, y) be an image with maximum pixel value Imax ,
and suppose denotes the percent threshold of the maximum pixel value above
which the pixels will be selected, then pixels with value ρ given by
Imax ≤ ρ ≤ Imax (3.1)
100
can be grouped and a binary image g(x, y) is formed:
1 if f (x, y) ≥ ρ
g(x, y) = (3.2)
0 otherwise
in which pixels with value of 1 correspond to the ROI, while pixels with value 0
correspond to the background.
Global thresholding is simple and computationally fast. It performs well if
the images contain objects with homogeneous intensity or the contrast between
the objects and the background is high. However, it may not lead itself fully au-
tomated and may fail when two or more tissue structures have overlapping
intensity levels. The accuracy of the ROI is also questionable because it is sep-
arated from the data based on a single threshold value which may be subject
to very large statistical fluctuations. With the increasing number of regions or
noise levels, or when the contrast of the image is low, threshold selection will
become more difficult.
Apart from global thresholding, there are several thresholding methods
which can be classified as local thresholding and dynamic thresholding. These
techniques maybe useful when a thresholding value cannot be determined from
a histogram for the entire image or a single threshold cannot give good segmen-
tation results. Local threshold can be determined by using the local statistical
properties such as the mean value of the local intensity distribution or some
120 Wong
other statistics such as mean of the maximum or minimum values [21] or local
gradient [22], or by splitting the image into subimages and calculating thresh-
old values for the individual sub-images [23]. Some variants of the above two
methods can be found in Refs. [17, 18].
Detailed discussion on other edge operators such as Canny, Kirsch, Prewitt, and
Robinson can be found elsewhere [1, 20].
An edge magnitude image can be formed by combining the gradient compo-
nents δ fx and δ fy at every pixel location using Eq. (3.4). As the computational
burden required by square and square roots in Eq. (3.4) is very high, an approx-
imation with absolute values is frequently used instead:
After the edge magnitude image has been formed, a thresholding operation is
then performed to determine where the edges are.
The first-order derivatives of the image f (x, y) have local minima and
maxima at the edges because of the large intensity variations. Accordingly,
the second-order derivatives have zero crossings at the edges, which can
also be used for edge detection and the Laplacian is frequently employed in
practice. The Laplacian (∇ 2 ) of a two-dimensional function f (x, y) is defined
as
∂2 f ∂2 f
∇2 f = + (3.7)
∂ x2 ∂ y2
There are several ways to realize the Laplacian operator in discrete-time domain.
For a 3 × 3 region, the following two realizations are commonly used:
⎡ ⎤ ⎡ ⎤
0 −1 0 −1 −1 −1
⎢ ⎥ ⎢ ⎥
⎣ −1 4 −1 ⎦ and ⎣ −1 8 −1 ⎦
0 −1 0 −1 −1 −1
scope of this chapter and they can be found in Refs. [30, 31]. There are several
more powerful edge tracking/linking techniques such as graph searching [32,33]
and dynamic programming [34, 35] that perform well in the presence of noise.
As might be expected, these paradigms are considerably more complicated and
computationally expensive than the methods discussed so far.
where L(·) is a logical predicate. The original image can be exactly assembled
by putting all regions together (Eq. 3.9) and there should be no overlapping
between any two regions Ri and R j for i = j (Eq. 3.10). The logical predicate
L(·) contains a set of rules (usually a set of homogeneity criteria) that must be
satisfied by all pixels within a given region (Eq. 3.11), and it fails in the union of
two regions since merging two distinct regions will result in an inhomogeneous
region (Eq. 3.12).
The simplest region-based segmentation technique is the region growing,
which is used to extract a connected region of similar pixels from an image [36].
The region growing algorithm requires a similarity measure that determines the
inclusion of pixels in the region and a stopping criterion that terminates the
growth of the region. Typically, it starts with a pixel (or a collection of pixels)
called seed that belongs to the target ROI. The seed can be chosen by the operator
or determined by an automatic seed finding algorithm. The neighborhood of each
seed is then inspected and pixels similar enough to the seed are added to the
corresponding region where the seed is, and therefore, the region is growing
and its shape is also changing. The growing process is repeated until no pixel
124 Wong
can be added to any region. It is possible that some pixels may remain unlabeled
when the growing process stops.
Hebert et al. [37] investigated the use of region growing to automated de-
lineation of the blood pool with computer simulations and applied the method
to three gated SPECT studies using Tc-99m pertechnetate, and the results were
promising. Kim et al. [38] also investigated an integrated approach of region
growing and cluster analysis (to be described later) to segment a dynamic
[18 F]fluorodeoxyglucose (FDG) PET dataset. Although qualitatively reasonable
segmentation results were obtained, much more work is needed to overcome
the difficulties in the formation of odd segments possibly due to spillover region
boundaries, and evaluate the quantitative accuracy of the segmentation results
using kinetic parameter estimation.
Region splitting methods take an opposite strategy to the region growing.
These methods start from the entire image and examine the homogeneity crite-
ria. If the criteria do not meet, the image (or subimage) is split into two or more
subimages. The region splitting process continues until all subimages meet the
homogeneity criteria. Region splitting can be implemented by quadtree parti-
tioning. The image is partitioned into four subimages that are represented by
nodes in a quadtree, which is a data structure used for efficient storage and rep-
resentation. The partition procedure is applied recursively on each subimage
until each and all of the subimages meet the homogeneity criteria.
The major drawback of region splitting is that the final image may contain
adjacent regions Ri and R j , which are homogeneous, i.e. L(Ri ∪ R j ) = TRUE,
and ideally this region should be merged. This leads to another technique called
split-and-merge, which includes a merging step in the splitting stage, where an
inhomogeneous region is split until homogeneous regions are formed. A newly
created homogeneous region is checked against its neighboring regions and
merged with one or more of these regions if they possess identical properties.
However, this strategy does not necessarily produce quadtree partitioning of
the image. If quadtree partitioning is used, an additional step may be added to
merge adjacent regions (nodes) that meet the homogeneity criterion.
1p
n
d{xi , x j } = xi − x j p
(3.13)
k=1
where xi ∈ Rn and x j ∈ Rn are the two vectors in the feature space. It can be
seen that the above measure corresponds to Euclidean distance when p = 2 and
Mahalanobis distance when p = 1. Another commonly used distance measure
is the normalized inner product between two vectors given by
xiT x j
d{xi , x j } = (3.14)
xi · x j
126 Wong
where T denotes the transpose operation. The above distance measure is simply
the angle between vectors xi and x j in the feature space.
Each cluster is represented by its centroid (or mean) and variance, which
indicates the compactness of the objects within the cluster, and the formation
of clusters is optimized according to a cost function that typically takes the sim-
ilarity within individual cluster and dissimilarity between clusters into account.
There are many clustering techniques proposed in the literature (see Ref. [39]).
The most famous clustering techniques are K -means [40], fuzzy c-means [41],
ISODATA [42], hierarchical clustering with average linkage method [43], and
Gaussian mixture approach [44].
As we will see later in this chapter, the idea of pixel classification in two-
dimensional image segmentation using clustering techniques can be extended to
multidimensional domain where the images convey not only spatial information
of the imaged structures but also their temporal variations, for which clustering
plays a pivotal role in identification of different temporal kinetics present in
the data, extraction of blood and tissue TACs, ROI delineation, localization of
abnormality, kinetic modeling, characterization of tissue kinetics, smoothing,
and fast generation of parametric images.
Functional imaging with PET, SPECT, and/or dynamic MRI provides in vivo
quantitative measurements of physiologic parameters of biochemical pathways
and physiology in a noninvasive manner. A critical component is the extraction
of physiological data, which requires accurate localization/segmentation of the
appropriate ROIs. A common approach is to identify the anatomic structures
by placing ROIs directly on the functional images, and the underlying tissue
TACs are then extracted for subsequent analysis. This ROI analysis approach,
although widely used in clinical and research settings, is operator-dependent and
thus prone to reproducibility errors and it is also time-consuming. In addition,
this approach is problematic when applied to small structures because of the
PVEs due to finite spatial resolution of the imaging devices.
Methods discussed so far can be applied to almost all kinds of image seg-
mentation problem because they do not require any model (i.e. model-free) that
guides or constrains the segmentation process. However, segmenting structures
Medical Image Segmentation 127
of the object shape, while the continuity, connectivity, and smoothness of the
models can compensate for the irregularities and noise in the object boundaries.
Model-based approaches treat the problem of finding object boundaries as an
optimization problem of searching the best fit for the image data to the model. In
the case of boundary finding via optimization in image space, a fairly extensive
review on various deformable model methods can be found in Ref. [49].
Mykkänen et al. [50] investigated automatic delineation of brain structures
in FDG-PET images using generalized snakes with promising results. Chiao
et al. [51] proposed using model-based approach for segmenting dynamic car-
diac PET or SPECT data. The object model consists of two parts: a heart and the
rest of the body. The heart is geometrically modeled using a polygonal model [52]
and the myocardial boundaries are parameterized by the endocardial radii and
a set of angular thicknesses. Kinetic parameters in the compartment model
and the endocardial and epicardial radii are estimated by maximizing a joint
log-likelihood function using nonlinear parameter estimation. Tissue and blood
TACs are extracted simultaneously with estimated kinetic parameters. Chiao
et al. [51] proposed that some forms of regularization can be applied, including
auxiliary myocardial boundary measurements obtained by MRI or CT and reg-
istering the auxiliary measurements with the emission tomographic data, if the
kinetic parameter estimation failed.
are used to segment the organ boundaries. The difference lies in the definition
of the model, which is described by a computerized anatomy atlas or a stereo-
taxic coordinate system—a reference that the functional images are mapped
onto by either linear or nonlinear transformation. A number of transformation
techniques have been developed for this process [59]. The ROIs defined on the
template are then available to the functional image data.
Similarly, functional (PET and SPECT) images and structural (CT and MR)
images obtained from individual subjects can be fused (coregistered), allowing
precise anatomical localization of activity on the functional images [60, 61].
Precise alignment between the anatomic/template and PET images is necessary
for these methods. Importantly, methods that use registration to a standard
coordinate system are problematic when patients with pathological processes
(e.g., tumors, infarction, and atrophy) are studied.
time-intensity curves for different pixels or the mean of the pixel values av-
eraged over a selected ROI. Suppose a ROI is drawn on a reference region in the
dynamic sequence of images and its time course is extracted
The similarity between the reference time-intensity curve r and the time-
intensity curves for all pixels can then be calculated. And a similarity map,
which is a image where the value of each pixel shows the temporal similarity to
the reference curve, can be constructed.
Since the time instants do not affect the computation of cross correlation
between two time-intensity curves as pixel intensity values in one frame are
measured at the same time, xi in Eq. (3.15) and r in Eq. (3.16) can be rewritten
in a time-independent form as
where Xi, j ≡ Xi (t j ) is the pixel value of the ith element evaluated in the jth
frame of X, and
r = [r1 , r2 , . . . , rN ]T (3.18)
1 N
r= rj (3.19)
N j=1
The similarity map R based on normalized cross correlation can be defined for
each pixel i as
N
Xi, j − X i r j − r
j=1
Ri =
2 N
2 (3.20)
N
j=1 Xi, j − X i j=1 r j − r
where
1 N
Xi = Xi, j (3.21)
N j=1
is the mean value of the time sequence for pixel i. The normalized cross correla-
tion has values in the range of −1 to +1. Regions of identical temporal variation
have a coefficient of +1, with the exception that xi or r are extracted from
regions of constant pixel intensity (e.g. background). In this case, the denomi-
nator of Eq. (3.20) equals zero. Therefore, the following restrictions have to be
132 Wong
N
2
N
2
Xi, j − X i = 0 and rj − r = 0 (3.22)
j=1 j=1
the data, whereas lower order components are unimportant as they mainly con-
tain noise, which can be discarded without causing too much loss of information
of the original data. Therefore, dimensionality reduction (or data compression)
can be achieved using PCA technique. Separation of tissue types characterized
by different features can also be accomplished by careful inspection of the PCs.
This is because each PC contains only the representative feature that is specific
to that PC and cannot be found elsewhere (theoretically) owing to orthogonality
among PCs.
Let the dynamic sequence of images be represented by a matrix X that
has M rows and N columns. Each column represents a time frame of image
data and each row represents a pixel vector, i.e., a tissue TAC or a dixel [63],
which is a time series xi as in Eqs. (3.15) and (3.17). Note that there is no ex-
plicit assumption on the probability density of the measurements xi as long
as the first-order and second-order statistics are known or can be estimated
from the available measurements. Each of xi can be considered as a random
process
x = [x1 , x2 , . . . , xN ]T (3.23)
If the measurements (or random variables) x j are correlated, their major vari-
ations can be accurately approximated with less than N parameters using the
PCA. The mean of x is given by
x = E{x} (3.24)
the equation
Cx ek = λk ek (3.26)
Cx = UΛVT (3.27)
y = Ω(x − x) (3.28)
which defines a linear transformation for the random vector x through the or-
thogonal basis and x is calculated from Eq. (3.24). The kth PC of x is given
by
yk = ekT (x − x) (3.29)
Medical Image Segmentation 135
which has zero mean. The PCs are also orthogonal (uncorrelated) to one another
because
#
$
E{yk yl } = E ekT (x − x) elT (x − x) = ekT Cx el = 0 (3.30)
x = ΩT y + x (3.31)
which indicates that the variances of the PCs are given by the eigenvalues of
Cx . As the PCs have zero means, a very small eigenvalue (variance) λk implies
that the value of the corresponding PC is also very small to contribute to the
total variances present in the data. Since the eigenvalue sequence {λk } is mono-
tonically decreasing and typically the sequence drops rapidly, it is possible to
determine a limit below which the eigenvalues (and PCs) can be discarded with-
out causing significant error in reconstruction of the original dataset using only
the retained PCs. Thus, data compression (or dimensionality reduction) can be
achieved and this is an important application of PCA. Instead of using all eigen-
vectors of the covariance matrix Cx , the random vector x can be approximated
by the highest few basis vectors of the orthogonal basis. Suppose that only the
first K rows (eigenvectors) of Ω are selected to form a K × N matrix, ΩK , a
similar transformation as in Eqs. (3.28) and (3.31) can be derived
ỹ = ΩK (x − x) (3.33)
and
x̂ = ΩTK y + x (3.34)
N
E{x̂ − x2 } = λk (3.35)
k=K +1
136 Wong
The practical issue here is the choice of K beyond which the PCs are insignif-
icant. The gist of the problem lies in how “insignificant” is defined and how
much error one could tolerate in using less number of PCs to approximate the
original data. Sometimes, a small number of PCs are sufficient to give an accu-
rate approximation to the observed data. A commonly used strategy is to plot
the eigenvalues against the number of PCs and detect a cut-off beyond which
the eigenvalues become constants. Another approach is to discard the PCs with
eigenvalues lower than a specified fraction of the first (largest) eigenvalue. There
is no simple answer and one has to trade off between errors and the number of
PCs for approximation of the observed data which is the primary concern when
PCA is used for data compression.
PCA has been applied to analyze functional images including nuclear medi-
cine [72–77] and dynamic MRI [78, 79] where data visualization, structure and
functional classification, localization of diseases, and detection of activation pat-
terns are of primary interests. Moeller and Strother [72] applied PCA to analyze
functional activation patterns in brain activation experiments. Strother et al. [75]
revealed an intra- and intersubject subspace in data and demonstrated that the
activation pattern is usually contained in the first PC. A later study conducted
by Ardekani et al. [76] further demonstrated that the activation pattern may
spread across several PCs rather than lie only on the first PC, particularly when
the number of subjects increases and/or multicenter data are used. PCA was
also applied to aid interpretation of oncologic PET images. Pedersen et al. [74]
applied PCA to aid analyze of dynamic FDG-PET liver data. Anzai et al. [77]
investigated the use of PCA in detection of tumors in head and neck, also using
dynamic FDG-PET imaging. It was found that the first few highest order compo-
nent images often contained tumors whereas the last several components were
simply noise.
Absolute quantification of dynamic PET or SPECT data requires an invasive
procedure where a series of blood samples are taken to form an input function
for kinetic modeling (Chapter 2 of Handbook of Biomedical Image Analysis:
Segmentation, Volume I). Sampling blood at the radial artery or from an arte-
rialized vein in a hand is the currently recognized method to obtain the input
function. However, arterial blood sampling is invasive and has several poten-
tial risks associated with both the patient and the personnel who performed the
blood sampling [80]. Drawing ROI around vascular structures (e.g., left ventricle
in the myocardium [81] and internal carotid artery in the brain [82]) has been
Medical Image Segmentation 137
proposed as a noninvasive method that obviates the need of frequent blood sam-
pling. Delineation of larger vascular structures in the myocardium is relatively
straightforward. In contrast, delineation of internal carotid artery in the head
and neck is not trivial. A potential application of PCA is the extraction of input
function from the dynamic images in which vascular structures are present in
the dynamic images. Figure 3.2 show a sequence of dynamic neurologic FDG-
PET images sampled at the level in the head where the internal carotid arteries
are covered. Figure 3.3 shows the highest 12 PC images. The signal-to-noise ra-
tio (SNR) of the first PC image is very high when comparing with the original
image sequence. For PC images beyond the second, they simply represent the
remaining variability that the first two PC images cannot account for and they
are dominated by noise. The internal carotid arteries can be seen in the second
PC image which can be extracted by means of thresholding as mentioned before
in sections 3.4.1 and 3.4.4. Figure 3.4 shows a plot of percent contribution to the
total variance for individual PCs. As can be seen from the figure, the first and
the second PCs contribute about 90% and 2% of the total variance, while the re-
maining PCs only contribute for less than 0.6% of the total variance individually.
138 Wong
Figure 3.3: From left to right, the figure shows the first six principal component
(PC) images (top row), and the 7th to 12th PC images (bottom row) scaled to
their own maxima. All but the first two PC images are dominated by noise. The
higher order PC images (not shown) look very similar to PC images 3–12.
This means that large amount of information (about 92%) is preserved in only
the first two PCs, and the original images can be approximated by making use
only the first one or two PCs.
Different from model-led approaches such as compartmental analysis where
the physiological parameters in a hypothesized mathematical model are esti-
mated by fitting the model to the data under certain possibly invalid assump-
tions, PCA is data-driven, implying that it does not rely on a mathematical model.
Figure 3.4: The percent variance distribution of the principal component (PC)
images.
Medical Image Segmentation 139
X = CF + η (3.36)
where C contains factor coefficients for each pixel and it is of size M × K with
K being the number of factors; F is a K × N matrix which contains underlying
tissue TACs. The additive term η in Eq. (3.36) represents measurement noise
in X.
Similar to the mathematical analysis detailed before for similarity mapping
and PCA, we define xi as the ith pixel vector in X, and fk being the kth underlying
140 Wong
factor curve (TAC), and cki being the factor coefficient that represents contribu-
tion of the kth factor curve to xi . Let Y = CF and yi be a vector which represents
the ith row of Y, then
K
yi = cki fk (3.37)
k=1
and
xi = yi + η i (3.38)
Typically, FADS proceeds by first identifying an orthogonal basis for the se-
quence of dynamic images followed by an oblique rotation. Identification of the
orthogonal basis can be accomplished by PCA discussed previously. However,
the components identified by PCA are not physiologically meaningful because
some components must contain negative values in order to satisfy the orthog-
onality condition. The purpose of oblique rotation is to impose nonnegativity
constraints on the extracted factors (TACs) and the extracted images of factor
coefficients [63].
As mentioned in section 3.2, careful ROI selection and delineation are very
important for absolute quantification, but manually delineation of ROI is not easy
due to high-noise levels present in the dynamic images. Owing to scatter and
patient volume effects, the selected ROI may represent “lumped” activities from
different adjacent overlapping tissue structures rather than the “pure” temporal
behavior of the selected ROI. On the other hand, FADS can separate partially
overlapping regions that have different kinetics, and thereby, extraction of TACs
corresponding to those overlapping regions is possible.
Medical Image Segmentation 141
x = c1 f1 + c2 f2 (3.40)
with some constant α. It can be seen that Eqs. (3.40) and (3.41) are equivalent for
describing the measured TAC, x, as long as f1 + αf2 and c2 − αc1 are nonnegative
if nonnegativity constraints have to be satisfied. In other words, there is no
difference to represent x using factors f1 and f2 and factor coefficients c1 and c1 ,
or using factors f1 + αf2 and f2 and factor coefficients c1 and c2 − αc1 . Therefore,
further constraints such as a priori information of the data being analyzed are
required [84–87].
FADS has been successfully applied to extract the time course of blood ac-
tivity in left ventricle from PET images by incorporating additional information
about the input function to be extracted [88, 89]. Several attempts have also
been made to overcome the problem of nonuniqueness [90, 91]. It was shown
that these improved methods produced promising results in a patient planar
99m Tc-MAG renal study and dynamic SPECT imaging of 99m Tc-teboroxime in
3
Cluster analysis has been described briefly in section 3.4.4. One of the major
aims of cluster analysis is to partition a large number of objects according to
certain criteria into a smaller number of clusters that are mutually exclusive and
exhaustive such that the objects within a cluster are similar to each others, while
objects in different clusters are dissimilar. Cluster analysis is of potential value
in classifying PET data, because the cluster centroids (or centers) are derived
142 Wong
from many objects (tissue TACs) and an improved SNR can be achieved [92].
It has been applied to segment a dynamic [11 C]flumazenil PET data [92] and
dynamic [123 I]iodobenzamide SPECT images [93]. In the following, a clustering
algorithm is described. Its application to automatic segmentation of dynamic
FDG-PET data for tumor localization and detection is demonstrated in the next
section. An illustration showing how to apply the algorithm to generate ROIs
automatically for noninvasive extraction of physiological parameters will also
be presented.
The segmentation method is based on cluster analysis. Our aim is to classify
a number of tissue TACs according to their shape and magnitude into a smaller
number of distinct characteristic classes that are mutually exclusive so that the
tissue TACs within a cluster are similar to one another but are dissimilar to
those drawn from other clusters. The clusters (or clustered ROIs) represent the
locations in the images where the tissue TACs have similar kinetics. The kinetic
curve associated with a cluster (i.e. cluster centroid) is the average of TACs in
the cluster. Suppose that there exists k characteristic curves in the dynamic PET
data matrix, X, which has M tissue TACs and N time frames with k M and that
any tissue TAC belongs to only one of the k curves. The clustering algorithm then
segments the dynamic PET data into k curves automatically based on a weighted
least-squares distance measure, D, which is defined as
k
M
D{xi , µ j } = xi − µ j 2W (3.42)
j=1 i=1
where xi ∈ R N is the ith tissue TAC in the data, µ j ∈ R N is the centroid of cluster
C j , and W ∈ R N×N is a square matrix containing the weighting factors on the
diagonal and zero for the off-diagonal entries. The weighting factors were used
to boost the degree of separation between any TACs that have different uptake
patterns but have similar least-squares distances to a given cluster centroid.
They were chosen to be proportional to the scanning intervals of the experiment.
Although this is not necessarily an optimal weighting, reasonably good clustering
results can be achieved.
There is no explicit assumption on the structure of data and the clustering
process proceeds automatically in an unsupervised manner. The minimal as-
sumption for the clustering algorithm is that the dynamic PET data can be rep-
resented by a finite number of kinetics. As the number of clusters, k, for a given
dataset is usually not known a priori, k is usually determined by trial and error.
Medical Image Segmentation 143
where xl ∈ R N is the lth tissue TAC in X; µi ∈ R N and µ j ∈ R N are the ith and
jth cluster centroid, respectively; and Ci represents the ith cluster set. The
centroids in the clusters are updated based on Eq. (3.43) so that Eq. (3.42) is
minimized. The above allocation and updating processes are repeated for all
tissue TACs until there is no reduction in moving a tissue TAC from one clus-
ter to another. On convergence, the cluster centroids are mapped back to the
original data space for all voxels. An improved SNR can be achieved because
each voxel in the mapped data space is represented by one of the cluster cen-
troids each of which possesses a higher statistical significance than an individual
TAC.
Convergence to a global minimum is not always guaranteed because the
final solution is not known a priori unless certain constraints are imposed on
the solution that may not be feasible in practice. In addition, there may be
several local minima in the solution space when the number of clusters is large.
Restarting the algorithm with different initial cluster centroids is necessary to
identify the best possible minimum in the solution space.
The algorithm is similar to the K -means type Euclidean clustering algo-
rithm [40]. However, the K -means type Euclidean clustering algorithm requires
that the data are normalized and it does not guarantee that the within-cluster
cost is minimized since no testing is performed to check whether there is any
cost reduction if an object is moved from one cluster to another.
The work presented in this section builds on our earlier research in which we
applied the proposed clustering algorithm to tissue classification and segmenta-
tion of phantom data and a cohort of dynamic oncologic PET studies [94]. The
study was motivated by our on-going work on a noninvasive modeling approach
144 Wong
for quantification of FDG-PET studies where several ROIs of distinct kinetics are
required [95, 96]. Manual delineation of ROIs restrain the reproducibility of the
proposed modeling technique, and therefore, some other semiautomated and
automated methods have been investigated and clustering appears as a promis-
ing alternative to automatically segment ROI of distinct kinetics. The results
indicated that the kinetic and physiological parameters obtained with cluster
analysis are similar to those obtained with manual ROI delineation, as we will
see in the later sections.
T St
L
B B
T M S
Mu
muscle, spleen, stomach, a large and small tumor in the liver (see Fig. 3.5). A
dynamic sequence of sinograms was obtained by forward projecting the images
into 3.13 mm bins on a 192 × 256 grid. Attenuation was included in the sim-
ulations for the purpose of obtaining the correct scaling of the noise. Poisson
noise and blurring were added to simulate realistic sinograms. Noisy dynamic
images were then reconstructed using FBP (Hann filter cut-off at the Nyquist
frequency). Figure 3.6 shows the metabolite-corrected arterial blood curve and
noisy 2-[11 C]thymidine kinetics in some representative tissues.
Figure 3.7: A slice of the Hoffman brain phantom. A tumor in white matter
(white spot) and an adjacent hypometabolic region (shaded region) are shown.
Dynamic FDG-PET study was simulated using a slice of the numerical Hoff-
man brain phantom [99] that modified using a template consisting of five differ-
ent kinetics (gray matter, white matter, thalamus, tumor in white matter, and
an adjacent hypometabolic region in left middle temporal gyrus), as shown in
Fig. 3.7. The activities in gray matter and white matter were generated using a
five-parameter three-compartment FDG model [100] with a measured arterial
input function obtained from a patient (constant infusion of 400 MBq of FDG
over 3 min). The kinetics present in the hypometabolic region, thalamus, and
tumor were set to 0.7, 1.1, and 2.0 times the activity in gray matter. The kinetics
were then assigned to each brain region and a dynamic sequence of sinograms
(22 frames, 6 × 10 sec, 4 × 30 sec, 1 × 120 sec, 11 × 300 sec) was obtained by for-
ward projecting the images into 3.13 mm bins on a 192 × 256 grid. Poisson noise
and blurring were also added to simulate realistic sinograms. Dynamic images
were reconstructed using FBP with Hann filter cut-off at the Nyquist frequency.
The noisy FDG kinetics are shown in Fig. 3.8 and some of the kinetics are similar
to each other due to the added noise and gaussian blurring, although their ki-
netics are different in the absence of noise and blurring. This is illustrated in the
white matter and the hypometabolic region, and the gray matter and thalamus.
based on the given dataset. In this study, a model-based approach was adopted to
cluster validation based on two information-theoretic criteria, namely, Akaike
information criterion (AIC) [101] and Schwarz criterion (SC) [102], assuming
that the data can be modeled by an appropriate probability distribution function
(e.g. Gaussian). Both criteria determine the optimal model order by penalizing
the use of a model that has a greater number of clusters. Thus, the number of
clusters that yields the lowest value for AIC and/or SC is selected as the opti-
mum. The use of AIC and SC has some advantages compared to other heuristic
approaches such as the “bootstrap” resampling technique which requires a large
amount of stochastic computation. This model-based approach is relatively flex-
ible in evaluating the goodness-of-fit and a change in the probability model of
the data does not require any change in the formulation except the modeling
assumptions. It is noted, however, that both criteria may not indicate the same
model as the optimum [102].
The validity of clusters is also assessed visually and by thresholding
the average mean squared error (MSE) across clusters, which is defined
as
1 k M
MSE = xi − µ j 2W . (3.44)
k j=1 i=1
148 Wong
Both approaches are subjective but they can provide an insight into the “correct”
number of clusters.
Patient 1: The FDG-PET scan was done in a female patient, 6 months after
resection of a malignant primary brain tumor in the right parieto-occipital
Medical Image Segmentation 149
lobe. The scan was done to determine if there was evidence for tumor
recurrence. A partly necrotic hypermetabolic lesion was found in the
right parieto-occipital lobe that was consistent with tumor recurrence.
As they are unnecessary for clustering and the subsequent analysis, low
count areas such as the background (where the voxel values should be zero the-
oretically) and streaks (which are due to reconstruction errors) were excluded
by zeroing voxels whose summed activity was below 5% of the mean pixel in-
tensity of the integrated dynamic images. A 3 × 3 closing followed by a 3 × 3
erosion operation was then applied to fill any “gap” inside the intracranial/body
region to which cluster analysis was applied. Parametric images of the phys-
iological parameter, K , which is defined as the value of k1∗ k3∗ /(k2∗ + k3∗ ) [104],
were generated by fitting all voxels inside the intracranial/body region using
Patlak graphical approach [105]. The resultant parametric images obtained for
the raw dynamic images and dynamic images after cluster analysis were as-
sessed visually. Compartmental model fitting using the three-compartment FDG
model [104] was also performed on the tissue TACs extracted manually and by
cluster analysis to investigate whether there is any disagreement between the
parameter estimates.
3.6.4 Results
3.6.4.1 Simulated [11 C]Thymidine PET Study
Figure 3.9 shows the segmentation results using different numbers of clusters,
k, in the clustering algorithm. The number of clusters is actually varied from 3
to 13 but only some representative samples are shown. In each of the images in
Figs. 3.9(a)–3.9(f), different gray levels are used to represent the cluster loca-
tions. Figure 3.9 shows that when the number of clusters is small, segmentation
150 Wong
a b c
d e f
Figure 3.9: Tissue segmentation obtained with different number of clusters.
(a) k = 3, (b) k = 5, (c) k = 7, (d) k = 8, (e) k = 9, and (f) k = 13. (Color
Slide)
of the data is poor. With k = 3, the liver, marrow, and spleen merge to form a
cluster and the other regions merge to form a single cluster. With 5 ≤ k ≤ 7, the
segmentation results improve because the blood vessels and stomach are visu-
alized. However, the hepatic tumors are not seen and the liver and spleen are
classified into the same cluster. With k = 8, the tumors are visualized and almost
all of the regions are correctly identified (Fig. 3.9(d)). Increasing the value of
k to 9 gives nearly the same segmentation as in the case of k = 8 (Fig. 3.9(e)).
Further increasing the value of k, however, may result in poor segmentation be-
cause the actual number of tissues present in the data is less than the specified
number of clusters. Homogeneous regions are therefore fragmented to satisfy
the constraint on the number of clusters (Fig. 3.9(f)). Thus, 8 or 9 clusters ap-
pear to provide reasonable segmentation of tissues in the slice and this number
agrees with the various kinetics present in the data.
Figure 3.10 plots the average MSE across clusters as a function of k. The av-
erage MSE decreases monotonically, as it drops rapidly (k < 8) before reaching
a plateau (k ≥ 10). From the trend of the plot, there is no significant reduc-
tion in the average MSE with k > 12. Furthermore, the decrease in the average
MSE is nearly saturated with k ≥ 8. These results confirm the findings of the
images in Fig. 3.9, suggesting 8 or 9 as the optimal number of clusters for this
dataset.
Table 3.1 tabulates the results of applying AIC and SC to determine the op-
timum number of clusters which is the one that gives the minimum value for
Medical Image Segmentation 151
Table 3.1: Computed values for AIC and SC with different choices of the
value of k
Number of clusters, k
Criterion 3 4 5 6 7 8 9 10 11 12 13
AIC 99005 95354 93469 90904 88851 86967a 89769 93038 91994 90840 89807
SC 98654 94888 92887 90206 88038 86038a 88725 91878 90719 89450 88301
Figure 3.11: Single slice of simulated 2-[11 C]thymidine PET study. Top row
shows the original reconstructed images at (a) 15 sec, (b) 75 sec, (c) 135 sec,
(d) 285 sec, (e) 1020 sec, and (f) 2850 sec postinjection. Bottom row shows same
slice at identical time points after cluster analysis. Individual images are scaled
to their own maximum.
Five cluster images were generated by applying the clustering algorithm to the
noisy simulated dynamic images. The number of clusters k was actually varied
from 3 to 10 and the optimal k was determined by inspecting the change of
average MSE and the visual quality of the cluster images. Figure 3.12 shows the
cluster images for k = 5 that was found to be the optimum number of clusters
for this dataset. It was found that the tumor cannot be located when k was small
(k < 4). The tumor was located by gradually increasing the number of clusters.
However, there was a deteriorated segmentation of all regions when k was large
(k > 7).
Medical Image Segmentation 153
a b c d e
Figure 3.12: Five cluster images obtained from the noisy simulated dynamic
FDG-PET data. The images correspond to (a) ventricles and scalp, (b) white
matter and left middle temporal gyrus (hypometabolic zone), (c) partial volume
between gray matter and white matter, (d) gray matter, deep nuclei, and outer
rim of tumor, and (e) tumor.
Although the tumor was small in size, cluster analysis was still able to locate
it because of its abnormal temporal characteristics as compared to the other
regions. Cluster analysis also performed well in extracting underlying tissue
kinetics in gray matter and white matter because of their distinct kinetics. On
the contrary, the kinetics in the thalamus and the hypometabolic region were
not separated from those in gray and white matter but this was not unexpected
since their kinetics were very similar.
Owing to the partial volume effects (PVEs), there were some vague regions
whose kinetics were indeterminate (Fig. 3.12(c)) and did not approach gray or
white matter. The algorithm was unlikely to assign such kinetics to the cluster
corresponding to white matter or to the cluster corresponding to gray matter
since the overall segmentation results would be deteriorated. Thus, a cluster
was formed to account for the indeterminate kinetics.
Segmentation results are shown for dynamic neurologic (Fig. 3.13) and lung
(Fig. 3.14) FDG-PET studies. The clusters are represented by differing gray
scales and slices were sampled at the level where the lesions were seen on
the original reconstructed data. Since there is no a priori knowledge about the
optimum number of clusters, the value of k was varied in order to determine
the optimal segmentation using the AIC and SC as in the phantom study. For
Fig. 3.13, eight clusters were found to give the optimal segmentation for these
datasets. The locations of the tumors and the rim of increased glucose uptake
154 Wong
a d
b e
c f
Figure 3.13: Tissue segmentation obtained from Patient 1 at (a) slice 10,
(b) slice 13, and (c) slice 21; and Patient 2 at (d) slice 21, (e) slice 24, and
(f) slice 26. The number of clusters used is eight. The locations of the solid hy-
permetabolic portions of the tumors (arrows) and the small rim of increased
glucose uptake (arrow heads) identified by cluster analysis are shown.
are identified correctly by the clustering algorithm with the optimal value of
clusters.
For Fig. 3.14, the number of clusters was varied from 3 to 13 and only some
representative results are shown. Similar to the simulation study, the segmen-
tation results are poor when the number of clusters is small (k = 3), while the
segmentation is gradually improved by increasing the number of clusters. Based
on the AIC and SC, the optimum numbers of clusters for the selected slices (4,
19, and 24) were found to be 8, 8, and 9, respectively. It is not surprising that
the optimum number of clusters is different for different slices because of the
differing number of anatomical structures contained in the plane and the het-
erogeneity of tracer uptake in tissues. Nevertheless, the tumor (slice 4), right
lung and muscle (slices 4, 19, and 24), blood pool (slices 4, 19, and 24), separate
foci of increased FDG uptake (slices 19 and 24), and the injection site (slices 4,
19, and 24) are identifiable with the optimum number of clusters.
Figure 3.15 shows the measured blood TAC at the pulmonary artery and
the extracted tissue TACs for the tumor (from slice 4), lung and muscle (from
Medical Image Segmentation 155
T T T
L B I B I B I
L L
T T T
B I B I L B I
L L
T T T
B I B I B I
L L L
a b c d e f
Figure 3.14: Tissue segmentation of the dynamic lung FDG-PET data from
Patient 3 in three selected slices: 4 (top row), 19 (middle row), and 24 (bottom
row) with different number of clusters. (a) k = 4, (b) k = 7, (c) k = 8, (d) k = 9,
(e) k = 10, and (f) k = 12. (I = injection site; B = blood pool; L = lung; T =
tumor).
TAC: Time–activity curve; ROI: region of interest. Values are given as estimate ±% CV.
slice 19), foci of increased FDG uptake (from slice 24), and the blood pool (from
slice 19) using the corresponding optimal value of clusters.
The extracted tissue TACs obtained by cluster analysis and manual ROI de-
lineation were fitted to the three-compartment FDG model using nonlinear least
squares method and the results obtained for the tumor tissue TAC (Patient 2)
are summarized in Table 3.2. There was a close agreement between the param-
eter estimates for the tissue TACs obtained by different methods in terms of the
estimate and the coefficient of variation (CV), which is defined as the ratio of
the standard deviation of the parameter estimate to the value of the estimate.
Similar results were also found for other regions.
Table 3.3: Comparison between the estimated input functions obtained using
different number of manually drawn ROIs and clustered ROIs, and the
measured input functions
Number of ROIs
2 3 4 5
MSE = Mean square errors between the estimated and the measured input functions; AUC = area under
the blood curve; r = coefficient of correlation; ROI = region of interest.
FDG-PET studies. Table 3.3 summarizes the results for the estimation of the
input functions by the proposed modeling approach for both manually drawn
ROIs and clustered ROIs. The MSE between the estimated and the measured in-
put functions are tabulated. In addition, results of linear regression analysis on
the areas under the curves (AUCs) covered by the measured and the estimated
input functions are listed for comparison. Regression lines with slopes close to
unity and intercepts close to zero were obtained in all cases for manually drawn
ROIs and for clustered ROIs.
Figure 3.16 plots the measured input function and the estimated input func-
tions for manually drawn ROIs and clustered ROIs, respectively. The estimated
input functions were obtained by simultaneously fitting with three ROIs of dis-
tinct kinetics. There was a very good agreement between the estimated input
functions and the measured blood curve, in terms of the shape and the peak
time estimation at which the peak occurs, despite the overestimation of the
peak value. Thus, cluster analysis may be useful as a preprocessing step before
our noninvasive modeling technique.
158 Wong
Figure 3.16: Plot of the measured arterial input function, the estimated input
function from manually drawn regions of interest (ROIs), and clustering based
ROIs. The estimated input functions were obtained by simultaneously fitting
with three ROIs of distinct kinetics.
0.037
0
a b c
Figure 3.17: Parametric images on a pixel-by-pixel basis of K obtained from
Patient 1: (a) slice 10; (b) slice 13; (c) slice 21. Top row shows the images obtained
from the raw dynamic images and bottom row shows the images obtained from
dynamic images after cluster analysis. The images have been smoothed slightly
for better visualization.
on the noise levels inherent in the data which affect, in addition to meaningful
parameter estimation, the time required to converge as well as the convergence.
Clustering may be useful as a preprocessing step before fast generation of para-
metric images since only a few characteristic curves which have high statisti-
cal significance, need to be fitted as compared to conventional pixel-by-pixel
parametric image generation where many thousands of very noisy tissue TACs
must be analyzed. The computational advantage and time savings for generation
of parametric images (fitting many thousands of kinetic curves versus several
curves) are apparent.
Figure 3.17 shows the parametric images of physiological parameters, K , ob-
tained from the neurologic study for Patient 1 in the three selected slices. The top
and bottom rows of the images correspond to the results obtained from pixel-by-
pixel fitting the TACs in the raw dynamic PET data and data after cluster analysis,
respectively. The K images are relatively noisy when compared to the data after
cluster analysis because of the high-noise levels of PET data which hampered
reliable parametric image generation. However, the visual quality of the K im-
ages improves markedly with cluster analysis as a result of the increased SNR
of the dynamic images. Low-pass filtering of the original parametric images may
improve the SNR but clustering should produce better results because it takes
160 Wong
the tissue TACs with similar temporal characteristics for averaging. Meanwhile,
low-pass filtering only makes use of the spatial (adjacent pixels) information for
filtering and this will only further degrade the spatial resolution. The feasibility of
using the kinetic curves extracted by cluster analysis for noninvasive quantifica-
tion of physiological parameters and parametric imaging has been investigated
and some preliminary data have been reported [109]. Some other recent studies
can be found elsewhere [110–115].
Ideally, after corrections for the physical artifacts (e.g. scatter and attenuation)
and calibration, the reconstructed PET image should represent highly accurate
radiopharmaceutical distribution in absolute units of radioactivity concentra-
tion throughout the field of view of the PET scanner. However, this only holds
for organs or structures with dimensions greater than twice the spatial resolution
of the scanner, which is characterized by the full width at half maximum height
of an image of a line source. When the object or structure being imaged is smaller
than this, the apparent activity concentration is diluted. The degree of dilution in
activity concentration varies with the size of the structure being imaged and the
radioactivity concentration of the imaged structure comparing to its surround-
ing structures [10]. This phenomenon is known as partial volume effect (PVE),
which is solely caused by the limited spatial resolution of the PET scanner.
A number of approaches have been proposed to correct or minimize the
PVE, including resolution recovery before or during image reconstruction, and
incorporation of side information provided from anatomical imaging modalities
such as CT and MRI. One of the popular approaches that incorporates MRI
segmentation for partial volume correction is the method proposed by Müller-
Gärtner et al. [54] but the method is only applicable to brain imaging. PET images
are first spatially co-registered with MR images obtained from the same subject.
The MR images are then segmented into gray matter, white matter, and CSF
regions, represented in three separate images. These images are then convolved
spatially with a smoothing kernel which is derived from the point spread function
of the PET scanner. The convolved white matter, image is then normalized to
the counts in a white matter ROI drawn on the PET image so that spillover white
matter activity into gray matter regions can be removed. Finally, the resultant
image is divided by the smoothed gray matter image so that signals in small
structures, which were smoothed severely, can be enhanced.
In this chapter, a number of segmentation techniques used in, but not specific
to functional imaging have been detailed. In particular, tissue segmentation and
classification in functional imaging are of primary interests for dynamic imag-
ing, for which cluster analysis is a valuable asset for data analysis because the
identified characteristic curves are in the same space as the original data. This
certainly has advantages over PCA in terms of interpretation of identified PCs,
and over FADS where the factor components are rotated, leading to possibly
nonunique factor explanation and interpretation. This chapter focuses on func-
tional segmentation and a clustering technique is presented and discussed in
great detail. The proposed technique is an attempt to overcome some of the
limitations associated with commonly used manual ROI delineation, which is
labor intensive and time-consuming. The clustering technique described is able
to provide statistically meaningful clusters because the entire sequence of im-
ages are analyzed and different kinetic behaviors and the associated regions are
extracted from the dataset, as long as there is a finite number of kinetics in the
data. Once the segmentation process is completed, the extracted TACs, i.e. the
cluster centroids, are then mapped back to the original data space for all vox-
els. Thus, an improved SNR can be achieved because each voxel in the mapped
data space is represented by one of the cluster centroids each of which pos-
sesses a higher statistical significance than an individual TAC in the same spatial
Medical Image Segmentation 165
type may have inherent heterogeneity in it. A typical example is tumor, which are
naturally heterogeneous. Activity concentration in a small tissue structure can
be underestimated or overestimated due to partial volume effects, which cause
the structure being imaged to mix with adjacent structures of possibly markedly
different kinetics within the image volume, resulting in a mixed kinetics of the
structures involved. As a finite number of clusters is assumed to be present
in the raw PET data, the clustering algorithm will automatically look for the
cluster centers that best represent the dataset without any a priori knowledge
about the data and without violating the specified number of clusters. There-
fore, certain regions which are indeterminate but their kinetics are similar, may
be grouped together due to the constraint on the number of clusters, resulting
in the formation of vague clusters. Further studies are required to investigate
tissue heterogeneity in cluster analysis.
In earlier work, O’Sullivan [110] used cluster analysis as an intermediate step
to extract “homogeneous” TACs from data containing a heterogeneous mix of
kinetics resulting from spillover and partial volume effects for parametric map-
ping. This method is very similar to FADS but still there is a main difference
between them: FADS extracts physiological factors (TACs) that could (theoret-
ically) be found in the original data, whereas the set of “homogeneous” TACs
does not necessarily correspond to the underlying physiology. In this current
work, cluster analysis is used to extract kinetic data with different temporal
characteristics as well as for parametric mapping. This is important for data
analysis because data with different temporal behavior are better characterized
by the extracted features seen in a spatial map. A spatial map is simpler to in-
terpret when compared to the original multidimensional data. However, similar
to O’Sullivan’s approach [110], our method is data driven and is independent of
the properties of tracer that may be required by other methods [111]. Thus, the
clustering approach can be applied to a wide range of tracer studies.
It is anticipated that cluster analysis has a great deal of potential in PET data
analysis for various neurodegenerative conditions (e.g. dementias) or diseases
such as multiple system atrophy, Lewy Body disease, and Parkinsons disease
where numerical values for glucose metabolism and the patterns of glucose
hypometabolism may aid in the diagnosis and the assessment of disease pro-
gression. Localization of seizure foci in patients with refractory extratemporal
epilepsy is also important as it is a difficult management problem for surgi-
cal epilepsy programs for this patient group. Functional segmentation may be
Medical Image Segmentation 167
. . . at any moment we are prisoners caught in the framework of our theories; our
expectations; our language. But . . . if we try, we can break out of our framework at
any time. Admittedly, we shall find ourselves again in a framework, but it will be a
better and roomier one; and we can at any moment break out of it again.
There is no magic method that suits all problems. One has to realize the strengths
and limitations of the technique, and understand what kind of information the
technique provides, and careful definition of the goals of segmentation is essen-
tial. It is also important to remember that new ideas and techniques may bring us
something valuable that we are eager to see but something may be overlooked
or missed out in the mean time, because we are bounded by the framework of
the ideas or techniques, just like a prisoner, as Popper said. What we can only
hope is that the new idea or the new technique, i.e. the prison, will be a better
and roomier one where we can break out of it again at any time if there is a
need!
168 Wong
Acknowledgment
This work was partially supported by the Hong Kong Polytechnic University
under Grant G-YX13. Some of the results presented in this chapter were obtained
in the period 1999 to 2002 during which the author was sustained financially by
the National Health and Medical Research Council (NHMRC) of Australia.
Questions
6. What are the common and differences between principal component anal-
ysis (PCA) and factor analysis of dynamic structures (FADS)?
7. What are the major advantages of cluster analysis over other multivariate
analysis approaches such as PCA and FA?
8. What are the major advantages of using clustering for characterizing tissue
kinetics?
Medical Image Segmentation 169
Bibliography
[4] Brzakovic, D., Luo, X. M., and Brzakovic, P., An approach to automated
detection of tumors in mammograms, IEEE Trans. Med. Imaging, Vol. 9,
pp. 233–241, 1990.
[5] Liang, Z., MacFall, J. R., and Harrington, D. P., Parameter estimation
and tissue segmentation from multispectral MR images, IEEE Trans.
Med. Imaging, Vol. 13, pp. 441–449, 1994.
[6] Ardekani, B. A., Braun, M., Hutton, B. F., Kanno, I., and Iida, H., A
fully automatic multimodality image registration algorithm, J. Comput.
Assist. Tomogr., Vol. 19, pp. 615–623, 1995.
[7] Bankman, I. N., Nizialek, T., Simon, I., Gatewood, O. B., Weinberg, I. N.,
and Brody, W. R., Segmentation algorithms for detecting microcalcifi-
cations in mammograms, IEEE Trans. Inform. Technol. Biomed., Vol. 1,
pp. 141–149, 1997.
[8] Small, G. W., Stern, C. E., Mandelkern, M. A., Fairbanks, L. A., Min,
C. A., and Guze, B. H., Reliability of drawing regions of interest for
positron emission tomographic data, Psych. Res., Vol. 45, pp. 177–185,
1992.
[9] White, D. R., Houston, A. S., Sampson, W. F., and Wilkins, G. P., Intra-
and interoperator variations in region-of-interest drawing and their
effect on the measurement of glomerular filtration rates, Clin. Nucl.
Med., Vol. 24, pp. 177–181, 1999.
170 Wong
[10] Hoffman, E. J., Huang, S. C., and Phelps, M. E., Quantitation in positron
emission computed tomography, 1: Effect of object size, J. Comput.
Assist. Tomogr., Vol. 3, pp. 299–308, 1979.
[11] Mazziotta, J. C., Phelps, M. E., Plummer, D., and Kuhl, D. E., Quantita-
tion in positron emission compted tomography, 5: Physical-anatomical
effects, J. Cereb. Blood Flow Metab., Vol. 5, pp. 734–743, 1981.
[12] Hutchins, G. D., Caraher, J. M., and Raylman, R. R., A region of interest
strategy for minimizing resolution distortions in quantitative myocar-
dial PET studies, J. Nucl. Med., Vol. 33, pp. 1243–1250, 1992.
[13] Welch, A., Smith, A. M., and Gullberg, G. T., An investigation of the
effect of finite system resolution and photon noise on the bias and
precision of dynamic cardiac SPECT parameters, Med. Phys., Vol. 22,
pp. 1829–1836, 1995.
[14] Bezdek, J., Hall, L., and Clarke, L., Review of MR image segmentation
techniques using pattern recognition, Med. Phys., Vol. 20, pp. 1033–
1048, 1993.
[16] Mazziotta, J. C., Pelizzari, C. A., Chen, G. T., Bookstein, F. L., and
Valentino, D., Region of interest issues: The relationship between struc-
ture and function in the brain, J. Cereb. Blood Flow Metab., Vol. 11,
pp. A51–A56, 1991.
[21] Castleman, K. R., Digital Image Processing, Prentice Hall, Upper Saddle
River, NJ, 1996.
[22] Kittler, J., Illingworth, J., and Foglein, J., Threshold based on a simple
image statistics, Comp. Vision Graph. Image Proc., Vol. 30, pp. 125–147,
1985.
[23] Chow, C. K. and Kaneko, T., Automatic boundary detection of the left
ventricle from cineangiograms, Comput. Biomed. Res., Vol. 5, pp. 388–
410, 1972.
[24] Marr, D. and Hildreth, E., Theory of edge detection, Proc. Roy. Soc.
London, Vol. 27, pp. 187–217, 1980.
[25] Sun, Y., Lucariello, R. J., and Chiaramida, S. A., Directional low-pass
filtering for improved accuracy and reproducibility of stenosis quan-
tification in coronary arteriograms, IEEE Trans. Med. Imaging, Vol. 14,
pp. 242–248, 1995.
[26] Faber, T. L., Akers, M. S., Peshock, R. M., and Corbett, J. R., Three-
dimensional motion and perfusion quantification in gated single-
photon emission computed tomograms, J. Nucl. Med., Vol. 32,
pp. 2311–2317, 1991.
[27] Hough, P. V. C., A method and means for recognizing complex patterns,
US Patent 3069654, 1962.
[28] Deans, S. R., The Radon Transform and Some of Its Applications, Wiley,
New York, 1983.
[29] Radon, J., Über die bestimmung von funktionen durchihre inte-
gralwärte längs gewisser männigfaltigkeiten, Bertichte Säechsiche
Akad. Wissenschaften (Leipzig), Math. Phys. Klass, Vol. 69, pp. 262–
277, 1917.
[30] Kalviainen, H., Hirvonen, P., Xu, L., and Oja, E., Probabilistic and non-
probabilistic Hough transforms: Overview and comparisons, Image
Vision Comput., Vol. 13, pp. 239–252, 1995.
172 Wong
[31] Kassim, A., Tan, T., and Tan, K., A comparative study of efficient gen-
eralized Hough transforms techniques, Image Vision Comput., Vol. 17,
pp. 737–748, 1999.
[32] Martelli, A., Edge detection using heuristic search methods, Comp.
Graph. Image Proc., Vol. 1, pp. 169–182, 1972.
[34] Geiger, D., Gupta, A., Costa, A., and Vlontzos, J., Dynamic program-
ming for detecting, tracking, and matching deformable contours, IEEE
Trans. Patt. Anal. Mach. Intell., Vol. 17, pp. 294–302, 1995.
[36] Zucker, S., Region growing: Childhood and adolescence, Comp. Graph.
Image Proc., Vol. 5, pp. 382–399, 1976.
[37] Hebert, T. J., Moore, W. H., Dhekne, R. D., and Ford, P. V., Design of
an automated algorithm for labeling the cardiac blood pool in gated
SPECT images of radiolabeled red blood cells, IEEE Trans. Nucl. Sci.,
Vol. 43, pp. 2299–2305, 1996.
[38] Kim, J., Feng, D. D., Cai, T. W., and Eberl, S., Automatic 3D temporal
kinetics segmentation of dynamic emission tomography image using
adaptive region growing cluster analysis, In: Proceedings of 2002 IEEE
Medical Imaging Conference, Vol. 3, IEEE, Norfolk, VA, pp. 1580–1583,
2002.
[41] Bezdek, J. C., Ehrlich, R., and Full, W., FCM: The fuzzy c-means clus-
tering algorithm, Comp. Geosci., Vol. 10, pp. 191–203, 1984.
[45] Kass, M., Witkin, A., and Terzopoulos, D., Snakes: Active contour mod-
els, Int. J. Comput. Vis., Vol. 1, pp. 321–331, 1987.
[50] Mykkänen, J. M., Tohka, J., and Ruotsalainen, U., Automated delin-
eation of brain structures with snakes in PET, In: Physiological Imag-
ing of the Brain with PET, Gjedde, A., Hansen, S. B., Knudsen, G., and
Paulson, O. B., eds., Academic Press, San Diego, pp. 39–43, 2001.
[51] Chiao, P. C., Rogers, W. L., Fessler, J. A., Clinthorne, N. H., and Hero,
A. O., Motion-based estimation with boundary side information or
boundary regularization, IEEE Trans. Med. Imaging, Vol. 13, pp. 227–
234, 1994.
[52] Chiao, P. C., Rogers, W. L., Clinthorne, N. H., Fessler, J. A., and Hero,
A. O., Model-based estimation for dynamic cardiac studies using ECT,
IEEE Trans. Med. Imaging, Vol. 13, pp. 217–226, 1994.
[53] Meltzer, C. C., Leal, J. P., Mayberg, H. S., Wagner, H. N., and Frost, J. J.,
Correction of PET data for partial volume effects in human cerebral
cortex by MR imaging, J. Comput. Assist. Tomogr., Vol. 14, pp. 561–570,
1990.
174 Wong
[54] Müller-Gärtner, H. W., Links, J. M., Price, J. L., Bryan, R. N., McVeigh, E.,
Leal, J. P., Davatzikos, C., and Frost, J. J., Measurement of radiotracer
concentration in brain gray matter using positron emission tomogra-
phy: MRI-based correction for partial volume effects, J. Cereb. Blood
Flow Metab., Vol. 12, pp. 571–583, 1992.
[55] Fox, P. T., Perlmutter, J. S., and Raichle, M. E., A stereotatic method of
anatomical localization for positron emission tomography, J. Comput.
Assist. Tomogr., Vol. 9, pp. 141–153, 1985.
[56] Talairach, J., Tournoux, P., and Rayport, M., Co-planar Stereotaxic At-
las of the Human Brain, Thieme, Inc., New York, 1988.
[58] Bremner, J. D., Bronen, R. A., De Erasquin, G., Vermetten, E., Staib,
L. H., Ng, C. K., Soufer, R., Charney, D. S., and Innis, R. B., Development
and reliability of a method for using magnetic resonance imaging for
the definition of regions of interest for positron emission tomography,
Clin. Pos. Imag., Vol. 1, pp. 145–159, 1998.
[61] Woods, R. P., Mazziotta, J. C., and Cherry, S. R., MRI-PET registra-
tion with automated algorithm, J. Comput. Assisted Tomogr., Vol. 17,
pp. 536–546, 1993.
[62] Rogowska, J., Similarity methods for dynamic image analysis, In: Pro-
ceedings of International AMSE Conference on Signals and Systems,
Vol. 2, Warsaw, Poland, 15–17 July 1991, pp. 113–124.
Medical Image Segmentation 175
[65] Bandettini, P. A., Jesmanowicz, A., Wong, E. C., and Hyed, J. S., Pro-
cessing strategies for time-course datasets in functional MRI of the
human brain, Magn. Res. Med., Vol. 30, pp. 161–173, 1993.
[66] Rogowska, J., Preston, K., Hunter, G. J., Hamberg, L. M., Kwong, K. K.,
Salonen, O., and Wolf, G. L., Applications of similarity mapping in
dynamic MRI, IEEE Trans. Med. Imaging, Vol. 14, pp. 480–486, 1995.
[67] Jolliffe, I., Principal Component Analysis, Springer, New York, 1986.
[70] Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P.,
Numerical Recipes in C. The Art of Scientific Computing, Cambridge
University Press, New York, 1992.
[71] Golub, G. H. and Van Loan, C. F., Matrix Computations, 3rd edn., John
Hopkins University Press, Baltimore, 1996.
[73] Friston, K. J., Frith, C. D., Liddle, P. F., and Frackowiak, R. S., Func-
tional connectivity: The principal component analysis of large (PET)
data sets, J. Cereb. Blood Flow Metab., Vol. 13, pp. 5–14, 1993.
176 Wong
[74] Pedersen, F., Bergström, M., and Långström, B., Principal component
analysis of dynamic positron emission tomography images, Eur. J.
Nucl. Med., Vol. 21, pp. 1285–1292, 1994.
[75] Strother, S. C., Anderson, J. R., Schaper, K. A., Sidtis, J. S., and Rotten-
berg, D. A., Linear models of orthogonal subspaces and networks from
functional activation PET studies of the human brain, In: Information
Processing in Medical Imaging, Bizais, Y., Barillot, C., and Di Paola, R.,
eds., Kluwer, Dordrecht, The Netherlands, pp. 299–310, 1995.
[76] Ardekani, B. A., Strother, S. C., Anderson, J. R., Law, I., Paulson, O. B.,
Kanno, I., and Rottenberg, D. A., On the detection of activation patterns
using principal components analysis, In: Quantitative Functional Brain
Imaging with Positron Emission Tomography, Carson, R. E., Daube-
Witherspoon, M. E., and Herscovitch, P., eds., Academic Press, San
Diego, pp. 253–257, 1998.
[77] Anzai, Y., Minoshima, S., Wolf, G. T., and Wahl, R. L., Head and neck can-
cer: Detection of recurrence with three-dimensional principal compo-
nents analysis at dynamic FDG PET, Radiology, Vol. 212, pp. 285–290,
1999.
[78] Andersen, A. H., Gash, D. M., and Avison, M. J., Principal component
analysis of the dynamic response measured by fMRI: A generalized
linear systems framework, Mag. Res. Imag., Vol. 17, pp. 795–815, 1999.
[79] Baumgartner, R., Ryner, L., Richter, W., Summers, R., Jarmasz, M., and
Somorjai, R., Comparison of two exploratory data analysis methods
for fMRI: Fuzzy clustering vs. principal component analysis, Mag. Res.
Imag., Vol. 18, pp. 89–94, 2000.
[80] Correia, J., A bloody future for clinical PET? [editorial], J. Nucl. Med.,
Vol. 33, pp. 620–622, 1992.
[81] Iida, H., Rhodes, C. G., De Silva, R., Araujo, L. I., Bloomfield,
P. M., Lammertsma, A. A., and Jones, T., Use of the left ventricu-
lar time-activity curve as a non-invasive input function in dynamic
Oxygen-15-Water positron emission tomography, J. Nucl. Med., Vol. 33,
pp. 1669–1677, 1992.
Medical Image Segmentation 177
[82] Chen, K., Bandy, D., Reiman, E., Huang, S. C., Lawson, M., Feng,
D., Yun, L. S., and Palant, A., Noninvasive quantification of the cere-
bral metabolic rate for glucose using positron emission tomography,
18
F-fluoro-2-deoxyglucose, the Patlak method, and an image-derived
input function, J. Cereb. Blood Flow Metab., Vol. 18, pp. 716–723, 1998.
[83] Houston, A. S., The effect of apex-finding errors on factor images ob-
tained from factor analysis and oblique transformation, Phys. Med.
Biol., Vol. 29, pp. 1109–116, 1984.
[84] Nirjan, K. S. and Barber, D. C., Factor analysis of dynamic function stud-
ies using a priori physiological information, Phys. Med. Biol., Vol. 31,
pp. 1107–1117, 1986.
[85] Šámal, M., Kárný, M., Su̇rová, H., and Dienstbier, Z., Rotation to sim-
ple structure in factor analysis of dynamic radionuclide studies, Phys.
Med. Biol., Vol. 32, pp. 371–382, 1987.
[86] Buvat, I., Benali, H., Frouin, F., Bazin, J. P., and Di Paola, R., Target
apex-seeking in factor analysis on medical sequences, Phys. Med. Biol.,
Vol. 38, pp. 123–128, 1993.
[87] Sitek, A., Di Bella, E. V. R., and Gullberg, G. T., Factor analysis with
a priori knowledge—Application in dynamic cardiac SPECT, Phys.
Med. Biol., Vol. 45, pp. 2619–2638, 2000.
[88] Wu, H. M., Hoh, C. K., Buxton, D. B., Schelbert, H. R., Choi, Y., Hawkins,
R. A., Phelps, M. E., and Huang, S. C., Factor analysis for extraction of
blood time-activity curves in dynamic FDG-PET studies, J. Nucl. Med.,
Vol. 36, pp. 1714–1722, 1995.
[89] Wu, H. M., Huang, S. C., Allada, V., Wolfenden, P. J., Schelbert, H. R.,
Phelps, M. E., and Hoh, C. K., Derivation of input function from FDG-
PET studies in small hearts, J. Nucl. Med., Vol. 37, pp. 1717–1722,
1996.
[90] Sitek, A., Di Bella, E. V. R., and Gullberg, G. T., Factor analysis of dy-
namic structures in dynamic SPECT imaging using maximum entropy,
IEEE Trans. Nucl. Sci., Vol. 46, pp. 2227–2232, 1999.
178 Wong
[91] Sitek, A., Gullberg, G. T., and Huesman, R. H., Correction for ambiguous
solutions in factor analysis using a penalized least squares objective,
IEEE Trans. Med. Imaging, Vol. 21, pp. 2166–225, 2002.
[92] Ashburner, J., Haslam, J., Taylor, C., Cunningham, V. J., and Jones,
T., A cluster analysis approach for the characterization of dynamic
PET data, In: Quantification of Brain Function using PET, Myers, R.,
Cunningham, V., Bailey, D., and Jones, T., eds., Academic Press, San
Diego, pp. 301–306, 1996.
[93] Acton, P. D., Pilowsky, L. S., Costa, D. C., and Ell, P. J., Multivariate clus-
ter analysis of dynamic iodine-123 iodobenzamide SPET dopamine D2
receptor images in schizophrenia, Eur. J. Nucl. Med., Vol. 24, pp. 111–
118, 1997.
[94] Wong, K. P., Feng, D., Meikle, S. R., and Fulham, M. J., Segmentation
of dynamic PET images using cluster analysis, IEEE Trans. Nucl. Sci.,
Vol. 49, pp. 200–207, 2002.
[95] Wong, K. P., Feng, D., Meikle, S. R., and Fulham, M. J., Simultaneous es-
timation of physiological parameters and the input function—In vivo
PET data, IEEE Trans. Inform. Technol. Biomed., Vol. 5, pp. 67–76,
2001.
[96] Wong, K. P., Meikle, S. R., Feng, D., and Fulham, M. J., Estimation
of input function and kinetic parameters using simulated annealing:
Application in a flow model, IEEE Trans. Nucl. Sci., Vol. 49, pp. 707–
713, 2002.
[98] Zubal, I. G., Harrell, C. R., Smith, E. O., Rattner, Z., Gindi, G., and Hof-
fer, P. B., Computerized three-dimensional segmented human anatomy,
Med. Phys., Vol. 21, pp. 299–302, 1994.
[99] Hoffman, E. J., Cutler, P. D., Digby, W. M., and Mazziotta, J. C., 3-D
phantom to simulate cerebral blood flow and metabolic images for
PET, IEEE Trans. Nucl. Sci., Vol. 37, pp. 616–620, 1990.
Medical Image Segmentation 179
[100] Hawkins, R. A., Phelps, M. E., and Huang, S. C., Effects of temporal
sampling, glucose metabolic rates, and disruptions of the blood-brain
barrier on the FDG model with and without a vascular compartment:
Studies in human brain tumors with PET, J. Cereb. Blood Flow Metab.,
Vol. 6, pp. 170–183, 1986.
[101] Akaike, H., A new look at the statistical model identification, IEEE
Trans. Automatic Control, Vol. AC-19, pp. 716–723, 1974.
[102] Schwarz, G., Estimating the dimension of a model, Ann. Stat., Vol. 6,
pp. 461–464, 1978.
[103] Hooper, P. K., Meikle, S. R., Eberl, S., and Fulham, M. J., Validation of
post injection transmission measurements for attenuation correction
in neurologic FDG PET studies, J. Nucl. Med., Vol. 37, pp. 128–136,
1996.
[104] Huang, S. C., Phelps, M. E., Hoffman, E. J., Sideris, K., Selin, C., and
Kuhl, D. E., Noninvasive determination of local cerebral metabolic rate
of glucose in man, Am. J. Physiol., Vol. 238, pp. E69–E82, 1980.
[105] Patlak, C. S., Blasberg, R. G., and Fenstermacher, J., Graphical evalu-
ation of blood-to-brain transfer constants from multiple-time uptake
data, J. Cereb. Blood Flow Metab., Vol. 3, pp. 1–7, 1983.
[109] Wong, K. P., Feng, D., Meikle, S. R., and Fulham, M. J., Non-invasive
determination of the input function in PET by a Monte Carlo approach
and cluster analysis, J. Nucl. Med., Vol. 42, No. 5(Suppl.), p. 183P, 2001.
[111] Kimura, Y., Hsu, H., Toyama, H., Senda, M., and Alpert, N. M., Im-
proved signal-to-noise ratio in parametric images by cluster analysis,
NeuroImage, Vol. 9, pp. 554–561, 1999.
[113] Kimura, Y., Senda, M., and Alpert, N. M., Fast formation of statistically
reliable FDG parametric images based on clustering and principal com-
ponents, Phys. Med. Biol., Vol. 47, pp. 455–468, 2002.
[114] Zhou, Y., Huang, S. C., Bergsneider, M., and Wong, D. F., Improved para-
metric image generation using spatial-temporal analysis of dynamic
PET studies, NeuroImage, Vol. 15, pp. 697–707, 2002.
[115] Bal, H., DiBella, E. V. R., and Gullberg, G. T., Parametric image forma-
tion using clustering for dynamic cardiac SPECT, IEEE Trans. Nucl.
Sci., Vol. 50, pp. 1584–1589, 2003.
[116] Toyama, H., Takazawa, K., Nariai, T., Uemura, K., and Senda, M., Visu-
alization of correlated hemodynamic and metabolic functions in cere-
brovascular disease by a cluster analysis with PET study, In: Phys-
iological Imaging of the Brain with PET, Gjedde, A., Hansen, S. B.,
Knudsen, G. M., and Paulson, O. B., eds., Academic Press, San Diego,
pp. 301–304, 2001.
[117] Koh, W. J., Rasey, J. S., Evans, M. L., Grierson, J. R., Lewellen, T. K.,
Graham, M. M., Krohn, K. A., and Griffin, T. W., Imaging of hypoxia
in human tumors with [F-18]fluoromisonidazole, Int. J. Radiat. Oncol.
Biol. Phys., Vol. 22, pp. 199–212, 1992.
[119] Huang, S. C., Hoffman, E. J., Phelps, M. E., and Kuhl, D. E., Quantitation
in positron emission computed tomography, 2: Effects of inaccurate
attenuation correction, J. Comput. Assist. Tomogr., Vol. 3, pp. 804–814,
1979.
[121] Huang, S. C., Carson, R. E., Phelps, M. E., Hoffman, E. J., Schelbert,
H. R., and Kuhl, D. E., A boundary method for attenuation correction
in positron computed tomography, J. Nucl. Med., Vol. 22, pp. 627–637,
1981.
[122] Xu, M., Luk, W. K., Cutler, P. D., and Digby, W. M., Local threshold for
segmented attenuation correction of PET imaging of the thorax, IEEE
Trans. Nucl. Sci., Vol. 41, pp. 1532–1537, 1994.
[123] Meikle, S. R., Dahlbom, M., and Cherry, S. R., Attenuation correction
using count-limited transmission data in positron emission tomogra-
phy, J. Nucl. Med., Vol. 34, pp. 143–144, 1993.
[124] Papenfuss, A. T., O’Keefe, G. J., and Scott, A. M., Segmented attenuation
correction in whole body PET using neighbourhood EM clustering,
In: 2000 IEEE Medical Imaging conference, IEEE Publication, Lyon,
France, 2000.
[125] Bettinardi, V., Pagani, E., Gilardi, M. C., Landoni, C., Riddell, C.,
Rizzo, G., Castiglioni, I., Belluzzo, D., Lucignani, G., Schubert, S., and
Fiazio, F., An automatic classification technique for attenuation cor-
rection in positron emission tomography, Eur. J. Nucl. Med., Vol. 26,
pp. 447–458, 1999.
[126] Ogawa, S., Lee, T. M., Kay, A. R., and Tank, D. W., Brain magnetic
resonance imaging with contrast dependent on blood oxygenation,
Proc. Natl. Acad. Sci. USA, Vol. 87, pp. 9868–9872, 1990.
[129] Moser, E., Diemling, M., and Baumgartner, R., Fuzzy clustering of
gradient-echo functional MRI in the human visual cortex. Part II: Quan-
tification, J. Magn. Reson. Imaging, Vol. 7, pp. 1102–1108, 1997.
[130] Goutte, C., Toft, P., Rostrup, E., Nielsen, F. Å., and Hansen, L. K., On
clustering fMRI time series, NeuroImage, Vol. 9, pp. 298–310, 1999.
[131] Fadili, M. J., Ruan, S., Bloyet, D., and Mazoyer, B., A multistep un-
supervised fuzzy clustering analysis of fMRI time series, Hum. Brain
Mapping, Vol. 10, pp. 160–178, 2000.
[132] Schmidt, K., Lucignani, G., Moresco, R. M., Rizzo, G., Gilardi, M. C.,
Messa, C., Colombo, F., Fazio, F., and Sokoloff, L., Errors introduced by
tissue heterogeneity in estimation of local cerebral glucose utilization
with current kinetic models of the [18 F]fluorodeoxyglucose method, J.
Cereb. Blood Flow Metab., Vol. 12, pp. 823–834, 1992.
[133] Popper, K. R., Normal science and its dangers, In: Criticism and the
Growth of Knowledge, Lakatos, I. and Musgrave, A., eds., Cambridge
University Press, Cambridge, pp. 51–58, 1970.
Chapter 4
4.1 Introduction
Pancreatic cancer is the fourth leading cause of cancer deaths in the United
States but only the tenth site for new cancer cases (estimated at 30,300 in 2002)
[1, 2]. The reason for this major difference is that there are no clear early symp-
toms of pancreatic cancer and no screening procedures or screening policy for
this disease. It is usually diagnosed at a late stage and has a poor prognosis
with a 1-year survival rate of 20% and a 5-year survival rate of less than 5% [3].
Complete surgical resection is the only way to significantly improve prognosis
and possibly lead to a cure. Unfortunately, only 15–20% of the patients can un-
dergo resection with median survival rates from 12 to 19 months and a 5-year
survival rate of 15–20%. A large majority of patients with pancreatic cancer re-
ceives palliative care or may follow a therapeutic approach the impact of which
is currently quite limited and difficult to assess quantitatively [3].
Pancreatic cancer is a disease that is not extensively studied and is poorly
understood. A significant risk factor associated with pancreatic cancers is age
(frequency increases linearly after 50 years of age) with median age of diagno-
sis of 71 years. In addition to aging, other probable risk factors include fam-
ily history, cigarette smoking, long-standing diabetes, and chronic and hered-
itary pancreatitis [2]. Studies have also implicated, without any consistency,
1
Department of Radiology, H. Lee Moffitt Cancer Center & Research Institute, University
of South Florida, Tampa, FL 33612
183
184 Kallergi, Hersh, and Manohar
Figure 4.1: Human anatomy indicating the location of the pancreas and
major neighboring organs (reprinted from http://pathology2.jhu.edu/pancreas/
pancreas1.cfm).
Figure 4.2: Detailed structure and main parts of the pancreas where H = head,
B = body, and T = tail (reprinted from http://pathology2.jhu.edu/pancreas/
pancreas1.cfm).
pancreatic cancers occur in the head or neck of the pancreas, about 15% occur
in the body of the pancreas, about 5% in the tail, and about 20% are diffused
through the gland [7]. Pancreatic cancer metastasizes rapidly even when the
primary tumors are less than 2 cm. Metastasis most commonly occurs to re-
gional lymph nodes, then to liver, and less commonly to the lungs. The tumors
display a high degree of resistance to conventional chemotherapy and radiation
therapy.
The National Cancer Institute (NCI) created a Review Group in 2001 to de-
fine an agenda for action for pancreatic cancer and the research priorities that
could reduce morbidity and mortality from this difficult disease [2]. The research
priorities set by NCI spanned a wide range of areas including
r tumor biology
r risk, prevention, detection, and diagnosis
r therapy
r health sciences research
In each of the areas above, research priorities were defined that could lead to
the advancement of our knowledge of the disease and improved health care for
the patients. Our interest in this chapter is on the role of imaging and computer
186 Kallergi, Hersh, and Manohar
1. Computed tomography (CT); standard [11, 12], helical [7, 13, 14], and mul-
tidetector [8, 15] with the latter becoming the dominant modality of choice.
modality include the desired accuracy of the procedure for providing staging
information, or its ability to perform simultaneous biopsy of the tumor, or its
capacity to facilitate therapeutic procedures. Detection usually starts with trans-
abdominal sonography to identify causes of pain. After sonography, CT is used
as the primary modality for diagnosis and staging. MRI is also used for staging.
MRCP and ERCP imaging provide additional information on the level of ob-
struction of the biliary or pancreatic ductal systems. Fine-needle aspiration of
suspected pancreatic lesions can be done with EUS for increased biopsy speci-
ficity. Specificity is a problem with all imaging modalities as they do not make it
possible to distinguish between pancreatic cancer and other pancreatic pathol-
ogy, e.g., chronic pancreatitis, mucinous cystadenoma, and intraductal papillary
mucinous neoplasms [10].
Today the most common modality for pancreatic imaging is helical CT, which
has significantly improved outcomes relative to the standard CT or the other
imaging modalities. Standard abdominal CT scans can help detect 70–80% of pan-
creatic carcinomas [3, 13]. But 40–50% of tumors smaller than 3 cm are missed,
and these are the tumors most likely to be resectable. Helical CT improved
significantly the resolution of conventional CT for pancreatic tumor imaging
[25]. Helical CT has also impacted staging and treatment monitoring procedures
and is now probably the most useful imaging technique for such investigations.
Helical CT is the technique that will be focused on in this chapter.
Priorities set by the NCI Review Group for pancreatic cancer imaging include
[2] (a) increase specificity of current imaging modalities, (b) increase sensitivity
of current imaging modalities for small invasive and preinvasive lesions in both
normal and abnormal pancreas, (c) develop and test molecular imaging tech-
niques, (e) develop and test screening and surveillance imaging protocols for
high-risk patients, and (f) develop and test noninvasive techniques to accurately
define the effect of treatment. Computer aided diagnosis (CAD) schemes are
computer algorithms that could assist the physician (radiologist or oncologist)
in the interpretation of the radiographic images and the evaluation of the disease.
CAD could play several roles in the above imaging priorities and contribute in
several recommendations and research directions. CAD using CT scans seems
to be the logical first step in the development of computer tools for pancreatic
cancer because of the major role of CT in this area, the large amount of infor-
mation available in CT scans, and the considerable potential for improvements
that could have significant clinical impact independent of magnitude.
188 Kallergi, Hersh, and Manohar
Continuous
Patient/Table
Translation
Z-axis
Y-axis
X-axis
Figure 4.3: Schematic diagram of the helical CT set-up and operation principle
including cross-section (slice) plane (x,y), and z axis orientation.
R
S
B
C
Figure 4.4: Contrast enhanced, helical CT scan through the abdomen and the
head of the pancreas obtained with a reconstruction width of 8 mm (equal to
slice thickness). A = liver; B = head of pancreas with tumor; C = bowel; D =
spleen; E = right adrenal; F = aorta.
(a) Slice thickness (mm): This is equal to the collimation of the X-ray beam.
Maximum slice thickness depends on the size of the detectors and is
typically about 8–10 mm. Thick slices are used for general abdominal
scans and they usually have better contrast resolution than do thin slices,
which in turn have better spatial resolution and require higher radiation
dose. Thin slices are used for small organs or to evaluate and review a
region of interest in more detail. Pancreatic scans may be performed in
either thick (8 mm) scans or thin (less than 4 mm) scans or combinations
depending on the case and/or the protocol requirements. Different slice
thicknesses result in different image properties and noise characteristics
so they have to be carefully considered in computer applications including
registration, reconstruction, and segmentation.
(c) Resolution: The field of view is related to the spatial resolution of the CT
slices or the in-plane resolution. Based on the field of view and the size
of the 2-D matrix (X and Y dimensions in pixels (see Fig. 4.3)) the spatial
resolution or pixel size can be determined. The dynamic resolution or
pixel depth is determined by the characteristics of the detector.
(d) Exposure: kVp and mAs are the two parameters that influence exposure
with kVp defining the beam quality or the average intensity and mAs
defining the quantity of the beam. For larger patients, the mAs may be
increased.
(e) Pitch: This is the ratio of the distance traveled by the table during one full
rotation of the detector (gantry) to the beam collimation. For example,
a pitch of 1.0 corresponds to a scan with a beam collimation of 10 mm,
1-sec duration of a 360 degree rotation, and a rate of table movement of
10 mm/sec. By doubling the rate of table translation, the pitch is increased
proportionally. In pancreatic screening scans the table speed is on the
order of 15 mm/rotation and the pitch is 6. In diagnostic high-resolution
scans the table speed is on the order of 6 with a pitch of 6. The higher
the pitch the shorter the scan time. For multislice helical CT, pitch is also
defined as the ratio of the table travel per gantry rotation to the nominal
slice thickness. This is a more ambiguous definition, not applicable to
single-slice helical CT, but often used by the manufacturers of multislice
CT scanners [30].
(f) Contrast type and amount: For pancreatic scans, a contrast material
(e.g., Omnipaque 320 or 350) is administered intravenously prior to imag-
ing at a volume of 100–120 cc. Water, gastograffin, or barium is usually
given as oral contrast. There is 50–60 sec scan delay to allow for optimum
imaging of the pancreas after the administration of the contrast material.
because it is the most common type of pancreatic cancer and, hence, the one bet-
ter understood. Similar imaging and evaluation procedures are initially followed
for all pancreatic tumor types.
All pancreatic tumors are better visualized when intravenous contrast ma-
terial is used. Only necrotic tumors and very large tumors can be identified
without contrast enhancement. Endocrine tumors often have associated calci-
fications and are less likely to have central necrosis than do adenocarcinomas.
Automatic Segmentation of Pancreatic Tumors in Computed Tomography 193
They also enhance more than normal tissue during the initial phases of contrast
administration [36]. Cystic neoplasms have a variety of appearances. They can
appear solid secondary to the multiple tiny nonvisible cysts or they can appear
as multiple small cysts or as “multilocular-appearing mass” with thin septations
[37]. Alterations in the bowel, blood vessels, or ducts within or adjacent to the
pancreas may be caused by all types of pancreatic tumors and are important
features in the identification of pancreatic abnormalities [38].
Once diagnosed, pancreatic tumors are surgically removed or treated. The
resection of pancreatic tumors is based on the identified tumor size and the
presence or absence of additional abnormal signs on the abdominal CT scans.
Resection is determined by three imaging criteria:
r Tumor size (less than 4 cm usually); tumors greater than 5 cm are resectable
in less than 10% of the cases.
Clinical and demographic characteristics play a role in feature selection for clus-
tering and classification. In the past, few CAD application incorporated image
and nonimage characteristics in algorithm design. New directions in medical
image analysis and processing clearly demonstrate the need to consider the pa-
tient as a whole and integrate information from a variety of sources to achieve
high performances.
Processing
Selected External Segmentation
Image
Input CT Signal Classification
Enhancement
Image Segmentation Registration
Reconstruction
Figure 4.5: General algorithm design for CT image processing. Processing may
include a segmentation, a classification, a registration, a reconstruction step, or
any combinations of these.
Automatic Segmentation of Pancreatic Tumors in Computed Tomography 195
the goals of the development. Herein we focus on issues related to 2-D CT pro-
cessing and, hence, registration and reconstruction will not be discussed other
than to mention that significant work exists in the area of CT slice registration
and reconstruction but is not necessarily focused on pancreatic imaging [39,
40]. We should also note that registration is necessary to the evaluation of serial
(temporal) images of the same patient. For example, in the case of segmentation
of the pancreas in multiple, serial scans of a patient that undergoes treatment,
registration of CT images obtained at different times may be necessary prior
to the assessment of changes from one scan to the next. In the following para-
graphs, we will examine each module of the CT image processing algorithm
(shown in Fig. 4.5) in more detail.
Figure 4.6: (a) Original helical, contrast enhanced CT slice with a tumor at the
head of the pancreas indicated by black arrow. (b) Region based segmentation
using ITK software on Fig. 4.6(a).
were some of the techniques tested for the segmentation of the pancreas and
pancreatic tumors. Initial results suggested that region growing was the best
approach because most of the other techniques clustered the majority of the
structures in the image together not allowing separation of the pancreas from
the other organs. But even with region growing, the pancreas and associated
tumor could not be separated from the liver if the pancreatic structures were
to remain in the segmented image; separation occurred at the expense of los-
ing most of the information from the gland and associated tumor. Representa-
tive segmentation outputs from the region growing approach of ITK are shown
in Figs. 4.6(b) and 4.7(b) for two CT slices that contain a mass at the head
(Fig. 4.6(a)) and tail (Fig. 4.7(a)) of the pancreas respectively. It should be noted
that, although not fully optimized for this application, the tools included in ITK
are not likely to yield, by themselves, the desired segmentation outcome be-
cause of the low contrast differences between adjacent organs and the way
region growing operates. The initial problems we identified in the application of
conventional segmentation techniques on CT images of the pancreas include the
following:
Figure 4.7: (a) Original helical, contrast enhanced CT slice with a pancreatic
tumor at the tail of the pancreas indicated by white arrow. (b) Region based
segmentation using ITK software on Fig. 4.7(a).
2. The shape of the various organs in the CT slices is not always well defined or
consistent from slice to slice. So, it is difficult to select generally applicable
characteristics. CAD development is likely to require an adaptive process
to deal with this variability.
organs including the pancreatic areas. This could make the job of subsequent
segmentation steps easier and more successful.
4.3.4 Processing—Classification
Very few methodologies have been developed for the classification of pancreatic
tumors, e.g., the differentiation between benign and malignant disease or even
the differentiation between normal and abnormal pancreas or pancreatic areas
reported. One application used several classification schemes to differentiate be-
tween pancreatic ductal adenocarcinoma and mass-forming pancreatitis. The
methods included artificial neural network classifiers, Bayesian analysis, and
Hayashi’s quantification method II [45]. The approach used radiologist-extracted
CT features for the classification and no automatic segmentation or feature
identification was performed. Results indicated that all computer techniques
performed similarly to expert radiologists and had no significant benefits [45].
The classification task adds another level of difficult to the segmentation. It is
reasonable to hypothesize that classification my be successful if automated fea-
ture extraction is performed or when image and nonimage features are merged
in the feature set.
Electronic
Database Ground
Truth Files
Post-
Processing
Fuzzy Processing
External Organ
CT Pre-
Signal Clustering Validation
Slice Processing
Segmentation
Tumor
Classification
Figure 4.8: Block diagram of CAD algorithm developed for the clustering and
classification of pancreatic tumors on helical CT scans.
the design and implementation of the algorithm. The designed CAD scheme
follows the general principles presented in Fig. 4.5 but includes additional
steps for postprocessing and validation that will be discussed in more detail
below.
1. Collect both image and nonimage data and generate complete cases.
possible, prioritize and focus on selected groups that define the most im-
portant clinical problems.
9. Obtain all available reports, e.g., radiology, pathology, clinical reports, that
can assist the researcher in case documentation and evaluation of the
database contents.
For our development and preliminary study, data were collected retrospec-
tively from the patient files of the H. Lee Moffitt Cancer Center & Research
Institute. Approximately 100 patients undergo a pancreatic CT exam annually
at the center. About 2/3 of these patients are diagnosed with pancreatic cancer
and about 1/3 with a benign pancreatic mass or cyst. Abdominal scans are also
performed for staging patients diagnosed with other cancer types, e.g., breast
cancer, that may turn out to be negative for metastatic disease or any disease.
Figure 4.9 shows a database design for pancreatic cancer imaging applications.
CT Images
(X+Y+Z)
Mass Cysts
Tumor size
< 4 cm (50%)
> 4 cm (50%)
Figure 4.9: Image database design for pancreatic cancer research and CAD
development.
Automatic Segmentation of Pancreatic Tumors in Computed Tomography 203
The contents of the database, e.g., numbers X, Y, and Z, are determined based
on (a) the aims of the project, (b) the clinical characteristics of the pancreatic
cancer and benign pancreatic masses, (c) the disease statistics, (d) the demo-
graphic characteristics both nationally and locally, (e) the imaging protocols
implemented at the Institution, and (f) the requirements of the algorithm design
as discussed earlier. Imaging protocols and surveillance procedures may differ
among institutions and, hence, CAD goals may differ to accommodate specific
clinical practices and requirements. HLMCC’s imaging protocol for abdominal
helical CT scans of patients diagnosed with or suspected of pancreatic cancer
includes three imaging series:
r Series #1: An initial abdominal scan is done with a relatively thick slice
(8–10 mm) prior to the administration of contrast material; approximately
5 slices from this series contain information of the pancreas.
r Series #2: An enhanced abdominal scan follows with the same slice thick-
ness as in Series #1 shortly after the intravenous administration of contrast
material (a second enhanced scan after a short period of time may also be
acquired if requested by the physician). Similar to the first series, approx-
imately 5 slices in this series contain information on the pancreas.
r Series #4: A renal delay scan that acquires images through the kidneys only.
This series includes partial information on the pancreas.
Series #1 and #4 are not likely to be of value at least in the initial algo-
rithm development because pancreatic tumors are clinically evaluated in con-
trast enhanced scans, i.e., Series #2, and insufficient information is present in
Series #4.
In addition to the CT images and imaging parameters, the following infor-
mation was also collected or generated: (a) radiology reports, (b) pathology
reports, (c) demographic information, (d) other nonimage information includ-
ing lab tests, and (e) electronic ground truth files. All data were entered in
a relational database that links image and nonimage information. All patient
204 Kallergi, Hersh, and Manohar
identifiers were removed prior to any research and processing to meet confi-
dentiality requirements.
Figure 4.10: (a) Two manual outlines of the pancreatic tumor shown in
Fig. 4.6(a). (b) Two manual outlines of the pancreatic tumor shown in Fig. 4.7(a).
Automatic Segmentation of Pancreatic Tumors in Computed Tomography 205
head and tail of the pancreas respectively (original images shown in Figs. 4.6(a)
and 4.7(a)). These outlines are part of the ground truth files generated for the
CT slices and used for segmentation validation. Variations in the outlines as the
ones seen in Fig. 4.10 are expected and inevitable between experts and could
make segmentation validation a strenuous task. Often there is no right or wrong
answer and it is our recommendation that both are considered in an evaluation
process.
Measures can and should be taken to increase the accuracy of this informa-
tion and at a minimum remove external sources of variability or error. These
measures include the following:
(i) Establish optimum and standard viewing and outline conditions in terms
of monitor display and calibration, ambient light, manual segmentation
tool(s), and image manipulation options.
(ii) Use all individual manual outlines for evaluation as a possible way to
account for expert variability. For example, use both outlines shown in
Fig. 4.10 for segmentation validation. Alternative options are to deter-
mine the union or overlap of outlines or use a panel of experts to obtain
a consensus on one outline per image.
(iii) Provide all available information to the expert before he/she generates
truth file.
(iv) Have experts perform initial outlines independently to avoid bias (any
joint outlines are done in addition to the originals).
(v) Review the expertise of the “experts” and their physical condition prior
to the initiation of the process (number of cases read within a certain
time frame, familiarity with computer tools, training, fatigue).
Ground truth files are generated for all cases in the designed database but
for a selected number of image series and slices to reduce physician effort.
Specifically, in the cases where there is no high-resolution series (#3), ground
truth files are generated for the 5 enhanced CT slices of Series #2 that contain
the pancreas. In the cases where the high-resolution series (#3) is available,
206 Kallergi, Hersh, and Manohar
ground truth files are generated for both the 8 mm slices of the pancreas in
Series #2 and every other slice in Series #3 (10 slices total); “ground truth”
for the intermediate slices of Series #3 is obtained by interpolation. (Ground
truth could be extrapolated from the 4 mm slices to the 8 mm slices but slice
registration would be required prior to this process.)
Truth files are images of the same size as the original slice and include (a)
the location of the pancreas and outline of its shape; (b) the location of the
pancreatic tumor(s), masses, or cysts, and their shape outline(s) (Fig. 4.10);
(c) the location and identification of neighboring organs and their shape out-
lines; (d) the identification of any vascular invasion and metastases sites and
outlines. Truth files are generated using a computer mouse to outline the areas
of interest on CT slices that are displayed on a high resolution (2048 × 2560
pixels) and high luminance computer monitor. Pixels in the ground truth files
are assigned a specific value that corresponds to an outlined organ or structure,
e.g., a gray value of 255 is assigned to the pixels that correspond to the outline
of the pancreatic tumor(s), a gray value of 200 to the pixels that correspond to
the outline of the normal pancreas, etc.
Figure 4.11: External signal segmentation for the slice of Fig. 4.6(a).
4.4.4 Preprocessing—Enhancement
Our enhancement approach aimed at increasing the image contrast between
the pancreas and organs in close proximity. A histogram equalization approach
was implemented for this purpose and yielded satisfactory results (Gaussian
and Wiener filters seemed to benefit these images as well) [64]. Wavelet-based
enhancement was also considered as an alternative option for removing un-
wanted background information and better isolating the signals of interest [65].
The method was promising but may present an issue when used in combination
with registration or reconstruction processes.
Enhancement generally benefits CAD algorithms but in 3-D imaging modali-
ties like CT, it may have an adverse effect on the registration of the 2-D data, if
it is not uniformly done across slices. Wavelet-based enhancement may worsen
Figure 4.12: External signal segmentation for the slice of Fig. 4.7(a).
208 Kallergi, Hersh, and Manohar
the situation since it operates in the frequency domain and may not necessar-
ily preserve the spatial features of the CT images as needed for registration. A
standardization or normalization method may offer a solution in regaining all
spatial information when transforming from the frequency back to the spatial
domain. However, no such method was established for this application or is
readily available. If registration is not part of the process, the enhancement step
could significantly benefit subsequent clustering and classification on the CT
images [66].
n
c
J(U, V ) = (uik )mxk − vi 2A
k=1 i=1
Note that U is a (c × n) matrix of the uik values and xk − vi 2A = (xk −
vi )T A(xk − vi ). The steps followed in the implementation of the FCM algorithm
are as follows [47]:
3. Initialize U0 ∈ M f cn randomly
5. For t = 1, 2, . . . , T do
210 Kallergi, Hersh, and Manohar
The problematic aspects of a supervised approach are the need for high-quality
labeled data and the variability introduced by human intervention. An alterna-
tive to the supervised and unsupervised versions of FCM is the semisupervised
method where a small set of the data is labeled but the majority of the data is
unlabeled. In ssFCM, we let
⎧ , ⎫
⎨ x11 , x21 , . . . , xn11 , x12 , x22 , . . . , xn22 . . . . , x1c , x2c , . . . , xncc , x1u, x2u, . . . , xnuu ⎬
X= ( )* +,,( )* +
⎩ labeled data , unlabeled data ⎭
denote partially labeled data; superscripts denote the class label and ni is the
number of samples having the same label vector in the partition matrix U . Using
the labeled set of data we find the center of the clusters iteratively until the
terminating condition is satisfied. The unlabeled data are then introduced for
finding the cluster centers. This method is more stable as the centers are well
defined from the labeled data used for training. The clusters are also well defined
with the correct physical data labels.
The cluster centers for the labeled data are calculated as
nd d m d
k=1 (uik,0 ) xk
vi,0 = nd d m
1≤i≤c (4.3)
k=1 (uik,0 )
where the subscript d denotes design or training data. Having the labels for the
Automatic Segmentation of Pancreatic Tumors in Computed Tomography 211
where the subscript u is now used to denote the unlabeled data contribution.
In practice, nd is much smaller than nu . For example, an abdominal CT im-
age is usually 512 × 512 or 262,144 pixels. A pancreatic mass of about 4 cm in
maximum diameter may cover approximately 1200 pixels in an image with a
resolution of 1 mm/pixel while the pancreas itself may be up to 5000 pixels.
Neighboring major organs and structures may be up to 20,000 pixels depending
on the slice. A very small percentage of these pixels will be labeled in the ssFCM
approach. For example for c = 4 a quarter of the pixels in each class per slice
are labeled, which approximately equals to nd = 8000 and nu = 254,144.
To reduce potential bias that may be introduced by large differences between
nd and nu as well as between the difference tissue classes, one more modification
of c-means is introduced that allows us to weigh the fewer labeled samples more
heavily than their unlabeled counterparts. Furthermore, such weighing allows
us to assign much larger weights to small clusters to better separate them from
larger ones. This is done by introducing weights W = (w1 , w2 , . . . , wnd ) in the
Eq. (4.5) as
n nu
k=1 wk (uik,t ) xk +
d d m d u m u
k=1 (uik,t ) xk
Vi,t = nd nu 1 ≤ i ≤ c t = 1, 2, . . . , T
k=1 wk (uik,t ) +
d m u m
k=1 (uik,t )
(4.6)
In general, W is a vector of positive real numbers and when wnd = 1 then ssFCM
becomes standard unsupervised FCM [47].
The implementation of an ssFCM algorithm involves the following steps:
area for a second time using the same parameters. The output of the second run
is shown in Figs. 4.15 and 4.16. Although not immediately evident in a grayscale
representation, four pixel clusters were identified in these images. One of these
clusters (encompassed by a white outline) corresponds to the pancreatic tumors
indicated by arrows in Figs. 4.6(a) and 4.7(a) and the pancreas.
Several issues remain to be addressed in this FCM application including
(a) what is the appropriate number of clusters to differentiate between tu-
mor and nontumor pancreatic areas, (b) how many labeled data are needed for
Figure 4.15: Segmentation results obtained from applying the FCM algorithm
to the area defined by class 1 pixels in Fig. 4.13. The white outline indicates the
cluster that corresponds to the pancreas and tumor pixels.
214 Kallergi, Hersh, and Manohar
Figure 4.16: Segmentation results obtained from applying the FCM algorithm
to the area defined by class 1 pixels in Fig. 4.14. The white outline indicates the
cluster that corresponds to the pancreas and tumor pixels.
4.4.6 Validation
Our pilot study on pancreatic cancer did not include a validation step due to
the small number of the tested images to date. However, the evaluation of the
clustering and segmentation outputs is expected to be a major part of this ap-
plication. Hence, we will close our algorithm description with few remarks on
segmentation validation issues and a summary of the measures proposed for
this purpose.
Validation requires a gold standard segmentation that represents the “abso-
lute truth” on the size and shape of the object of interest. The lack of a gold
standard or absolute ground truth in most medical imaging applications does
not allow an absolute quantitative evaluation of the segmentation output. The
best and often only, option available is segmentations generated by expert ob-
servers that may be biased and also exhibit significant inter- and intraobserver
variability. In some cases, an alternative approach to the direct evaluation of
segmentation results is the use of simulation or phantom studies, [69] the use
of relative performance measures, or the use of classification outcomes [70].
The goal of validation in our application is to demonstrate that the automatic
methods proposed for the segmentation of pancreatic tumors will lead to stan-
dardized and more reproducible tumor measurements than the manual and vi-
sual estimates performed traditionally by experts. Tumor size, area, and volume
are parameters currently used to determine tumor resectability, and response
to treatment. Greater accuracy, less variability, and greater reproducibility in
these measurements is expected to have a significant impact on the diagnosis
and treatment of pancreatic cancer [71].
An indicated in Fig. 4.8, a postprocessing step is usually applied to the clus-
tered data prior to validation in order to generate smooth contours of the or-
gans and tumors that can then be compared to those in the truth files; see for
example the truth files in Fig. 4.10 and the FCM segmentations (white outlines)
of Figs. 4.15 and 4.16). From the measures available for segmentation valida-
tion, [72] we have selected and implemented those that are recommended for
medical imaging applications and are particularly suited for the comparison of
216 Kallergi, Hersh, and Manohar
1. The Hausdorff distance h(A, B) between two contours of the same ob-
ject (tumor), one generated by an expert (A) and one generated by the
computer (B).
Let A = {a1 , a2 , . . . , am} and B = {b1 , b2 , . . . , bm} be the set of points on
the two contours (each point representing a pair of x and y coordinates)
then the distance of a point ai to the closest point on curve B is defined as
d(ai , B) = min b j − ai
j
The first two measures above are sensitive to the size and shape of the
segmented objects and also depend on the image spatial resolution. The third
measure is independent of object size and image resolution and preferred if
images from different sources are to be compared.
Alternatively to custom-made routines, the VALMET segmentation validation
software tool that is publicly available could be used to generate these metrics
in 2D and 3D [73]. Tools such as VALMET and ITK may offer the standardization
missing from the validation of segmentation algorithms and reduce variability.
Currently, there is no agreement on the “best method” or “best methods” for
analyzing and validating segmentation results. The need for standardized mea-
sures that are widely acceptable is significant as is the need for establishing
conventions on how to use expert-generated ground truth data in the evaluation
process.
In a final note, the reader is reminded that a statistical analysis that measures
the agreement between the measured parameters from different segmentation
algorithms or the agreement between computer and observer performances
should be part of the validation process. Computer and expert data are compared
with a variety of statistical tools. The most frequently reported ones include
(a) linear regression analysis to study the relationship of the means in the various
segmentation sets [76, 77], (b) paired t test to determine agreement between the
computer method(s) and the experts [76, 77], (c) Williams index to measure
interobserver or interalgorithm variability in the generation of manual outlines
[74], and (d) receiver operating characteristic analysis and related methods to
obtain sensitivity and specificity indices by estimating the true positive and false
positive fractions detected by the algorithm and/or the observer [78].
4.5 Conclusions
issues for medical imaging and CAD applications. It was also used in an effort to
open the pancreatic cancer imaging area into possibly more research and discus-
sions considering that it is relatively under-investigated and unknown despite
its significant toll on health care.
The current state-of-the-art in CAD methodologies for CT and pancreatic
cancer was reviewed and limitations were discussed that led to the development
of a novel, fuzzy logic-based algorithm for the clustering and classification of
pancreatic tumors on helical CT scans. This algorithm was presented here and
its pilot application on selected CT images of patients with pancreatic tumors
was used as the basis to discuss issues associated with tumor segmentation and
validation of the results.
The problems and difficulties encountered today by the radiologists and the
oncologists dealing with pancreatic carcinoma are numerous and they are often
associated with the limitations of the current imaging modalities, the observer
biases, and the inter- and intraobserver variability. Among the most striking
weaknesses is the inability to detect small tumors, to consistently differentiate
between pancreatic tumors and benign conditions of the pancreas putting the
patient through several imaging procedures and medical tests, to accurately
measure tumor size and treatment effects.
Computer tools could play a diverse role in pancreatic cancer imaging. The
primary goal of the system presented here was the automated segmentation
of the normal and abnormal pancreas and associated pancreatic tumors from
CT images. However, these tools could have a broader and more diverse role
in the detection, diagnosis, and management of this disease that could change
the current standard of care. Among other applications, CAD methodologies
could provide objective measures of pancreatic tumor size and response to ther-
apy that will allow (a) accurate and timely assessment of tumor resectability,
(b) accurate and timely estimates of tumor size as a function of time and treat-
ment, and (c) standardized evaluation and interpretation of tumor size and re-
sponse to treatment. CAD techniques could further lead to 3-D reconstructions
of the pancreas and tumors and impact surgery and radiation treatment.
Validation is and should be a major part of CAD development and implemen-
tation. Medical imaging applications, however, present unique problems to CAD
validation, e.g., lack of an absolute gold standard, lack of standardized statistical
analysis and evaluation criteria, time-consuming and costly database generation
procedures, and other. Yet, CAD researchers are asked to find ways to overcome
Automatic Segmentation of Pancreatic Tumors in Computed Tomography 219
limitations and properly validate medical CAD algorithms including those that
involve segmentation or clustering. Several options have been proposed in this
chapter for this purpose. As we learn more about this area, however, we find
that it may be possible to define a new family of validation criteria better suited
for medical imaging applications. These criteria are likely to link algorithm per-
formance to actual clinical outcomes. We could use, for example, classification
results as a measure of segmentation performance.
4.6 Acknowledgments
Questions
1. What are the physical characteristics of helical CT scans that may impact
CAD algorithm design and performance?
2. What are the clinical characteristics of pancreatic cancer that may impact
CAD algorithm design?
5. What is FCM and when is it used for image segmentation? List any ad-
vantages over classical segmentation techniques.
6. What are the differences between FCM and ssFCM? List advantages and
disadvantages of the two techniques.
7. List methods that can be used for the optimization of FCM for image
segmentation.
220 Kallergi, Hersh, and Manohar
8. What are the metrics used for the validation of a segmentation output?
9. What are the major limitations and problems associated with the valida-
tion of segmentation algorithms in medical imaging applications?
10. What are the statistical tools used for the analysis of the segmentation
results including tools to determine the agreement between different algo-
rithms and observers or within groups.
Automatic Segmentation of Pancreatic Tumors in Computed Tomography 221
Bibliography
[1] Jemal, A., Thomas, A., and Murray, T., Cancer statistics, 2002, CA Cancer
J. Clin., Vol. 52, pp. 23–47, 2002.
[2] Kern, S., Tempero, M., and Conley, B., (Co-Chairs), Pancreatic cancer:
An agenda for action, Report of the Pancreatic Cancer Progress Group,
National Cancer Institute, February 2001.
[4] Lin, Y., Tamakoshi, A., Kawamura, T., Inaba, Y., Kikuchi, S., Motohashi,
Y., Kurosawa, M., and Ohno, Y., An epidemiological overview of envi-
ronmental and genetic risk factors of pancreatic cancer, Asian Pacific
J. Cancer Prev., Vol. 2, pp. 271–280, 2001.
[5] Li, D. and Jiao, L., Molecular epidemiology of pancreatic cancer, Int.
J. Gastrointest. Cancer, Vol. 33, No. 1, pp. 3–14, 2003.
[6] Ghadirian, P., Lynch, H. T., and Krewski, D., Epidemiology of pancreatic
cancer: an overview, Cancer Detect Prev., Vol. 27, No. 2, pp. 87–93,
2003.
[8] Yeo, T. P., Hruban, R. H., Leach, S. D., Wilentz, R. E., Sohn, T. A., Kern,
D. E., Iacobuzio-Donahue, C. A., Maitra, A., Goggins, M., Canto, M. I.,
Abrams, R. A., Laheru, D., Jaffee, E. M., Hidalgo, M., and Yeo, C. J.,
Pancreatic cancer, Curr. Prob. Cancer, Vol. 26, No. 4, pp. 176–275, 2002.
[9] Tamm, E. P., Silverman, P. M., Charnsangavej, C., and Evans, D. B.,
Diagnosis, staging, and surveillance of pancreatic cancer, AJR, Vol. 180,
pp. 1311–1323, 2003.
[10] Clark, L. R., Jaffe, M. H., Choyke, P. L., Grant, E. G., and Zeman, R. K.,
Pancreatic imaging, Radiol. Clin. North Am., Vol. 23, No. 3, pp. 489–501,
1985.
222 Kallergi, Hersh, and Manohar
[11] Haaga, J. R., Alfide, R. J., Zelch, M. G., Meany, T. F., Boller, M., Gonzalez,
L., and Jelden, G. L., Computed tomography of the pancreas, Radiology,
Vol. 120, pp. 589–595, 1976.
[12] Haaga, J. R., Alfide, R. J., Harvilla, T. R., Tubbs, R., Gonzalez, L., Meany,
T. F., and Corsi, M. A., Definitive role of CT scanning of the pancreas:
The second year’s experience, Radiology, Vol. 124, pp. 723–730, 1977.
[13] Sheth, S., Hruban, R. K., and Fishman, E. K., Helical CT of islet cell
tumors of the pancreas: Typical and atypical manifestations, AJR,
Vol. 179, pp. 725–730, 2002.
[16] Winston, C. B., Mitchell, D. G., Outwater, E. K., and Ehrlich, S. M.,
Pancreatic signal intensity on T1-weighted fat saturation MR images:
Clinical correlation, J. Magn. Reson. Imaging, Vol. 5, pp. 267–271,
1995.
[17] Ragozzino, A. and Scaglione, M., Pancreatic head mass: What can be
done? Diagnosis: Magnetic resonance imaging, J. Pancreas, Vol. 1,
pp. 100–107, 2000.
[18] Barish, M. A., Yucel, E. K., and Ferrucci, J. T., Magnetic resonance
cholangiopancreatography, NEJM, Vol. 341, pp. 258–264, 1999.
[20] Adamek, H. E., Albert, J., Breer, H., Weitz, M., Schilling, D., and Riemann,
J. F., Pancreatic cancer detection with magnetic resonance cholan-
giopancreatography and endoscopic retrograde cholangiopancreatog-
raphy: a prospective controlled study, Lancet, Vol. 356, pp. 190–193,
2000.
Automatic Segmentation of Pancreatic Tumors in Computed Tomography 223
[21] Mertz, H. R., Sechopoulos, P., Delbeke, D., and Leach, S. D., EUS,
PET, and CT scanning for evaluation of pancreatic adenocarcinoma,
Gastrointest. Endosc., Vol. 52, pp. 367–371, 2000.
[23] Kalra, M. K., Maher, M. M., Boland, G. W., Saini, S., and Fischman,
A. J., Correlation of positron emission tomography and CT in evalu-
ating pancreatic tumors: Technical and clinical implications, AJR, Vol.
181, No. 2, pp. 387–393, 2003.
[24] Koyama, K., Okamura, T., Kawabe, J., Nakata, B., Hirakawa-Chung, K.
Y. S., Ochi, H., and Yamada, R., Diagnostic usefulness of FDG PET for
pancreatic mass lesions, Ann. Nuclear Med., Vol. 15, No. 3, pp. 217–224,
2001.
[25] Dupuy, D. E., Costello, P., and Ecker, C. P., Spiral CT of the pancreas,
Radiology, Vol. 183, pp. 815–818, 1992.
[26] DiChiro, G. and Brooks, R. A., The 1979 Nobel prize in physiology and
medicine, Science, Vol. 206, No. 30, pp. 1060–1062, 1979.
[31] Sheedy, P. F., II., Stephens, D. H., Hattery, R. R., MacCarty, R. L., and
Williamson, B., Jr., Computer tomography of the pancreas, Radiol. Clin.
North Am., Vol. 15, No. 3, pp. 349–366, 1977.
[32] Dendy, P. P. and Heaton, B., Physics for Diagnostic Radiolog, 2nd ed.,
Medical Science Series, Institute of Physics Publishing, Bristol, 1999.
[34] Love, L., (guest ed.), Symposium on abdominal imaging, Radiol. Clin.
North Am., Vol. 17, No. 1, 1979.
[35] Frank Miller, H., (guest ed.), Radiology of the pancreas, gallbladder, and
biliary tract, Radiol. Clin. North Am., Vol. 40, No. 6, 2002.
[37] Stanley, R. J. and Semelka, R. C., Pancreas, In: Computed Body Tomog-
raphy with MRI Correlation, Lee, J. K. T., Sagel, S. S., Stanley, R. J., and
Heiken, J. P., eds., Lippincott Raven, pp. 915–936, 1998.
[38] Sheedy, P. F., II, Stephens, D. H., Hattery, R. R., MacCaty, R. L., and
Williamson, B., Jr., Computed Tomography of the Pancreas: Whole Body
Computed Tomography, Radiol. Clin. North Am., Vol. 15, No. 3, pp. 349–
366, 1977.
[39] Masero, V., Leon-Rojas, J. M., and Moreno, J., Volume reconstruction
for health care: A survey of computational methods, Ann. N Y Acad.
Sci., Vol. 980, pp. 198–211, 2000.
[44] Schiemann, T., Michael, B., Tiede, U., and Hohne, K. H., Interactive 3D-
segmentation, SPIE, Vol. 1808, pp. 376–383, 1992.
[45] Ikeda, M., Shigeki, I., Ishigaki, T., and Yamauchi, K., Evaluation of a
neural network classifier for pancreatic masses based on CT findings,
Comput. Med Imaging Graphics, Vol. 21, No. 3, pp. 175–183, 1997.
[46] Clarke, L. P., Velthuizen, R. P., Camacho, M. A., Heine, J. J., Vaidyanathan,
M., Hall, L. O., Thatcher, R. W., and Silbiger, M. L., Review of MRI seg-
mentation: Methods and applications, Magn. Reson. Imaging, Vol. 13,
No. 3, pp. 343–368, 1995.
[47] Bensaid, A. M., Improved Fuzzy Clustering for Pattern Recognition with
Applications to Image Segmentation., Ph.D. Dissertation, Department
of Computer Science, University of South Florida, 1994.
[48] Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algo-
rithm, Plenum Press, New York, 1981.
[49] Bensaid, A. M., Bezdek, J. C., Hall, L. O., and Clarke, L. P., A partially
supervised fuzzy c-means algorithm for segmentation of MR images,
SPIE, Vol. 1710, pp. 522–528, 1992.
[50] Bensaid, A. M., Hall, L. O., Bezdek, J. C., Clarke, L. P., Silbiger, M. L., Ar-
rington, J. A., and Murtagh, R. F., Validity-guided (re)clustering with
application to image segmentation, IEEE Trans. Fuzzy Sys., Vol. 4,
No. 2, pp. 112–123, 1996.
[51] Clark, M. C., Hall, L. O., Goldgof, D. B., Clarke, L. P., Velthuizen, R. P.,
and Silbiger, M. S., MRI segmentation using fuzzy clustering techniques,
IEEE Eng. Med. Biol. Magazine, Vol. 13, No. 5, pp. 730–742, 1994.
[54] Velthuizen, R. P., Clarke, L. P., Phuphanich, S., Hall, L. O., Bensaid, A.
M., Arrington, J. A., Greenberg, H. M., and Silbiger, M. L., Unsupervised
measurement of brain tumor volume on MR images, J. Magn. Reson.
Imaging, Vol. 5, No., 5, pp. 594–605, 1995.
[55] Velthuizen, R. P., Hall, L. O., and Clarke, L. P., An initial investigation of
feature extraction with genetic algorithms for fuzzy clustering, Biomed.
Eng., Appl., Basis Commun., Vol. 8, No. 6, pp. 496–517, 1996.
[57] Li, L., Zheng, Y., Kallergi, M., and Clark, R. A., Improved method for
automatic identification of lung regions on chest radiographs, Acad.
Radiol., Vol. 8, pp. 629–638, 2001.
[58] Kallergi, M., Carney, G., and Gaviria, J., Evaluating the performance of
detection algorithms in digital mammography, Med. Phy., Vol. 26, No. 2,
pp. 267–275, 1999.
[59] Kallergi, M., Clark, R. A., and Clarke, L. P., Medical image databases for
CAD applications in digital mammography: Design issues, In: Medical
Informatics Europe ’97, Pappas, C., Maglaveras, N., and Scherrer, J. R.,
eds., IOS Press, Amsterdam, pp. 601–605, 1997.
[60] Harrell, F. E., Lee, K. L., and Mark, D. B., Tutorial in biostatistics. Mul-
tivariate prognostic models: Issues in developing models, evaluating
assumptions and adequacy, and measuring and reducing errors, Stat.
Med., Vol. 15, pp. 361–387, 1996.
[62] Li, L., Zheng, Y., Kallergi, M., and Clark, R. A., Improved method for
automatic identification of lung regions on chest radiographs, Acad.
Radiol., Vol. 8, pp. 629–638, 2001.
[63] Pavlidis, T., Algorithms for Graphics and Image Processing, Computer
Science Press, Rockville, MD, 1982.
[64] Greenberg, S., Aladjem, M., Kogan, D., and Dimitrov, I., Fingerprint im-
age enhancement using filtering techniques, In: International Confer-
ence on Pattern Recognition, Vol. 3, Barcelona, Spain, Sept. 3–8, 2000.
[65] Heine, J. J., Deans, S. R., Cullers, D. K., Stauduhar, R., and Clarke, L. P.,
Multiresolution statistical analysis of high resolution digital mammo-
grams, IEEE Trans. Med. Imaging, Vol. 16, pp. 503–15, 1997.
[66] Weaver, J. B., Xu, Y. S., Healy, D. M., Jr., and Cromwell, L. D., Filtering
noise from images with wavelet transforms, Magn. Reson. Med., Vol.
21, No. 2, pp. 288–295, 1991.
[67] Hall, L. O., Bensaid, A. M., Clarke, L. P., Velthuizen, R. P., Silbiger, M. L.,
and Bezdek, J., A Comparison of neural network and fuzzy clustering
techniques in segmenting magnetic resonance images of the brain, IEEE
Trans. Neural Networks, Vol. 3, No. 5, pp. 672–682, 1992.
[68] Phillips, W. E., Velthuizen, R. P., Phuphanich, S., Hall, L. O., Clarke, L. P.,
and Silbiger, M. L., Application of fuzzy c-means segmentation technique
for tissue differentiation in MR images of a hemorrhagic glioblastoma
multiforme, Magn. Reson. Imaging, Vol. 13, No. 2, pp. 277–290, 1995.
[69] Kallergi, M., Gavrielides, M. A., He, L., Berman, C. G., Kim, J. J., and
Clark, R. A., A simulation model of mammographic calcifications based
on the ACR BIRADS, Acad. Radiol., Vol. 5, pp. 670–679, 1998.
[70] Kallergi, M., He, L., Gavrielides, M., Heine, J. J., and Clarke, L. P., Resolu-
tion effects on the morphology of calcifications in digital mammograms,
In: Proceedings of VIII Mediterranean Conference on Medical and
228 Kallergi, Hersh, and Manohar
[71] Zhang, Y. J., A review of recent evaluation methods for image segmenta-
tion, In: Proceedings of International Symposium on Signal Processing
and its Applications, Malaysia, August 13–16, 2001.
[73] Gerig, G., Jomier, M., and Chakos, M., Valmet: A new validation tool for
assessing and improving 3D object segmentation, MICCAI, Vol. 2208,
pp. 516–528, 2001.
[75] Kelemen, A., Székely, G., and Gerig, G., Elastic model-based segmenta-
tion of 3-D neuroradiological data sets, IEEE Trans. Med. Imaging, Vol.
18, No. 10, pp. 828–839, 1999.
[76] Motulsky, H., Intuitive Biostatistics, Oxford University Press, USA, 1995.
5.1 Introduction
1
Computer Vision Lab, Aragon Institute of Engineering Research, University of Zaragoza,
Zaragoza, Spain
2
Department of Technology, Pompeu Fabra University, Barcelona, Spain
3
Lozano Blesa University Clinical Hospital, Aragon Institute of Health Sciences, Zaragoza,
Spain
229
230 Frangi, Laclaustra, and Yang
All these methods present some common limitations. First, edge detection
techniques are undermined by important error sources like speckle noise or the
varying image quality typical of US sequences. Second, most methods require
expert intervention to manually guide or correct the measurements thus being
prone to introduce operator-dependent variability. Also, almost no method per-
forms motion compensation to correct for patient and probe position changes.
This could easily lead to measuring arterial dilation using wrong anatomical cor-
respondences. Temporal continuity is another aspect that has not been exploited
enough in previous work. Two consecutive frames have a high correlation, and
only Newey and Nassiri [20] and Fan et al. [15] take advantage of this feature
during edge detection. Finally, there is a general lack of large-scale validation
studies in most of the techniques presented so far.
In this chapter a method is proposed that is based on a global strategy to
quantify flow-mediated vasodilation. We model interframe arterial vasodilation
as a superposition of a rigid motion (translation and rotation) and a scaling factor
normal to the artery. Rigid motion can be interpreted as a global compensation
for patient and probe movements. The scaling factor explains arterial vasodi-
lation. The US sequence is analyzed in two phases using image registration to
recover both rigid motion and vasodilation. Image registration uses normalized
mutual information [21] and a multiresolution framework [22]. Temporal conti-
nuity of registration parameters along the sequence is enforced with a recursive
filter. Application of constraints on the vasodilation dynamics is a natural step,
since the dilation process is known to be a gradual and continuous physiological
phenomenon.
Once a vasodilation curve is obtained, clinical measurementes must be ex-
tracted from it. Classically, FMD is quantified by measuring the peak vasodilation
diameter relative to the basal diameter level, which are usually manually iden-
tified in the curve. Automatically extracting these two parameters is not trivial
(e.g., a mere mean of several basal frames and a simple search for a maximum
in the curve) given that the curve may also include artifacts. We examined the
use of a robust principal component analisys to derive intuitive morphologi-
cal parameters from the curves and relate them to classical FMD indexes and
cardiovascular (CVD) risk factors.
The chapter is organized as follows: Section 5.2 describes the system for
image acquisition and the protocol for a typical FMD study. It also describes
the population used to evaluate our technique. Section 5.3 introduces the
proposed method to assess FMD. The validation of the technique is reported
Computerized Analysis and Vasodilation Parameterization of FMD 231
5.2 Materials
5.2.1 Subjects
A total of 195 sequences of varying image quality were studied, corresponding to
195 male volunteers of the Spanish Army (age range, 34–36 years). This sample is
part of the AGEMZA Study, a national cohort study of cardiovascular risk factors
in young adults and includes subjects with a wide range of clinical characteristics
(body weight: 62.3–111.8 Kg; body mass index: 20.59–35.36 Kg/m2 ; hypertension:
9%; hypercholesterolemia: 20%; smokers: 24%).
Each sequence has about 1200 frames and a duration of around 20 min,
acquiring each second the last telediastolic frame previously hold. This provides
a fixed sampling rate irrespective of heart rate, which means a substantial benefit
for clinical interpretation, as different stimulus are applied on a time basis along
the clinical test. As the dynamics of endothelial vasodilation is much slower than
changes happening between cardiac cycles, with this sampling rate, missing
information from one heartbeat or using one heartbeat twice does not affect, in
practice, the results.
FMD is the vasodilation response to hyperemia after a transitory distal
ischemia induced in the forearm using a pneumatic cuff distal to the probe
(Fig. 5.1). The dilation mediated by a chemical vasodilator, the nitroglycerin,
or nitroglycerin-mediated dilation (NMD), is also registered. Accordingly, five
phases of the medical test can be distinguished in each sequence (see Fig. 5.2).
r Rest baseline (B1). Initial rest state preceding distal ischemia. Presents
the best image quality in the whole sequence and lasts for about 1 min.
r Distal ischemia (DI). The cuff is inflated and, therefore, the image quality
is usually the worst in the sequence. It takes approximately 5 min. This
phase ends when the cuff pressure is released.
Computerized Analysis and Vasodilation Parameterization of FMD 233
Figure 5.2: A whole typical examination can be divided into several segments
along the sequence: rest baseline (B1), distal ischemia (DI), flow-mediated
dilation (FMD), post-FMD baseline (B2), and nitroglycerin-mediated dilation
(NMD).
Figure 5.3: Overview of the proposed two-stage method: after motion com-
pensation is carried out by recovering a rigid motion model, the vasodilation is
measured by computing the scaling factor along the normal to the artery that
best matches the two analyzed frames.
A reference frame is selected from the beginning of the sequence. All the
other frames are registered to this reference frame. Changes in the relative
position between the patient and the transducer are quite common during a
whole examination, which may take up to 20 min. To elude wrong anatomical
correspondences, motion compensation becomes necessary, and a rigid image
registration technique is used to this end.
Structures surrounding the artery in the image may be important to resolve
potential ambiguities in the longitudinal alignment between two frames, which
occur because of the lineal nature of the arterial walls. On the other hand, ex-
traluminal structures introduce artifacts when measuring the vasodilation since
they do not necessarily deform in the same way as the artery does. Therefore,
they should be taken into account when retrieving the global rigid motion infor-
mation of the model, while arterial vasodilation estimation should only consider
the artery deformation.
Our technique proceeds in two phases as summarized in Fig. 5.3: motion
compensation and dilation assessment. The first phase uses the original frames
and rigid image registration to recover a rigid motion model. Translation and
rotation parameters are used to initialize the subsequent phase of vasodilation
estimation. This second stage employs an affine registration model. To avoid
artifacts when measuring arterial vasodilation it is convenient to remove back-
ground extra luminal structures by padding them out from the reference frame.
Preprocessing of this frame also requires repositioning it so that the artery is
normal to the scaling direction, that is to say, aligned with the horizontal axis,
since our model searches for a vertical scaling factor (see Fig. 5.4). Both op-
erations are performed manually on the reference frame. Manual masking only
requires to roughly draw two lines in the reference frame and repositioning, to
align a line with the direction of the artery, a process that is simple and takes
only a few seconds per image sequence.
Computerized Analysis and Vasodilation Parameterization of FMD 235
Figure 5.4: Preprocessing applied to the reference frame before the phase of
vasodilation assessment. (a) Original reference frame and (b) the reference
frame after alignment to the horizontal axis and padding out of background
structures.
Our motion model between the original (x, y) and transformed (x , y ) coordi-
nates is a rigid transformation of the form
x cos θ −sinθ x tx
= · + (5.2)
y sin θ cos θ y ty
As we are imaging only a small and roughly straight vessel segment, the
vasodilation can be assumed to be normal to the artery and, therefore, it can be
modeled by only a scaling factor in that direction. Then, the vasodilation model
236 Frangi, Laclaustra, and Yang
The entropies are computed from the image histograms where pi is an ap-
proximation of the probability of occurrence of intensity value i. Similarly, the
joint entropy is computed from the joint histogram where pi, j is the approxima-
tion of the probability of the occurrence of corresponding intensity pairs (i, j).
Linear interpolation is used to obtain intensities in noninteger pixel values to
build the joint histograms.
Computerized Analysis and Vasodilation Parameterization of FMD 237
where x(n) denotes the parameter vector output after registering the nth frame.
Note that there is some implicit delay between x̂(n) and x(n) since they refer
to parameter vectors before and after the registration process. Equation (5.7)
introduces a systematic inertia to changes in the parameter values through the
constant γ . This filtering tries to avoid falling into local minima in the parameter
space during registration, which would not be temporally consistent with pre-
vious history of arterial motion. On the other hand, it might also slow down the
ability to track sudden transitions coming from true motion artifacts. A value
of γ = 0.1 has been empirically shown to be a good compromise between these
two competing goals and it was used throughout our experiments.
In this stage we assume that the translation and rotation parameters were cor-
rectly recovered at the motion compensation stage. Therefore, only the scale fac-
tor, sy, will be tracked over time using a simplified Kalman filtering scheme [23].
Computerized Analysis and Vasodilation Parameterization of FMD 239
Figure 5.5: Kalman gain. (a) Estimated σw (n) used for the computation of K (n)
during vasodilation assessment. It contains the expected vasodilation dynamics
along the sequence. Instants with higher value of σw (n) correspond to higher
uncertainty about the chosen dynamic model and, consequently, where it has to
be relaxed to accommodate for possibly sudden transitions. (b) Corresponding
Kalman gain for three different measurement noise, σv (n), values correspond-
ing to the minimum, median, and maximum noise levels, respectively, in our
sequence database.
where w(n) is white noise with variance σw , and 0 < α < 1 is the coefficient of
the AR(1) model, which was chosen as α = 0.95. The scaling factor has a non-
stationary behavior and, therefore, σw (n) actually changes widely over time (cf.
Fig. 5.5).
Let the measurement model be
where K (n) is the Kalman filter gain. Owing to the standardized acquisition
protocol, the vasodilation time series has a characteristic temporal evolution,
240 Frangi, Laclaustra, and Yang
5.4.1 Examples
Image registration between two frames searches for the transformation that puts
them into correspondence. To visually illustrate the algorithm performance, four
examples are shown in Fig. 5.6. In four sequences, the reference frame has been
aligned with the frame showing maximum FMD.
5.4.2 Evaluation
Three properties of the proposed method are analyzed: accuracy (agreement
with the gold standard), reproducibility (repeatability), and robustness (de-
gree of automation of the measurement without apparent failure). To evaluate
Computerized Analysis and Vasodilation Parameterization of FMD 241
SD%FMD
CV = (5.12)
m%FMD
242 Frangi, Laclaustra, and Yang
(i) Accuracy. Figure 5.7 shows the Bland–Altman plots comparing the inter-
session average measurement for each observer and the gold-standard
measurements. The biases and standard deviation of the differences for
the three observers are given in Table 5.3. Standard deviations are cor-
rected to take into account repeated measurements according to the
method proposed by Bland and Altman [26].
Manual (%) 0.95 ± 0.5 1.20 ± 0.4 0.71 ± 0.6 1.35 ± 0.6 1.04 ± 0.6
Computerized (%) 0.23 ± 0.1 0.26 ± 0.1 0.32 ± 0.3 0.84 ± 0.4 0.40 ± 0.3
Figure 5.8: Bland–Altman plots comparing the two manual sessions of dilation
measurements of each observer. The horizontal and vertical axes indicate the
average %FMD and the difference %FMD of the two sessions, respectively.
using Analyse-it v 1.68 (Analyse-it Software Ltd, Leeds, UK). The two-way
ANOVA was controlled by observer and measurement frame as fixed fac-
tors and by the session number as random factor (Table 5.5). From this
analysis, the inter- and intraobserver within-frame %FMD standard devi-
ations were 1.20% and 1.13%, respectively.
The scaling factor in the direction normal to the vessel axis that relates each
frame to the reference frame constitutes the vasodilation parameter output by
the automatic method. As a consequence, the measurements are normalized to
the arterial diameter of the reference frame. This normalization is different from
that of the gold-standard dilation measurements, which, as described before,
were normalized for each sequence to the grand-average diameter over phase
B1. To make the computerized measurements comparable to the gold standard,
SSq: Sum of squares; DOF: degrees of freedom; MSq: mean squares; F: F of Snedecor;
p: Snedecor test significance.
Computerized Analysis and Vasodilation Parameterization of FMD 245
Bias and difference SD in the comparison between the gold standard measures and the automatic
dilation obtained with different similarity measures. Values reported correspond to %FMD values.
NMI: Normalized mutual information; MI: mutual information; GCC: gradient image cross correlation;
JE: joint entropy; CC: cross correlation; SSD: sum of squared differences.
(ii) Accuracy. Figure 5.9 shows a Bland–Altman plot comparing the auto-
mated versus the gold-standard measurements. The SD of the differences
is 1.05%. The dilation curves obtained by the proposed method are super-
imposed to the gold-standard measurements in Fig. 5.10 where we also
include the 95% confidence interval of the gold-standard measurements
for comparison [26].
(iii) Robustness. The whole set of 195 sequences were processed with the
proposed method (more than 280,000 frames). The overall result was
ranked according to the ability to recover the clinically relevant infor-
mation from the corresponding vasodilation curve. The results were clas-
sified as good, useful, and bad, depending on the amount and severity
of the artifacts present in the curve. When, in the opinion of an expert
246 Frangi, Laclaustra, and Yang
Figure 5.10: FMD curves obtained by the proposed automated method (–) and
by the gold-standard measurements (•). Error bars show the 95% confidence
interval of the gold-standard measurements for comparison.
Computerized Analysis and Vasodilation Parameterization of FMD 247
In the last few years there has been a growing interest in understanding the link
between endothelial function and several aspects of cardiovascular diseases
(CVD). It is known that impaired endothelial function is associated with a num-
ber of disease states, including CVD and its major risk factors [28–31]. Also,
endothelial dysfunction seems to precede by many years other more manifest
symptoms and may itself be a potentially modifiable CVD risk factor. Therefore,
it promises to have not only diagnostic value but also use as an instrument for
patient monitoring during treatment.
Once that ongoing research establishes the role and value of FMD in clini-
cal practice, and that computerized tools like the one presented in this chapter
248 Frangi, Laclaustra, and Yang
115
110
Segment of Analysis
108
110
d
106
FMD [%]
∝ ∅max
FMD [%]
104
105
∝ ∅r 102
ncuff
100 100
98
∝ ∅basal
95
0 200 400 600 800 1000 1200 350 400 450 500 550 600 650
n[s] n [s]
(a) (b)
A subset of n = 161 subjects from the military male population described in sec-
tion 5.2.1 was analyzed. This population had normal to mildly unfavorable lipid
levels. Also, the subset of subjects used in the following analysis corresponds
with the one used in computing the EigenFMD and EigenD analyses of the
Computerized Analysis and Vasodilation Parameterization of FMD 253
previous section, and whose FMD curves where obtained with our computer-
ized technique.
Fasting serum samples were obtained from the subjects and were analyzed
at the Lozano Blesa University Clinical Hospital (Zaragoza, Spain). Analyses
were performed for total cholesterol, triglycerides, and HDL cholesterol by stan-
dard enzymatic laboratory techniques. LDL cholesterol was calculated using the
Friedewald formula [36,37] in subjects whose triglycerides levels were less than
400 mg/dl.
In each absolute diameter curve, a reference baseline diameter was estab-
lished before vasodilation, basal , in a region free from motion artifacts selected
just after cuff pressure release (Fig. 5.11(a)). This diameter was similar in most
cases to that measured before cuff inflation, r (Fig. 5.11(a)). Peak vasodilation
was identified in the curve to calculate FMDc (Eq. 5.13).
Statistically significant correlations are indicated with an asterisk. HDL-C: High-density lipoprotein chole-
strol; LDL-C: low-density lipoprotein cholestrol.
254 Frangi, Laclaustra, and Yang
Statistically significant correlations are indicated with an asterisk. HDL-C: High-density lipoprotein chole-
strol; LDL-C: low-density lipoprotein cholestrol.
5.6 Discussion
Statistically significant correlations are indicated with an asterisk. HDL-C: High-density lipoprotein chole-
strol; LDL-C: low-density lipoprotein cholestrol.
Computerized Analysis and Vasodilation Parameterization of FMD 255
noise, poor quality edge definition and acoustic shadows. In our opinion, these
techniques are based on low-level features with a poor reliability.
Our method, on the contrary, deals with the images in a more global
manner. We model vasodilation as a scaling factor between frames that can
be recovered by means of image registration techniques. The effect of low-
level artifacts is therefore minimized as the registration measure is computed
using all the information present in the whole image, and not just at the
edges.
Results obtained with the automated method were better than those mea-
sured manually by medical experts. The proposed method presents a negligi-
ble bias (0.05 %FMD) whereas the bias of the manual measurements depends
on the observer (range −0.16 to +0.34 %FMD). The standard deviation of the
differences between the automated and the gold-standard measurements is
1.05 %FMD, which is slightly lower than the intra- and interobserver variabilities
of manual measurements (1.13 %FMD and 1.20 %FMD, respectively). From the
dilation CV, the proposed method has also shown to present better reproducibil-
ity (CV = 0.40%) than the manual procedure (CV = 1.04%).
The method is reasonably fast. Our experiments were carried out on a PC
(Pentium III @ 600 MHz) under RedHat 7.2 Linux operating system. The reg-
istration algorithm and the Kalman filtering are both coded in C++ without a
thorough code optimization since the implementation of the registration method
is a general-purpose software not specifically devised for this application. Under
these constraints, the mean execution times per frame are 6.4 sec (SD = 0.8 sec)
and 4.0 sec (SD = 1.2 sec) for the first and second phase, respectively. This time
also incorporates outputting of progress information. From our experience with
the software, we think that these figures could still be cut down substantially by
customizing and further optimizing several parts of the code.
The vasodilation model used in this approach has also some potential limita-
tions. Here, dilation is recovered by means of reduced similarity transformation
between each frame and the reference one. However, this implicitly assumes
that the wall thickness dilates in the same way that the artery does, while it may
remain constant or even thin during lumen dilation. The unstable presence of the
lumen–intima boundary (LIB) could potentially affect the registration results.
Finally, structures stuck to the outer part of the arterial wall may introduce
errors in the vasodilation measurements since they make it more difficult to
adequately pad the reference frame. The results obtained in this chapter seem
to indicate, however, that the vasodilation model outlined in this work is a rea-
sonable simplification.
Motion compensation is necessary to avoid potential sources of bias in the
subsequent estimation of vasodilation and to ensure that vasodilation is mea-
sured by comparing the same artery segment in two different frames. Neverthe-
less, stable motion references are required to succeed in motion recovery and
avoiding indetermination of the correct alignment in the longitudinal direction
of the artery. Moreover, only 2-D information is available in the image to correct
a problem that is intrinsically 3-D in nature.
Another advantage of motion compensation is that it makes unnecessary
the manual [12, 13] or automatic [15] tracking of a region of interest (ROI) in
the image sequence. This ROI tracking is required for the edge detection of
some of the methods proposed in the literature. Our technique requires a simple
preprocessing of only the reference frame. The interaction required is minimal
(only rough delineation of two lines) and introduces a small variability (it is
included in the CV of 0.40% obtained in the reproducibility study).
Computerized Analysis and Vasodilation Parameterization of FMD 257
ratio has to be interpreted in consonance with the eigenmode plots of Figs. 5.12
and 5.13. On the contrary, the effect of risk factors on the direction of variation
of the EigenFMD and EigenD curves is not arbitrary and, therefore, they will be
discussed in the sequel.
Among the EigenD modes in Fig. 5.13, the first mode, which broadly repre-
sents baseline diameter, significantly correlates with triglycerides, while EigenD
mode number 2 does with LDL cholesterol. This later mode graphically appears
to represent dilation and diameter decay (the lower the LDL cholesterol level,
the higher the peaks and the quicker the vessel recovery).
EigenFMD modes are also related to cholesterol fractions. Interestingly, this
analysis highlights that each fraction exerts a different influence on the curve
shape. While HDL cholesterol covaries with the first mode, which could be
interpreted as a classical measure of FMD peak (the higher the HDL cholesterol
level, the higher the peaks), LDL cholesterol is significantly associated with
the second mode, which has a form similar to EigenD 2 (the higher the LDL
cholesterol levels, the lower the peaks and the slower the decays), and EigenD
3, interpretable as response velocity or time-to-peak (the higher LDL-cholesterol
values, the later the peaks).
The classical measurement of FMD, FMDc , correlates with statistical signif-
icance with almost all absolute and relative shown eigenmodes. The mode with
the highest correlation is EigenFMD 1 (r = −0.786), which one can visually argue
that captures the maximum dilation peak. Other modes with correlation coeffi-
cients over 0.300 are also present but their visual interpretation is more subtle.
EigenD 1 is almost equivalent to the baseline diameter. This is promising,
as this manual measurement could be potentially estimated on an observer-
independent basis. Baseline diameter also showed significant correlations with
EigenFMD 2 and 4 although the interpretation of this fact is not so evident.
5.7 Conclusions
Each manual measurement requires fitting of a cubic spline to the edge of each
arterial wall. The distance between both splines is the arterial diameter. The line
is placed at the inner edge of the media as showed in Fig. 5.14.
The distance is calculated as follows (see Fig. 5.15):
2. The mean orientation of both splines is calculated using the bisector of the
two calculated regression lines.
260 Frangi, Laclaustra, and Yang
3. Perpendicular lines to this bisector are traced, every 10 pixels, finding the
intersection points with the two splines.
4. The average distance between all pairs of points found is the arterial di-
ameter.
5.9 Acknowledgments
We express our gratitude to Dr. D. Rueckert for providing us with his implemen-
tation of Studholme’s algorithm. We thank M.L. Gimeno, MD, and A.G. Frangi,
MD, for providing manual measurements for the evaluation study, and S. Olmos
and M. Bossa for their help and discussions. We also thank P. Lamata for his
contribution in the initial phases of this work. AFF is supported by a Ramón y
Cajal Research Fellowship from the Spanish ministry of Science and Technology.
Figure 5.15: Average diameter estimation. The diameter is the mean length of
the segments (thin continuous gray lines) perpendicular to the bisector (con-
tinuous black line) between the regression lines (dashed black lines) of both
splines. Nonparallelism and tortuosity of the splines representing the arterial
walls have been exaggerated to clarify the example.
Computerized Analysis and Vasodilation Parameterization of FMD 261
This research was partially supported by grant TIC2002-04495-C02 from the same
ministry. The AGEMZA Project is supported by a grant (FIS 99/0600) from the
Spanish Ministry of Health and Consumption. The clinical research was also
partially supported by a grant of the Diputación General de Aragón (P58/98).
Questions
2. The motion model of section 5.3.2.1 assumes that there is in-plane motion
only. Can you comment on this?
3. The Kalman filter is causal, which means that its output value is a function
of only the inputs that came earlier in time (could also be only later).
Comment on the use of non causal tracking strategies like, for instance,
noncausal Wiener filtering.
Bibliography
[2] Teragawa, H., Kato, M., Kurokawa, J., Yamagata, T., Matsuura, H., and
Chayama, K., Endothelial dysfunction is an independent factor respon-
sible for vasospastic angina, Clin. Sci. (London), Vol. 101, pp. 707–713,
2001.
[3] Cooke, J. P., Rossitch, E., Jr, Andon, N. A., Loscalzo, J., and Dzau, V. J.,
Flow activates an endothelial potassium channel to release an endoge-
nous nitrovasodilator, J. Clin. Invest., Vol. 88, pp. 1663–1671, 1991.
[4] Sinoway, L. I., Hendrickson, C., Davidson, W. R., Jr, Prophet, S., and
Zelis, R., Characteristics of flow-mediated brachial artery vasodilation
in human subjects, Circ. Res., Vol. 64, pp. 32–42, 1989.
[6] Hardie, K. L., Kinlay, S., Hardy, D. B., Wlodarczyk, J., Silberberg, J. S., and
Fletcher, P. J., Reproducibility of brachial ultrasonography and flow-
mediated dilatation (FMD) for assessing endothelial function, Aust. N.Z.
J. Med., Vol. 27, pp. 649–652, 1997.
[7] Touboul, P. J., Prati, P., Scarabin, P. Y., Adrai, V., Thibout, E., and
Ducimetiere, P., Use of monitoring software to improve the measure-
ment of carotid wall thickness by B-mode imaging, J. Hypertens. Suppl.,
Vol. 10, pp. S37–S41, 1992.
Computerized Analysis and Vasodilation Parameterization of FMD 263
[8] Gariepy, J., Massonneau, M., Levenson, J., Heudes, D., Simon, A., and
Groupe de Prevention Cardio-vasculaire en Medecine du Travail, Evi-
dence for in vivo carotid and femoral wall thickening in human hyper-
tension, Hypertension, Vol. 22, pp. 111–118, 1993.
[9] Selzer, R. H., Hodis, H. N., Kwong-Fu, H., Mack, W. J., Lee, P. L., Liu,
C. R., and Liu, C. H., Evaluation of computerized edge tracking for
quantifying intima-media thickness of the common carotid artery from
B-mode ultrasound images, Atherosclerosis, Vol. 111, pp. 1–11, 1994.
[11] Gustavsson, T., Liang, Q., Wendelhag, I., and Wikstrand, J., A dynamic
programming procedure for automated ultrasonic measurement of the
carotid artery, In: IEEE Computers Cardiology, IEEE Computer Society,
pp. 297–300, 1999.
[12] Sonka, M., Liang, W., and Lauer, R. M., Flow-mediated dilatation in
brachial arteries: Computer analysis of ultrasound image sequences,
CVD Preven., Vol. 1, pp. 147–55, 1998.
[13] Liang, Q., Wendelhag, I., Wikstrand, J., and Gustavsson, T., A multiscale
dynamic programming procedure for boundary detection in ultrasonic
artery images, IEEE Trans. Med. Imaging, Vol. 19, pp. 127–142, 2000.
[14] Preik, M., Lauer, T., Heiss, C., Tabery, S., Strauer, B. E., and Kelm, M.,
Automated ultrasonic measurement of human arteries for the determi-
nation of endothelial function, Ultraschall. Med., Vol. 21, pp. 195–198,
2000.
[15] Fan, L., Santago, P., Jiang, H., and Herrington, D. M., Ultrasound mea-
surement of brachial flow-mediated vasodilator response, IEEE Trans.
Med. Imaging, Vol. 19, pp. 621–631, 2000.
[16] Fan, L., Santago, P., Riley, W., and Herrington, D. M., An adaptive
template-matching method and its application to the boundary detec-
tion of brachial artery ultrasound scans, Ultrasound Med. Biol., Vol. 27,
pp. 399–408, 2001.
264 Frangi, Laclaustra, and Yang
[18] Woodman, R. J., Playford, D. A., Watts, G. F., Cheetham, C., Reed, C.,
Taylor, R. R., Puddey, I. B., Beilin, L. J., Burke, V., Mori, T. A., and
Green, D., Improved analysis of brachial artery ultrasound using a novel
edge-detection software system, J. Appl. Physiol., Vol. 91, pp. 929–937,
2001.
[19] Sonka, M., Liang, W., and Lauer, R. M., Automated analysis of brachial
ultrasound image sequences: Early detection of cardiovascular dis-
ease via surrogates of endothelial function, IEEE Trans. Med. Imaging,
Vol. 21, pp. 1271–1279, 2002.
[21] Studholme, C., Hill, D. L. G., and Hawkes, D. J., An overlap invariant
entropy measure of 3D medical image alignment, Patt. Recogn., Vol. 32,
No. 1, pp. 71–86, 1999.
[22] Studholme, C., Hill, D., and Hawkes, D., Automated three-dimensional
registration of magnetic resonance and positron emission tomography
brain images by multiresolution optimization of voxel similarity mea-
sures, Med. Phys., Vol. 24, No. 1, pp. 25–35, 1997.
[23] Kalman, R. E., A new approach to linear filtering and prediction prob-
lem, Trans. ASME, J. Basic Eng., Vol. 82 (Series D), pp. 35–45, 1960.
[24] Hayes, M., Statistical Digital Signal Processing and Modelling, Wiley,
New York, 1996.
[25] Altman, D., Practical Statistical Research, Chapman & Hall, Boca Raton,
FL, 1991.
[27] Bland, J. and Altman, D., Statistical methods for assessing agreement
between two methods of clinical measurement, Lancet, Vol. 1, No. 8476,
pp. 307–310, 1986.
[28] Vita, J. A. and Keaney, J. F., Jr, Endothelial function: A barometer for
cardiovascular risk?, Circulation, Vol. 106, No. 6, pp. 640–642, 2002.
[29] Faulx, M. D., Wright, A. T., and Hoit, B. D., Detection of endothelial
dysfunction with brachial artery ultrasound scanning, Am. Heart J.,
Vol. 145, No. 6, pp. 943–951, 2003.
[30] Widlansky, M. E., Gokce, N., Keaney, J. F., Jr, and Vita, J. A., The clinical
implications of endothelial dysfunction, J. Am. Coll. Cardiol., Vol. 42,
No. 7, pp. 1149–1160, 2003.
[31] Gokce, N., Keaney, J. F., Jr, Hunter, L. M., Watkins, M. T., Nedeljkovic,
Z. S., Menzoian, J. O., and Vita, J. A., Predictive value of noninvasively de-
termined endothelial dysfunction for long-term cardiovascular events
in patients with peripheral vascular disease, J. Am. Coll. Cardiol., Vol. 41,
No. 10, pp. 1769–1775, 2003.
[32] Jolliffe, I., Principal Component Analysis, 2nd edn., Springer Series in
Statistics, Springer-Verlag, New York, 2002.
[33] Hubert, M., Rousseeuw, P., and van den Branden, K., ROBPCA: A
new approach to robust principal component analysis, Technical Re-
port, Department of Mathematics, Katholieke Universiteit Leuven,
2003.
[36] Friedewald, W. T., Levy, R. I., and Fredrickson, D. S., Estimation of the
concentration of low-density lipoprotein cholesterol in plasma, without
266 Frangi, Laclaustra, and Yang
use of the preparative ultracentrifuge, Clin. Chem., Vol. 18, pp. 499–502,
1972.
[38] Kuvin, J. T., Patel, A. R., Sidhu, M., Rand, W. M., Sliney, K. A., Pan-
dian, N. G., and Karas, R. H., Relation between high-density lipoprotein
cholesterol and peripheral vasomotor function, Am. J. Cardiol., Vol. 92,
pp. 275–279, 2003.
[39] Aggoun, Y., Bonnet, D., Sidi, D., Girardet, J. P., Brucker, E., Polak, M.,
Safar, M. E., and Levy, B. I., Arterial mechanical changes in children
with familial hypercholesterolemia, Arterioscler Thromb. Vasc. Biol.,
Vol. 20, pp. 2070–2075, 2000.
[40] Toikka, J. O., Ahotupa, M., Viikari, J. S., Niinikoski, H., Taskinen, M.,
Irjala, K., Hartiala, J. J., and Raitakari, O. T., Constantly low HDL-
cholesterol concentration relates to endothelial dysfunction and in-
creased in vivo LDL-oxidation in healthy young men, Atherosclerosis,
Vol. 147, pp. 133–138, 1999.
[41] Holubkov, R., Karas, R. H., Pepine, C. J., Rickens, C. R., Reichek, N.,
Rogers, W. J., Sharaf, B. L., Sopko, G., Merz, C. N., Kelsey, S. F., McGorray,
S. P., and Reis, S. E., Large brachial artery diameter is associated with
angiographic coronary artery disease in women, Am. Heart J., Vol. 143,
pp. 802–807, 2002.
Chapter 6
6.1 Introduction
1
Department of Electrical and Computer Engineering, Texas Tech University Lubbock,
TX 79409-3102
267
268 Yang and Mitra
images [1–5], while color image segmentation techniques have been created
much later than its gray-level counterpart because of the computational com-
plexity involved with the latter. However, the availability of fast digital proces-
sors in recent times allows easy implementations of such complex algorithms.
Most of the segmentation techniques applied to gray-level images can also be
extended to color images [6, 7].
Clustering is a pattern recognition technique that has been frequently used in
image segmentation [8, 9]. Similar to the variety of approaches in image segmen-
tation, there are numerous clustering techniques based on statistics, fuzzy logic
[10, 11], neural network, or an integration of these [12]. This chapter applies
two recently developed advanced clustering algorithms, namely, deterministic
annealing (DA) [13] and adaptive fuzzy leader clustering (AFLC) [14] and com-
pares their performances with other standard well-known algorithms in efficient
segmentation of medical images. DA is designed on a statistical frame work,
while AFLC has a neural network structure embedded with fuzzy optimization.
The performances of these two algorithms have been compared with classi-
cal clustering techniques such as k-means [15], and fuzzy C-means (FCM) [16].
These clustering algorithms have been applied to segment a few diverse types of
medical images. All operations are performed on images in the spatial domain,
i.e., pixel intensity will be used as the only feature. For gray-scale images, such
as MRI, the feature will be 1D, while for color images, such as the retinal image,
the classification is 3D (red, green, and blue components for each pixel).
The major advantage of using clustering for medical image segmentation is
that these unsupervised techniques for data partitioning do not require a train-
ing set, which is not easy to find in most clinical datasets. The two clustering
techniques, namely AFLC and DA, used in our study to investigate the effec-
tiveness and accuracy of these techniques in medical image segmentation can
be considered as optimization processes. Both AFLC and DA do not require an
initial guess of the actual number of clusters present in a dataset and thus do
not suffer from the instability inherent to traditional and well-known clustering
algorithms such as k-means.
Several types of medical images are selected and used as examples of clus-
tering application. The first modality we used is MRI. We compared the seg-
mentation of anatomical structures such as gray matter, white matter, and cere-
brospinal fluid from simulated MRI. Pathological segmentation of multiple scle-
rosis with both simulated and real MRI is also performed. The second imaging
Statistical and Adaptive Approaches for Optimal Segmentation 269
Image segmentation can be defined as separating the image into similar con-
stituent parts. Given an image I, segmentation of I is a partition P of I into a
0N
set of N regions Rn, n = 1, . . . , N, such that n=1 Rn = I. The separated regions
should be homogeneous and meaningful to the application intended. According
to Pham et al. [1] image segmentation techniques can be classified into several
categories, such as thresholding, region growing, classifiers, clustering, Markov
random field, artificial neural network, fuzzy logic, deformable models, and
atlas-guided approaches. The performance efficiency of each approach, how-
ever, varies and is dependent on specific application and image modality. When
a practical application is concerned, sometimes integration of these techniques
is needed to achieve better performance. A number of review papers on image
segmentation in general and specifically on medical image segmentation are al-
ready available [1–7, 9]. In this chapter, we have focused on the impact of recent
advanced clustering algorithms in precise segmentation of medical images.
Most of the common medical images such as MRI, positron emission to-
mography (PET), computed tomography (CT), and ultrasound images are
270 Yang and Mitra
algorithm determines the classes C and assigns every sample xi into one of
the classes. For hard clustering, a sample belongs to only one class, meaning
1
Ck C j = φ, ∀k = j. For fuzzy clustering, a sample can be classified into more
than one class with different membership values (a degree of similarity) [11].
The sum of all membership values of one sample is unity. Categorization and
summary of most clustering techniques can be found in [12, 18, 19].
k-means (or c-means) and its fuzzy version FCM are two well-known classi-
cal clustering algorithms used for image segmentation. A comparative study of
k-means and FCM is presented in [20]. Application of these algorithms and their
variations on image segmentation can be found in [21–25].
The clustering techniques discussed below, namely, AFLC, DA, k-means,
and FCM, can be regarded as optimization processes that seek to reduce mis-
classification by minimizing specific cost functions or system energy functions.
Contrary to the classical Bayesian classifier that needs training, these cluster-
ing techniques are unsupervised. The complexity of these algorithms, however,
varies. k-means and FCM are relatively simple and easier to implement but not
as effective when compared to DA and AFLC, as will be demonstrated in the
next section. The main problem inherent to both k-means and FCM is that the
initial guess of the actual number of clusters present in a dataset is crucial to
the convergence of the algorithms.
V1 V2 Vc Recognition
layer
Control logic
Reset
b11 b12
b21
b22
bp2 bpc t
b1c b2c bp1
Comparison
Xj1 Xj2 Xjp layer
X j = {Xj1, Xj2, ..., Xjp}
Figure 6.1: Adaptive fuzzy leader clustering (AFLC) structure.
x j − vi
R= Ni (6.2)
k=1 xk − vi
1
Ni
When this ratio is higher than a user-defined threshold, the test fails and a new
cluster is created, taking the sample as the initial centroid and assigning an
initial cluster distance value to this new cluster (which has only one sample
coinciding with the centroid; it is necessary to assign an initial distance value
so that the vigilance test can be performed when the next sample is presented).
Otherwise, the sample is officially classified into this cluster, and then its centroid
and the fuzzy membership values are updated with the following optimization
parameters:
1 p
vi = p m
(uij )mx j i = 1, 2, . . . , c (6.3)
j=1 (uij ) j=1
1/m−1
1/x j − vi 2
uij = c
i = 1, 2, . . . , c j = 1, 2, . . . , p (6.4)
2 1/m−1
k=1 1/x j − vk
Statistical and Adaptive Approaches for Optimal Segmentation 273
Figure 6.2: Adaptive fuzzy leader clustering (AFLC) implementation flow chart.
The above nonlinear relationships between the ith centroid and the membership
value of the jth sample to the ith cluster are obtained by minimizing the fuzzy
objective function in Eq. (6.18). Figure 6.2 shows the flow chart for AFLC im-
plementation. AFLC has been successfully applied to image restoration, image
noise removal, image segmentation, and compression [13, 31, 32].
e−Ei /kB T
pi = (6.5)
Z
274 Yang and Mitra
F = E − TH (6.6)
is the “free energy” of the system. We can see that as T approaches zero, F ap-
proaches the average energy E. From the principle of minimal free energy, Gibbs
distribution collapses on the global minima of E when this happens. SA [33] is
an optimization algorithm based on Metropolis algorithm [34] that captures this
idea. However, SA moves randomly on the energy surface and converges to a
configuration of minimal energy very slowly, if the control parameter T is low-
ered no faster than logarithmically. DA improves the speed of convergence; the
effective energy is deterministically optimized at successively reduced T while
maintaining the annealing process aiming at global minimum.
In our clustering problem, we would like to minimize the expected distor-
tion of all the samples x’s given a set of centroids y’s. Let D be the average
distortion,
D= p(x, y)d(x, y) = p(x) p(y | x)d(x, y) (6.7)
x y x y
F = D − T H, (6.9)
is equivalent to the free energy of the system. The temperature T here is the
Lagrange multiplier, or simply the pseudotemperature. Rose [13] described a
probabilistic framework for clustering by randomization of the encoding rule,
in which each sample is associated with a particular cluster with a certain prob-
ability. When F is minimized with respect to the association probability p(y | x),
Statistical and Adaptive Approaches for Optimal Segmentation 275
Using the explicit expression for p(y | x) into the Lagrangian F in Eq. (6.9), we
have the new Lagrangian
∗ d(x, y)
F = −T p(x) log exp − . (6.11)
x y T
With p(x, y) = p(x) p(y | x), where p(x) is given by the source and p(y | x) is
also known (as given above.) The centroid values of y that minimize F* can be
computed by an iteration that starts at a large value of T, tracking the minimum
while decreasing T. The centroid rule is given by
xp(x) p(yi | x)
yi = x (6.13)
p(yi )
where
p(yi ) e−(x−yi ) /T
2
p(yi | x) = k (6.14)
p(yj ) e−(x−yj ) /T
2
j=1
p(yi ) = p(x) p(yi | x) (6.15)
x
It is obvious that the parameter T controls the entire iterative process of deriv-
ing final centroids. As the number of clusters increases, the distortion, or the
covariance between samples x and centroids yi will be reduced. Thus, when T
is lowered, existing clusters split and the number of clusters will increase while
maintaining minimum distortion. When T reaches a value at which the clusters
split, it corresponds to a phase transition in the physical system.
The exact value of T, at which a splitting will occur, is given by
Tc = 2λmax (6.16)
where Tc is known as the critical temperature and λmax is the maximum eigen-
value of the covariance matrix of the posterior distribution p(x | y) of the cluster
276 Yang and Mitra
corresponding to centroid y:
Cx | y = p(x | y)(x − y)(x − y)t (6.17)
x
In mass-constrained DA, the constraint of i pi = 1 is applied. Here pi ’s
are the centroids that coincide in the same cluster i at position yi . We call this
the “repeated” centroids. This is because when the cluster splits, the annealing
might result in multiple centroids in each effective cluster depending on the
initial pertubation. Below is a simple description of implementation of DA:
4. If K > K max , stop and output centroids and sample assignments. Other-
wise, for all clusters, check if T > Tc (i), if yes, go to step 3; otherwise, split
the cluster.
Figure 6.3 gives the flow chart for implementing mass-constrained DA. As can
be seen from the flow chart, there are a couple of parameters that govern the
annealing process, each exerts its influence on the outcome, particularly the
temperature cooling step parameter α. Theoretically, if α is reduced sufficiently
slowly, local minima of the cost function can be skipped and a global minimum
can be reached. However, it can be very time consuming if T is reduced too
slowly.
The centroids yi and the encoding rule p(yi | x) are illustrated in Fig. 6.4.
Initialization
K max, T=Tc, K=1, py=1, a, d, R, r
N
K < K max
Y
T=a T, Update Tc(i),
N
i<K K=K _current, i=1
N T ≥ T c(i)? N
i<K
Y Y
Split cluster i:
Update
K_ current++,
py K_current=py i/2,
py i=py i/2,
y K_current=y i+d,
i++ Eliminate
Stop,
output centroids yi &
sample assignment
Notes
calculated as in Tc =2λmax,
y i_old=y i
∑ x
xp( x) p( yi | x )
where p( yi ) = ∑ p( x) p( y | x)
i
yi = , x
p ( yi )
p ( yi ) e − ( x − y i ) /T
2
p ( yi | x ) =
∑
K − ( x − y j )2 /T
j=1
p ( yj) e
|| yi _ yi _ old || Y
>R
|| yi _ old ||
N
Converge
Eliminate repeated centroids*f: discard centroids that coincide at the same location in one cluster:
For i = 1:K,
End
3. Keep iterate over the above two steps till the sum of square error of each
cluster can no longer be reduced.
The initial centroids can be random; however, the choice of initial centroids
is crucial and may result in incorrect partitioning. The iteration drives the ob-
jective function toward a minimum. The resultant grouping of the objects is
geometrically as compact as possible around the centroids in each group.
Statistical and Adaptive Approaches for Optimal Segmentation 279
where vi is the centroid of the ith cluster; uij is the membership value vector
of the ith class for the jth sample; dij is the Euclidean distance between the
ith class and sample x j ; c and n denote the number of classes to be clustered
and the total number of samples, respectively; and m is a weighting exponential
parameter on each fuzzy membership with1 ≤ m < ∞. The FCM algorithm can
be described as follows:
6.4 Results
We have chosen three different modalities of images, namely, MRI, stereo fun-
dus images, and color cervix images to demonstrate the effectiveness of the
advanced clustering algorithms over the traditional ones in segmenting medical
images of various modalities.
fluid (CSF), as well as MS lesions have been developed [5, 8, 9, 35–38]. A good
survey in applying pattern recognition techniques to MR image segmentation
is available in [8]. Clark et al. [38] give a comparative study of fuzzy clustering
approaches, including FCM and hard c-means versus supervised feedforward
back-propagation computational neural network in MRI segmentation. These
techniques are found to provide broadly similar results, with fuzzy algorithms
showing better segmentation.
MS is a disease that affects the central nervous system. It affects more than
400,000 people in North America. Patients with MS experience range of symp-
toms depending on where the inflammation and demyelination is situated in
the central nervous system. It can be from blurred vision, pain, affecting the
sense of touch to loss of muscle strength in arms and legs. About 95% MS le-
sions occur in the white matter in the brain [39]. MR imaging is usually used to
monitor the progression of the disease and the effect of drug therapy. Clinical
analysis or grading of MS lesions is mostly performed by experienced raters
visually or qualitatively. The involvement of such manual segmentation suffers
from inconsistency between raters and inaccuracy. Computer aided automatic
or semiautomatic segmentation of MS lesions in MR images is important in en-
hancing the accuracy of the measurement, facilitating quantitative analysis of
the disease [35, 36, 39–43].
Many regular image segmentation techniques can be employed in MS le-
sion segmentation, such as edge detection, thresholding, region growing, and
model-based approaches. However, because of MR field inhomogeneities and
partial volume effect, most of the methods are integrated in nature, in which pre-
and postprocessing are involved to correct these effects and remove noise, or
a priori knowledge of the anatomical location of brain tissues is used [36, 39,
41]. Johnston et al. [35] used a stochastic-relaxation-based method, a modified it-
erated conditional modes (ICM) algorithm in 3D [6] on PD- and T2-weighted MR
images. Inhomogeneities in multispectral MR images are corrected by applying
homomorphic filtering in the preprocessing step. After initial segmentation is
obtained, a mask containing only the white matter and the lesion is generated by
applying multiple steps of morphological filter and thresholding, on which a sec-
ond pass of ICM is performed to produce the final segmentation. Zijdenbos et al.
[36] applied back-propagation neural network for segmentation on both T1-,
T2-, and PD-weighted images. Intensity inhomogeneities are corrected by using
a so-called thin-plate spline surface fitted to the user-supplied reference points.
Statistical and Adaptive Approaches for Optimal Segmentation 281
The intensity level and contrast can be very different for T1-, T2-, or PD-weighted
MR images. Segmentation of gray matter, white matter, or CSF in the spatial
domain depends highly on the contrast of the image intensity; therefore, T1-
weighted MRI is more suitable than T2- or PD-weighted MRI. In order to validate
the performances of the clustering algorithms, synthetic MRI [46, 47] is used be-
cause the existence of an objective truth model is helpful in obtaining quantative
analysis of a segmentation technique, excluding the introduction of human er-
ror. The synthetic images used in this example are obtained from a simulated
brain database [46, 47] provided by McConnell Brain Imaging Center, Montréal
Neurological Institute, McGill University. It includes databases for normal brain
and MS lesion brain. Three modalities are provided, T1-, T2-, and PD-weighted
MRI. Simulations such as noise and intensity nonuniformity are also available.
The image in this example is #90 of 1-mm thick slices with 3% noise and 0%
intensity nonuniformity. Figures 6.6–6.9 compare the segmentation results from
DA, AFLC, FCM, and k-means. Misclassification, using the computer-generated
truth model as the reference, is considered as the performance evaluation crite-
rion following the traditional trend. Misclassification on each segmented cate-
gory is calculated as the percentage of the total number of misclassified pixels in
the segmented image divided by the total number of pixels in the corresponding
truth model. For example, for the CSF, let class csf be the binary segmented
image and csf model be the binary CSF truth model image,
(a) T11mm30_90.raw
(b) CSF gray-level truth model (c) Gray matter gray-level (d) White matter gray-level truth
truth model model
Figure 6.5: Noisy MRI and the corresponding truth model for CSF, gray matter,
and white matter.
The truth models are originally fuzzy models (Figs. 6.5(b)–6.5(d)). Since all
results produced from the algorithms are hard clustering, the fuzzy truth models
are converted into hard models by classifying a pixel to the category in which it
has the largest pixel value.
Misclassification results (in Figs. 6.6–6.9) show that DA and AFLC perform
better than k-means and FCM, demonstrating the effectiveness of the advanced
algorithms in being more noisy tolerant.
(a) Classified CSF (b) Classified gray matter (c) Classified white matter
(a) Classified CSF (b) Classified gray matter (c) Classified white matter
(d) T1-weighted MRI without (e) T2-weighted MRI without (f ) PD-weighted MRI
extraneous parts extraneous parts without extraneous parts
(g) Synthesized image (T1- (h) Fuzzy MS lesion truth (i) Hard MS lesion truth
(PD-T2)) model model
is applied, the lesions will be classified either with gray matter in Figs. 6.10(a)
and 6.10(c) or CSF in Fig. 6.10(b) since the intensities of the lesions are similar
to those tissues. It is a common practice that information embedded in multi-
channel MR images are combined to extract MS lesions [35, 36]. In this example,
a synthesized image is created by manipulation of the three images: synthesized
image = T1 − (PD − T2). Figure 6.10(g) shows the synthesized image of slice
#90 T1-, T2-, and PD MR images, respectively. It can be seen that Fig. 6.10(g)
provides distinct intensity level variation among gray matter, white matter CSF,
and MS lesions. Then synthesized image is feed into four clustering algorithms.
Segmentation results are shown in Figs. 6.10(j)–6.10(l). Figure 6.10(h) is the
fuzzy MS lesion truth model. A hard model in Fig. 6.10(i) is created by verifying
that the fuzzy model possesses the largest value among other tissues at the same
pixel. It can be observed that DA provides the closest result to the truth model.
Segmentation of MS lesions from clinical MRI: Clinical MRI is much more
complicated than simulated MRI in noise, clarity, and intensity inhomogeneity.
The MR images to be segmented in the following example come from clinical
data. The T2-weighted MR images are obtained from [33]. Four images are
extracted from a MPEG movie showing chronic progression of MS lesions
(Figs. 6.11(a)–6.11(d)). The images have been compressed, showing poor image
resolution and reduced quality (with observable blocking artifacts.) The results
of segmentations on these images are summarized in Figs. 6.11(e)–6.11(h), with
labeling of the individual lesions shown in Fig. 6.11(e). Segmentation processes
will be explained below. With image (a) and (b), intermediate results are shown
in Figs. 6.12 and 6.13. For image (c) and (d), intermediate results are skipped
and only the white matter masks and final segmentations are illustrated in
Figs. 6.14 and 6.15.
Segmenting MS lesions from a single modality image such as the one used in
this example is difficult as has been explained in the previous section. However,
when images in other modalities are not available, background knowledge can
be used. As most of the MS lesions occur in the white matter, Johnston et al.
[35] suggested creating a white mask to confine segmentation area such that
segmentation accuracy can be enhanced. In this case, segmentation is a two-
pass process. The image is first roughly segmented into four categories with
DA: the background, the white matter, gray matter, and CSF and other tissues
(Figs. 6.12(b)–6.12(e)). Not surprisingly, the MS lesions cannot form a class of
their own. They are classified either as gray matter or other categories. Then a
white matter mask is generated by morphological filtering of the white matter.
This is shown in Fig. 6.12(f). Using this mask, a tailored image containing only
the masked area is obtained (Fig. 6.12(g)) and used as input image in the second
pass DA clustering. Final segmentation for Fig. 6.12(g) is shown in Fig. 6.12(h).
The result is superposed on the original image in Fig. 6.12(i).
Besides being affected by the image quality, the above segmentation is influ-
enced by the mask. The misclassification of gray matter into MS lesion on the
right-hand side of the image (indicated by the red arrow in Fig. 6.12) in this ex-
ample is caused by misclassification of gray matter into the white matter mask.
One of the goals to segment the lesions is to provide quantitative analysis.
Once the lesions are segmented and labeled, progress of each lesion in size can
be obtained in chronic order. Table 6.1 summarizes the changes in size (number
of pixels) of all lesions that exist through all four MR images.
(c) First pass segmentation: CSF and other (d) First pass segmentation: gray matter
(e) First pass segmentation: white matter (f) White matter mask
(g) Input image for second pass clustering (h) MS lesion segmented from the second
pass
(a) 4 4 28 2 70 36
(b) 2 15 29 1 401 32
(c) 10 31 63 313a 24
(d) 0 11 45 14 41 16
a
Lesions 4 and 5 merge in (c).
290 Yang and Mitra
(c) First pass segmentation: CSF and other (d) First pass segmentation: gray matter
(e) White matter mask (f ) Input image for second pass segmentation
can be expressed as the cup-to-disk ratio in diameter (2D) or volume (3D), for
which segmentation of the optic cup/disk in 2D or 3D is necessary. The cup-to-
disk ratios obtained from 3-D visualization of the optic cup/disk has been found
to match closely with those provided by physicians [48]. Semiautomated meth-
ods for finding the contours of the optical nerve head (ONH) by digital image
analysis attempt to find the disparities of pixels between the fundus stereo pairs
in a region including the ONH. Recent studies [48–50] describe in detail the algo-
rithms developed for feature extraction, registration, correlation, and dynamic
programming leading to computing disparities based on a nonconvergent stereo
292 Yang and Mitra
entropy in relation to the green channel and therefore are not taken into ac-
count. The registration process removes all vertical displacements leaving only
the horizontal shifts arising from the different positions of the camera while tak-
ing the stereo fundus images. A good registration is crucial to obtain accurate
disparity maps. A power cepstrum-based registration that uses Fourier spec-
trum properties to correct rotational errors that may be present in the stereo
pair is employed. This process begins by extracting the most relevant features
such as the blood vessels in both images. These features are extracted by sub-
tracting a filtered version of the original stereo pair from the original (unsharp
masking). After binarizing this new stereo pair, multiple passes of a median filter
are used to eliminate some of the resulting noise in the images. Compensation
for rotational differences is also performed via zero mean normalized cross-
correlation(ZNCC) [53] of the Fourier spectrum of the images using ZNCC as a
disparity measure. ZNCC is expressed as follows:
covi, j ( f, g)
C(i, j) = (6.19)
σi, j ( f ) × σi, j (g)
1
i+K j+L
covi, j ( f, g) = fm,n − fi, j gm,n − gi, j
((2K + 1)(2L + 1) − 1) m=i−K n= j−L
(6.20)
where f and g are the windows of pixels to be measured and f and g are
corresponding average values. K and L define the size of those windows, and
the indices for the pixels within the windows are i and j. σ ( f ) and σ (g) are the
square roots of the covariances cov( f, f ) and cov(g, g), respectively.
According to the inherent Fourier spectrum properties, a rotation in the
spatial image results in the same amount of rotation of its spectrum. Thus it is
possible to find the angle of rotation of one image in the stereo pair with respect
to the other by performing step-by-step rotations and cross correlating their
Fourier transforms. The actual angle of rotation will be the one with the highest
cross-correlation obtained. Rotational compensation is applied once the angle
of rotation has been found.
After the rotational correction, a cepstrum transformation is applied to the
sum of the binary-featured stereo pair images. The power cepstrum P is defined
as in [52]:
where F represents the Fourier transform operation. Let w(x, y) be the reference
image, w(x + x 0 , y + y0 ) be the shifted image, and i(x, y) = w(x, y) + w(x +
x0 , y + y0 ). Then the power cepstrum of the sum of both images is given as
where δ(x, y) is the Kronecker delta and A, B, and C are the first three coef-
ficients for this power cepstrum expansion series [54]. Equation (6.22) shows
that the displacement between images results in the sum of the power cepstrum
of the original image w(x, y) plus a multitude of delta functions. Each delta is
separated from the others by an integer multiple of the actual displacement we
are looking for. In order to enhance the cepstral peaks, the cepstrum of the refer-
ence must be subtracted from the cepstrum of the stereo pair. With this in mind,
a fixed number of deltas are chosen from the resulting cepstrum. Each delta
represents a translational shift, or an integer multiple of the shift, of a pixel in
the image shifted from the corresponding pixel in the reference image. All points
are tested by cross correlating the reference image with the other image shifted
by the number of pixels (in the vertical and horizontal directions) indicated by
the current point being tested. The highest correlation will correspond to the
most probable relative translation between both images.
Before disparity mapping is carried out, salient features of the image, such as
the blood vessels, have to be extracted. The blood vessels are segmented through
series of traditional local operation, such as unsharp masking, thresholding, and
median filtering. This process is illustrated in Fig. 6.16.
The algorithm developed for the search of disparities first divides both images
into square windows of a given size (multiple of two), say N by N. ZNCC is
performed between the windows in one image with the windows in the other
image. If cross correlation is larger than a certain threshold, it is assumed that
the windows at that position in the image are similar, so the cepstrum is applied
to those windows to check for possible shifts. Otherwise, if the cross correlation
is smaller than or equal to the threshold mentioned, zero disparity is assigned to
every pixel in the window. Only a specified number of horizontal points shown
in the cepstrum are taken into account for analysis. Let’s say that, for an N by N
window, only N/4 horizontal points are chosen for analysis in the cepstral plane.
This is because for an N by N window the maximum horizontal displacement
that can be detected is N/2 (either to the right or to the left, making a full range
296 Yang and Mitra
Figure 6.16: Blood vessel extraction for registration and disparity mapping.
of N pixels), so checking all N/2 points for right and left shifts will be very time
consuming. Instead, only the most probable N/4 horizontal shifts found by the
cepstrum will be tested using the cross correlation technique. One of the images
of the stereo pair is considered as the reference image and the other is the test
image. Then, for every point chosen (from the cepstrum), cross correlation is
applied between the reference window (in the reference image) and the other
window (in the test image) shifted by the number of pixels determined by the
cepstral shift. Since the cepstrum can detect only the amount of the shifts but not
their direction (the cepstrum is symmetric about the origin), each point should
be tested for left and right shifts. So when checking N/4 cepstral points, actually
N/2 positions are analyzed. The highest value in the cross correlation will be the
most probable shift that will be assigned to all elements in the window currently
being tested for disparity. The number of cepstral points is not a fixed parameter
and can be modified. This modification will affect the processing time and the
accuracy of the disparity map. Once all disparities have been calculated with a
window size of N by N pixels, the size of the window is reduced by a factor of two
and the whole process is repeated until the windows reach a predetermined size.
Each disparity map (calculated at a given resolution) is accumulated by adding
it to the previous disparity map. At the end of the process, the final disparity map
Statistical and Adaptive Approaches for Optimal Segmentation 297
is the total accumulated disparity map. Usually the starting window size is 64 by
64 and the stopping size is 8 by 8. Smaller sizes of a window may not be worth
computing because of the much longer time required and the small impact of it
on the final disparity map. Also, since the window is so small, chances are that
noise becomes a serious issue. The cepstrum is, in fact, a very noise tolerant
technique that is suitable for finding disparities in chosen regions [55], while
cross correlation is noise-sensitive and finds disparities using a procedure in a
pixel-by-pixel fashion. A combination of both techniques results in an accurate
and noise tolerant algorithm. In order to get an accurate 3-D representation
from a stereo pair of images, disparities must be known for each point (pixel)
of one image with respect to the other. Since the disparity search algorithm
finds only disparities for the features or regions, disparities of all individual
pixels are not known. The interpolation used here gives an estimate of the other
missing disparities. Cubic B-spline is the interpolation technique applied to the
sparse matrix resulting from the disparity search. It can be shown that the cubic
B-spline can be modeled by three successive convolutions with a constant mask
[54, 56]. In this case, a mask consisting of all ones of size 32 by 32 or 64 by 64 is
used. After filtering the original sparse disparity matrix three times with the mask
described above, a smooth representation results. This is the final 3-D surface
of the ONH. With this surface, measures such as the disk and cup volume can be
made. Figure 6.17(a) shows the optic disk/cup segmentation obtained from the
50 50
100 100
150 150
200 200
250 250
300 300
350 350
400 400
50 100 150 200 250 300 350 400 50 100 150 200 250 300 350 400
(a) (b)
Figure 6.17: Segmentation of optic disk/cup from 3-D disparity map. (color
slide)
298 Yang and Mitra
Some of the above approaches have been applied to retinal blood vessel
segmentation. Chaudhuri et al. [60] used a tortuosity measurement technique
and matching filter for blood vessel extraction and reported 91% of blood vessel
segments and 95% of vessel network. Wood et al. [61] extracted blood vessels
in the retinal image for registration by first equalizing the image with local aver-
aging and subtraction, and then nonlinear morphological filtering was used to
locate blood vessel segments. Hoover et al. [62] proposed an automated method
to locate blood vessels in images of the ocular fundus. They used both local and
global vessel features and studied the matched filter response using a probing
technique. Zana and Klein [63] segmented vessels in retinal angiography images
based on mathematical morphology and linear processing. FCM and tracking-
based methods can also be used in retinal vessel segmentation [64]. Zhou et al.
[65] proposed an algorithm that relies on a matched filtering approach coupled
with a priori knowledge about retinal vessel properties to automatically detect
300 Yang and Mitra
the vessel boundaries, track the midline of the vessel, and extract useful param-
eters of clinical interest.
We can see from the previous section that accurate blood vessel segmen-
tation is very important for registration and disparity mapping. The segmenta-
tion method described previously involves Gaussian filtering, unsharp masking,
thresholding, and median filtering all of which can be easily affected by image
illumination or contrast of the image. Here, we used advanced clustering ap-
proaches for the same task and compared it with the Gaussian filtering method.
Among the three color channels, the green channel provides better contrast
for edge information. Therefore, it alone can be used to accelerate subsequent
segmentation processes. Challenge of blood vessel segmentation lies in distin-
guishing the blood vessel edges. However, most images are noisy and nonuni-
formly illuminated. In the optic disk area, blood vessel edges are smeared by
reflectance. Simple segmentation techniques such as the one described above
can produce noisy and inaccurate result. As is illustrated in Fig. 6.19(a), edges
of the optic disk are mistakenly classified as blood vessel edges.
DA clustering solves this problem. Figure 6.18(c) shows the segmented re-
sult. Image resulted directly from DA segmentation is still affected by the noise
in the original image, since single pixel intensity is used as feature. However,
the noise in the segmented image can be easily removed through morphological
filtering. The expansion or shrinking of blood vessel edges caused by morpholog-
ical erosion and dilation can be corrected by edges detected by a simple edge de-
tector, such as a Canny edge detector. Figure 6.18(d)–6.18(f) show the segmented
optic disk, optic cup, and blood vessels in the ONH, respectively. The optic
disk/cup segmentation is very similar to the manual segmentation in Fig. 6.18(b).
Blood vessels thus segmented are then used for registration and disparity
mapping in the 3-D optic disk/cup segmentation, as is shown in Fig. 6.19(b).
Three-dimensional visualization of the optic disk/cup with and without DA seg-
mentation is comparatively demonstrated in Figs. 6.19(c) and 6.19(d). We can
see that with DA segmentation, more accuracy is achieved.
(a) Fundus image (left stereo (b) Manually segmented optic (c) DA segmentation of (b)
image) disk/cup by an ophthalmologist
on the right stereo image
(d) DA-segmented optic disk (e) DA-segmented optic cup (f) DA-segmented blood vessels
(g) Final segmentation of optic (h) Final segmentation of (i) Final segmentation of blood
disk optic cup vessels
around 230,000 deaths each year. Cervical cancer develops slowly and has a
detectable and treatable precursor condition known as dysplasia. It can be pre-
vented through screening at-risk women and treating women with precancerous
and cancerous lesions. In many western countries, cervical cancer screening
programs have reduced cervical cancer incidence and mortality by as much as
90%. Analysis and interpretation of cervix images are important in early detection
302 Yang and Mitra
Figure 6.19: Disparity maps generated with and without DA feature extraction.
Statistical and Adaptive Approaches for Optimal Segmentation 303
(c) Disparity map obtained from DA blood (d) Disparity map obtained from general
vessel segmentation edge detection for blood vessel
segmentation
6.5 Conclusions
6.6 Acknowledgments
This work has been partially supported by funds from the Advanced Technology
Program (ATP) (Grant No. 003644-0280-1999), Technology Development and
Transfer Program (TDT) (Grant No. 003644-0217-2001) of the state of Texas,
Kestrel Corporation, the NEI Grant No. 1 R43 EY14090-01 and the NSF Grant EIA-
9980296. We acknowledge Young I. Kim, M.D., and Mary Lucy M. Pereira, M.D.,
of Young H. Kwon’s (M.D., Ph.D.) team from University of Iowa Hospitals and
Clinics for their contributions to manual segmentation of the stereo optic disk
images. The authors are grateful to Daron Ferris, M.D., from the Medical College
of Georgia for providing us with the cervigram images from the Guanacaste
Project, Costa Rica, sponsored by the National Cancer Institute of USA.
Questions
10. Judging from the examples given in the chapter, what are the performance
differences among AFLC, DA, FCM, and k-means?
12. How is clustering in retinal optic disk/cup and blood vessel segmentation
better than regular edge detection techniques?
Bibliography
[1] Pham, D. L., Xu, C. J., and Prince, L., A survey of current methods in
medical image segmentation, Ann. Rev. Biomed. Eng., Jan 1998.
[5] Suri, J. S., Setarehdan, S. K., and Singh, S., eds., Advanced Algorithmic
Approaches to Medical Image Segmentation, Springer-Verlag, London,
2002.
[8] Bezdek, J. C., Hall, L. O., and Clarke, L. P., Review of MR image seg-
mentation techniques using pattern recognition, Med. Phys., Vol. 20,
pp. 1033–1048, 1993.
[9] Clarke, L. P., Camacho, R. P., Velthuizen, M. A., Heine, J. J., Vaidyanathan,
M., Hall, L. O., Thatcher, R. W., and Silbiger, M. L., Review of MRI seg-
mentation: Methods and applications, Magn. Reso. Imaging, Vol. 13,
No. 3, pp. 343–368, 1995.
[10] Zadeh, L. A., Fuzzy sets, Inf. Control Theory, Vol. 8, pp. 338–353, 1965.
[12] Jain, A. K., Murty, M. N., and Flynn, P. J. Data clustering: A review, ACM
Comput Surveys, Vol. 31, No. 3, pp. 264–323, 1999.
[14] Newton, S. C., Pemmaraju, S., and Mitra, S., Adaptive fuzzy leader clus-
tering of complex data sets in pattern recognition, IEEE Trans. Neural
Networks, Vol. 3, pp. 794–800, 1992.
[16] Bezdek, J., Pattern Recognition with Fuzzy Objective Function Algo-
rithms, Plenum Press, New York, 1981.
[18] Marroquin, J. L. and Girosi, F., Some extensions of the k-means al-
gorithm for image segmentation and pattern recognition, Technical
Report, MIT AI Lab., AI Memo 1390, Jan 1993.
[19] Fraley, C. and Raftery, A. E., How many clusters? Which clustering
method? Answers via model-based cluster analysis, Technical Report
No. 329, University of Washington, 1998.
[20] Ray, S., Turi, R. H., and Tischer, P. E., Clustering-based color image
segmentation: An evaluation study, In: Proceedings of Digital Image
Computing: Techniques and Applications, Brishbane, Qld., Austrlia,
6–8 Dec. 1995, pp. 86–92.
[21] Park, S. H., Yun, I. D., and Lee, S. U., Color image segmentation based on
3-D clustering: Morphological approach, Patt. Recogn., Vol. 31, No. 8,
pp. 1061–1076, 1998.
[22] Weeks, A. R., and Hague, G. E., Color segmentation in the HIS
color space using the k-means algorithm, In: Proceedings of the
310 Yang and Mitra
SPIE—Nonlinear Image Processing VIII, San Jose, CA, Feb, 10–11, 1997,
pp. 143–154.
[23] Wu, J., Yan, H., and Chalmers, A. N., Color image segmentation using
fuzzy clustering and supervised learning, J. Electron. Imaging, pp. 397–
403, 1994.
[31] Mitra, S. and Yang, S. Y., High fidelity adaptive vector quantization at
very low bit rates for progressive transmission of radiographic images,
J. Electron Imaging, Vol. 11, No. 4(Suppl. 2), pp. 24–30, 1998.
[32] Mitra, S., Castellanos, R., Yang, S. Y., and Pemmaraju, S., Adaptive
clustering for image segmentation and vector quantization, In: Soft-
Computing for Image Processing, Editors: Pal, S. K., Ghosh, A., and
Kundu, M. K., eds., Springer-Verlag, New York, 1999.
Statistical and Adaptive Approaches for Optimal Segmentation 311
[33] Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P., Optimization by simu-
lated annealing, Science, Vol. 220, pp. 671–680, 1983.
[34] Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and
Teller, E., Equations of state calculations by fast computing machines,
J. Chem. Phys., Vol. 21, No. 6, pp. 1087–1091, 1953.
[35] Johnston, B., Atkins, M. S., Mackiewich, B., and Anderson, M., Segmen-
tation of multiple sclerosis lesions in intensity corrected multispectral
MRI, IEEE Trans. Med. Imaging, Vol. 15, pp. 154–169, 1996.
[36] Zijdenbos, A. P., Dawant, B. M., Margolin, R. A., and Palmer, A. C.,
Morphometric analysis of white matter lesions in MR images: Method
and validation, IEEE Trans. Med. Imaging, Vol. 13, No. 4, pp. 716–724,
1994.
[37] Hall, L. O., Bensaid, A. M., Clarke, L. P., Velthuizen, R. P., Silbiger, M. S.,
and Bezdek, J. C., A comparison of neural network and fuzzy clustering
techniques in segmenting magnetic resonance images of the brain, IEEE
Trans. Neural Networks, Vol. 35, pp. 672–682, 1992.
[38] Clark, M. C., Hall, L. O., Goldgof, D. B., et al., MRI segmentation us-
ing fuzzy clustering-techniques, IEEE Eng. Med. Biol., Vol. 13, No. 5,
pp. 730–742, 1994.
[39] Kamber, M., Collins, D. L., Shinghal, R., Francis, G. S., and Evans,
A. C., Model-based 3-D segmentation of multiple sclerosis lesions in
dual-echo MRI data, Proc. SPIE Visual. Biomed. Comput., Vol. 1808,
pp. 590–600, 1992.
[40] Jackson, E. F., Narayana, P. A., Wolinsky, J. S., and Doyle, T. J., Accuracy
and reproducibility in volumetric analysis of multiple sclerosis lesions,
J. Comut. Assisted Tomog., Vol. 17, No. 2, pp. 200–205, 1993.
[42] Mitchell, J. R., Karlik, S. J., Lee, D. H., and Fenster, A., Automated detec-
tion and quantification of multiple sclerosis lesions in MR volumes of
312 Yang and Mitra
the brain, Proc. SPIE Med. Imag. VI: Image Process., Vol. 1652, pp. 99–
106, 1992.
[43] Johnston, B. G., Atkins, M. S., and Booth, K. S., Partial volume segmen-
tation in 3-D of lesions and tissues in magnetic resonance images, Proc.
SPIE Med. Imaging, Vol. 2167, pp. 28–39, 1994.
[44] Gerig, G., Kübler, O., Kikinis, R., and Jolesz, F. A., Nonlinear anisotropic
filtering of MRI data, IEEE Trans. Med. Imaging, Vol. 11, pp. 221–232,
1992.
[45] Leemput, K. V., Maes, F., Vandermeulen, D., Colchester, A., and Suetens,
P., Automated segmentation of multiple sclerosis lesions by model out-
lier detection, IEEE Trans. Med. Imaging, Vol. 20, No. 8, pp. 677–688,
2001.
[46] Kwan, R. K.-S., Evans, A. C., and Pike, G. B., MRI simulation-based
evaluation of image-processing and classification methods, IEEE Trans.
Med. Imaging. Vol. 18, No. 11, pp. 1085–97, 1999.
[47] Cocosco, C. A., Kollokian, V., Kwan, R. K.-S., and Evans, A. C., BrainWeb:
Online Interface to a 3D MRI Simulated Brain Database, available at:
http://www.bic.mni.mcgill.ca/brainweb/.
[48] Corona, E., Mitra, S., Wilson, M., and Soliz, P., Digital stereo optic disc
image analyzer for monitoring progression of glaucoma, Proc. SPIE,
Vol. 4684, pp. 82–93, 2002.
[49] Corona, E., Mitra, S., Wilson, M., Krile, T., Kwon, Y. H., and Soliz, P.,
Digital stereo image analyzer for generating automated 3-D measures
of optic disc deformation in glaucoma, IEEE Trans. Med. Imaging, Vol. 2,
No. 10, pp. 1244–1253, 2002.
[50] Yang, S., King, P., Corona, E., Wilson, M., Aydin, K., Mitra, S., Soliz, P.,
Nutter, B., and Kwon, Y. H., Feature extraction and segmentation in med-
ical images by statistical optimization and point operation approaches,
Proc. SPIE, Vol. 5032, pp. 1676–1684 , 2003.
[51] Mitra, S., Nutter, B. S., and Krile, T. F., Automated method for fundus
image registration and analysis, Appl. Optics, Vol. 27, pp. 1107–1112,
1988.
Statistical and Adaptive Approaches for Optimal Segmentation 313
[52] Lee, D. J., Krile, T. F., and Mitra, S., Power cepstrum and spectrum
techniques applied to image registration, Appl. Optics, Vol. 27, pp. 1099–
1106, 1988.
[53] Sun, C., A fast stereo matching method, In: Proceedings of Digital Image
Computing: Techniques and Applications, Massey University, Auckland,
New Zeland, December 10–12, 1997, pp. 95–100.
[54] Ramirez, J., Mitra, S., and Morales, J., Visualization of the three di-
mensional topography of the optic nerve head through a passive
stereo vision model, J. Electron. Imaging, Vol. 8, No. 1, pp. 92–97,
1999.
[56] Pratt, W. K., Digital Image Processing, 2nd edn., Wiley-Interscience, New
York, pp. 112–117, 1991.
[57] Kirbas, C. and Quek, F. K. H., Vessel extraction techniques and algo-
rithms: A survey, In: 3rd Symposium on Bioinfomatics and BioEngi-
neering, Bethesda, Maryland, March 2003, pp. 238–245.
[58] Kass, M., Witkin, A., and Terzoopoulos, D., Snakes: Active contour mod-
els, Int. J. Comp. Vision, Vol. 1, pp. 321–331, 1988.
[59] Osher, S. and Sethian, J. A., Fronts propagating with curvature de-
pendent speed: Algorithms based on hamilton-jacobi formulation, JCP,
Vol. 79, pp. 12–49, 1988.
[60] Chaudhuri, S. C., Katz, N., Nelson, M., and Goldbaum, M., Detection
of blood vessels in retinal images using two dimensional blood vessel
filters, IEEE Trans. Med. Imaging, Vol. 8, pp. 263–269, 1989.
[61] Wood, S. L., Qu, G., and Roloff, L. W., Detection and labeling of retinal
vessels for longitudinal studies, In: IEEE International Conference on
Image Processing, 1995, Vol. 3, pp. 164–167.
[62] Hoover, A., Kouznetsova, V., and Goldbaum, M., Locating blood vessels
in retinal images by piecewise threshold probing of a matched filter
response, IEEE Trans. Med. Imaging, Vol. 19, pp. 203–210, 2000.
314 Yang and Mitra
[63] Zana, F. and Klein, J. C., Robust segmentation of vessels from retinal
angiography, in IEEE International Conference on Digital Signal Pro-
cessing, 1997, Vol. 2, pp. 1087–1090.
[64] Tolias, Y. and Panas, S. M., A fuzzy vessel tracking algorithm for retinal
images based on fuzzy clustering, IEEE Trans. Med. Imaging, Vol. 17,
pp. 263–273, 1998.
[65] Zhou, L., Rzeszotarski, M. S., Singerman, L. J., and Chokreff, J. M., The
detection and quantification of retinopathy using digital angiograms,
IEEE Trans. Med. Imaging, Vol. 13, pp. 619–626, 1994.
[67] Wang, W., Sun, C., and Chao, H., Color image segmentation and under-
standing through connected components, In: Proceedings of 1997 IEEE
Int’l Conf. on Systems, Man, and Cybernetics, Orlando, FL, Oct. 12–15,
1997, Vol. 2, pp. 1089–1093.
[68] Castellanos, R., Castillo, H., and Mitra, S., Performance of nonlinear
methods in medical image restoration, SPIE Proc. Nonlinear Image Pro-
cess., Vol. 3646, 1999.
[69] Jain, A. K. and Dubes, R. C., Algorithms for Clustering Data, Prentice
Hall, Englewood Cliffs, NJ, 1988.
[70] Johnson, K. A., and Becker, J. A. The Whole Brain Atlas, available at:
http://www.med.harvard.edu/AANLIB/.
Chapter 7
7.1 Introduction
Medical image processing is the meeting of two sciences that behave in com-
pletely different ways. While medicine is a science where experience plays a
majors role and where the practical use is evident, image processing—as a
derivative of applied mathematics—is a more theoretical discipline. Hence, the
conditions of this meeting need to be analyzed sophisticatedly; not everything
possible to implement is useful, and not everything useful is possible to imple-
ment.
In this introductory part, we describe the biomedical context and the clinical
motivation of the methods presented in this chapter.
1
Centre of Mathematical Morphology, Paris School of Mines, Fontainebleau, France
315
316 Walter and Klein
Pupil Lens
Macula
Retina
Optic disk
Pigment epithelium
Choroid
Sclera
r Retina: The innermost layer of the eyeball. Place where the image created
by the lens is focused and transformed into nerve impulses which are then
sent to the brain via the optic nerve.
The retina itself can be divided into two layers: The photoreceptor layer and
the pigment epithelium (sometimes, the latter one is introduced not as a part
of the retina, but as a layer of its own). The pigment epithelium has metabolic
functions; it lies between the photoreceptor layer and the choroid and it is
densely packed with pigment granules.
The cells responsible for the transformation of light into nerve impulses are
the rods and the cones. They are not distributed uniformly on the retina: The
concentration of the cones—responsible for daylight vision—is maximal in the
macula, the center of vision.
Once the light has been transformed into a nerve impulse, the information has
to be transported to the brain. This is done by the optic nerve that enters the eye
by the optic disk (or papilla). The papilla does not contain any photoreceptor;
it is also called the blind spot.
The retinal vessels that nourish the retinal tissue also enter by the optic disk:
Only the macula is exclusively nourished by the choroidal layer and is therefore
not vascularized.
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 317
These three features—the macula, the papilla, and the vascular tree—are the
main anatomical features of the retina.
The retina can be seen as an exterior part of the brain; it is highly specialized
and complicated. There are many diseases that can affect this part of the eye
and one of the most important is diabetic retinopathy.
Microaneurysm
Exudate
Optic disk
Retinal Vessel
Hemorrhage
Choroidal vessel
retina. These new vessels are normally weak and may cause vitreous hem-
orrhages, which are one of the main reasons for irreparable vision impair-
ment and blindness due to diabetic retinopathy.
Color images are becoming increasingly important for the diagnosis of diabetic
retinopathy; their acquisition is cheap, noninvasive, and easy to perform. It is
only in the last decade that they have become—due to considerable technical
improvement of their acquisition—really important for the diagnosis of this
disease. Before, fluorescein angiographies have been used for years. Although
the latter ones still allow detection of microaneurysms—the lesion characteristic
to diabetic retinopathy—with a greater sensitivity, they are invasive and costly
and therefore not adapted for screening purposes.
In this section, we discuss the color content of fundus images, and we deduce
a color representation which is adapted for automatic treatment.
Hemoglobin
Molar extinction coefficient (10000 L/Moles/mm)
Melanin
0
400 500 600 700
Wavelength
one to exploit this interpretation of the reflected spectrum. A color image and its
decomposition into the three channels, red, green, and blue, is shown in Fig. 7.4.
Of course, the given interpretation holds only approximately: The red channel
is not the spectral response to red illumination, but the red part of the spectral
response to illumination with white light.
The RGB representation of a color fundus image is shown in Fig. 7.4. The red
channel is relatively bright and the vascular structure of the choroid is visible.
The retinal vessels are also visible but less contrasted than in the green channel
(compare Fig. 7.4(b) with Fig. 7.4(c)). The blue channel is noisy and contains
only few information. The vessels are hardly visible and the dynamic is very
poor.
This phenomenon can be observed in all retinal images we have studied
(about 200) with one difference: Sometimes, the blue channel contains informa-
tion, sometimes, it does not. Indeed, the quality of the blue channel of the RGB
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 323
color space depends on the age of the patient and on the yellowing of the cornea;
the cornea can be understood as a filter that screens out ultraviolet radiation.
With the age, the cut-off frequency moves toward the blue, and even blue light
can no longer pass.
This interpretation of the color content of fundus images privileges the use
of the RGB color space, for the channels have a physical meaning. We have
compared qualitatively the green channel of 30 fundus images with channels
of other color spaces (HSV , HL S, Lab, Luv, principal components) and for
all images, the green channel was better contrasted than any other channel (at
least concerning all blood-containing features). Another advantage of the use
of the green channel is that the choroidal vessels do not appear at all, whereas
they do appear in the luminance channel for instance, for it is a combination
of the three channels R, G, and B. This is why, we work mainly on the green
channel.
Figure 7.5: Erosion and dilation with a circular SE of a retinal image (detail).
The gray level dilation/erosion substitutes the value f (x) by the maxi-
mum/minimum of f for all the pixels contained in the translated SE Bx :
2 3
ε B ( f ) (x) = min f (x + b)
b∈B
2 B 3
δ ( f ) (x) = max f (x + b) (7.1)
b∈B
In Fig. 7.5, the effect of these operations are shown. We see that the erosion
enlarges dark details and reduces bright ones and the dilation enlarges bright
details and reduces dark ones.
γ B (·) = δ B̆ ε B (·)
φ B (·) = ε B̆ δ B (·) (7.2)
Figure 7.6: Opening and closing of a retinal image (detail) with a circular SE.
local minima (“sources”). The flooding level s is the same for the whole image;
all pixels with a gray level value lower than s belong therefore to a “lake” (see
Fig. 7.7(a)). When two lakes meet, a “wall” is built between the two lakes, i.e.,
the pixel where the two lakes meet forms part of the watershed line W S( f )
(see Fig. 7.7(b)). The whole image is flooded in this way giving an image that
contains the watershed line W S( f ) and as many regions as local minima in the
original image f (see Fig. 7.7(c)). These regions are called catchment basins
C Bi in analogy to their topographic interpretation.
The presence of many minima dues to the noise present in real images results
in over-segmentation. The number of minima can be reduced before calculating
the watershed transformation by means of the morphological reconstruction.
Therefore, we calculate a marker image m, which takes the value f (x) for all
the “marked pixels” and tmax elsewhere (see Fig. 7.8(a)). Then, we calculate the
reconstruction by erosion R∗f (m), i.e., we remove (“fill”) all not marked minima
(Fig. 7.8(a)). For this modified image, the watershed transformation gives a more
persistent result (Fig. 7.8(b)).
marker
image f
reconstruction
There are three systematic problems that occur in nearly all segmentation tasks
of color fundus images:
r Nonuniform illumination
r Poor contrast
r Noise
For the attenuation of noise, we cannot propose filters that are applicable in
general, because the exigency on such filters depends on the segmentation task.
If big features are to be detected (e.g. the macula), strong filters can be used
whereas algorithms dedicated to the detection of small details (e.g. microa-
neurysms) must rely on filters that preserve even small dark details.
In this section, we present an algorithm for contrast enhancement and shade
correction. First, we propose a simple global contrast enhancement operator.
Applying this operator locally enhances the contrast and corrects nonuniform
illumination in one step.
:T →U
u = (t)
τ = t − µt
1
ν = u − (umin + umax ) (7.5)
2
328 Walter and Klein
∗ (τmin ) = νmin
lim ∗ (τ ) = 0
τ →0−
lim ∗ (τ ) = 0
τ →0+
∗
(τmax ) = νmax (7.7)
−νmin 1
(umax − umin )
a1 = = 2
(−τmin )r (µt − tmin )r
−νmax − 12 (umax − umin )
a2 = =
(−τmax )r (µt − tmax )r
1
b1 = νmin = (umin − umax )
2
1
b2 = νmax = (umax − umin ) (7.8)
2
and finally, for u = (t):
⎧ 1 (u −u )
⎨ 2 (µmax min
· (t − tmin )r + umin if t ≤ µt
t −tmin )
r
u = (t) = (7.9)
⎩ − 12 (umax −umin )
(µt −tmax )r
· (t − tmax )r + umax if t > µt
The corresponding graph is shown in the Fig. 7.9 for different µt . The resulting
transformation is not symmetric to the point (µt , 12 (umax + umin )).
With r, we can control the strength of the contrast enhancement. For µt =
1
(t
2 min
+ tmax ), we obtain a linear contrast stretching operator for r = 1. For
r → ∞, we obtain a threshold operation with the thresh µt .
If this operator is applied to the whole image as a global contrast operator, the
result is not satisfying due to the nonuniform illumination. In fact, the proposed
gray-level transformation does not enhance the contrast for any subset of T,
∂u
but only for subsets for which ∂t
> 1. For instance, the contrast of a dark detail
situated in a dark region may even be attenuated.
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 329
u max
1 (u + u min )
2 max
u min
tmin tmax t
µt
Figure 7.9: The graph of the gray level transformation for different µt .
In the corrected image, the gray-level values depend only on the difference
between the original value and the background approximation.
The local contrast enhancement operator : In order to obtain a shade cor-
rection operator, which also enhances the contrast, we apply the gray-level
transformation from Eq. (7.9) locally, i.e. we substitute the global mean µt by a
local background approximation.
One possibility is to calculate the mean value of f within a window W cen-
tered in the pixel x:
1
µtW (x) = f (ξ ) (7.11)
NW ξ ∈W (x)
The results obtained by the application of this operator are shown in Fig. 7.10
and Fig. 7.11.
(a) Detail of the green (b) The shade correc- (c) The final shade cor-
channel of a fundus im- tion operator with the rection and contrast en-
age containing hard ex- local mean value as hancement operator
udates background approxima-
tion
The main anatomical features in fundus photographs are—as it has been ex-
plained in section 7.1—the vascular tree, the optic disk, and the macula. In the
following subsections we present methods for the detection of the vessels and
the papilla. An algorithm for the localization of the macula can be found in [9].
7.5.1.1 Motivation
Detecting the vascular tree is essential for the analysis of fundus images. The
structure of the vascular tree gives useful information for other feature or lesion
332 Walter and Klein
detection algorithms (e.g., optic disk, macula, hemorrhages). Over and above
that, it delivers landmarks for image registration.
7.5.1.2 Properties
Vessels are elongated features, much longer than, thick, reddish, and darker
than the background. They enter the retina by the optic disk and go all over the
retina forming the vascular tree.2 With increasing distance from the optic disk,
the vessels become thinner and their contrast decreases. Contrast and color of
vessels vary considerably from one image to another. Even in the same image,
there may be color differences, as color depends on the vessel type (artery
or vein), its diameter (the amount of hemoglobin that is transported), and the
illumination of the retinal region where the vessel is situated.
The width of the thickest vessels is almost constant for all images taken with
the same angle and the same resolution; we can state that all vascular structures
in fundus images are thinner than a parameter λ (which depends on resolution
and angle of the image).
As we have seen in section 7.2, vessels appear best contrasted in the green
channel fg of the color images; our algorithm for vessel detection is exclusively
based on the use of this channel. The main difficulties we have to deal with are
as follows:
r The vascular tree may be interrupted by the presence of lesions (as shown
in Fig. 7.12(b)) or noise.
2
The vascular tree as it appears in color images, is not a “tree” in the topological sense, as
veins and arteries usually cross each other. It is more like a “net” of piecewise linear structures.
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 333
(a) Two vessels corrupted by (b) A part of a vessel deconnected from the
noise rest of the vascular tree by an exudate
In this section, we present a new method for the detection of vessels in fun-
dus images. The main idea is to detect thin structures in gray-scale images by
evaluating the local contrast along watershed lines. This algorithm can also be
applied to other problems where thin structures are to be found.
Prefiltering: As we can see in the Fig. 7.13(a), spaces between hard exudates
are a systematic source of false positives for vessel detection algorithms. In
order to remove small exudates, the prefiltered image p is calculated as follows:
p = γλa fg (7.13)
with fg the green channel and γλa the area opening with the parameter λ. The
result of this prefiltering step is shown in Fig. 7.14(b). One may notice that this
filter is not very restrictive, the borders of the different features present in the
image are not altered, but the small exudates are removed.
Extraction of dark details: Vessels appear as dark features in the green
channel of a color image, their maximal width is known and does not vary with
the image (as far as the resolution is the same). As we have seen in section 7.3,
vessels can be removed from this image by means of the morphological closing
with an appropriate size s1 (see also Fig. 7.6(c)). Calculating the difference to
the original gives all the dark details that cannot contain the SE:
ϑ p = φ s1 B ( p) − p (7.14)
In the top-hat image ϑ p (shown in Fig. 7.15(a)), vessels appear as bright fea-
tures, elongated and connected. However, because of contrast differences be-
tween retinal images and between different vessels in one image, only a raw
approximation of the vascular tree can be found by means of simple threshold
techniques, as shown in Fig. 7.15(b). In our example, the vessels are obtained
by an area threshold TK , proposed in [12]: The threshold is chosen in such a way
that the resulting binary image contains at least K pixels.
Extraction of the crest lines: Considering the image shown in Fig. 7.15(a) as a
topographic surface, we can notice that the vessels correspond to the crest lines
in this image. An excellent tool for finding the crest lines in a gray-scale image
has been presented in section 7.3: the watershed transformation. The strategy is
to first find a good marker, then calculate the watershed transformation, and in
the final step apply a contrast criterion in order to distinguish real vessels from
false detection.
The usual technique to obtain a good segmentation result using the water-
shed transformation is to use markers (see section 7.3), i.e., the image is flooded
only from “important” minima, the others are filled by means of the morpholog-
ical reconstruction. Here, the markers must be chosen in such a way that the
watershed line coincides with the vessels. It is, therefore, very important that
336 Walter and Klein
we mark all the “valleys” that are completely or partially surrounded by the crest
lines. Such a marker is shown in the Fig. 7.16.
In order to obtain such a marker, we determine the points having maximal
distance from the approximation shown in Fig. 7.15(b). In a first step, we fill the
small “holes” of the thresholded image by a surface closing of small size, i.e., we
remove all “holes” having less than 5 pixels, and then we invert the result and
we determine the local maxima of the distance function:
2 3c
m1 = φλa TK ( p)
f (x) if x ∈ Max {D(m1 )}
m(x) = (7.15)
tmax if not
The distance function is shown in Fig. 7.17(a), its maxima superposed to the
original image in Fig. 7.17(b). Of course, the presence of dark noise and features
(a) The distance image of the inverted (b) The marker image (here super-
approximation posed to the green channel of the orig-
inal image)
(a) The watershed line and the catch- (b) The application of the contrast
ment basins criterion
Figure 7.18: The watershed line and the result of the application of the contrast
criterion.
in the original image may produce a lot of spurious objects in the approximation
image. As a consequence, there are more markers than necessary, but the number
of minima has been significantly reduced, and the watershed line can now be
determined:
r bifurcation points (BIF): all points of the WSL that have more than two
neighbors.
Fi,j
Bj
Bi
Figure 7.19: Two catchment basins BVi and BV j and the frontier Fi, j between
them.
In the top-hat image, vessels appear brighter than the background (brighter than
the adjacent regions) and changes in gray-level on the vessels are slow. Let us
now consider two catchment basins CBi and CB j and the frontier Fi, j between
them (see also Fig. 7.19). If Fi, j corresponds to a vessel, the mean gray-level
value of the top-hat image on Fi, j must by higher than the mean gray level on
the two catchment basins. Let ϑ p be the top-hat image and #A the number of
pixels of the set A. We can then write the first criterion c1 :
1
µ Fi, j = p2 (x)
#{Fi, j } x∈Fi, j
1
µ BVi = p2 (x)
#{BVi } x∈BVi
1
c1 (Fi, j ) = (µ Fi, j − µ BVi ) + (µ Fi, j − µ BV j ) (7.17)
2
Evaluating the contrast criterion c1 , all the false branches not coinciding with a
dark detail extracted by the top-hat are removed. However, the result is not yet
satisfying, because there are still false positives that are due to some small, not
connected dark details like hemorrhages close to vessels producing also a quite
high value for c1 . In order to remove these false positives from the segmentation
result, we have to take into consideration the local gray-level variation on the
branch:
1
σ Fi, j = ( p2 (x) − µ Fi, j )2
#{Fi, j } − 1 x∈Fi, j
c2 (Fi, j ) = c1 (Fi, j ) − α · σ Fi, j (7.18)
The result V1 is shown in Fig. 7.18(b). We see that there are still small false pos-
itives. In fact, they are so small that the criterion c2 has no meaning. Therefore,
we remove all the connected components of V1 that contain less than λ pixels
(we chose λ = 30):
With this technique, we obtain very satisfying results, if the images do not
contain larger exudates that have not been removed by the prefiltering step.
Indeed, the spaces between exudates form small channels that are quite similar
to vessels. One possibility is to calculate the mean gray level for the branches in
the shade corrected image SCnorm and to use it as a complementary information:
Only if the mean gray level is lower than a certain threshold, the branch is
accepted. In this way, many false positives dues to exudates can be removed.
7.5.1.5 Results
The algorithm has been tested on sixty 640 × 480 fundus photographs taken
with a Sony color video 3CCD camera on a Topcon TRC 50 IA retinograph.
These images have not been used for the development of the algorithm. We
asked an ophthalmologist to mark false detections and missed vessels on the
result (a posteriori evaluation). We obtained a sensitivity of 83% and a predictive
value of 97% an example is shown in Fig. 7.20.
This kind of evaluation is certainly not the best method, as the expert is
influenced by the result of the algorithm. However, vessels are clearly visible
and an expert will always be able to mark them; the same holds for false positives.
Over and about that, if an expert marks all vessels, it is far from being sure that
he will not miss some of them, because this is a boring and time-consuming task.
The optic disk (or papilla) is one of the main features of the retina, its detection is
essential for a system of automatic analysis of retinal images; it is the prerequisite
for other segmentation algorithms (exudates, macula).
In the context of diagnosis of the glaucoma, the detection of and measures on
the optic disk may also be of great importance. Hence, an algorithm of automatic
detection of the optic disk is required.
7.5.2.2 Properties
The optic disk is the entrance of the optic nerve and the vessels into the retina.
It is situated on the nasal side of the macula and it does not contain any photo-
receptor: It is also called the blind spot. In color fundus photographs, the optic
disk appears as a big bright spot of circular or elliptical shape, interrupted by
the outgoing vessels. Its size varies from patient to patient, but its diameter is
always comprised between 40 and 60 pixels in 640 × 480 images. The optic disk
is characterized by a strong contrast between outgoing vessels and the bright
color of the optic disk itself.
Unfortunately, this description is not valuable for all images: Sometimes, the
contours are not clearly visible, the color tends more to a pale white, and there
may be other regions in the image which are as bright or even brighter than the
optic disk (due to nonuniform illumination or the presence of exudates).
In [13], the authors localize the optic disk using the high contrast between the
papilla and the outgoing vessels. This method fails if there are exudates in the
image.
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 341
In [14], the authors use an area threshold for localization of the papilla, the
Hough transform for the detection of its contours. The Hough transform is also
used by [15]. The main problems that have been stated are low contrast and the
case where its shape does not correspond to a circle (for example, if the optic
disk is situated on the border of the image).
In [16], the authors use a template matching approach for the localization
of the optic disk. The problem with this approach is the size variability of the
papilla between different images and the presence of large accumulations of
exudates.
The presented algorithm can be subdivided into two parts: the localization and
the detection of the contours of the optic disk. First versions of this algorithm
have been presented in [17, 18].
Localization: As the optic disk belongs to the brightest parts of the image,
the idea to apply an area threshold in order to find at least a part of the optic
disk may work well, if there do not exist large accumulations of exudates or
other bright regions. The atrophy in Fig. 7.21(a), for example, corresponds to
a yellow spot and its size and shape are comparable to the one of the optic
disk. Before we can apply a threshold to the image, it is therefore necessary to
remove these bright features. This can be done using the vascular tree we have
already detected: As the optic disk is the entrance of the vessels into the retina,
It is recommended not to use the complete vascular tree V , but only the main
branches that can be extracted easily by applying a stronger contrast criterion
in the algorithm presented in section 7.5.1.
The effect of this filtering is shown in Fig. 7.21: The atrophy present in the
image (a) is removed in (b), the optic disk stays nearly entirely unchanged by
the reconstruction. Using the methods presented in [14, 17, 18], the localization
algorithm would have failed in this case.
Now, we can assume that the optic disk belongs to the brightest elements
of the image, and the application of an area threshold should give a part of the
optic disk:
L 1 normally contains more than one connected component: A part of the optic
disk, some noise, and eventually other bright features connected to the vascular
tree. The latter ones are normally exudates of small size. Hence, it is sufficient
to choose the connected component with the largest surface to obtain a part of
the optic disk:
The center of the (only) connected component of L can be seen as the approx-
imative center c of the optic disk and is used for the detection of the contours
described in the following paragraph.
Detection of the contours: The contours of the optic disk appear under the
best contrast in the red channel fr of the color image. Unfortunately, the red
channel is sometimes saturated and cannot be used. In this case, we propose
to work on the luminance channel fl . The first step is to determine if the red
channel is saturated or not. Let c be the approximative center determined in the
localization step of the algorithm, fr a subimage of the red channel centered in
c, and tmax ( fr ) the maximal gray-level value within this subimage. We define the
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 343
(a) The luminance chan- (b) The biggest particle (c) The distance func-
nel of the threshed image tion of the particle
(d) The gradient im- (e) The result of the (f) The segmentation
age with the super- watershed algorithm result
posed marker
Figure 7.22: The steps of the algorithm for the detection of the contours.
gray-level saturation Sα 3 :
#T[tmax ( fr )−α,tmax ( fr )] ( fr )
Sα = (7.24)
#T[0,tmax ( fr )] ( fr )
This measure determines the percentage of pixels in the subimage whose gray
level is larger than tmax ( fr ) − α. If this percentage is too high, the channel is
saturated and does not contain any exploitable information. We use the red
channel, if for α = 30, Sα < 0.5 (this has been found experimentally), if not, the
luminance channel is used. We call the used channel fc in the following.
For finding the contours of the optic disk, we shall make use of the watershed
transformation applied to the gradient image of a filtered version of the channel
fc (see also Fig. 7.22).
3
The gray-level saturation Sα should not be confounded with the color saturation.
344 Walter and Klein
First, we attenuate the noise in the image using a Gaussian filter G (type and
parameters of the filter are not crucial, we used a 9 × 9 filter with σ = 4). Then,
the vessels interrupting the circular shape of the optic disk are filled using a
morphological closing:
p1 = φ (s1 B) (G ∗ fc ) (7.25)
with s1 such that the largest vessels are filled (as explained in the previous
section). In order to remove irregularities within the papillary regions that may
also produce a high-gradient value, we apply an opening by reconstruction:
p2 = R p1 (ε (s2 B) ( p1 )) (7.26)
s2 = 15 has been found to be a good value for 640 × 480 images. This is
a big opening, but thanks to the reconstruction, the contours of p1 are
preserved.
Then, the morphological gradient is calculated:
7.5.2.5 Results
The algorithm has been tested on 60 color fundus photographs (640 × 480)
taken with a Sony color video 3CCD camera on a Topcon TRC 50 IA
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 345
retinograph. These images have not been used for the development of the
algorithm.
The optic disk has been localized correctly in 57 of these 60 images. In 3
of these 60 images, there were very large accumulations of exudates which
inhibited a correct localization of the optic disk. The accuracy of the detection
of the contours has been assessed qualitatively by a human grader; there were 48
images, for which the segmentation result was satisfying, with no or few pixels
missed or falsely detected (e.g. see Fig. 7.23). In eight images, there were some
parts missing due to very poor contrast of the original images, but the result
contained still more than 75% of the optic disk. In one image, the result was
not satisfying, once again due to low contrast: Indeed the contour was hardly
visible, even for a human.
Pathology detection is certainly the most important part of analysis of retinal im-
ages. In diabetic retinopathy, there are three types of lesions indicating different
stages of the disease that can be detected using color fundus images: microa-
neurysms, exudates, and hemorrhages. In this section, we present automatic
algorithms for the detection of microaneurysms and exudates. An algorithm for
the detection of hemorrhages can be found in [9].
346 Walter and Klein
7.6.1.2 Properties
Microaneurysms are tiny dilations of the capillaries. They appear as small red-
dish isolated patterns of circular shape in color fundus images of the human
retina [1]. Their diameter normally lies between 10 and 100 m, but it is always
smaller than 125 m. As they come from capillaries, and as capillaries are not
visible in color fundus images, they appear as isolated patterns, i.e. disconnected
from the vascular tree.
Microaneurysms are sometimes hard to detect: Their contrast is often very
low and sometimes, they are hardly visible and difficult to distinguish from noise.
Their reddish color can hardly be used for their detection, because it is far from
being constant in different images (see Fig. 7.24).
The first algorithm for the detection of microaneurysms has been presented Laÿ
0
[19]. The author introduced the radial opening γ sup = γ L i , i.e. the supremum
B B B
A A
p = G ∗ SCnorm ( fg ) (7.30)
348 Walter and Klein
The detection of dark isolated details by means of the diameter closing: The
next step is to find the “candidates,” i.e., all features that may possibly correspond
to microaneurysms. Microaneurysms are characterized by their diameter; in the
green channel of a color image, they correspond to dark details—“holes”—with
a maximal diameter of λ (with λ depending on the image resolution).
As in the top-hat transformation used for vessel detection in section 7.5.1,
the main idea is to first construct a closing φ that removes the details from
the image and then calculate the difference to the original image. However,
a morphological closing cannot be used in our case because it fills not only
the holes but also the ditches (vessels). One possibility to fill only the holes
without filling the ditches is to determine the infimum of openings with linear
structuring elements in different directions, because they do fit into the vessels
in at least one direction. However, this is only an approximative solution of the
problem; a tortuous line for example will be closed as well. We will now present
the diameter closing φλ◦ which removes all dark details of a diameter smaller
than λ.
First, we define the diameter α of a connected set X as its maximal extension,
i.e. the maximal distance between two points of the set:
5
α (X) = d(x, y) (7.31)
x,y∈X
with d(x, y) the distance between two points x and y. For simplicity, we use
the block distance: If x = (x1 , x2 ), y = (y1 , y2 ) ∈ Z2 are two points and x1 , x2
and y1 , y2 their coordinates, respectively, the block distance can be written as
d(x, y) = |x1 − y1 | ∨ |x2 − y2 |.
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 349
Figure 7.27: The diameter opening of a binary image: all connected components
with a diameter inferior to 15 pixels are removed.
With this definition of the diameter of a set, we can define a trivial opening.
Let X be an arbitrary binary image and Xi its connected components, i.e. X =
0
Xi and Xi ∩ X j for i = j. The diameter opening is the union of all connected
components Xi with a diameter greater or equal to λ (see Fig. 7.27):
"
γλ◦ (X) = Xi (7.32)
α(Xi )≥λ
flooding level s
Cx Xs− (f )
−
Xs (f )
is equal or superior to λ:
⎛ ⎞
2 3 "
φλ◦ (X) (x) = X ∪ ⎝ Xic ⎠
α(Xic )<λ
6
= φB (7.34)
α(B)≥λ
We have now defined the diameter opening and closing for the binary case. In
order to pass from binary to gray-level images, we can apply the binary operator
to all level sets (the results of threshold operations for all gray levels t ∈ T). Let
C x (X) be the connected opening, i.e., the connected component of X containing
/ X. Furthermore, let Xt+ ( f ) be the section of
x if x ∈ X and the empty set if x ∈
f at level t, i.e., the set of all pixels for which f (x) ≥ t and Xt− ( f ) the section
of the background (the “lakes,” see Fig. 7.28):
Then, the gray scale diameter opening and closing can be defined respectively:
#
2 3 $
γλ◦ ( f ) = sup s ≤ f (x) | α C x Xs+ ( f ) ≥ λ
#
2 3 $
φλ◦ ( f ) = inf s ≥ f (x) | α C x Xs− ( f ) ≥ λ (7.36)
Of course, Eq. (7.36) cannot be used for implementation of this algorithm be-
cause it would be highly inefficient. Instead of calculating the diameter opening
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 351
(a) The prfiltered and (b) The diameter clos- (c) The associated top-
shade corrected image ing hat transformation
The automatic threshold: The threshold can be seen as the minimal contrast
a detail must have in order to be considered as a candidate.
If the threshold is chosen manually, we lose the main advantages of an en-
tirely automatic analysis. If a fix threshold is applied, we have to deal with a
lot of false positives or with poor sensitivity, because the contrast of microa-
neurysms may be very different from one image to another. If it depends ex-
clusively on the histogram of the top-hat image, it is supposed that the image
contains microaneurysms. Hence, we have to find a compromise between a fix
a histogram-dependent threshold.
In order to find an automatic method for the determination of an automatic
threshold, we have analyzed 10 retinal images. For all these images, we have
chosen an “optimal” threshold using ROC-analysis, i.e., a threshold that gives
the best compromise between sensitivity and number of false positives.
This optimal threshold has then been compared to statistical properties of
the top-hat image (standard deviation, amount of noise, volume of the top-hat
image, etc.). The most obvious relation has been found between the volume of
the top-hat image and the optimal threshold. This relation is shown in Fig. 7.30.
This result is not really surprising. The volume of the top-hat image depends
on two image properties: the contrast and the amount of noise. On the one hand,
18
17
optimal threshold
16
15
14
13
80000 90000 100000 110000 120000 130000 140000
volume of tophat
the better the contrast is, the higher the threshold can be chosen. On the other
hand, the higher the amount of noise is, the higher the threshold must be chosen.
However, some “fix” information must be incorporated by using lower and
upper bounds for the threshold:
⎧
⎪
⎨ 13 if V < 80000
tvol (V ) = 10−4 · V + 5 if 80000 ≤ V ≤ 130000 (7.37)
⎪
⎩
18 if V > 130000
C A = RC A2 (C A1 ) (7.38)
This is not the case for candidates situated on the vessels. We can write the
modified candidate image C A as
# 2 3 2 3 $
C A = x ∈ C A | ϑφ (sB) ( p) (x) ≤ 2 · ϑφλ◦ ( p) (x) (7.40)
Candidates situated on the optic disk can be easily removed using the segmen-
tation result from section 7.5.2.
354 Walter and Klein
r The surface: Fundus images are often corrupted by noise (high frequency
gray level variations). Hence, there are many small “holes” and “peaks” in
the image; therefore, the surface of the candidate regions is an important
feature:
r The outer mean value: It is also important to take into consideration the
absolute gray level values on the outside of the candidate. The mean on
the external gradient can help finding false positives due to exudates or
hemorrhages (see Fig. 7.31):
Ex(Ci ) = δ 3B Ci \ δ B Ci
1
µext = p(x) (7.44)
#Ex(Ci ) x∈Ex(Ci )
r The contrast measure: The maximal value of the top-hat image is a contrast
measure: It is the difference between the local minimum and the level
for which the flooding stops. Another contrast measure is the difference
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 355
Figure 7.31: Two types of false positives that can be identified using the mean
value of the prefiltered image on the external gradient of the candidate.
between the mean value on the external gradient of the candidate region
and the mean value on the candidate region itself:
1
µint ( f ) = f (x)
#Ci x∈Ci
1
µext ( f ) = f (x)
#Ex(Ci ) x∈Ex(Ci )
contr f (Ci ) = µext ( f ) − µint ( f ) (7.45)
r The color: We have already seen in the section 5.2 that the green channel
contains the most important information about blood-containing elements
in the retina and this is why it is used for the detection of microaneurysms.
However, there is also some information in the red, and sometimes in the
blue channel. We have studied a lot of color features; the most efficient are
the following two:
1. Color Contrast in the Luv color space: In the Luv color space, the
euclidean distance can be seen as the “true” distance, i.e. the percepti-
ble distance. We used, therefore, the euclidean distance between the
color on the candidate region and the color on its external gradient:
2
contrLuv (Ci ) = (µext (L) − µint (L))2 + (µext (u) − µint (u))2
31
+ (µext (v) − µint (v))2 2 (7.46)
These two features do not depend strongly on each other. They help iden-
tifying some false positives, but their efficiency is limited.
7.6.1.5 Results
The algorithm has been tested on 57 images and the results have been compared
to the ones obtained by two human graders: As for the training set, the specialists
graded the images independently, then they compared and discussed the results.
The result of this procedure was considered as golden standard.
The comparison with the automatic method gave a mean sensitivity of 88.1%
and a predictive value of 83.8% (2.3 F P per image). In Fig. 7.32, an example is
shown.
7.6.2.2 Properties
Exudates appear as bright patterns in color fundus images [1]. They are charac-
terized by a strong contrast; their shape and size are completely variable, and
their contours mostly irregular.
However, they are not the only bright features in retinal images; the optic disk
and eventual over-exposed regions have similar gray levels. Regions surrounded
by vessels may also be bright and well contrasted.
In [25], the authors propose shade correction and image enhancement tech-
niques. Then, a threshold is manually chosen in order to detect the exudates.
We think that a full automation of exudate detection is possible and useful.
In [26], a method based on image enhancement, shade correction, and a
combination of local and global thresholding is proposed and validated.
The method proposed in [28] is based on shade correction and advanced
classification methods.
Our algorithm can be subdivided into two parts: First, we find candidate regions,
i.e. regions that possibly contain exudates. In a second step, we determine the
358 Walter and Klein
(a) (b)
(c) (d)
Figure 7.33: (a) The luminance channel of a color image of the human retina.
(b) The closing of the luminance channel. (c) The local standard variation in a
sliding window. (d) Candidate region.
contours of the exudates. This algorithm has been published and discussed
in [18]; here in we give a sketch of it.
Finding the candidate regions: Regions containing exudates are character-
ized by a high contrast and a high gray-level. The problem that occurs, if we
use the local contrast to determine regions that contain exudates, is that bright
regions surrounded by dark vessels may also produce a high local contrast. As
shown in the section 7.3, vessels can be removed by means of a morphological
closing (see Fig. 7.33(b)):
e1 = φ (s1 B) ( fg ) (7.48)
On this image we calculate the local variation for each pixel x within a
window W (x) (see Fig. 7.33(c)) centered in x:
1
e2 (x) = · (e1 (ξ ) − µe1 (x))2 (7.49)
N − 1 ξ ∈W (x)
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 359
Finding the contours: In order to find the contours of the exudates, we set
all the candidate regions to 0 in the original image (see Fig. 7.34(a)):
0 if ca(x) = 0
m(x) = (7.52)
fg (x) if ca(x) = 0
(a) (b)
Figure 7.34: (a) The candidate regions set to 0 in the original image. (b) The
morphological reconstruction.
360 Walter and Klein
This algorithm has three parameters: The size of the window W and the two
thresholds α1 and α2 . The choice of the size of W is not crucial, and we have
found good results for a window size of 10 × 10. If the window size is very large,
small isolated exudates are not detected. From a medical point of view, this is
not really problematic. The first threshold α1 determines the minimal variation
value within the window that is suspected to be a result of the presence of
exudates. If α1 is chosen too low, the number of false positives increases, if it
is set too high, sensitivity decreases. The parameter α2 is a contrast parameter:
It determines the minimal value a candidate must differ from its surrounding
background to be classified as an exudate.
7.6.2.5 Results
We have tested the algorithm on an image data base of 30 digital images 640 × 480
taken with a Sony color video 3CCD camera on a Topcon TRC 50 IA retinograph.
These images have not been used for the development of the algorithm. Fifteen
of these images did not contain exudates, and in 13 of these 15 no exudates were
found by our algorithm. In two images, few false positives were found (less than
20 pixels).
We asked an ophthalmologist to mark the exudates in the 15 images and
compared the results obtained by the algorithm to his. The comparison was done
pixel-wise (with 1 pixel tolerance), and as for exudates, the number cannot be
determined; it is the surface and the position rather than the number which can
be used for diagnostic purposes.
We obtained a mean sensitivity of 92.8% and a predictive value of 92.4%. In
Fig. 7.35, an example for the automatic detection of exudates is shown (see also
Figs. 7.36 and 7.37).
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 361
(a) (b)
Figure 7.36: (a) A detail of the green channel of a color image containing exu-
dates. (b) The segmentation result.
(a) (b)
Figure 7.37: (a) A detail of the green channel of a color image containing exu-
dates. (b) The segmentation result.
362 Walter and Klein
In this chapter, we have seen different ways of computer assistance to the diag-
nosis of diabetic retinopathy, which is a very frequent and severe eye-disease: im-
age enhancement, mass screening, and monitoring. Different algorithms within
this framework have been presented and evaluated with encouraging results.
However, there are still improvements to be made. The first one is to use
high-resolution images. We worked on images already used in centers of oph-
thalmology, but it is clear that acquisition techniques also improve and that in
the coming years high-resolution images will become clinical standard. Future
segmentation algorithm can make use of this high resolution (e.g. there will be
more features for microaneurysm detection).
Another possible research axis is the inclusion of patient data into the algo-
rithms. This a priori knowledge about the patient is used by physicians; it also
could be used by automatic methods. For instance, we have observed, that the
color of black people’s eyes is quite different from the color of white people’s,
the color of a child’s retina is different from the color of an adult’s eye. This is
precious information that could be used in order to enhance the performance
of lesions detection algorithms.
Even if there is still progress to be made, the presented algorithms work
well; a clinical trial is envisaged.
r True Positive (TP): The patient suffers from the disease and the test was
positive.
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 363
r False Positive (FP): The patient does not suffer from the disease, but the
test was positive.
r True Negative (TN): The patient does not suffer from the disease, and the
test was negative.
r False Negative (FN): The patient suffers from the disease, but the test was
negative.
TP
sensitivity =
TP + FN
TN
specificity = (7.54)
TN + FP
TP
pv = (7.55)
TP + FP
364 Walter and Klein
This is the probability that an object (or pixel) classified as positive is really
positive.
7.9 Acknowledgment
First of all, the authors thank the ophthalmology department of the Lariboisire
Hospital in Paris for their excellent collaboration, their hearty and competent
support, for having supplied all images, and for having evaluated the perfor-
mance of all algorithms presented in this chapter.
This work has been supported by the French Ministry of Education and
Research (MENRT) in the program Dpistage automatique de la rétinopathie
diabtique (00 B 0100 01).
Questions
4. How can dark details in a gray scale image be extracted using mathemat-
ical morphology?
6. How does the use of markers in the watershed transformation work and
what is their influence on the result?
7. How can the watershed transformation be used for the detection of thin
dark lines in a gray scale image?
8. How can the watershed transformation be used for the detection of object
contours?
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 365
10. In the analysis of fundus images, specificity cannot be used for an assess-
ment of the quality of a pathology detection algorithm. Why?
366 Walter and Klein
Bibliography
[1] Massin, P., Erginay, A., and Gaudric, A., Rétinopathie Diabétique, Else-
vier, Editions scientifiques of médicales, Elsevier, SAS, Paris, 2000.
[2] Lee, S. C., Lee, E. T., Kingsley, R. M., Wang, Y., Russell, D., Klein, R.,
and Warn, A., Comparison of diagnosis of early retinal lesions of dia-
betic retinopathy between a computer system and human experts, Arch.
Ophthalmol., Vol. 119, pp. 509–515, 2001.
[3] Delori, F. C. and Pflibsen, K. P., Spectral Reflectance of the Ocular Fun-
dus, Appl. Optics, Vol. 28, pp. 1061–1071, 1989.
[4] Preece, S. J. and Claridge E., Monte Carlo modelling of the spectral
reflectance of the human eye, Phy. Med. Biol., Vol. 47, pp. 2863–2877,
2001.
[7] Beucher, S. and Meyer, F., The morphological approach to image seg-
mentation: The watershed transformation, In: Mathematical Morphol-
ogy in Image Processing, Dougherty, E. R., ed., Marcel Dekker, New
York, pp. 433–481, 1992.
[8] Vincent, L., Morphological area openings and closings for grayscale
images, In: NATO Shape in Picture Workshop, Driebergen, 1992,
pp. 197–208.
[10] Chaudhuri, S., Chatterjee, S., Katz, N., Nelson, M., and Goldbaum, M.,
Detection of blood vessels in retinal images using two-dimensional
matched filters, IEEE Trans. Med. Imaging, Vol. 8, No. 3, pp. 263–269,
1989.
Analysis of Color Fundus Photos and Its Application to Diabetic Retinopathy 367
[12] Sahoo, P. K., Soltani, S., Wong, A. K. C., and Chen, Y. C., A survey of
Thresholding Techniques, Comput. Vision, Graphics, Image Process.,
Vol. 41, pp. 233–260, 1988.
[13] Sinthanayothin, C., Boyce, J. F., Cook, H. L., and Williamson, T. H.,
Automated localisation of the optic disc, fovea and retinal blood vessels
from digital colour fundus images, Br. J. Ophthalmol., Vol. 83, No. 8,
pp. 231–238, 1999.
[15] Pinz A., Prantl, M., and Datlinger P., Mapping the human retina, IEEE
Trans. Med. Imaging, Vol. 1, No. 1, pp. 210–215, 1998.
[16] Osareh, A., Mirmehdi, M., Thomas, B., and Markham, R., Colour mor-
phology and snakes for optic disc localisation, In: The 6th Medical Image
Understanding and Analysis Conference, 2002, pp. 21–24.
[17] Walter, T. and Klein, J.-C., Segmentation of color fundus images of the
human retina: Detection of the optic disc and the vascular tree us-
ing morphological techniques, In: Lecture Notes in Computer Science,
Vol. 2199, Crespo, J., Maojo, V., and Martin, F., eds., Springer-Verlag,
Berlin, pp. 282–287, 2001.
[18] Walter, T. and Klein, J.-C., A contribution of image processing to the di-
agnosis of diabetic retinopathy—Detection of exudates in color fundus
images of the human retina, IEEE Trans. Med. Imaging, Vol. 21, No. 10,
pp. 1236–1244, 2002.
[20] Spencer, T., Phillips, R. P., Sharp, P. F., and Forrester, J. V., Auto-
mated detection and quantification of microaneurysms in fluorescein
368 Walter and Klein
angiograms, Graefe’s Arch. Clin. Exp. Ophtalmol., Vol. 230, pp. 36–41,
1991.
[21] Mendonça, A. M., Campilho, A. J., and Nunes, J. M., Automatic segmen-
tation of microaneurysms in retinal angiograms of diabetic patients,
In: Proceedings of IEEE International Conference of Image Analysis
Applications (ICIAP 99), 1999, pp. 728–733.
[23] Duda, R. O. and Hart, P. E., Pattern Recognition and Scene Analysis,
Wiley-Interscience, New York, London, Sidney, Toronto, 1973.
[25] Ward, N. P., Tomlinson, S., and Taylor, C., Image analysis of fundus
photographs—The detection and measurement of exudates associated
with diabetic retinopathy, Ophthalmology, Vol. 96, pp. 80–86, 1989.
[26] Phillips, R., Forrester, J., and Sharp, P., Automated detection and quan-
tification of retinal exudates, Graefe’s Arch. Clini. Exp. Ophthalmol.,
Vol. 231, pp. 90–94, 1993.
[27] Moreno Barriuso, E., Laser Ray Tracing in the Human Eye: Measure-
ment and Correction of the Aberrations by Means of Phase Plates, Ph.d.
Thesis, Institute of Optics, CSIC, Spain, June 2000.
[28] Osareh, A., Mirmehdi, M., Thomas, B., and Markham, R., Automatic
recognition of exudative maculopathy using fuzzy c-means clustering
and neural networks, In: Proceedings of Medical Image Underestanding
and Analysis Conference, July 2001, pp. 49–52.
Chapter 8
8.1 Overview
1
Department of Radiology, BOX 357115
2
Department of Bioengineering, University of Washington, Seattle, WA 98195
369
370 Xu et al.
The second direction is trying to identify the tissue type distribution in plaque
which is the only way to distinguish vulnerable plaques from stable plaques of
similar size.
The motivation to study the constituents within carotid vessel wall is that
evidence suggests different plaque tissue types yield different vulnerabilities to
plaque rupture. Also, the location of plaque tissues, such as the distance to lumen,
may play a role in plaque rupture. Thus, imaging and analysis techniques that are
sensitive to plaque tissue types are needed and can subdivide a plaque into its
constituent components. This chapter presents the postprocessing techniques
developed for the identification of plaque constituents. In our laboratory, these
techniques have been used to study the characteristics of the human carotid
lesions that caused neurological symptoms [7] and of high cholesterolemia pa-
tients. Technically, magnetic resonance (MR) images obtained from advanced
lesions in human carotid arteries present unique challenges:
1. Small size of artery wall: The carotid artery is usually less than 1 cm
in diameter. Even if high-resolution imaging methods are used, practical
limitations of MR scanners require the field-of-view of the image to be at
least 13 by 13 cm, with a resolution of at most 512 by 512 pixels. Within
these image dimensions, the subject, carotid artery, is normally about 40 by
40 to 100 by 100 pixels ranges. The comparatively small number of pixels
makes the processing and analysis very challenging.
3. Difficulties in tissue separation: Many studies have shown that any in-
dividual MR image can only distinguish between a limited numbers of
plaque tissues, regardless of contrast weighting [8]. Therefore, a need ex-
ists to integrate the information obtained from several different contrast
weightings, like T1W, T2W, PDW, and TOF so as to provide a single rep-
resentation of all plaque constituents. To achieve this, multiple spectrum
data segmentation is very critical in this study.
blood vessel wall. Therefore, it is the critical feature in predicting the oc-
currence of rupture and monitoring the stability of patients’ diseases. As a
result, specialized segmentation techniques aimed specifically at charac-
terizing the fibrous cap must be considered.
generally difficult to obtain ideal segmentation results when they are applied
into other types of images. In recent years, some studies have been conducted to
take advantages of a few algorithms to improve the segmentation performance,
such as the wavelet MRF method [28], the region competition [29], and the Fuzzy
Snake [32], etc. In this study, we develop our solutions following this problem-
solving approach.
The research on segmentation techniques at an early stage was more on sin-
gle frame gray level or monochrome images according to the survey provided
by Haralick and Shapiro [10]. One category of these algorithms is region-based
segmentation that includes region growing, splitting, and merging techniques.
They generally use the intensity smoothness or similarity among neighboring
pixels to find the regions. Another category of approaches find regions based
on the discontinuities or edges in image. Since the closed contour is usually
hard to be obtained in the edge map generated by various edge detectors, such
as gradient operators or Canny edge operator [15], a tedious and more chal-
lenging linking procedure has to be employed to find closed region boundary.
Active contour model (ACM)-based algorithms are a category of segmenta-
tion methods that search the contour of a particular object by minimizing a
curve energy function. Bayesian inference based algorithms are another cat-
egory of segmentation methods. They usually define the segmentation result
image (a label matrix) as a sample of 2-dimensional random field and find the
optimal solution by performing maximum a posterior probability (MAP) esti-
mation. The label matrix is usually modeled as Markov random fields (MRF)
[21, 22] and computed by means of clique potential functions according to its
duality with Gibbs random fields (proved by the Hammersley–Clifford theorem
[33]). In recent publications, most of the research works are focused on the
accurate model description [20], optimized energy minimization searching algo-
rithms design [19], as well as the performance improvement by introducing new
models [28].
The study of image sequences segmentation can be regarded as an ex-
tended application area of single frame segmentation approaches. In addition
to the segmentation results on each frame of the sequence, the correlation be-
tween adjacent frames is often considered, and hence, it makes the automatic/
semiautomatic processing possible. Even though some approaches have been
proposed before [34–37], the designs of algorithms are usually decided by the
detailed correlated features in applications. In our study on the atherosclerotic
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 373
Section 8.5 introduces the specific segmentation methods that we use in fibrous
cap analysis.
8.2.1 Introduction
In this section, we will discuss the segmentation techniques for gray-level image.
This is because the subjects in our study, the MR images, are gray level intensity
based, with pixel intensity within range 212 –216 defined by different MR scanner
manufactures. In addition, the methods for gray-level image are usually the basis
for processing of a MR image sequence and multiple contrast images.
Gray-level image segmentation techniques have been studied for years.
Among the existing algorithms in literature, some are based on the pixel inten-
sity distribution or histogram [49–52], some use region-based splitting/merging
approaches [11–14], and some are derived from morphological operations [53,
54]. They have been successfully employed in many applications. However, the
drawback of these algorithms is the poor performance in noisy environment.
Some Bayesian inference based segmentation techniques [19–22, 55], using the
MRF as image model to improve robust performance to noise, have been pro-
posed in recent years and become very popular.
This section will focus on the MRF model and its application on gray-level
image segmentation. An enhanced version of the Highest Confidence First algo-
rithm is introduced.
S
h
g
Figure 8.1: Illustration of MRF neighborhood and edge constraint. s and g are
no-edge pixels belonging to different regions and h is an edge pixel within the s
neighborhood.
reduce the complexity of the image modeling and provides a convenient and
consistent way of describing the observed images.
8.2.2.1 Definition
1. pixel (i, j) ∈
/ Ni, j and
(i) P(Xi, j |X p,q all ( p, q) = (i, j)) = P(Xi, j |X p,q all ( p, q) ∈ Ni, j ) (8.1.a)
(ii) P(X = x) > 0 for all x ∈ O (8.1.b)
Condition (i) is called the Markovian property that describes the statistical
dependency of any pixel in the random field on its neighboring pixels. Under
this constraint, only a small number of pixels within Xi, j ’s neighborhood, Ni, j ,
instead of the whole image needs to be considered. Thereby, it reduces the
376 Xu et al.
2. for t = s, t ∈ C, s ∈ C ⇒ t ∈ Ns .
(a) (b)
(c) (d)
Figure 8.2: Illustration of all the possible cliques types associated with a 3 by
3 pixels neighborhood. (a) One pixel clique. (b) Two-pixel cliques. (c) Four-pixel
clique. (d) Three-pixel cliques.
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 377
It is a summation of all the clique energy of each pixel along the whole image.
VsC (x) is called clique energy function.
The assignment of clique energy is completely application dependent [33].
In our study, in order to obtain a precise model description, clique energy is
calculated as a summation of two parts, pixel constraint and edge constraint
[19]. The expression of clique energy function is written as
where the VsP (x) is the energy function derived by considering the spatial con-
straint of pixel s and its neighboring pixels. It is defined as
VsP (x) = VsP (s, h), (8.5)
h∈Ns
−β1 if xs = xh ,
with VsP (s, h) = (8.6)
+β1 if xs = xh ,
where xs and xh are the labels at location s and h. VsE (x) is the energy function
with an edge constraint:
VsE (x) = VsE (s, h). (8.7)
h∈Ns
Assume h is an edge pixel within the neighborhood of pixel s (see Fig. 8.1),
⎧
⎪
⎪ +β2 if xs = xg , h ∈ Ns , and s, g are on different
⎪
⎪
⎪
⎪
⎪
⎨ sides of an edge,
VsE (s, h) = −β2 if xs = xg , h ∈ Ns , and s, g are on different (8.8)
⎪
⎪
⎪
⎪
⎪
⎪
sides of an edge.
⎪
⎩ 0 otherwise,
be written as:
1 VsP (x) + VsE (x)
P(X = x) = exp − (8.9)
Z s c∈Ns T
P(Y | X)P(X)
P(X | Y) = , (8.10)
P(Y)
where P(Y | X) is the conditional probability of the observed image given the
scene segmentation. The goal of maximum a posterior probability (MAP) cri-
terion is to find an optimal estimate of X, Xopt , given the observed image Y.
Since P(Y) is not a function of X, the maximization process only applies over
the upper portion of Eq. (8.10), P(Y | X)P(X). More accurately, given the ob-
served image, the target of solving an MRF is to find the optimal state Xopt that
maximizes the a posterior probability and take that state as the optimal image
segmentation solution.
In this study, the conditional density is modeled as a Gaussian process, with
mean µs and variance σ 2 for the region that s belongs to in the image domain.
Thus, the intensity of each region can be regarded as a signal µs plus additive
zero mean Gaussian noise with variance σ 2 , and the conditional density can be
expressed as
(ys − µs )2
P(Y | X) ∝ exp − . (8.11)
s 2σ 2
By substituting Eqs. (8.9) and (8.11) into (8.10), the general form of the a poste-
rior probability can be written as
The discussion in section 8.2.2.2 shows that in an MRF the optimal segmentation
solution comes from finding the maximum of the a posteriori probability in Eq.
(8.14), which is equivalent to the minimization problem of the energy function
of Eq. (8.16). However, because of the large size of high-dimensional random
variable X and the possible existence of local minima, it is fairly difficult to find
the global optimal solution. Given the image size with I by J and the gray level
for each pixel is Ng , the total size of the random field solution space is NgI×J
that usually requires a huge amount of computation to find the optimal solution.
For example, the size of MR image on carotid artery is usually 256 by 256, the
gray level of each pixel is 212 , then the number elements of the solution set is
(212 )256×256 = 2786432 , it is a prohibitive to be implemented in most interactive
applications.
Some algorithms have been proposed to solve this problem in literature.
Generally speaking, they can be classified into two categories. One category
380 Xu et al.
Based on this, there are occasional energy ascents in the “cooling” process so
as help the algorithm escape from local minima.
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 381
Suppose that the temperature parameter is at certain value T and the random
field X starts from any arbitrary initial state, X (0) . By applying a perturbation
randomly, a new realization of the random field can be generated as X (n+1) . The
implementation of this perturbation varies in different optimization scheme. For
example, in Gibber sampler, only one pixel is changed in each scan, while all the
other pixels are kept unchanged. Generally speaking, the perturbation is usually
very small so that X (n+1) is in the neighborhood of its previous state, X (n) . The
probability of accepting this perturbation is decided by two factors:
(ii) Temperature T.
It is obvious that the perturbations that lower the energy will be definitely
accepted. However, when there is an increase of energy, the temperature para-
meter T controls the accepting probability in that given the same energy change
E, when T is with relative high value, the accepting probability is more than
when T is relatively lower. Since this probability is based on the overall energy
change, it has no dependency on the scanning sequence as long as all the pixels
have been visited. In each iteration, this perturbing-accepting process will go
on until the equilibrium is assumed as being approached (this is generally con-
trolled by the maximum times of iteration). Then the temperature parameter T
is reduced according to an annealing schedule and the algorithm will repeat the
iterations for equilibrium searching as discussed above with the newly reduced
temperature.
This annealing process will keep on going until the temperature is below the
minimum temperature defined. Then the system is frozen and the state with the
lowest energy is reached.
The annealing schedule is usually application-dependent since it is very cru-
cial to the amount of computation in the stochastic relaxation process and the
accuracy of the final result. Gemen and Geman proposed a temperature-reducing
schedule that is expressed as a function of the iteration numbers:
τ
T= (8.18)
ln(k + 1)
382 Xu et al.
where k is the iteration cycle and τ is the constant. Even though this schedule can
guarantee a global minimum solution, unfortunately, it is normally too slow for
practical applications. Some other annealing methods [58] have been proposed
to reduce the computation burden; unfortunately, it is no longer guaranteed to
reach global minimum.
1 (Ys −µs )2
where p(Ys | Xs ) = √ e− 2σ 2 (8.20)
2π σ 2
(ii) The labeling of pixel s, Xs , depends only on the labels of its local neigh-
borhood as
where Ns is the neighborhood of pixel s and Cs is the set containing all the cliques
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 383
within Ns . This equation shows that the local conditional probability depends
only on Xs , Ys , and Ns .
Based on these relations, the ICM iteratively decreases the energy by visit-
ing and updating the pixels in a raster-scan order. For each pixel s, given the
observed image Y and the current label of all the other pixels (actually only the
neighborhood of pixel s), the label of Xs is replaced with one that can maximize
the conditional probability as
Starting from the initial state, this algorithm will keep on running based on
the procedure introduced above until either the predefined number of iterations
is reached or when the labels of X do not change any more. Then it is regarded
that a local minimum is reached.
Compared with the acceptance probability in SA method, only decrease of
energy change is accepted in the ICM algorithm. This can be regarded as a spatial
case when T = 0 because SA never accept any positive energy change when T
is at zero temperature. This is why ICM is often referred as the “instant freezing”
case of simulated annealing.
Even though ICM provides a much faster convergence than stochastic re-
laxation based methods, the solutions from ICM are likely to reach only local
minima and there is no guarantee that a global minimum of energy function can
be obtained. Also, the initial state and the pixel visiting sequence can also have
effects on the searching result.
(Ys − µs )2 1
Es (Xs ) = 2
+ Uc (X) (8.24)
2σ T c∈Cs
In Eq. (8.26), L min is the one that can make the energy function at site s minimum
among all the elements except 0 in label set L. When a site is uncommitted, it
is obvious that cs (x) is always nonnegative. Label of the site with the maximum
amount of confidence will be changed to L min . The exact purpose of this label
updating process is to actually pick up the site where an appropriate label as-
signment can lower the energy of the whole image the most. In the meantime,
the new label of this site influences the energy and confidence of its neighbors
whose confidences need to be updated before the start of the next iteration. We
can also regard the confidence as an indication that how stable the segmenta-
tion will be with the changing of the label at s. Obviously, the more stable is the
label-updated image due to the change, the more confident we are that the label
at s should be updated.
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 385
The HCF algorithm initially assigns all the sites with label 0 in the segmen-
tation matrix. In each iteration, the algorithm will update only the site with
the maximum confidence to the label that minimizes Es the most. In the cal-
culation of cs (x), only neighbors that are committed can affect the clique po-
tentials. Therefore, once the label of a site changes, either from uncommitted
state to committed state or from one committed label to another committed
label, the confidence of its neighbors will have to be recalculated. The HCF al-
gorithm stops when all the sites are committed and their confidence becomes
negative.
Although it is claimed [46] that there is no initial estimate needed in HCF,
some parameters, such as the mean value of each site, have to be provided in
advance in order to get the confidence calculated when all the sites are uncom-
mitted. For implementation, a heap structure is suggested in Chou’s design that
creates the highest confidence in the searching process faster.
Even though both the ICM and the HCF algorithms can only reach local min-
ima of energy function, the results obtained with HCF are generally better than
that with ICM [19, 46] because of its relatively optimal label-updating scheme.
In HCF, the optimization result is independent of the scanning order of the sites.
Even though this may lead to the increase of the computation amount and im-
plementation complexity, the HCF algorithm is still regarded as a very practical
solution with the fast growing of the processors’ power.
(i) There is no need to predefine the number of classes because the Quad-
Tree algorithm can dynamically decide the partitions based on its split-
ting criterion.
The Quad-Tree procedure initially divides the whole image into four equal-
size blocks. By checking the value of region criterion (RC) Vrc , each block will
be evaluated whether it is qualified to be an individual region. The RC for each
block Bi is defined as
(i) The selection of initial sites is based on the consideration of pixels in the
surrounding region due to the spatial constraint.
(ii) The points within the same region are not used for initialization repeat-
edly so that unnecessary computations can be avoided.
(iii) The iterative comparisons can be simplified during the HCF labeling
process and the problem of “unlabeled small region” in [19] can be solved.
Yes
Any pixel has negative confidence?
No
Select the pixel with highest
confidence, and label it to get
minimum energy
End of segmentation
After the Quad-Tree initialization, in each region, the pixel with closest value
to the mean of region intensity is selected as the representative and assigned a
unique label; the others are all uncommitted and labeled 0. The algorithm will
then start to update labels according to the procedures given in Fig. 8.3 until the
energy of the whole image becomes stable. In this label updating process, the
pixels are permitted to change only within the committed states or from uncom-
mitted states to committed states. They are not allowed to become uncommitted
from committed states.
To simplify the calculation, we assume variance with σ 2 = 12 . The local energy
is normalized as
representing the degree of confidence with which we can update the label of
pixel s. Different from the definition in Eq. (8.26), for a committed site, the
range for label searching is reduced to those existing within its neighborhood
to decrease the confidence computation. For those uncommitted pixels at each
Quad-Tree region, once any of their neighboring pixels is labeled, their energy
and confidence will be affected. The confidence updating process needs to be
conducted by applying Eq. (8.29).
After getting the new confidence of these sites, we search the whole image
and select the one with the largest confidence as the next candidate to be up-
dated. Any candidate site has one of three relations with its neighboring pixels:
isolated, surrounded, or partially surrounded. When isolated (all neighbor-
ing pixels are uncommitted), the candidate pixel is given a new label different
from existing labels. When surrounded (all neighboring pixels are committed),
a unique label for this pixel becomes unlikely. We therefore select one label for
this pixel from the labels owned by its neighbors according to
making the energy of the candidate pixel become minimal. When the situation
of partially surrounded (neighboring pixels are partially committed) occurs, we
first consider it as a surrounded case and if the selected label xk cannot satisfy
Es (xk ) ≤ max{Es (0) − Es (L min )}, then a new label is assigned to this pixel. With
this updating strategy, each region is entitled to have a unique label. This is
different from the original HCF in which disjointed regions may share the same
label. The advantage of the proposed QHCF is that the estimation of each region’s
mean value, µs , is decided solely by the sites in this region, resulting in more
precise estimates during the label updating process.
As shown in Fig. 8.3, the segmented procedure stops when all the pixels pos-
sess negative confidence, where any change of a single pixel’s label will increase
the image’s overall energy. However this does not guarantee that the image en-
ergy will converge to a global minimum. The tradeoff here is the processing
speed because the adopted HCF algorithm can always finish in a finite time [21],
providing a feasible solution to practical applications.
8.2.3.2 Experiments
Two experiments were designed to evaluate the performance of the QHCF algo-
rithm. The first experiment used a phantom image to determine segmentation
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 389
(a) (b)
(c) (d)
(e) (f )
Figure 8.4: A phantom study of the MRF segmentation problem with adaptive
ICM and QHCF. (a) Original phantom image. (b) Phantom is processed with
additive Gaussian random noise. (c) Segmented image with Pappas’ adaptive
ICM algorithm. (d) Segmented result with proposed QHCF algorithm. (e) The
difference between (a) and (c). Number of dark pixels is 120. (f) The difference
image between (a) and (d). Number of dark pixels is 92.
accuracy. In this study, a phantom was used as the ground truth shown in
Fig. 8.4(a). By applying additive Gaussian noise, a test image was created as
shown in Fig. 8.4(b). It was segmented with adaptive ICM [20] and the QHCF
algorithms individually. The results are shown in Figs. 8.4(c) and 8.4(d). Compar-
ison of the segmented images with the original phantom images yielded differ-
ence images (Figs. 8.4(e) and 8.4(f)). QHCF had 92 pixel errors, while adaptive
ICM had 120 pixel errors. Ten phantom image comparisons had been conducted
and the QHFC algorithm sustained 24.7% fewer edge pixel errors than the ICM
algorithm. We can also see from the shape of the ICM-segmented object that
390 Xu et al.
merging of the two parts has occurred, while the proposed QHFC algorithm
sustains the separation. This is due to the edge constraint in the QHCF energy
function that makes it more sensitive at boundaries. However, in the analysis of
the difference image (Fig. 8.4(f)) we note that most errors occur at the boundary
of the segments creating a rough contour. Further work is indicated to solve this
problem.
The second experiment was designed to evaluate the sensitivity of the seg-
mentation result with differing initial conditions. We compared QHCF and uni-
form grid initialized HCF [19] (UGHCF) on human carotid MR images with a
size of 128 pixels by 128 pixels. As UGHCF needs a predefined grid size, we
chose 10, 20, and 30 pixels respectively. For the QHCF algorithm, the standard
deviation of Quad-Tree region’s intensity was used as RC Vrc , and the threshold
was adjusted at 5, 10, and 20 intensity levels, respectively. Other constraint pa-
rameters, such as β 1 and β 2 have same values for the two algorithms. Figure 8.5
is an example of the segmentation result processed with the above initial condi-
tions. Although the overall performance of the two segmentation results seem
quite similar given various input RC values, QHCF gives more consistent results
than UGHCF (for example, the partitioned regions within the white dotted line
circles are stable in QHCF under differing initialization).
(a) (b)
(c) (d)
(e) (f )
even with the same input image. Therefore, for a specific type of images,
some empirical selections of parameters can be adopted. The param-
eters for two categories of images have been analyzed: one is about
lumen segmentation with T1W MR images; the other is about the frame
segmentation in videoconference clip. Table 8.1 shows the typical values
of parameters in two applications for the QHCF algorithm.
Figure 8.6 is an example of the segmentation with different parameter
combinations on T1W MR images. It shows that the parameter combina-
tion Trc = 10, Tmin = 10, β1 = 600, and β2 = 100 has better performance
392 Xu et al.
Trc 10 10
β1 600 400
β2 1000 600
Tmin 10 10
than others for lumen segmentation because all the typical regions, in-
cluding lumens and blood vessel wall, are partitioned correctly. To fur-
ther fine-tune the results, we increase the minimum region threshold
as Tmin = 30 and an even “clear” result can be obtained as shown in
Fig. 8.6(g).
8.3.1 Introduction
As discussed in section 8.1, most segmentation algorithms can perform well
only on certain types of practical images because of the applicability limitation
of each modeling or ease of use. In this section, we will introduce a flexible and
powerful framework for general-purpose image segmentation. In the course of
our investigation, we use the following assumption: A successful segmentation
is an optimal local contour detection based on an accurate global understand-
ing of the whole image. This assumption stems from the fact that the global
information of an image is generally crucial in local object identification, auto-
matic searching initialization, and energy optimization. Therefore, we focused
our work in three parts:
(i) Region segmentation of the whole image: This provides a reliable basis
for decision-making and subsequent processing.
(ii) Local object boundary tracking: This will optimally fine-tune the contour
of the desired object region.
394 Xu et al.
(iii) Flexible identification mechanism: This will bridge parts (i) and (ii) sys-
tematically and also be extendable to allow additional control functions
or prior knowledge.
points or curves must be specified near the object’s boundary initially. When
the algorithm is applied, the Snake will “move” gradually toward the positions
where the object’s contour locates under certain constraints. This deformation
process is generally conducted by iteratively searching for a local minimum of an
energy function. However, a well-known problem of the classical Snake model
is that it may be trapped into local minimal solutions caused by noise or poor
initialization [62].
Another kind of active model is called the geometric active contour model
that was first proposed by Caselles in 1993 [63]. It uses a geometric approach
for the Snake modeling and applies the level set theory in the optimal curve
searching. During the deformation process, the object contour evolutes and ex-
pands in the normal direction under certain constraints. Heuristic procedures
are used to stop the evolution process when an edge is reached. The experi-
ments presented in [63, 64] demonstrate better results than that was done with
the classical Snake model [65, 66]. In 1995, Ceselles further improved the geo-
metric model and transformed the object boundary detection problem as a path
searching for minimal weighted length. This enhanced version is also known
as the “geodesic model,” which experimentally outperforms both the classical
Snake model and the geometric model [67].
The minimum path approach, proposed by Cohen and Kimmel in 1996, is a
state of the art solution in active contour modeling. It uses a path of cost as
the interpretation of the Snake curve. The main feature of this method is that,
given two prespecified end points, the global minimal path can be obtained.
The energy optimization process is based on a numerical method proposed by
Sethian [23] to find the “shortest path” in term of the global minimum of the
energy among all paths joining the two end points. Compared with its previous
versions of active contour modeling, MPA has the following advantages:
(ii) It simplifies the initialization: only two initial end points are needed.
In the rest of this section, we will have a review of classical Snake model and
the minimal path approach since they present the instinct spirit of this model
and the state of the art of the optimal algorithm design.
(ii) Initialization of Snake model: To track the contour of the desired object,
some initial points or a closed curve are generally placed near the object’s
boundary in advance. This usually needs human’s interactive mechanism,
like Snake pit [71], involved to provide a reliable initialization. Otherwise,
either a wrong target boundary may be tracked or the algorithm evolves
with poor convergence [23]. To reduce this model’s sensitivity and sim-
plify its tedious initialization process, some methods have been pro-
posed, such as “Snake growing” method by Berger et al. [72] and Fuzzy
logic based framework by Eugene [73]. Cohen also introduced another
method, called balloon force, to push the Snake curve outward from the
center [74], which shows greater stability and faster convergence [75], by
using the finite-element method. However, the limitations of this method
is also very obvious, such as the location of initial points must be inside
the desired object. Also, the optimal design of the balloon force is not
closed.
Minimal path approach (MPA) is the recent version of the active contour model
proposed by Cohen et al. [23] in 1997. Compared with its predecessors, the main
improvements of this model are in the following two areas:
Similar to other active contour models, the contour evolving process of MPA
is also based on the minimization of an energy function. A potential field P is
created based on the edge map of the original image, and the energy function is
expressed as the integration over P along the curve. The goal of the optimization
process is to find a curve whose energy function is minimum.
The definition of energy function is given as following: assume a pair of
control points are p0 and p1 , the energy of the curve C connecting this pair of
points is
, ,
, ∂C ,2
E(C) = ,
w, ,
(s), + P(c(s)) ds
∂s
, (8.33)
= wL(C) + P(C(s)) ds
Another drawback with MPA is that this approach lacks the topography
handling ability. For some applications within our study, such as carotid artery
lumen contour tracking in MRI sequences, the topology of blood vessel lumen
in each cross-section images may change due to bifurcation, and it is impossible
to apply MPA directly even though the initial points can be provided precisely.
Therefore, a mechanism is needed to track the topology changes for automatic
image processing.
In the segmentation result obtained by the QHCF algorithm, the following infor-
mation is available for further processing: region distribution, region intensity
related properties (such as mean and standard deviation), and region bound-
aries. Although an MRF model can take into account the intensity continuity
among neighboring pixels during the segmentation process, it imposes no con-
straint along the contour direction. Therefore, this problem of the QHCF method
that there is no curve continuity constraint of object’s contour during the opti-
mization process makes the segmented object contour to be easily distorted due
to noise. The experiment results in Fig. 8.5 have shown this drawback (rugged
object boundary) that is unacceptable in some practical applications, such as
quantitative medical image analysis and measurement.
In the proposed framework, a further fine-tune of region’s boundary is ac-
complished by employing the MPA contour model [23]. To have an accurate
initialization, it first needs to find the control points automatically.
As mentioned previously, the labeling process of each pixel in an MRF model
is decided by the MAP, max{ p(xi | y), i = 1, 2, . . . , N, where N is the number of
labels}. Based on this segmentation, the contour of an object can be easily found
by searching region’s boundary points. However, the experimental analysis of the
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 401
Step 1. Locate the boundary points of the desired region based on the QHCF
segmentation.
Step 3. For each section, select the most reliable boundary point as a control
point for MPA.
The selection process of step three is crucial to the success of the algorithm.
Assume the boundary of the object of interest is divided into M sections and
section m, 0 < m = M, contains im total points. To simplify the problem formu-
lation, we considered only the boundary points that have one adjacent region
(they belong to another region). Suppose a particular boundary point is labeled
p and its adjacent region’s label is q, the a posterior probability of this point
with label p and q can be expressed, respectively, as
p(xs = p | y) ∝ exp −(yim − µ p )2 − [U N (xs = p) + U E (xs = p)] ,
s∈S im
(8.34)
p(xs = q | y) ∝ exp −(yim − µq )2 − [U N (xs = q) + U E (xs = q)] .
s∈S im
Assume this point belongs to region with label p, it is obvious that its
a posteriori probability with label p should always have higher value than that
with its adjacent region’s label q. In a real image, like MR images or ultrasound
image, noise affects the capturing process in boundary regions making the above
assumption invalid. Distortion due to noise can blur edges and create a lack of
separation in the a posterior probabilities of the true “edge points.” To assure
good measurement of the probability difference, we introduce the reliability of
boundary points as
The value of the reliability is within the range [0, 1]. If s from the segmented
402 Xu et al.
The above criterion can be applied directly to the boundary points obtained
with the QHCF algorithm because of the location and shape accuracy of the
found object region. A further advantage of this accuracy is the solid founda-
tion from which to do further work. This foundation is similar to the manual
outline provided by the human operator for the traditional Snake algorithm.
Consequently, use of the MRF-based segmentation result and the MAP criterion
allows for an automatic initialization process that is relatively free of traps due
to noise and spurious edges and has consistent reproducibility.
Step 2 addresses the selection problem of section number M and size of each
section. Image quality and confidence of the contour points are determining fac-
tors in finding the solutions. For example, in our carotid lumen segmentation
of MR images shown in Fig. 8.18, typical images generally needed 3–6 sections
for contour tracking, while low-quality images required 8–10 splitting sections
to track the whole blood vessel boundary. Object boundary corruption by noise
results in more splits in the attempt to attain higher accuracy. The size of the ob-
ject also is an important factor. Most of our studies contain objects sequestered
within a square the size of 128 by 128 pixels. Division of the contour is accom-
plished by equal-length splitting. A more dynamic approach can be used in the
case of a contour with noisier pixels resulting in more control points for ACM.
The resulting curve will be noise-resistant and reliable. In addition, processing
speed will be increased.
Step 1 is the most flexible and application-dependent of the three steps. It
may also be totally eliminated in cases that target regions are already known.
However, in most situations lack of advance knowledge of the exact location and
spurious knowledge of the object’s properties can be referenced as an additional
constraint during the segmentation process. When this occurs an identification
process can be designed based on the QHCF segmented regions to extract the
boundary of the region-of-interest, which can then be used in further contour
fine-tuning. An example is the lumen segmentation in a sequence of MR images.
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 403
The lumen may often be almost circular in shape and have a dark intensity.
Applications in step one provide for the design of a decision tree to better identify
the dark lumen from other regions in the QHFC segmentation.
In summary, the search for control points is the crucial step of this pro-
posed framework. This step provides the bridge between the MRF segmenta-
tion algorithm and the active contour model. It decides the initialization ac-
curacy, the key to the success of the minimal path approach. Finally, Step 1,
being flexible, allows space for prior knowledge and integration of target
constraints.
After finding all the control points in the M sections along the closed boundary,
the outline of the desired object can be found similar to the human input initial
points in the classical active contour models. However, compared with the hu-
man inputs, the identified control points are much more efficient and objective,
especially in the situation where large amount of image sequences need to be
processed.
Based on these control points, the complete contour can be found by ap-
plying the MPA algorithm to every two adjacent control points. To improve the
performance in the optimization process, we dynamically frame the path search-
ing range instead of applying it to the whole image. This can avoid the irrelevant
sites in the image and hence reduce the computation. Assume a pair of control
points are P0 and P1 , in our implementation, the searching range is defined as
a square containing p0 and p1 , as illustrated in Fig. 8.8, in which P0 and P1 are
the middle points of the edges. The shape of searching range is often decided
by two factors:
(i) It must guarantee that the minimal path goes through this reduced search
region.
(ii) The implementation of this region boundary control must be easy in case
excessive computation is involved.
P1
P0
involve a lot calculation to control its boundary. For simplicity and generality,
we chose the square region to limit searching range. Another benefit of this re-
striction is that it can work as a control of the overall object shape and prevent
the occurrence of “wild divergence” distorted by noise.
The procedure of region boundary splitting, control point searching, and
curve fine-tuning is illustrated in Fig. 8.9.
3 3
2 2
1 1
4 4
5 5
6 6
(a) (b)
3
2
1 - control point
4
- section separator
5
6
(c)
Figure 8.9: Illustration of the procedure used to apply the MRF-based active
contour model on object boundary tracking. (a) The QHCF segmented region
with the boundary divided into six sections. (b) In each section, a control point
is searched based on maximum reliability criterion. (c) The final fine-tuned
contour is found by linking the curves between each two adjacent control points,
which are searched with minimal path approach.
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 405
First, a typical carotid MR image is shown in Fig. 8.10(a). Because of the noise or
artifacts during imaging process, the intensity of lumen area is not uniform and
there are some isolated bright spots inside. The QHCF algorithm was applied to
this image and segmented it into many regions as shown in Fig. 8.10(c). From
the result, we can see that the lumen segmentation is not affected by those
bright spots inside the lumen area and most of the noise in the background
have been suppressed. This is better than the result segmented with adaptive
ICM algorithm shown in Fig. 8.10(b). By tracking the boundary of lumen region
based on purely QHCF segmented results, we obtained the contour points of
target region as shown in Fig. 8.10(d). It is obvious that some sharp corners on
the top-left part of the contour and the bottom part are also not very smooth;
this conflicts with normal observation of lumen shape in anatomy. In the next
step, this contour was split into six equal sections and we searched the control
points (see Fig. 8.10(e)) with maximum reliability criterion. The MPA algorithm
was then used to track the whole contour and the result is shown in Fig. 8.10(f).
Compared with the contour in Figs. 8.10(d) and 8.10(f), the effect of smoothness
constraint in MPA algorithm is demonstrated. The two rough parts in Fig. 8.10(c)
have also been fine-tuned.
Most existing active contour model based algorithms require the topology of
the object to be known before the tracking action starts. Unfortunately, this
requirement is difficult to be satisfied in some practical scenarios since the
topography is often difficult to be predicted in advance. For example, in our
study of carotid artery, the lumen bifurcates from one common carotid artery
into internal and external carotid arteries at certain location along the image
406 Xu et al.
(a) (b)
(c) (d)
(e) (f)
Figure 8.10: An example of lumen segmentation with MRF-based active con-
tour framework. (a) The original MR image. (b) Segmentation results with adap-
tive ICM. (c) Segmentation results with the QHCF algorithm. (d) Rough lumen
contour based on the QHCF segmentation result. (e) The six selected control
points for MPA model initialization. (f) The fine-tuned lumen contour achieved
by applying the MPA algorithm. In comparing (d) and (f), it becomes clear that
contour tracking under the proposed framework results in superior smoothness
control than with the MRF-based solution.
searching procedure. The bulk of our work is represented by the second block
in the processing diagram shown in Fig. 8.7.
First, we model the MR image with MRF model and segment each of them
into many regions by applying the QHCF algorithm. Since the number of lu-
men region may vary due to the bifurcation of carotid artery along the image
sequence, a lumen identification process is indispensable before further plaque
analysis. For each image, the lumen identification is achieved by letting all the
segmented regions through a knowledge-based decision tree and picking up lu-
men region(s) of interest. The decision criteria are obtained by analyzing the
statistical distribution of lumen region features based on prior knowledge in the
test dataset. In the atherosclerotic blood vessel study, the following features are
regarded critical for lumen identification:
1 N
(2) Region average intensity CIntensity = In (8.37)
N n=1
4πCArea
(3) Region circularity CCircular = , (8.38)
L Contour
where L Contour is the length of region contour;
The basic structure of the decision tree is shown in Fig. 8.11. For criteria
CArea , CIntensity , and CCircularity , statistical analysis of training MR image data re-
quired the use of two standard deviations as the satisfactory scale to make
sure most of the variation range can be covered. For CLocation , it reflects the
maximum radius of lumen center may locate in current slice away from the
center of lumen in the previous slice. To reduce the computation in the identi-
fication process, the most distinctive feature of the target region is always ana-
lyzed first so as to decrease the number of candidates in the following criteria
checking. In our study, the sequence is arranged as CArea , CIntensity , CLocation , and
CCircularity .
From above identification procedure, it can be seen that the accuracy of
the low-level region segmentation plays a very important role in the topography
detection. This can be achieved by using the QHCF algorithm. For the lumen
408 Xu et al.
CIntensity is
No Yes
CArea is
No Yes
CLocation is
No Yes
CCircularity is
No Yes
REJECTED ACCEPTED
Figure 8.11: Diagram of the decision tree structure for lumen identification in
MR image sequences.
Mean SD Mean SD
a
For the common carotid artery.
For lumen contour tracking, prior use of the MPA model has provided more
satisfactory results than MRF segmentation alone. Figure 8.12 is an example of
the lumen segmentation procedure. A typical carotid artery MR image is shown
in Fig. 8.12(a). Because of noise and artifacts during the imaging process, the
intensity of the lumen area is not uniform and contains isolated bright spots. The
QHCF algorithm was applied to this image with the result shown in Fig. 8.12(c)
(note that lumen segmentation was not affected by bright spots in the lumen
or background noise). The contour of interest region is shown in Fig. 8.12(d).
It is obvious that some sharp corners on the left part of the contour and the
bottom part are also not very smooth. Figure 8.12(e) shows the control points
and the final MPA fine-tuned contour is demonstrated in Fig. 8.12(f). Figure 8.13
is an example of the blood vessel tracking in MR image sequence, with lumen
bifurcation included.
Even though experimental results demonstrate good performance of the
proposed framework, by analyzing those cases with error lumen identification
it is found that the decision tree needs to be further enhanced so as to overcome
the disturbance caused by random imaging artifacts in lumen region. Moreover,
additional criteria and optimal decision strategies should also be considered in
future research.
8.3.5 Conclusion
In this section, we discussed a framework, the MRF-based active contour
model, for precise image segmentation and automatic contour tracking of image
sequence data. It combines some of the most attractive features of random field
410 Xu et al.
(a) (b)
(c) (d)
(e) (f )
Figure 8.12: An example of the MRF-based active contour framework. (a) The
original image of T1W MR image on carotid artery lumen. (b) Edge map by Canny
edge detector. (c) Segmentation result of QHCF algorithm with Trc = 10, β1 =
400, β2 = 1000, Tmin = 20. (d) Lumen contour based on the QHCF algorithm.
(e) Six selected control points. (f) Fine-tuned contour by applying MPA.
segmentation and ACM models. In addition, it is also very flexible and can easily
include prior knowledge from various applications. An example of blood ves-
sel tracking and lumen segmentation in magnetic resonance image sequences
is studied and the experimental results have demonstrated very satisfactory
performance.
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 411
(d) (e) (f )
8.4.1 Introduction
It is well known that an object viewed from multiple channels will generally
convey more information than a single-channel observation [78–80]. A very suc-
cessful application is remote sensing. Various sensors are designed to capture
signals reflecting from surface of the earth in different bands. Since different
objects on the earth have different spectrum profile, more details are usually de-
tected by integrating the multibands information than that viewed with a single
band. Similarly, in carotid plaque study, different imaging contrast weightings are
often employed to detect the composition with blood vessel wall [81], and these
multiple contrast weighting (MCW) techniques play a more and more important
role in finding the different tissue types in the studied subject and generally can
provide a more comprehensive view.
To achieve the goal of image segmentation and also to take advantage of the
information with multichannel data, a multidimensional MRF (mMRF)-based
solution will be first discussed in this section, which integrates the information
from all different channels with a dynamical weighting. However, because of
the intolerable amount of computation involved in the optimization process and
intrinsic interspectral independency requirements in mMRF model, this tech-
nique becomes unsuitable to the requirements of interactive MR image analysis
application. As a compromise, a robust cluster based segmentation algorithm is
then put forward in our study, which is with faster segmentation speed.
I1 I2 Id
y1 y2 yd
are observed in d channels as illustrated in Fig. 8.14 and the label matrix of
segmentation result is X. Then a posterior probability can be expressed in Eq.
(8.39).
Another way to express the input data is to view them as d-dimensional vector
Ys = [ys,1 , ys,2 , . . . , ys,d ]T , where the value of ith dimension, ys,i , represents the
intensity at site s in image Ii . There is relation that
where S is the total number of pixels in each image. Based on Bayes’ theorem,
a posterior probability is with form
Ns,1
Ns,2
Ns,d
are the pixels in neighborhood. The prior probability of the whole image is
1 1
p(X) = exp − Vs (x) = exp − [VsN (s) + VsE (s)] . (8.43)
Z s Z s
Similar to the energy definition of monochrome image, the clique energy Vs (x)
for mMRF model at location s is a summation of spatial constraint energy VsN (x)
and edge constraint energy VsE (s) in all channels. Their definitions are given in
the following equations, respectively:
d
VsN (s) = VsN,i (s), (8.44)
i=1
d
VsE (s) = VsE,i (s), (8.45)
i=1
where the VsN,i (s) and VsE,i (s) represent the component in the ith dimension.
Compared to the energy function in traditional mMRF models [46, 82, 83], an
additional edge constraint VsE is added, which preserves the details of each
dimension in the probability description and provides an even more accurate
description of the energy function. This can makes the label updating process
more sensitive at the regions boundaries.
Based on the above definition, we can find the a posteriori probability as
d
1
p(Y | X) ∝ p(Y | X) p(X) ∝ exp − 2
(ys,i − µs,i ) 2
s i=1 2ss,i
d
− (VsN,i (s) + VsE,i (s)) , (8.46)
s i=1
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 415
I1 I2 Id
QT QT QT
Initial
Region
Merging Energy Function E Update
Confidence Update
QHCF
Segmentation Result
1. Complexity factor (CF): It measures the amount of details that each chan-
nel provides at a certain location. In the surrounding region of each loca-
tion, we assume the complexity is proportional to the number of edges.
The more edge points can be detected, the more details this channel can
provide. Since Canny Edge detector [15] has been used successfully in the
energy calculation, it is utilized in our implementation to generate edge
map. Based on the requirements of segmentation performance, two ways
are proposed to evaluate the complexity factor.
(i) Local CF: The number of edge points within a local neighboring
region in each channel.
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 417
(ii) Global CF: The number of edge points in the whole image in each
channel (it is equivalent to local CF with neighboring range as the
whole image).
and the clique energy at each locataion Vs (x) in Eq. (8.9) can be computed
as
d
Vs (x) = WFi [VsN,i (x) + VsE,i (x)]. (8.51)
i=1
(d) (e) (f )
Dell Precision 410 Intel PIII 600 MHz 256 MB Windows NT 4.0
(i) Low processing speed: From the experimental results in Table 8.4, the
time used for multiple contrast weighting MR images is much longer than
that for single contrast weighting ones, which is intolerable for practical
interactive MR image analysis systems.
8.4.3.1 Introduction
In the first step, the cluster center or the expression of density function is
usually estimated so that the distribution of dataset can be clearly described.
The second step is mainly on the dispatch of elements from the dataset based on
certain criteria such as biggest similarity, shortest distance, maximum likelihood
(ML), and K-nearest neighbor, etc.
region with highest density. The MVE searching methods normally use randomly
selected elements in V as the ellipsoids’ initial centers. After inflating each of
these ellipsoids until h elements are covered, the one with minimum volume is
selected as a searched mode. Then we can remove the elements associated with
this mode from dataset V and a new search is repeated until all the cluster centers
are found. Although many approaches based on this multivariate locations esti-
mator have shown its success in various applications [88], experimental results
indicate that the performance of MVE decreases when the number of modes is
greater than 10 [86]. The reason for this phenomenon is that the density defi-
nition in MVE presumes the multivariate normal distribution. Therefore, in the
case of multiple modes, where no mixture of Gaussian distribution appears,
MVE model will not be able to give an accurate description.
Another type of cluster searching techniques is called nonparametric esti-
mation. The advantage is that they require no prior knowledge of the form of
the density function in the search process. They can be applied to arbitrary dis-
tribution dataset. There are two main categories of methods for nonparametric
density estimation, Parzen window and k-nearest neighbors. For the Parzen
window approach, the kernel type needs to be given before it is applied. In the
k-nearest neighbor method, the number of neighbors must be assigned in ad-
vance. Therefore, both require additional prior information. In addition, they are
hard to optimally initialize.
To overcome these problems and provide more robust cluster estimation,
a framework was proposed by Dorin [83]. It is based on the mean-shift algo-
rithm, a nonparametric procedure to estimate density gradients. Although this
method claims to avoid the drawbacks of most existing approaches, it still has
the following weaknesses: (1) The initialization scheme cannot guarantee that
all the cluster centers are under consideration in the search process because of
its random tessellation selection and (2) because of the static size of the search
sphere, the approaching speed may be slowed and the accuracy of cluster center
estimation may be affected.
In this study, the data to be processed is MCW MR image, in which the dis-
tribution of data can vary arbitrarily between subjects. Since it is impossible to
obtain the forms of the underlying density function, a nonparametric technique
has to be employed for multivariate location estimation. To overcome the prob-
lems in Dorin’s method, we first apply a preestimation of cluster distribution to
guarantee that all the typical cluster centers are considered in the initial center
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 423
8.4.3.1.3 Mean Shift Density Estimator. The mean shift algorithm was
proposed by Fukunaga and Hosterler in 1975 as a “very intuitive” [89] estimator
for data density. In 1995, Cheng generalized this algorithm and conducted a more
rigorous study [90]. In this section, we will review its estimation of the density
gradient in a uni-modal situation.
Assume f (v) is the probability density function of a d-dimensional variable
v. Suppose that centered with vector r, a sphere, Sv , with radius r is established.
For any given vector within this sphere, y, the expected distance to sphere center
v is
f (y)
E[(y − v)|Sv ] = (y − v) f (y | Sv ) dy = (y − v) dy (8.52)
SV Sv f (y ∈ Sv )
With Taylor expansion, f (v) can be expressed as
f (y ∈ Sv ) = f (v)VSv , (8.54)
where VSv = c · r D represents the sphere volume. Based on Eqs. (8.53) and (8.54),
Eq. (8.52) becomes [83, 89]:
(y − v)(y − v)T ∇ f (v) r 2 ∇ f (v)
E[(y − v)|Sv ] = dy = (8.55)
Sv VSv f (v) D + 2 f (v)
Expand the LHS of Eq. (8.55), the expected center of the sphere E[v | v ∈ Sv ]
and v have the following relation:
r 2 ∇ f (v)
E[v | v ∈ Sv ] − v = . (8.56)
D + 2 f (v)
For a given sphere, Eq. (8.56) shows that the difference vector between the
local estimation of cluster center and sphere center is proportional to ∇ f (v),
the gradient of the density function at v. It is also reciprocal to f (v). When
it approaches the mode, f (v) is generally large in value and ∇ f (v) is small
due to the slow increase, so a small mean-shift vector is applied. Compared
to traditional density gradient searching techniques [86], in which only ∇ f (v)
is considered, the dynamically adjusted step size used in this method is more
424 Xu et al.
In the density estimation process, the mean-shift vector is very important to the
search speed and accuracy of result. In previous methods [83], the size of the
local estimation sphere is always fixed. This may not work well in the following
two situations: (1) In a location which is far from the mode where the density
distribution is relatively uniform, and if the sphere size is not large enough,
the mean-shift vector may be misled and (2) when the search progress is close
to the mode, the mean-shift vector needs to be sensitive enough to catch the
local change. Therefore, if the sphere volume is too large, the local information
cannot play the determinant role in the vector calculation and hence may affect
the accuracy of center searching.
To solve these problems, we propose a dynamic search range with q lev-
els: r1 , . . . , rq , where ri < ri+1 . The values of r1 and rq come from prior analysis
of MCW MR image data. In the search process, the initial radius starts from
rq (the largest radius). If the mean-shift vector is over the stopping threshold
Tstop , it moves to the next location with the same sphere radius as the previous
position. If the mean-shift vector is lower than Tstop , it uses the next smaller
radius to calculate the mean-shift vector again. Once the mode is found, a
small perturbation is applied [83] and the procedure is repeated to avoid a local
maximum.
Given the initial center of a sphere at location v and starting with sphere
radius index i = q, the proposed dynamic mean-shift algorithm is implemented
in the following procedures:
3. If perturbation has not been applied, add a small vector vpert to the con-
verged result and repeat steps 1 and 2, where |vpert | = |r1 | and its direction
is randomly selected.
2. For each point in the center set, apply the dynamic mean-shift search de-
scribed in section 8.4.3.2 to find the candidate cluster centers in the d-
dimensional data space.
3. Partition the center set elements into subsets in which the distances be-
tween points are within Tsub . Merging each subset by calculating the mean
of elements in it, we arrive at a new center set Y = {yi , i = 1, . . . , p}.
4. To validate the cluster centers, the constraint on the valley between every
two elements in the center set, yp and yq , is applied [83]. Each point, for
example yr after a fixed interval along the line linking yp and yq is checked
and the corresponding density f (xr ) is estimated with Epanechnikov ker-
nel [91]:
1
c (d
2 d
+ 2)(1 − yT y) if yT y < 1
K E (y) = . (8.57)
0 otherwise
Whenever
min[ f (yP ), f (yq )]
≥ Tvalley , (8.58)
f (yr )
a valley is detected. If no valley is detected between yp and yq , the one
with lower density will be removed.
426 Xu et al.
600
500
400
Time (Seconds)
SCW
300
MCW
200
100
0
128 × 128 256 × 256 512 × 512
Image Resolution
5. Using the elements in the center set as the cluster centers, the data in
the d-dimensional space can be decomposed with the k-nearest neighbor
approach.
(8.59)
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 427
Spatial Constraints
Output Segmentation
Figure 8.19: Processing flow chart of the proposed MCW MR image segmenta-
tion algorithm.
in which L i and µLi are assumed to be the label and mean of neighboring region.
U N and U E are the clique energy functions [92]. A MAP criterion is applied to
find the most reliable neighboring region to merge into.
2
3
1
(a) T1W (b)
2
3
1
(c) PDW (d)
2
3
1
(e) TOF (f)
2
3
1
(g)
Figure 8.20: An example of MCW MR image segmentation. (a), (c), (e) The
original contrast weighting images at the same location. (b), (d), (f) The cor-
responding single contrast weighing segmentations. (g) (color) Result with the
proposed multiple contrast weighting segmentation algorithm.
respectively, and (b), (d), and (f) are the corresponding segmentation results
using single contrast weighting only. Three distinctive differences are labeled in
the images. Arrow 1 points to a region poorly segmented by the single T1W and
TOF images. Arrow 2 shows a region that loses all detail in the TOF segmen-
tation. Arrow 3 points to an area that by PDW segmentation shows no detail.
However, the MCW-segmented image (Fig. 8.20(g)) retrieves and reveals these
details by considering all contrast weightings in the segmentation process. Of
20 cases analyzed, 17 showed distinct differences occurring at more than two
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 429
locations between the MCW and SCW methods. Lower image quality (blurring,
poor contrast, and imaging artifacts) was responsible for the poor segmentation
result in the three remaining cases.
The second category of experiments was used to verify the partitioning of
typical tissues of interest of the atherosclerotic plaque. Again we applied the
above algorithm to ex vivo MR images of endarterectomy specimens. Using
histology as the gold standard, we compared each segmented MR image to the
best corresponding histology section. The carotid bifurcation was used as a land-
mark for location registration. On sections distant from the bifurcation, lumen
size and shape and distinctive regions of calcification were used for match-
ing. A coordinate system of eight segments was generated and applied to the
matched histology and MR images. Figure 8.20 contains a pair of sample images.
(a)
(b)
Figure 8.21: An example of MCW MR images verification (CA: calcium, LM:
loose matrix, NEC: necrotic debris). (a) (color) Segmentation of MCW MR im-
ages. (b) (color) Outlined corresponding histology section.
430 Xu et al.
Calcium 69 2 2.8
Calcium (speckled) 18 2 10.0
Necrotic core 16 2 11.1
Necrotic core (mixed) 41 3 6.8
Foam cells 18 2 10.0
Fibrous plaque (dense) 165 8 4.6
Figure 8.21(a) is a MCW segmented MR image overlaid with the eight sector co-
ordinate system. Figure 8.21(b) is the corresponding histology section stained
with Mallory’s Trichrome. Tissues of interest were outlined and labeled prior to
matching.
A preliminary study of 22 matched MRI histology sections from eight patients
were analyzed with results shown in Table 8.5. Typical tissues of the atheroscle-
rotic plaque such as calcification, fibrous matrix, and mixed necrotic cores ap-
pear to have good agreement with histology. For those improperly matched
cases, besides the inaccuracy of segmentation algorithm, the following may also
be part of the reasons that affect the comparison results: (i) low-image quality
(noise involved in the imaging process) and (ii) the deformation of plaques in the
making process of histology section, including shrinkage. These are beyond the
study of our research. However, further refinement of our technique may allow
for better detection of the less discrete tissues such as loose matrix, speckled
calcification, and intraplaque hemorrhage.
8.4.4 Conclusion
In this section, we investigated the segmentation algorithm based on multiple
dimensional MRF model and clustering-based solutions, and introduced an ef-
fective approach for MCW MR image segmentation. This technique is based on
mean-shift density estimation algorithm and was carefully designed to overcome
the drawbacks in other existing methods. Experimental comparisons with his-
tology section have demonstrated its successful performance.
For the processing speed of the proposed DMC-based approach, the same
50 multiple contrast weighing MR images with different image size were also
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 431
used for testing. The comparison of the average segmentation time for DMC and
mMRF approaches are shown in Table 8.6 which indicates that DMC uses much
less time than mMRF.
In the validation experiments with histology sections, we can also note
that poor image quality can reduce the accuracy of the proposed method
by reading the cases that showed little agreement with histology. One hy-
pothesis of this problem is that the poor separation of data in the vector
space V makes the segmentation method unable to distinguish the different
clusters.
Detection of fibrous cap status is crucial for understanding the disease status
and prognosis of atherosclerosis. At the same time, fibrous cap segmentation
is difficult because of resolution issues, registration issues, and the presence of
artifacts. Hence a different approach is required to implement semiautomatic
detection of fibrous cap status.
cap and subsequent thinning are prior events [94]. Thin FCs have been shown
to be associated with symptomatic carotid vascular disease [95]. Studies of
endarterectomy or postmortem histology identify such association retrospec-
tively but methods of in vivo observation of FC status would enable prospective
studies and lead to a better understanding of the pathogenesis. High-resolution
MR imaging has shown promise in this regard. T2 [96], 3D TOF [7, 97, 98], and
gadolinium-enhanced MR [99] have been used for FC imaging. Examination of
multicontrast MRI with black blood (BB) sequences (T1, T2, PD) alongside 3D
TOF has been shown to identify three different cap states: thick, thin, and rup-
tured [97]. A thick FC is considered to be stable while thin and ruptured caps
are indicative of vulnerability. The presence of ruptured caps in MRI is highly
associated with recent TIA [7]. MRI has shown a high sensitivity and specificity
in identifying the three classes of FCs [98].
TOF T1
Figure 8.22: Thick cap—presence of dark rim on 3D TOF due to a thick cap
(arrow). The site of apparent rupture on histology (color) is artifactual and
caused by surgical incision (arrow).
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 433
Figure 8.23: Thin cap—absence of dark rim on 3D TOF due to a thin cap (ar-
row). The apparent focal contour abnormality on T1 (arrow) is due to calcium
and not a real contour abnormality.
TOF T1
Figure 8.24: Ruptured cap—absence of dark rim on 3D TOF and focal con-
tour abnormality on T1 due to a ruptured cap (arrows). The site of rupture on
histology (color) is indicated by an arrow.
434 Xu et al.
2. Focal contour abnormality best observed in black blood images and im-
plying a rupture or erosion of the fibrous cap.
The above two characteristics are used by the algorithm since they are the
primary distinguishing characteristics. Absence of a dark rim is taken to indicate
a thin or ruptured cap. Ruptured caps can be differentiated from thin caps by the
presence of a focal contour abnormality. Other factors such as the presence of
calcium near the lumen surface [98], flow abnormalities [7, 98], and intraplaque
hemoharrage [97] may affect the correspondence between FC status and the
dark rim but are not currently taken into account by the algorithm. To perform
the FC evaluation, matched 3D TOF images and one black blood weighting are
used by the algorithm to identify plaque status (Figure 8.25). Parameters for
the dark rim are measured from the TOF image and those for focal contour
Gradient Curvature
Gradient averaged
three pixels wide along
the normal Local curvature
TOF BB
Figure 8.26: Feature vector calculation: Gradient along the normal to the TOF
lumen contour and ratio of local curvature to global curvature on the registered
BB lumen contour.
436 Xu et al.
ẋ ÿ − ẏẍ
c= (8.60)
(ẋ 2 + ẏ 2 )3/2
is calculated for a small segment of the lumen and its ratio to the average curva-
ture for the whole lumen is assigned to the point in the center of that segment. In
order to obtain gradient and curvature parameters for the same point, the TOF
contour and BB contour are brought into correspondence by registering the
centroids of their convex hulls. With this definition any sharp change in curva-
ture is detected. It becomes significant only when associated with the absence
of a dark rim. However, it has to be noted that this could lead to some false
classifications especially around the bifurcation.
8.5.7 Classification
The parameters for dark rim and focal abnormality were measured from a set
of images identified by radiologists and confirmed by histology. Two sets each
were used for thick, thin, and ruptured caps. Several measurements along the
contour were thus available for each set. The mean and covariance of each pa-
rameter for thick, thin, and ruptured caps was then calculated. These templates
were used for classification by the feature distance of a candidate point from
a template for thick, thin, and ruptured classes. The Mahalanobis distance of
the dark rim parameter was used to differentiate thick caps from the other two
classes. The thin and ruptured classes were differentiated from the remaining
points based on the curvature parameter again using the Mahalanobis distance
metric,
r 2 = (x − m)C −1 (x − m) (8.61)
where m and C are mean and covariance matrices, respectively. This decision is
based on the observation that both thin and ruptured caps do not have a dark rim
but the ruptured can be differentiated by the presence of a focal contour abnor-
mality. Figure 8.27 shows an example of the algorithms classification compared
to ground truth by histology.
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 437
3D TOF T1
8.5.8 Postprocessing
The classified pixels are then postprocessed to remove isolated classifications.
These pixels are merged into those of surrounding pixels by a morphological
opening operation (an element size of 5 was used). This makes the classification
similar to what is outlined by a human operator so that classification by the
algorithm can be compared to ground truth outlined by a pathologist.
8.5.9 Validation
A pathologist outlined the contour classifications from six endarterectomy pa-
tients. These sections were centered on the bifurcation with an average of 8–9
slices per patient. Fifty three sections out of these with matched MR image slices
were chosen for analysis. The classification by the algorithm was then compared
to the ground truth by histology. Each cap status per slice compared point by
point for classification accuracy was used to calculate Pearson’s correlation
coefficients. The algorithm performs well in classifying thick and thin caps with
438 Xu et al.
Thick cap
300
2
250 R = 0.415
200
Classified
150
100
50
0
0 50 100 150 200 250 300
True
a correlation coefficient of 0.64 (significant with p value < 0.0001) and 0.62 (sig-
nificant with p value < 0.0001) as shown in Figs. 8.28 and 8.29, respectively. The
correlation coefficient for the ruptured cap is lesser (0.34, p value of 0.014) due
to more false negatives and false positives. The correlation might be improved if
specimen shrinkage [101] can be accounted for in matching correspondence be-
tween true and classified points. Differential shrinkage of the endarterectomy
specimen during histological processing can cause twisting of the specimen
around the arterial axis thus increasing classification error.
Thin cap
300
250
200 R 2 = 0.39
Classified
150
100
50
0
0 20 40 60 80 100
True
8.5.10 Conclusion
This preliminary algorithm shows promise in separating stable (thick) and un-
stable (thin) fibrous caps. Future work is aimed at improving the detection of
ruptured cap and differentiating it from thin caps. Actual identification of rup-
tured caps is a more complicated issue involving multicontrast MRI with up to
5 weightings (3D TOF, T1, T2, PD, and contrast enhanced T1). A human expert
also uses presence of juxtaluminal calcification, intraplaque hemorrhage, and
thrombus to detect a ruptured cap. An algorithm that takes into account all the
above weightings and factors would be more likely to differentiate ruptured
caps from thin caps.
8.6 Conclusions
Questions
2. Why is the study of constituents within carotid vessel wall very important?
3. Technically, what are the unique challenges in MRI obtained from ad-
vanced lesions in human carotid arteries?
6. What are the criteria used in selecting the control points for active contour
model?
10. What are the primary image features used in automatic fibrous cap
detection?
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 441
Bibliography
[1] Yuan, C., Mitsumori, L. M., Beach, K. W., and Maravilla, K.M., Carotid
atherosclerotic plaque: Noninvasive MR characterization and identifi-
cation of vulnerable lesions, Radiology, Vol. 221, No. 2, pp. 285–300,
2001.
[2] Savies, M. J. and Thomas, A. C., Plaque fissuring: The cause of acute
myocardial infarction, sudden ischaemic death, and crescendo angina,
Br. Heart J., Vol. 53, pp. 363–373, 1985.
[3] Falk, E., Stable versus unstable atherosclerosis: Clinical aspects, Am.
Heart J., Vol. 138, No. 5(Pt.2), pp. 421–425, 1999.
[4] Davies, M. J., Richardson, P. D., Woolf, N., Katz, D. R., and Mann, J., Risk
of thrombosis in human atherosclerotic plaques: Role of extracellular
lipid, macrophage, and smooth muscle cell content, Br Heart J., Vol. 69,
pp. 377–381, 1993.
[5] Fuster, V., Stein, B., Ambrose, J. A., Badimon, L., Badimon, J. J., and
Chesebro, J. H., Atherosclerotic plaque rupture and thrombosis, evolv-
ing concepts, Circulation, Vol. 82, pp. 1147–1159, 1990.
[7] Yuan, C., Zhang, S., Polissar, N. L., Echelard, D., Ortiz, G., Davis, J. W.,
Ellington, E., Ferguson, M. S., and Hatsukami, T. S., Identification of
fibrous cap rupture with magnetic resonance imaging is highly associ-
ated with recent transient ischemic attack or stroke, Circulation, Vol.
105, pp. 181–185, 2002.
[8] Toussaint et al., MRI lipid, fibrous, calcified, hemorrhagic, and throm-
botic components of human atherosclerosis in vivo, circulation, Vol. 94,
pp. 932–938, 1996.
[9] Fu, K. S. and Mui, J. K., A survey on image segmentation, Patt. Recogn.,
Vol. 13, pp. 3–16, 1981.
442 Xu et al.
[12] Leclerc, Y. G., Region growing using the MDL principle, In: DARPA
Image Understanding Workshop, 1990.
[14] Cpong, T., Shapiro, L. G., Watson, L. T., and Haralick, R. M., Experiments
in segmentation using a facet model region grower, Comput. Vision
Graph. Image Process, Vol. 1, pp. 360–372, 1972.
[16] Zhou, Y. T., Venkateswar, V., and Chellappa, R., Edge detection and linear
feature extraction using a 2-D random field model, IEEE Trans. PAMI,
Vol. 11, pp. 84–95, 1989.
[17] Haralick, R. M., Digital step edges from zero crossing of second direc-
tional derivatives, IEEE Trans. PAMI, Vol. 6, pp. 58–68, 1984.
[18] Reichenbach, S. E., Park, S. K., and Gartenberg, R. A., Optimal, small
kernels for edge detection, In: Proceedings of 10th ICPR, 1990, pp. 57–
63.
[19] Meier, T., Ngan, K. N., and Crebbin, G., A robust Markovian segmentation
based on highest confidence first, In: IEEE International Conference on
Image Processing, Santa Barbara, Oct. 1997.
[21] Chou, P. B. and Brown, C. M., The theory and practice of Bayesian image
labeling, Int. J. Comput. Vision, Vol. 4, pp. 185–210, 1990.
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 443
[23] Cohen, L. D. and Kimmel, R., Global minimum for active contour model:
A minimal path approach, Int. J. Comput. Vision, Vol. 24, pp. 57–78,
1997.
[24] Wang, H. and Ghosh, B. K., Geometric deformable model and segmenta-
tion, In: IEEE International Conference on Image Processing, Chicago,
USA, 1998.
[25] Vieren, C., Cabestaing, F., and Postaire, J. G., Catching motion objects
with snakes for moving tracking, Patt. Recogn. Lett., Vol. 16, pp. 679–
685, 1995.
[26] Bertalmio, M., Sapiro, G., and Randall, G., Morphing active contours:
A geometric approach to topology-independent image segmentation
and tracking, In: IEEE International Conference on Image Processing,
Chicago, USA, 1998.
[27] Cohen, L. D., On active contour models and ballons, CVGIP: Image
Understand., Vol. 53, No. 2, pp. 211–218, 1991.
[30] Lumia, R., Shapiro, G., and Zuniga, O., A new connected component
algorithm for virtual memory computers, Comput. Vision, Graphics Im-
age Process., Vol. 22, pp. 287–300, 1983.
[31] Malladi, R., Sethian, J. A., and Vemuri, B. C., A topology independent
shape modeling scheme, SPIE Geomet. Meth. Comput. Vision II, Vol.
2031, pp. 246–258, 1993.
444 Xu et al.
[32] Lin, E., A Fuzzy Global Minimum Snake Model for Contour Detection,
Ph.D. Dissertation, University of Washington, 1999.
[33] Besag, J., Spatial interaction and the statistical analysis of lattice sys-
tems, J. R. Stat. Soc. B, Vol. 36, No. 2, pp. 192–236, 1974.
[34] Yemez, Y., Sankur, B., and Anarim, E., Region growing motion segme-
nation and estimation in object-oriented video coding, ICIP, Vol. 2, pp.
521–524, 1996.
[35] Zhang, J. and Gao, J., Image sequence segmentation using curve evo-
lution, In: 33th Annual Asilomar Conference on Signals, Systems and
Computers, Oct. 1999.
[36] Wilson, R., Meulemans, P., Calway, A., and Krüger, S., Image sequence
analysis and segmentation using G-blobs, ICIP, 1998.
[37] Alatin, A. A., Onural, L., Wollborn, M., Mech, R., Tuncel, E., and
Sikora, T., Image sequence analysis for emerging interactive multime-
dia services—The European cost 211 framework, IEEE Trans. CSVT,
Vol. 8, No. 7, pp. 802–813, 1998.
[38] Allen, J. T. and Huntsberger, T., Comparing color edge detection and seg-
mentation methods, In: Proceedings of IEEE Southeaster Conference,
1989, pp. 722–728.
[41] Taylor, R. I. and Lewis, P. H., Color image segmentation using boundary
relaxation, In: Proceedings of 11th IAPR International Conference on
Pattern Recognition, Den Hague, Netherlands, Aug 30–Sept 2, 1992, Vol.
III, pp. 721–724.
[42] Schettini, R., A segmentation algorithm for color images, Patt. Recogn.
Lett., Vol. 14, pp. 499–506, 1993.
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 445
[43] Bonsiepen, L. and Coy, W., Stable segmentation using color information,
In: Proceedings of CAIP’91, Dresden, Sept 17–19, 1991, pp. 77–84.
[44] Ferri, F. and Vidal, E., Color image segmentation and labeling through
multiedit-condensing, Patt. Recogn. Lett., Vol. 13, No. 8, pp. 561–568,
1992.
[45] Umbaugh, S. E., Moss, R. H., Stoecker, W. V., and Hance, G. A., Automatic
color segmentation algorithms with applications to skin tumor feature
identification, IEEE Eng. Med. Biol., Vol. 12, No. 3, pp. 75–82, 1993.
[46] Chang, M. M., Patti, A. J., Sezan, M. I., and Tekalp, A. M., Adaptive
Bayesian approach for color image segmentation, In: SPIE Conference
on Visual Communication and Image Processing, Boston, MA, Nov 1993.
[47] Wright, W. A., Markov random field approach to data fusion and color
segmentation, Image vision comput., Vol. 7, pp. 144–150, 1989.
[48] Comaniciu, D. and Meer, P., Robust analysis of feature space: color
image segmentation, In: IEEE Conference Computer Vision and Pattern
Recognition, Puerto Rico, 1997, pp. 750–755.
[49] Taxt, T., Flynn, P. J., and Jain, A. K., Segmentation of document images,
IEEE Trans. PAMI., Vol. 11, No. 12, pp. 1322–1329, 1989.
[55] Li, S. Z., Markov Random Field Modeling in Computer Vision, Springer-
Verlag, Berlin, 1995.
[56] Cerny, V., A thermo dynamical approach to the traveling salesman prob-
lem: An efficient simulation algorithm, J. Optimization Theory Appl.,
Vol. 45, pp. 41–51, 1985.
[57] Kirkpatric, S., Gellate, C. D., and Vecchi, M., Optimization by simulated
annealing, Science, Vol. 220, pp. 671–680, 1983.
[58] Murray, D. W., Kashko, A., and Buxton, B. F., An approach to the picture
restoration algorithm of Gemen and Geman on an SIMD machine, Image
Vision Comput., Vol. 4, pp. 133–142, 1986.
[61] Kass, M., Witkin, A., and Terzopoulos, D., Snakes: Active contour mod-
els, Int. J. Comput. Vision, pp. 321–331, 1988.
[63] Caselles, V., Catte, F., Coll, T., and Dibos, F., A geometric model for
active contours, Numerische Mathematik, Vol. 66, pp. 1–31, 1993.
[64] Miller, J. V., Breen, D. E., Lorensen, W. E., O’Bara, R. M., and Wozny, M. J.,
Geometrically deformed models: A method to extract closed geometric
models from volume data, Comput. Graph., Vol. 25, No. 4, pp. 217–226,
1991.
[65] Osher, S. and Sethian, J. A., Fronts propagating with curvature depen-
dent speed: Algorithms based on Hamiltion–Jacobi formulation, J. Com-
put. Phy., Vol. 79, pp. 12–49, 1988.
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 447
[67] Caselles, V., Kimmel, R., and Shapiro, G., Geodesic active contours, In:
Proceedings of the Fifth International Conference on Computer Vision,
Boston, MA, 1995, pp. 694–699.
[68] Amini, A. A., Tehrani, S., and Weymouth, T. E., Using dynamic program-
ming for minimizing the energy of active contours in the presence of
hard constraints, In: Second International Conference on Computer Vi-
sion, 1990, pp. 95–99.
[69] Geiger, D., Gupta, A., Costa, L. A., and Vlontzos, J., Dynamic pro-
gramming for detecting, tracking, and matching deformable contours,
IEEE Trans. Patt. Anal. Machine Intell., Vol. 17, No. 3, pp. 294–302,
1995.
[71] Kass, M. et al., Snakes: Active contour models, Int. J. Comput. Vision,
pp. 321–331, 1987.
[72] Berger, M. O. and Mohr, R., Towards, autonomy in active contour mod-
els, In: Proceedings of 10th International Conference on Pattern Recog-
nition, Atlantic City, NJ, USA, June 1990, Vol. 1, pp. 847–851.
[73] Yuan, C., Lin, E., and Hwang, J. N., Closed contour edge detection of
blood vessel lumen and outer wall boundaries in black-blood MR im-
ages, Magn. Reson. Imaging, Vol. 17, No. 2, pp. 257–266, 1999.
[74] Cohen, L. D., On active contour models and balloons, CVGIP: Image
Understand., Vol. 53, No. 2, pp. 211–218, 1991.
[75] Cohen, L. D. and Cohen, I., Finite-element methods, for active contour
models and balloons for 2-D and 3-D images, IEEE Trans. Patt. Anal.
Machine Intell., Vol. 15, No. 11, pp. 1131–1147, 1993.
448 Xu et al.
[76] Lin, E., Hwang, J.-N., and Yuan, C., Measurements of blood vessel wall
areas in black-blood MR images using global minimum snake algorithm,
In: IEEE International Conference on Acoustic, Speech and Signal Pro-
cessing, Phoenix, AZ, March 1999, Vol. 6, pp. 3409–3412.
[77] Yokoyama, T., Yagi, Y., and Yachida, M., Active contour model for ex-
tracting human faces, In: Fourteenth International Conference on Pat-
tern Recognition, Brisbane, Qld., Australia, Aug 1998, Vol. 1, pp. 673–
676.
[78] Wyszecki, G. and Stiles, W. S., Color Science: Concepts and Methods,
Quantitative Data and Formulae, 2nd edn., Wiley, New York, pp. 113,
1982.
[82] Panjwani, D. K., and Healey, G., Markov random field models for un-
supervised segmentation of textured color images, IEEE Trans. PAMI,
Vol. 17, No. 10, pp. 939–954, 1995.
[83] Comaniciu, D. and Meer, P., Mean shift analysis and applications, In:
IEEE International Conference Computer Vision (ICCV’99), Kerkyra,
Greece, 1999, pp. 1197–1203.
[85] Rousseeuw, P. J. and Leroy, A., Robust Regression and Outlier Detec-
tion, Wiley, New York, Section 7.1, 1987.
[86] Jain, A. K., Murty, M., Narasimha, and Flynn, P. J., Data clustering: A
review, ACM Comput. Surv., Vol. 31, No. 3, pp. 264–323, 1999.
Segmentation Issues in Carotid Artery Atherosclerotic Plaque 449
[87] Rousseeuw, P. J., Least median of squares regression, J. Am. Stat. Assoc.,
Vol. 79, pp. 871–880, 1984.
[88] Jolion, J. M., Meer, P., and Bataouche, S., Robust clustering with appli-
cations in computer vision, IEEE Trans. Patt. Anal. Machine Intell., Vol.
13, pp. 791–802, 1991.
[90] Cheng, Y., Mean shift, mode seeking, and clustering, IEEE Trans. Patt.
Anal. Machine Intell., Vol. 17, pp. 790–799, 1995.
[91] Silverman, B. W., Density Estimation for Statistics and Data Analysis,
Chapman and Hall, New York, 1986.
[92] Xu, D. and Hwang, J.-N., A topology independent active contour track-
ing, In: IEEE NNSP’99, Madison, USA, Aug 23–25, 1999, pp. 164–167.
[94] Willeit, J. and Kiechl, S., Biology of arterial atheroma, Cerebrovas. Dis.,
Vol. 10, pp. 1–8, 2000.
[95] Dhume, A. S., Soundararajan, K., Hunter, W. J., and Agrawal, D. K.,
Comparison of vascular smooth muscle cell apoptosis and fibrous cap
morphology in symptomatic and asymptomatic carotid artery disease,
Ann. Vas. Surg., Vol. 17, pp. 1–8, 2003.
[96] Winn, W. B., Schmiedl, U. P., Reichenbach, D. D., Beach, K. W., Nghiem,
H., Dimas, C., Daniel, E., Maravilla, K. R., and Yuan, C., Detection and
characterization of atherosclerotic fibrous caps with T2-weighted MR,
Am. J. Neuroradiol., Vol. 19, pp. 129–134, 1998.
[97] Hatsukami, T. S., Ross, R., Polissar, N. L., and Yuan, C., Visualization
of fibrous cap thickness and rupture in human atherosclerotic carotid
plaque in-vivo with high resolution magnetic resonance imaging, Circu-
lation, Vol. 102, pp. 959–964, 2000.
450 Xu et al.
[98] Mitsumori, L. M., Hatsukami, T. S., Ferguson, M. S., Kerwin, W. S., Cai,
J. C., and Yuan, C., In vivo accuracy of multisequence MR imaging for
identifying unstable fibrous caps in advanced human carotid plaques,
J. Magn. Reson. Imaging, Vol. 17, pp. 410–420, 2003.
[99] Wasserman, B. A., Smith, W. I., Trout, H. H., Cannon, R. O., Balaban,
R. S., and Arai, A. E., Carotid artery atherosclerosis: In vivo morpho-
logic characterization with gadolinium-enhanced double-oblique MR
imaging—initial results, Radiology, Vol. 223, pp. 566–573, 2002.
[100] Han, C., Hwang, J. N., and Yuan, C., A fast minimal path active contour
model, IEEE Trans. Image Process., Vol. 6, pp. 865–873, 2001.
9.1 Introduction
1
Biomedical Engineering Department, Case Western Reserve University, Cleveland OH,
USA
2
Biomedical Engineering Department, Idaho State University, Pocatello, ID, USA
3
Department of Radiology, Case Western Reserve University, Cleveland OH, USA
451
452 Suri et al.
2. Tunica media: This is the middle layer made up mainly of smooth muscle
cells. In arteries it is the thickest layer.
Mesh-based 3 Manual
step method
Wilhjelm group
Ladak group
Technical
John P. Robarts University of
Research Institute, Denmark
Canada
using the lumen wall boundary for shape guidance. The external elastic lamina
boundary was then obtained using the internal elastic lamina for shape guidance.
Kim et al. [11] imaged the proximal coronary artery vessel wall with high-
resolution 3-D cardiovascular MRI. The proximal vessel wall and lumen bound-
aries were obtained using an automated edge detection tool. Comparison of
vessel wall thickness and luminal diameter between healthy subjects and pa-
tients with coronary artery disease showed increased wall thickness and but no
significant difference in luminal diameter in patients. This was due to positive
arterial remodeling in the patients, known as the Glagov effect.
Wilhjelm et al. [12] spatially compounded ultrasound images of formalin-
fixed carotid atherosclerotic plaques to reduce angle-dependence and speckle
noise—two problems prominent in ultrasound. A digital off-line ultrasound scan-
ner for multi-angle compound imaging (MACI) produced arterial image slices
that were compared to the corresponding anatomical slices. Compared to B-
mode ultrasound images, the MACI images had a better definition of outlines
and a more uniform representation of tissue parameters, which can aid in the
diagnosis of atherosclerotic disease.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 455
the balloon using inflation forces, and automatic localization of the balloon to
the arterial wall using image-based forces. The balloon model was represented
by a triangular mesh. Two thresholds were used in defining when a triangle in
the mesh should be split into two triangles; the larger threshold was used while
the balloon was inflating to the arterial wall, and the smaller threshold was
used while the balloon was being refined to fit the arterial wall. Surface tension
was used to reduce the effect of noise. A maximum error corresponding to the
maximum separation of the two registered phantom arteries was reported to be
0.3 mm.
Zhang et al. [35] showed that images produced by different imaging modali-
ties of MRI will give similar results when measuring lumen and vessel wall areas,
provided that the quality of the images are high and comparable. An image qual-
ity rating criteria was developed and had five levels of quality. Ten patients were
imaged with four MRI modalities (Time of Flight, T1 , T2 , PD-weighted), and
image sets of a patient were studied only if all of the different images were
above the third level of image quality. Lumen and outer wall boundaries were
measured semiautomatically using a program called the quantitative vascular
analysis tool (QVAT). Since flow artifacts were better suppressed on double in-
version T1 -weighted images, those images were recommended for measurement
when those images have the highest image quality. Mean differences between
lumen area measurements of each of the three black blood imaging techniques
were shown to be not statistically significant. In measurements of lumen area,
outer wall boundary area, and wall area, the PD and T2 -weighted images showed
the best agreement.
Yuan et al. [33] studied whether using a gadolinium-based contrast agent
in high-resolution MRI provided additional information that helped in charac-
terizing atherosclerotic plaques. High-risk atherosclerotic plaques were charac-
terized by thinning and rupture of the fibrous cap overlying the thrombogenic
lipid core of the artery. The study was done on patients scheduled for carotid
endarterectomy and volunteers. High-resolution cross-sectional MR images of
bilateral carotid arteries were obtained with a phased array carotid coil on a 1.5-T
GE SIGNA Horizon Echo Speed 5.8 MR scanner using a pre- and postcontrast-
enhanced double inversion recovery T1 -weighted fast spin-echo imaging proto-
col with TR/TE/T1 = 800/10/650 msec, echo train length = 8, slice thickness
= 2 mm, FoV = 13 × 9 cm, and matrix = 512 × 512 with zero-filled Fourier
reconstruction. TOF images were also obtained to aid in the classification of
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 457
plaque tissues. The precontrast enhanced images were used to identify regions
of interests (ROIs) in which the constituents were classified as fibrous tissue,
necrotic core, or calcification. These ROIs were then matched in the postcon-
trast enhanced images and a percent signal intensity change was calculated
from each ROI. After the endarterectomy, the plaques were histologically clas-
sified. Results were analyzed using statistical techniques such as single-factor
analysis of variance (ANOVA), Tukey, and Student’s t test. It was found that the
use of the gadolinium-based contrast agent in MRI is significantly useful in the
classification of necrotic core, fibrotic tissue, and especially neovasculature of
atherosclerotic plaques.
Yuan et al. [34] further showed that identification of a ruptured fibrous cap
in in-vivo human carotid atherosclerosis using high-resolution MRI is highly
associated with a recent transient ischemic attack (T IA) or stroke. Multiple
contrast-weighted MR protocol was used to obtain the images. The fibrous caps
were reviewed and classified as either being intact and thick, intact and thin, or
ruptured. Patients were classified as symptomatic or asymptomatic depending
on recent history of TIA or stroke. It was observed that while 9% of patients with
thick fibrous caps were symptomatic, 50% and 70% of those which had thin caps
and ruptured caps, respectively, were symptomatic. Statistical analysis showed
a highly significant trend of increasing percent symptomatic as cap deterioration
increases.
Naghavi et al. [22, 23] discussed a new classification system for identifying
patients having a risk of cardiac disease and related events. They defined three
areas of vulnerability: plaque, blood, and myocardium. They defined a vulnera-
ble plaque with a set of major and minor criteria, and techniques for detection of
each of these criteria. Many markers in blood that were associated with coagu-
lation were described, as were conditions that are associated with a vulnerable
myocardium. A new risk assessment strategy that was based on the three areas
of vulnerability, called the Cumulative Vulnerability Index, was proposed.
Fayad et al. [24] discussed the use of electron-beam computed tomography
(EBCT) to quantitatively detect the amount of calcium deposited in the coronary
arteries. Using a multidetector-row CT (MDCT) system to detect calcium offers
higher spatial resolution and SNR, but has more motion artifacts. Additionally,
using a contrast agent with MDCT can classify plaques into soft, intermediate,
or calcified. EBCT angiography results were found to be similar to MDCT an-
giography results. Coronary MR angiography (CMRA) was still less sensitive
458 Suri et al.
and specific than EBCT and MDCT angiographies; both spatial and temporal
resolution were lower, and the time needed to acquire an image requires that
imaging take place over multiple heart beats. MRI has been shown to usefully
image plaques at various locations. It has also been used to monitor experimen-
tal studies on plaque [25]. The combination of CT and MRI for use in detecting
dangerous plaques was promising.
Corti et al. [27] used high-resolution MRI imaging to follow the effects of
simvastatin, a statin that stabilizes plaques by lowering the lipid content, on
human atherosclerotic plaques. Results showed that after 12 months there was
a significant decrease in vessel wall area and maximal wall thickness, but there
was not a significant change in the lumen area.
Fuster et al. [26] discussed the biological events that lead to acute coronary
syndromes (ACS). Plaques of types IV and V (vulnerable) and type VI (com-
plicated) were most likely to lead to ACS. The beginning of an atherosclerotic
lesion start with lipoprotein transport and development of the extracellular ma-
trix. The disruption of plaques was made up of passive and active phenomenon.
Inflammatory cells at the plaque site would weaken the fibrous cap through
lytic processes, which was a step in arterial remodeling. Tissue factor (TF) was
associated with macrophages and was involved in coagulation, haemostasis,
and thrombosis. It was recognized that MRI is a promising tool for noninvasive
plaque characterization.
system detects and identifies the two different left and right lumen boundaries
and quantifies them. The lumen is complicated to classify, since the blood in
the lumen flows parabolically. Blood in the center of the lumen flows at a higher
speed than the blood near the edges of the lumen. In an MRI image this difference
in flow rates causes the center of the lumen to appear brighter than the edges of
the lumen. When classifying the image, the classifier will fail to identify the entire
lumen as one class, instead it will identify multiple classes inside the lumen. We
used three different segmentation methods for the classification of the lumen
region in our system. These are the Markov random fields (MRF), the Fuzzy C
means (FCM), and the graph segmentation methods (GSM). The MRF method
uses the Bayes rule to segment the image. It uses the expectation-maximization
(EM) algorithm and is based on maximum likelihood. It segments the image into
a given number of classes. The FCM method is based on the clustering technique.
It computes the fuzzy membership function. It associates this function to each
pixel in image. The GSM method is based on analyzing the image as a graph with
the pixels being nodes and the edges being the connections between two pixels.
It calculates weights of the edges and decides with a decision criterion whether
there should be a boundary between them. After the image is classified using one
of the three methods of segmentation, the image is binarized to isolate the left and
the right lumens. Since the lumens may contain multiple classes, the binarization
process merges these classes when necessary. The carotid arteries bifurcate in
the middle of the volume and the region of interest (ROI) of the lumens change
from being a circular shape to being an elliptical shape. The binarization process
uses both circular and elliptical masks. Once the boundaries of the left and right
lumens were obtained, they were compared to traced ground truth boundaries
using two methods of error computation between boundaries. We computed
the error using the shortest distance method (SDM) and the polyline distance
method (PDM). The PDM computes a lower error than does the SDM. We tested
the system for the three different classifying methods, first on synthetic data
and then on real patient volumes.
We created a model of images of the carotid arteries for validation. To
simulate noise, we created images with variance from 0 to 100 for a small
noise protocol, and we created images with variance from 100 to 1000 for
a large noise protocol. For each variance we created images with left and
right lumens having two classes in eight different orientations. Each proto-
col had about 24,000 total boundary points. Using MRF, the average error for
a variance of 500 pixels squared was 5.97 pixels with standard deviation of
460 Suri et al.
0.13 pixels; using FCM the average error was 1.54 pixels with standard deviation
of 0.05 pixels.
We ran the system using each of the three different classifying methods
on real patient data. Ground truth boundaries of the walls of the carotid
artery were traced for 15 patients. Overall the number of boundary points was
roughly 22,500 points. A pixel was equivalent to 0.25 mm. Using MRF, the
average error was 0.61 pixels; using FCM, the average error was 0.62 pixels;
using GSM, the average error was 0.74 pixels.
What is new in this chapter? The following are the new things the readers
will observe when it comes to plaque imaging: (a) Application of three different
sets of classifiers for lumen region classification in plaque MR protocols. These
classifiers are done in multiresolution framework. Thus subregions are chosen
and subclassifiers are applied to compute the accuracy of the pixel values be-
longing to a class. (b) Region merging for subclasses in lumen region to compute
accurate lumen region and lumen boundary in cross-sectional images. (c) Rota-
tional effect of ROI in bifurcation zones for accurate lumen region identification
and boundary estimation.
Following are the challenges for lumen wall (inner) and vessel wall (outer)
estimation processes (see also Fig. 9.4):
1. Mulitple classes in the lumen region due to laminar blood flow: The lumen
region consists of multiple classes: core class (central part of the lumen),
adjoining class (due to slow moving blood flow as seen in Fig. 9.4), and
some times border pixels in the fibrous cap region giving different classes.
So, the lumen region can be C1, C1 + C2, or C1 + C2 + C3 class regions.
2. Lumen shape variation: The shape of the cross section of the artery lumen
is “circular” for some slices and is “elliptical” near the bifurcation. So, the
ROI can change from slice to slice also. If one uses a circular ROI on an
elliptical region, then a large number of pixels will be missed along the
major axis of the elliptical region. The elliptical region can be seen on slice
before the bifurcation zone, while the circular regions can be seen on slices
far from the bifurcation zone.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 461
Figure 9.4: Top left: Figure showing the class C0 for lumen, C1 class for low
intensity flow region, and the outer wall for the vessel wall. Top right: Classes
C1, C2, and C3 are the regions due to the classification process due to weak
distribution of pixels in the boundary region. Bottom left: Parabolic flow of the
blood showing the highest velocity in the central region of the vessel. Bottom
right: Flow of blood in the bifurcation zone.
5. Partial volume effect: The partial volume effect in the edge of the lumen
can lead to misleading lumen wall boundary estimation.
462 Suri et al.
Figure 9.5: Abnormal ground truth overlays. The top row in each row pair is
the left carotid artery overlayed with the ground truth tracing of the inner lumen
wall, and the bottom row is the corresponding image of the right carotid artery.
Some lumens have a circular shape, while others have an elliptical shape.
Figure 9.6: Abnormal ground truth overlays. The top row in each row pair is
the left carotid artery overlayed with the ground truth tracing of the inner lumen
wall, and the bottom row is the corresponding image of the right carotid artery.
Some lumens have a circular shape, while others have an elliptical shape.
lumens that are more circular than those of the abnormal images, and there is no
constriction of the arteries. Ground truth tracing was done using the MATLAB
program MRI GUI ver 1.2. For each lumen boundary the image was zoomed in
and points were plotted by the user around the inner wall boundary. The points
were spline-fitted to 20 points.
The imaging parameters (TR/TE/TI/NEX/thickness/FOV/ETL) are as follow-
ing: T1W:1R-R/7.1 ms/500 ms/2/3 mm/12–14 cm/21; PDW: 2R-R/7.1 ms/600 ms/
464 Suri et al.
Figure 9.7: Abnormal ground truth overlays. The top row in each row pair is
the left carotid artery overlayed with the ground truth tracing of the inner lumen
wall, and the bottom row is the corresponding image of the right carotid artery.
Some lumens have a circular shape, while others have an elliptical shape.
2/3 mm/12–14 cm/31; T2W: 2R-R/68 ms/600 ms/2/3 mm/12–14 cm/31. Matrix for
all images were 256 × 192. Voxel size was 0.5 × 0.5 × 3 mm3 . For bright blood
3D TOF images: TR/TE/flip angle/thickness/FOV = 20 ms/3.4 ms/25/2.0 mm/18.
Zero-filled Fourier transform was used to create voxels of 0.35 × 0.35 × 1 mm3
for 3-D TOF imaging. The factors which affect MR plaque quality are (a) ran-
dom patient motion, including (b) obese patients who may have deeper carotid
arteries, (c) incomplete flow suppression, and (d) artery wall pulsation.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 465
Figure 9.8: Abnormal ground truth overlays. The top row in each row pair is
the left carotid artery overlayed with the ground truth tracing of the inner lumen
wall, and the bottom row is the corresponding image of the right carotid artery.
Some lumens have a circular shape, while others have an elliptical shape.
Figure 9.9: Normal ground truth overlays. The top row in each row pair is
the left carotid artery overlayed with the ground truth tracing of the inner
lumen wall, and the bottom row is the corresponding image of the right carotid
artery. Some lumens have a circular shape, while others have an elliptical
shape.
example, when K = 3, zi, j = [0, 1, 0]T means we assign the pixel at (i, j) to
class 2.
Using the notation introduced above, the segmentation problem can be for-
mulated as the following MAP (maximum a posteriori) inference problem:
where and were model parameters. In this work, we assume that the pixels
in y are conditionally independent given z, i.e.,
log p(y | z, ) = log p(yi, j | zi, j , ). (9.2)
i, j
Furthermore, we assume that conditioned on zi, j , the pixel yi, j has a multivariate
Gaussian density, i.e., for k = 1, 2, . . . , K,
e− 2 (yi, j − mk )T C−1
1
k (yi, j − mk )
p(yi, j | zi, j = ek , ) = , (9.3)
(2π )3/2 |Ck |1/2
where ek is a K -dimensional binary indicator vector with the kth component
being 1. From this, = {mk , Ck }k=1
K
contained the mean vectors and covariance
matrices for the K image classes. For z, we have adopted an MRF model with a
Gibbs’ distribution [149]:
1 −βE(z)
p(z | ) = e , (9.4)
Z
where
1
E(z) = 1 − 2zi,t j zk,l (9.5)
2 i, j (k,l)∈N
i, j
1. E step: Compute
Q( |
ˆ ( p) ) = log p(y | z, ) + log p(z | ) | y,
ˆ ( p) .
Here · represents the expectation, or mean, and the superscript p denotes
the pth iteration. This translated into the following formulas for updating the
468 Suri et al.
parameter estimates:
7 ( p) 8
zi, j = zi, j f (zi, j ) (9.6)
zi, j
( p) 8 7
( p+1) zi, jk yi, j
i, j
m̂k = 7 ( p) 8 ,
i, j zi, jk
7 ( p) 82 ( p+1) 32 ( p+1) 3T
( p+1) i, j zi, jk yi, j − m̂k yi, j − m̂k
Ĉk = 7 ( p) 8 , (9.7)
i, j zi, jk
where k = 1, 2, . . . , K, zi, jk is the kth component of zi, j , and f (zi, j ) a “mean field”
probability distribution (see Zhang [149]).
These formulas, in addition to providing the estimate of , also produced a
( p)
segmentation. Specifically, at each iteration, zi, jk was interpreted as the prob-
ability that yi, j was assigned to class k. Hence, after a sufficient number of
iterations, we can obtain the segmentation z for each (i, j) ∈ L by
Pros and Cons of MRF with Scale Space. The major advantages of MRF-
based classification is (1) addition of Gibbs’ Model: Kapur et al. model the a
priori assumptions about the hidden variables as a Gibbs’ random field. Thus the
prior probability is modeled using the following physics-based analogous Gibbs’
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 469
equation: P( f ) = 1
Z
exp(− κ1T )E( f )), where Z = f exp( −E(T f ) ) and P( f ) is the
probability of the configuration f , T is the temperature of the system, κ is the
Boltzmann constant, and Z is the normalizing constant. The major disadvan-
tages of MRF-based classification include the following: (1) The computation
time would be large if the number of classes is large. In such cases, one needs
to use the multiresolution techniques to speed up the computation. (2) The
positions of the initial clusters are critical in solving the MRF model for the
convergence. Here, the initial cluster was computed using K -means algorithm
which was a good starting point. However, a more robust method is desirable.
Figure 9.10 shows the main MRF classification system implementation. Input is
a perturbed gray scale image with left and right lumens. The number of classes,
the initial mean, and the error threshold are inputs given to the system. The result
is a classified image with multiple class regions in it, including multiple classes
in the lumen region. Figure 9.11 shows the MRF system in more detail. Given
the initial center, the number of classes K , the Markov parameters of mean,
variance, and covariance, and the perturbed image, the current cluster center is
calculated. Using the EM algorithm, new parameters are solved and new cluster
K (classes)
Classification Process
Error
Initial Mean Threshold
Figure 9.10: Markov random fields (MRF) classification process. Input is a gray
scale perturbed image with left and right lumens. The number of classes, the
initial mean, and the error threshold are inputs given to the system. The result
is a classified image with multiple class regions in it.
470 Suri et al.
Figure 9.11: The Markov random fields (MRF) segmentation system. Given the
initial center, the number of classes K , the Markov parameters of mean, variance,
and covariance, and the perturbed image, the current cluster center is calculated.
Using the expectation-maximization (EM) algorithm, new parameters are solved
and new cluster centers are computed. The error between the previous cluster
center and the recently calculated cluster center are compared, and the process
is repeated if the error is not less than the error threshold. After the iterative
process is finished, the output is a segmented image.
centers are computed. The error between the previous cluster center and the
recently calculated cluster center are compared, and the process is repeated
if the error is not less than the error threshold. After the iterative process is
finished, the output is a segmented image.
algorithms for computing membership functions, and one of the most efficient
ones is Fuzzy C means (FCM) based on the clustering technique. Because of its
ease of implementation for spectral data, it is preferred over other pixel clas-
sification techniques. Mathematically, we expressed the FCM algorithm below
but for complete details, readers are advised to see Bezdek and Hall [180] and
Hall and Bensaid [181]. The FCM algorithm computed the measure of member-
ship termed as the fuzzy membership function. Suppose the observed pixel
intensities in a multispectral image at a pixel location j is given as
where j takes the pixel location, and N is the total number of pixels in the data
set4 in FCM (see Figs. 9.12 and 9.13) the algorithm iterates between computing
the fuzzy membership function and the centroid of each class. This member-
ship function is the pixel location for each class (tissue type), and the value of
the membership function lies between the range of 0 and 1. This membership
function actually represents the degree of similarity between the pixel vector at
a pixel location and the centroid of the class (tissue type); for example, if the
membership function has a value close to 1, then the pixel at the pixel location is
close to the centroid of the pixel vector for that particular class. The algorithm
( p)
can be presented in the following four steps. If u jk is the membership value
at location j for class k at iteration p, then k=1
3
u jk = 1. As defined before, y j
( p)
is the observed pixel vector at location j and vk is the centroid of class k at
iteration p. Thus, the FCM steps for computing the fuzzy membership values
are as follows:
1. Choose the number of classes (K ) and the error threshold th , and set the
(0)
initial guess for the centroids vk where the iteration number p = 0.
yi − vk −2
( p)
( p)
u jk = K (9.10)
( p) −2
l=1 yi − v
where j = 1, . . . , M and k = 1, . . . , K .
4
This is not the N used in derivation in section 9.4.1.
472 Suri et al.
Image Volume
Membership Function
New Centroid
Compute Error ?
Stop
Input Image
For each edge in sorted edge list, apply the decision criterion D to
begin merging components
Segmented Image
Figure 9.14: Graph segmentation method (GSM). The input image is smoothed
given a smoothing parameter. The image is treated as a graph, with each pixel
treated like a vertex. An edge is a pair of pixels. Using a weight function w(e),
the weights of the edges are computed and the edges are listed by weight in a
nondecreasing order. Initially, each pixel is segmented into its own component.
For each edge in the list, a decision criterion D is applied and the components
are merged accordingly. Input constant k determines the size preference of the
components by changing the threshold function. The result is a segmented image
made up of the final merged components.
For each edge (vi ,vj ) in sorted edge list, compute difference
between Ci and Cj and compute the minimum internal
difference among Ci and Cj.
YES
Figure 9.15: Decision criterion D for the graph segmentation method (GSM).
After the list of edge weights are sorted and each pixel is segmented into its own
component, the decision criterion is D is applied to each edge. The constant
k is used in determining the threshold function. First the difference between
the two components to which the two pixels making up the edge belong is
computed. Then the minimum internal difference among those two components
is computed. If the difference between the two components is greater than the
minimum internal difference among them, then D applied to the two components
is true, and the two components are not merged because there is evidence
for a boundary between them. Otherwise, if the difference between the two
components is less than or equal to the minimum internal difference, then the D
applied to the two components is false, and the two components are merged into
one component which contains both pixels of the edge. This decision criterion
is applied to all the edges of the list, and the final result is a segmentation of the
pixels into components.
k
τ (C) = , (9.15)
|C|
where k is the input constant and |C| is the size of the component C.
476 Suri et al.
Figure 9.16: Graph segmentation method (GSM) equations. The internal differ-
ence of a component is the maximum edge weight of the edges in its minimum
spanning tree. The difference between two components is the minimum edge
weight of the edges formed by two pixels, one belonging to each component. The
threshold function of a component is the constant k divided by the size of that
component, where the size of a component is the number of pixels it contains.
The minimum internal difference among two components is the minimum value
of the sum of the internal difference and the value of the threshold function of
each component.
If the difference between the two components is greater than the mini-
mum internal difference among the two components, then the two compo-
nents are not merged. Otherwise, the two components are merged into one
component.
where η(x, y) ∼ N (0, σ 2 ) and σ 2 is the variance of the noise. N is the Gaussian
distribution. The output synthetic image using Gaussian image generation pro-
cess is shown in the Fig 9.17.
Figure 9.18 shows eight different directions that the core class of the lumen
can be with respect to the crescent moon class. The darkest region is the core
Figure 9.18: σ 2 = 500, all directions, large noise protocol. With respect to the
center of the lumen area, the core class is shown in eight different orientations.
In the top row, from right to left: east, northeast, north; in the second row:
northwest, southeast, south; in the third row: southwest, west, west.
478 Suri et al.
Figure 9.19: Images with 10 different variances using large noise protocol. The
gray scale model is perturbed with variance (σ 2 ) varying from 100 to 1000. In
the top row, from right to left: σ 2 = 100 and 200; in the second row: σ 2 = 300
and 400; in the third row: σ 2 = 500 and 600; in the fourth row: σ 2 = 700 and 800;
in the fifth row: σ 2 = 900 and 1000.
lumen, and the next lightest region that surrounds the core is the crescent moon
class. Figure 9.19 shows the core class and the crescent moon class of the lumen
with perturbation. The darkest region is the core lumen, and the next lightest
region that surrounds the core is the crescent moon class. The variance (σ 2 )
was varied from 100 to 1000.
Binarization Ruler
Figure 9.20: Block diagram of the system. A gray scale image is generated,
with parameters being the number of lumens, location of lumens, the number
of classes K , and a Gaussian perturbation with mean and variance. The re-
sult is an image with multiple lumens with noise. This image is then processed
by the lumen detection and quantification system (LDS). This system includes
many steps including classification, binarization, connected components anal-
ysis (CCA), boundary detection, overlaying, and measuring the error. The final
result are the lumen errors and overlays.
generation process takes in the noise parameters: the mean and variance, the
locations of the lumens, the number of lumens and the class intensities of the
lumen core, the crescent moon, and the background.
Step two consists of lumen detection and quantification system (LDAS) (see
Fig. 9.20). The major block is the classification system discussed in section 9.3.
Then comes the binarization unit which is used to convert the classified input
into the binarized image and also does the region merging. It also has a connected
component analysis (CCA) system block which is the input to the LDAS. We also
need the region-to-boundary estimator which will give the boundary of the left
and right lumens. Finally we have the quantification system (called Ruler), which
is used to measure the boundary error.
The LDQS system consists of lumen detection and lumen quantification sys-
tem. The lumen detection system (LDS) is shown in Fig. 9.21. The detection
process is done by the classification system, while the identification is done
by the CCA system. There are three classification system we have used in our
480 Suri et al.
Markovian K (classes)
Classification Process
Fuzzy Bayesian
Classified Image with multiple lumens and each lumen having multiple class regions
CCA K
Figure 9.21: Block diagram of lumen detection system (LDS). The gray scale
image with multiple lumens is first classified by one of the classifiers. The result
is a classified image with multiple lumens, with each lumen having multiple class
regions. Within each lumen, these multiple regions are merged in the binariza-
tion process, given the number of classes K . They are labeled using connected
component analysis (CCA). The LDS detects and labels each lumen.
processes (see section 9.3). The parameters used are the number of classes (K )
as shown in the Fig. 9.21. The CCA block also takes the parameter the number
of classes, K , as input.
The lumen detection and identification is further detailed as shown in the
Fig. 9.22. The detection system inputs the classified image and outputs the bi-
nary regions of the lumen. Because of boundary classes and plaque diffusion in
the lumen area, there are classes well. We merge these classes to generate the
complete lumen and the final detection of the lumen takes place as shown in the
Fig. 9.22. Finally, the system shows the identification of the left and right lumen
using the CCA analysis.
K (classes)
Detection: Class Merging &
Binarization
ROI
CCA
Left and Right Lumen Identification
Using CCA
due to the bifurcations in the arteries of the plaqued vessels (see sections 9.6.1
and 9.6.2). Figure 9.23 illustrates the region merging algorithm. The input image
has lumens which have one, two, or more classes. If the number of classes in
the ROI is one class, then that class is selected; if two classes are in the ROI,
then the minimum class is selected; and if there are three or more classes in
the ROI, then the minimum two classes are selected. The selected classes are
merged by assigning all the pixels of the selected classes one level value. This
process results in the binarization of the left and right lumens.
The binary region labeling process is shown in Fig. 9.24. The process uses
the CCA approach of top to bottom and left to right. Input is an image in which
the lumen regions are binarized. The CCA first labels the image from the top
to the bottom, and then from the left to the right. The result is an image that is
labeled from the left to the right.
ID assignment process of the CCA for each pixel is shown in Fig. 9.25. In the
CCA, in the input binary image, each white pixel is assigned a unique ID. The label
propagation process then results in connected components. The propagation of
Input Image with Lumen having 1, 2, or more Classes
Compute the
Take Full Compute Minimum Least Two
Lumen (K = 1) (K = 2) (K ≥ 3)
Figure 9.23: Region detection: region merging algorithm. The input image has
lumens that have 1, 2, or more classes. If the number of classes in the ROI is
one class, then that class is selected; if two classes are in the ROI, then the
minimum class is selected; and if there are three or more classes in the ROI,
then the minimum two classes are selected. The selected classes are merged
by assigning all the pixels of the selected classes one level value. This process
results in the left and right lumen being binarized.
R C
Top Down Labeling
Binary Image
1 1 1 1
1 1 1 1
Assign each white pixel an ID 1 1 1 1 1
Binary Image
Each white pixel has unique ID
1 2 3 4
Label Propagation
5 6 7 8
10 11 13 14 15
Connected Components
Assigning unique IDs
the region from left to right is shown in Fig. 9.26. This is the first pass of the label-
propagation process. Every row of the image is scanned from top to bottom, left
to right, pixel by pixel. If the pixel has an ID, then pixels to the left and above
of the pixel are checked for IDs, and if either one has an ID, then the pixel’s
value is reassigned to that of the lowest among the neigbor pixels and the pixel.
This processed is repeated for all pixels in the row and in all rows. The result
is a binary image with some label propagation. The propagation of the region
from top to bottom is shown in Fig. 9.27. This is the second pass of the label-
propagation process. Every row of the image is scanned from bottom to top,
right to left, pixel by pixel. If the pixel has an ID, then pixels to the right and
below of the pixel are checked for IDs, and if either one has an ID, then the
pixel’s value is reassigned to that of the lowest among the neigbor pixels and
the pixel. This processed is repeated for all pixels in the row and in all rows.
The result is a binary image with some label propagation. Finally, the region
assignment is summarized in Fig. 9.28. The top left image is a binary image with
a value of 1 assigned to each of the white pixels. Each of the white pixels are
assigned a unique value in the top-right image. The left to right and top to bottom
label propagation propagates the labels of value 1 and 3, and the result is the
484 Suri et al.
NO
Does pixel have an ID? Assigning unique IDs
YES
NO
1 2 3 4 1 1 3 3
Does the pixel to the left or the 5 6 7 8 1 1 3 3
pixel above have an ID? 10 11 13 14 15 1 1 1 1 1
YES
Figure 9.26: Region identification: propagation. This is the first pass of the label
propagation process. Given the bianary image having unique IDs for each white
pixel, every row of the image is scanned from top to bottom, left to right, pixel
by pixel. If the pixel has an ID, then pixels to the left and above of the pixel are
checked for IDs, and if either one has an ID, then the pixel’s value is reassigned
to that of the lowest among the neigbor pixels and the pixel. This processed
is repeated for all pixels in the row all rows. The result is a binary image with
reassigned pixel values.
bottom-left image. Then, the right to left and bottom to top label propagation
propagates the label value of 1 to the pixels having a label value of 3. The result
is the bottom-right image, in which the connected white pixels have all the same
label values of 1. This is the basic algorithm of the process; the CCA we used uses
look-up tables in order to efficiently assign regions in two passes. The results on
CCA on a binary image with 4 lumens are shown in Fig. 9.29. The input image
has the lumens detected, but they are all of the same color. CCA identifies the
lumens by labeling each with a different color. The process to generate a color
image is shown in Fig. 9.30. The first input is a gray scale image. The second
input is the ideal boundary image. This image is dilated and converted to a red
color, resulting in a red ideal boundary image. The third input is the estimated
boundary image. This image is dilated and converted to a green color, resulting
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 485
For each row of the image, scan (bottom to top, right to left) each
pixel
NO
Does pixel have an ID?
YES
1 1 3 3 1 1 1 1
NO
1 1 3 3 1 1 1 1
Does the pixel to the rightor 1 1 1 1 1 1 1 1 1 1
the pixel below have an ID?
YES
Figure 9.27: Region identification: propagation. This is the second pass of the
label propagation process. Given the binary image having unique IDs for each
white pixel, every row of the image is scanned from bottom to top, right to left,
pixel by pixel. If the pixel has an ID, then pixels to the right and below of the
pixel are checked for IDs, and if either one has an ID, then the pixel’s value is
reassigned to that of the lowest among the neigbor pixels and the pixel. This
processed is repeated for all pixels in the row and all rows. The result is a binary
image with reassigned pixel values.
in green estimated boundary image. These three images are fused to produce a
color overlay image.
1 1 1 1 1 2 3 4
1 1 1 1 5 6 7 8
1 1 1 1 1 10 11 13 14 15
1 1 3 3 1 1 1 1
1 1 3 3 1 1 1 1
1 1 1 1 1 1 1 1 1 1
Figure 9.28: Region identification: ID Propagation. The top left image is a binary
image with a value of 1 assigned to each of the white pixels. Each of the white
pixels are assigned a unique value in the top right image. The left to right and
top to bottom label propagation propagates the labels of value 1 and 3, and the
result is the bottom left image. Then, the right to left and bottom to top label
propagation propagates the label value of 1 to the pixels having a label value
of 3. The result is the bottom right image, in which the connected white pixels
have all the same label values of 1.
Detected Lumens
CCA
Identified Lumens
4 different colors
Figure 9.29: Region identification: CCA. The input image has the lumens de-
tected, but they are all the same color. Connected component analysis (CCA)
identifies the lumens by labeling each with a different color.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 487
Fusion of 3 images
Figure 9.30: Color overlay block. The first input is a gray scale image. The
second input is the ideal boundary image. This image is dilated and converted
to a red color, resulting in a red ideal boundary image. The third input is the
estimated boundary image. This image is dilated and converted to a green color,
resulting in green estimated boundary image. These three images are fused to
produce a color overlay image.
classification system. In the second row the right image shows the binarization
of the image after selecting only the core class for binarization (K = 1). In the
third row the left image shows the binarization of the image after selecting the
core class and the edge classes for binarization (K > 1). In the third row the
right image shows the image (K = 1) after the labeling of CCA. In the fourth row
the left image shows the image (K > 1) after the labeling of CCA. In the fourth
row the right image shows the image (K = 1) after the labeling of assign ID. In
the fifth row the left image shows the image (K > 1) after the labeling of assign
ID. In the fifth row the right image shows the computer-estimated boundary
of the image (K = 1) using the region-to-boundary algorithm. In the sixth row
the left image shows the computer-estimated boundary of the image (K > 1)
using the region-to-boundary algorithm. In the sixth row the right image shows
the original image on which is overlayed the ideal ground truth boundary, the
artifacted boundary (K = 1), and the corrected boundary (K > 1).
Figure 9.32 shows in the MRF classification system all the steps for the left
and right lumen detection, identification, and boundary estimation process in
the synthetic images. We look at large noise protocol as an example below with
488 Suri et al.
Figure 9.31: Results on synthetic image with noise variance, σ 2 = 500 using
FCM method. Row 1, left: Synthetic generate image. Row 1, right: After Perona–
Malik smoothing. Row 2, left: After FCM classification system. Row 2, right:
Binarization with only C0 class (K = 1). Row 3, left: Binarization with merging
C0, C1, and C2 classes (K > 1). Row 3, right: Binarization after CCA (K = 1).
Row 4, left: Binarization after CCA (K > 1). Row 4, right: After assign ID (K = 1).
Row 5, left: After assign ID (K > 1). Row 5, right: After region to boundary
(K = 1). Row 6, left: After region to boundary (K > 1). Row 6, right: Overlay
generation with and without crescent moon.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 489
Figure 9.32: Results on synthetic image with noise variance, σ 2 = 500 using
MRF method. Row 1, left: Synthetic generate image. Row 1, right: After MRF
classification system. Row 2, left: Binarization with only C0 class (K = 1). Row 2,
right: Binarization with merging C0, C1, and C2 classes (K > 1). Row 3, left:
Binarization after CCA (K = 1). Row 3, right: Binarization after CCA (K > 1).
Row 4, left: After assign ID (K = 1). Row 4, right: After assign ID (K > 1). Row 5,
left: After region to boundary (K = 1). Row 5, right: After region to boundary
(K > 1). Row 6, left: Overlay generation with and without crescent moon. Row 6,
right: Overlay generation with and without crescent moon.
490 Suri et al.
noise level σ 2 = 500. In the first row the left image shows the synthetically gen-
erated image. In the first row the right image shows the classified image after
the image has gone through the MRF classification system. In the second row
the left image shows the binarization of the image after selecting only the core
class for binarization (K = 1). In the second row the right image shows the bi-
narization of the image after selecting the core class and the edge classes for
binarization (K > 1). In the third row the left image shows the image (K = 1)
after the labeling of CCA. In the third row the right image shows the image
(K > 1) after the labeling of CCA. In the fourth row the left image shows the
image (K = 1) after the labeling of assign ID. In the fourth row the right im-
age shows the image (K > 1) after the labeling of assign ID. In the fifth row
the left image shows the computer-estimated boundary of the image (K = 1),
using the region-to-boundary algorithm. In the fifth row the right image shows
the computer-estimated boundary of the image (K > 1), using the region-to-
boundary algorithm. In the sixth row the left image shows the original image
on which is overlayed the ideal ground truth boundary, the artifacted boundary
(K = 1), and the corrected boundary (K > 1). In the sixth row the right image
shows the original image on which is overlayed the ideal ground truth boundary,
the artifacted boundary (K = 1), and the corrected boundary (K > 1).
Figures 9.33 and 9.34 show in the GSM classification system all the steps
for the left and right lumen detection, identification, and boundary estimation
process in the synthetic images. We look at large noise protocol as an exam-
ple below with noise level σ 2 = 500. In Fig. 9.33, the first row the left image
shows the synthetically generated image. In the first row the right image shows
the image after it has been smoothed by the Perona–Malik smoothing function.
In the second row the left image shows the image after its frequency peaks of
pixel values have been merged. In the second row the right image shows the
classified image after the image has gone through the GSM classification sys-
tem. In the third row the left image shows the binarization of the image after
selecting only the core class for binarization (K = 1). In the third row the right
image shows the binarization of the image after selecting the core class and the
edge classes for binarization (K > 1). In the fourth row the left image shows
the image (K = 1) after the labeling of CCA. In the fourth row the right image
shows the image (K > 1) after the labeling of CCA. In Fig. 9.34, the first row
the left image shows the image (K = 1) after the labeling of assign ID. In the
first row the right image shows the image (K > 1) after the labeling of assign
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 491
Figure 9.33: Results on synthetic image with noise variance, σ 2 = 500 using
GSM method. Row 1, left: Synthetic generate image. Row 1, right: After peak
merger. Row 2, left: After Perona–Malik Smoothing. Row 2, right: After GSM
classification system. Row 3, left: Binarization with only C0 class (K = 1). Row 3,
right: Binarization with merging C0, C1, and C2 classes (K > 1). Row 4, left:
Binarization after CCA (K = 1). Row 4, right: Binarization after CCA (K > 1).
ID. In the second row the left image shows the computer-estimated bound-
ary of the image (K = 1), using the region-to-boundary algorithm. In the sec-
ond row the right image shows the computer-estimated boundary of the image
(K > 1), using the region-to-boundary algorithm. In the third row the left image
shows the original image on which is overlayed the ideal ground truth bound-
ary, the artifacted boundary (K = 1), and the corrected boundary (K > 1). In
the third row the right image shows the original image on which is overlayed the
ideal ground truth boundary, the artifacted boundary (K = 1), and the corrected
boundary (K > 1).
492 Suri et al.
Figure 9.34: Results on synthetic image with noise variance, σ 2 = 500 using
GSM method. Row 5, left: After assign ID (K = 1). Row 5, right: After assign ID
(K > 1). Row 6, left: After region to boundary (K = 1). Row 6, right: After region
to boundary (K > 1). Row 7, left: Overaly generation with and without crescent
moon. Row 8, right: Overaly generation with and wihout crescent moon.
where
d1 = (x0 − x1 )2 + (y0 − y1 )2
d2 = (x0 − x2 )2 + (y0 − y2 )2
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 493
The distance dvb (B1 , B2 ) between the vertices of polygon B1 and the sides of
polygon B2 is defined as the sum of the distances from the vertices of the polygon
B1 to the closest side of B2 .
dvb (B1 , B2 ) = d(v, B2 )
v ∈ vertices B1
poly
9.5.1 Mean Error (eNFP )
Using the definition of the polyline distance between two polygons, we can now
poly
compute the mean error of the overall system. It is denoted by eNFP and defined
by
F N
poly 2× n=1 Ds (G nt , C nt )
eNFP = t=1
(9.22)
F×N
where Ds (G nt , Cnt ) is the polyline distance between the ground truth G nt and
computer-estimated polygons Cnt for patient study n and slice number t. Us-
ing the definition of the polyline distance between two polygons, the standard
deviation can be computed as
σNFP
poly
=
! N
n=1 { − eNFP ) + − eNFP )}
F poly 2 poly 2
t=1 v ∈ vertices G nt (db (v, C nt ) v ∈ vertices Cnt (db (v, G nt )
N × F × (# vertices ∈ B1 + # vertices ∈ B2 )
(9.23)
494 Suri et al.
Figure 9.35 (left and right) shows the performance of the synthetic system for
the small noise protocol using polyline (see section 9.5) and shortest distance
methods. Figure 9.35 (left) compares the mean error curves of the MRF vs.
FCM (with smoother) using the PDM, while Fig. 9.35 (right) compares the mean
error curves of the MRF vs. FCM (with smoother) using the shortest distance
method. Using PDM, as the variance of the noise (σ 2 ) increases from 0 to 100,
the mean error in both methods increases gradually. The mean error for the FCM
(with smoother) remains under 1.6 pixels, while the mean error for MRF ranges
between 1.6 and 1.8 pixels. The same pattern is observed using SDM method
(see Fig. 9.35, right). It is also seen in the two graphs that FCM using PDM has
a lower error compared to FCM using SDM.
In another procotol, we run the same PDM and SDM for FCM methods but
with and without the Perona–Malik smoothing process. This can be seen in
Fig. 9.36 (left and right). The method of PDE-based smoothing system improves
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 495
Figure 9.35: Results of MRF vs. FCM using PDM and SDM methods for small
noise protocol. Left: MRF vs. FCM using PDM method. Right: MRF vs. FCM using
SDM method.
the error over non-PDE based system at large noise and thus is more robust in
identification and detection process. It is also seen in the two graphs that FCM
(with and without smoother) using PDM has a lower error compared to FCM
(with and without smoother) using SDM.
In another procotol, we compare the MRF vs. FCM (without PDE smoother)
and this can be seen in Fig. 9.37 (left and right). Using PDM, as the variance of
the noise (σ 2 ) increases from 0 to 100, the mean error in both methods increases
gradually. The mean error for the FCM (without smoother) remains under 1.6
pixels, while the mean error for MRF ranges between 1.6 and 1.8 pixels. The
same pattern is observed using SDM method (see Fig. 9.37, right). It is also seen
Figure 9.36: Effect of PDE-smoother process on the overall system. Left: FCM
using PDM method (for small noise protocol). Right: FCM using SDM method
(for small noise protocol).
496 Suri et al.
Figure 9.37: MRF vs. FCM. Left: MRF using PDM method (for small noise pro-
tocol). Right: FCM using PDM method (for small noise protocol).
in the two graphs that FCM (without smoother) and MRF using PDM have a
lower error compared to FCM (without smoother) and MRF using SDM.
Figure 9.38: MRF vs. FCM for large noise protocol. Left: MRF using PDM
method. Right: FCM using PDM method. Note that the range of the mean er-
rors is less than 0.36 pixels using MRF and less than 0.05 pixels using FCM with
smoother.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 497
mean error for MRF increases more rapidly compared to that of FCM. The mean
error for the FCM (with smoother) remains close to 1.6 pixels, while the mean
error for MRF ranges between 1.7 (σ 2 = 100) and 2.1 pixels (σ 2 = 1000). The
same pattern is observed for the MRF vs. FCM using SDM (see Fig. 9.38, right).
Figure 9.39: The bias error is compared between the MRF and the FCM with
smoother. The mean errors are plotted against consecutive points around the
contour. Large noise protocol.
498 Suri et al.
Figure 9.40: PDM vs. SDM methods. Left: MRF: PDM vs. SDM. Right: FCM:
PDM vs. SDM. The length of the range of the mean errors is less than 0.36 pixels,
and the difference between the two curves is about 0.03 pixels.
Figure 9.41: Sampling protocol test: Left: Shape optimization test. Right: Con-
centric shape decagon test. Two circle contours each of radius 60 pixels have
their centers separated by 60 pixels. The mean errors given by the PDM and the
SDM are plotted against the number of points on the circular contours. As the
number of points on each of the contours increases, the difference in the mean
errors decreases, and both errors approach an actual value.
points increases on the boundary, the boundary becomes more smooth, but we
do not know as to how many points are necessary on the boundary to represent
the best lumen shape. Figure 9.41 (left and right) demonstrates the mean error
around the boundary versus the number of points on the lumen boundary. As
the number of points increases from 10 to 120, the mean error drops rapidly
using PDM and SDM methods. Using PDM, the mean error drops rapidly when
the number of boundary points increases from 10 to 30 and reaches a stage of
convergence when the number of points is 50. The same pattern is observed
using the SDM method and the mean error falls rapidly from points 10 to 50 and
reaches a stage of convergence when the number of points on the boundary
is 80. The stage of convergence here means that there is no more change in
the mean error, if the number of points increases beyond a certain limit. Lastly,
the fall of the errors as the number of points increases is more rapid for SDM
compared to that of PDM, and the starting error (when total points are 10) in
SDM is much larger compared to that of PDM. A similar experiment was done
synthetically when the boundaries are concentric shapes. We took a simple
shape of a concentric decagon (with radius 20 and 50 pixels) and increasing
the number of points from 10 to 120. Since the boundaries were concentric, the
500 Suri et al.
point of convergence was same (70 points) for both PDM and SDM (see Fig. 9.41,
right).
Figure 9.42: Effect of fitting splines over the estimated boundaries. Top left:
MRF, PDM, with and without splines. Top right: MRF, SDM, with and without
splines. Bottom left: FCM, PDM, with and without splines. Bottom right: FCM,
SDM, with and without splines.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 501
fitting splines over the estimated boundaries. There are four parts in this fig-
ure showing the effect of splines over two classification systems, using two
distance methods: (a) MRF using PDM, (b) MRF using SDM, (c) FCM using
PDM, and (d) FCM using SDM. In all four subprotocols, we find the same be-
havior that the spline-fitted mean errors are lower than nonspline-fitted mean
errors. We also observed that there is a very consistent standard deviation er-
ror for all four subprotocols. We also did lumen shape optimization on fitted
spline shapes, and this can be seen in Fig. 9.43 (left and right). As the number of
points on the boundary increases, the mean error drops and reaches a stage of
convergence.
To determine the average area of the entire lumen from the ground truth bound-
aries, the area by triangles computation is used. The center point of the ROI is
the user input, and is equivalent to the center of gravity (CG). The area of the
enclosed region is obtained by summing the areas of the triangles formed by the
CG and each pair of neighboring points on the boundary.
In the scan-line method, we count the number of pixels along the scan line
which lies in the ROI. This process is done for all the lies which interest the
ROI region. The entry and exit points are computed by finding the number of
times the scan line interests the boundary yielding the odd or even number. If
the intersection yields 1 then begin counting the pixels, and if the intersection
yields 2, then stop counting pixels. This gives a total number of pixels along the
line. The process stops when there are no more interesections. In a 384 × 512
image, the average area for the left and right lumen is 500 pixels squared.
The select class package takes as one of its inputs the number of classes formed
after the segmentation method. Using this as a size for an array of the different
classes C0 through Cn , the program checks each pixel in the ROI and stores the
number of times that each of the different pixel values occur. The program then
sorts these class values by their frequency.
Using the average ground truth contour area, a difference threshold, Td, is de-
termined. We set Td = 75. If the difference between the average ground truth
contour area and the number of pixels of C0 in the ROI is less than the differ-
ence threshold, then only C0 is selected. If the difference is greater than the
difference threshold, then both C0 and C1 are selected, then they are merged
and a binary image is made.
In the GSM, a select class package is not used, but a region growing method
is used. The GSM usually merges the C0 and C1 classes, so the region growing
captures both C0 and C1 classes.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 503
5
We ran the system using each of the three different classifying methods on real patient
data. Ground truth boundaries of the walls of the carotid artery were traced for 15 patients.
Overall the number of boundary points was roughly 22,500 points. A pixel was equivalent to
0.25 mm. Using MRF, the average error was 0.61 pixels; using FCM, the average error was
0.62 pixels; using GSM, the average error was 0.74 pixels.
504 Suri et al.
Table 9.1: Mean errors as computed using polyline and shortest distance
methods when the classification system is FCM based
Patient No. Artifacted (PDM) Corrected (PDM) Artifacted (SDM) Corrected (SDM)
method. Column 1 shows the error when the estimated boundary is not cor-
rected (artifacted), using the PDM ruler. Column 2 shows the error when the
estimated boundary is corrected by merging multiple classes of the lumen, using
the PDM ruler. Column 3 shows the error when the estimated boundary is not
corrected (artifacted), using the SDM ruler. Column 4 shows the error when the
estimated boundary is corrected by merging multiple classes of the lumen, using
the SDM ruler. As seen in the table, column 2 shows the least error and is sig-
nificiantly improved over the artifacted boundaries. Table 9.2 shows the error
between the computer-estimated boundary and ground truth boundary using
MRF-based method. Column 1 shows the error when the estimated boundary is
not corrected (artifacted), using the PDM ruler. Column 2 shows the error when
the estimated boundary is corrected by merging multiple classes of the lumen,
using the PDM ruler. Column 3 shows the error when the estimated boundary is
not corrected (artifacted), using the SDM ruler. Column 4 shows the error when
the estimated boundary is corrected by merging multiple classes of the lumen,
using the SDM ruler. As seen in the table, column 2 shows the least error and is
significiantly improved over the artifacted boundaries. Table 9.3 shows the er-
ror between the computer-estimated boundary and ground truth boundary using
GSM-based method. Column 1 shows the error when the estimated boundary is
not corrected (artifacted), using the PDM ruler. Column 2 shows the error when
the estimated boundary is corrected by merging multiple classes of the lumen,
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 505
Table 9.2: Mean errors as computed using polyline and shortest distance
methods when the classification system is MRF based
Patient No. Artifacted (PDM) Corrected (PDM) Artifacted (SDM) Corrected (SDM)
Table 9.3: Mean errors as computed using polyline and shortest distance
methods when the classification system is GSM based
Patient No. Artifacted (PDM) Corrected (PDM) Artifacted (SDM) Corrected (SDM)
using the PDM ruler. Column 3 shows the error when the estimated boundary is
not corrected (artifacted), using the SDM ruler. Column 4 shows the error when
the estimated boundary is corrected by merging multiple classes of the lumen,
using the SDM ruler. As seen in the table, column 2 shows the least error and is
significiantly improved over the artifacted boundaries.
506 Suri et al.
9.7 Conclusions
in most cases is the farthest distance from the center to a point on the contour.
The ROI is the circle given by this center and this radius. The center and radius
are sometimes adjusted after seeing the result of the pipeline’s first run.
9.8 Acknowledgments
The authors thank the Department of Radiology for the MR datasets. Thanks
also to the students of Biomedical Imaging Laboratory at the Department of
Biomedical Engineering, Case Western Reserve University for cooperating on
sharing the calibrated machines for tracing the ground truth on plaque volumes.
Questions
3. Discuss the three types of algorithms used in this chapter for lumen esti-
mation?
Bibliography
[1] Rogers, W. J., Prichard, J. W., Hu, Y. L., Olson, P. R., Benckart, D. H.,
Kramer, C. M., Vido, D. A., and Reichek, N., Characterization of sig-
nal properties in atherosclerotic plaque components by intravascular
MRI, Arterioscler. Thromb. Vasc. Biol., Vol. 20, No. 7, pp. 1824–1830,
2000.
[4] Pietrzyk, U., Herholz, K., and Heiss, W. D., Three-dimensional align-
ment of functional and morphological tomograms, J. Comput. Assist.
Tomogr., Vol. 14, No. 1, pp. 51–59, 1990.
[5] Coombs, B. D., Rapp, J. H., Ursell, P. C., Reily, L. M., and Saloner,
D., Structure of plaque at carotid bifurcation: High-resolution MRI
with histological correlation, stroke, Vol. 32, No. 11, pp. 2516–2521,
2001.
[6] Brown, B. G., Hillger, L., Zhao, X. Q., Poulin, D., and Albers, J. J., Types
of changes in coronary stenosis severity and their relative importance
in overall progression and regression of coronary disease: Observa-
tions from the FATS TRial: Familial Atherosclerosis Treatement Study,
Ann. N.Y. Acad. Sci., Vol. 748, pp. 407–417, 1995.
[7] Helft, G., Worthley, S. G., Fuster, V., Fayad, Z. A., Zaman, A. G.,
Corti, R., Fallon, J. T., and Badimon, J. J., Progression and regression
of atherosclerotic lesions: Monitoring with serial noninvasive MRI,
Circulation, Vol. 105, pp. 993–998, 2002.
[8] Hayes, C. E., Hattes, N., and Roemer, P. B., Volume imaging with MR
phased arrays, Magn. Reson. Med., Vol. 18, No. 2, pp. 309–319, 1991.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 511
[9] Gill, J. D., Ladak, H. M., Steinman, D. A., and Fenster, A., Segmentation
of ulcerated plaque: A semi-automatic method for tracking the pro-
gression of carotid atherosclerosis, In: Proceedings of 22nd Annual
EMBS International Conference, 2000, pp. 669–672.
[10] Yang, F., Holzapfel, G., Schulze-Bauer, Ch. A. J., Stollberger, R., The-
dens, D., Bolinger, L., Stolpen, A., and Sonka, M., Segmentation of wall
and plaque in in vitro vascular MR images, Int. J. Cardiovasc. Imaging,
Vol. 19, No. 5, pp. 419–428, 2003.
[11] Kim, W. Y., Stuber, M., Boernert, P., Kissinger, K. V., Manning, W. J.,
and Botnar, R. M., Three-dimensional black-blood cardiac magnetic
resonance coronary vessel wall imaging detects positive arterial re-
modeling in patients with nonsignificant coronary artery disease, Cir-
culation, Vol. 106, No. 3, pp. 296–299, 2002.
[13] Jespersen, S. K., Gro/nholdt, M.-L. M., Wilhjelm, J. E., Wiebe, B.,
Hansen, L. K., and Sillesen, H., Correlation between ultrasound B-
mode images of carotid plaque and histological examination, IEEE
Proc. Ultrason. Symp., Vol. 2, pp. 165–168, 1996.
[14] Quick, H. H., Debatin, J. F., and Ladd, M. E., MR imaging of the vessel
wall, Euro. Radiol., Vol. 12, No. 4, pp. 889–900, 2002.
[15] Corti, R., Fayad, Z. A., Fuster, V., Worthley, S. G., Helft, G., Chesebro, J.,
Mercuri, M., and Badimon, J. J., Effects of lipid-lowering by sim-
vastatin on human atherosclerotic lesions: A longitudinal study by
high-resolution, noninvasive magnetic resonance imaging, Circula-
tion, Vol. 104, No. 3, pp. 249–252, 2001.
[17] Helft, G., Worthley, S. G., Fuster, V., Zaman, A. G., Schechter, C.,
Osende, J. I., Rodriguez, O. J., Fayad, Z. A., Fallon, J. T., and Badimon,
J. J., Atherosclerotic aortic component quantification by noninvasive
magnetic resonance imaging: An in vivo study in rabbits, J. Am. Coll.
Cardiol., Vol. 37, No. 4, pp. 1149, 2001.
[18] Shinnar, M., Fallon, J. T., Wehrli, S., Levin, M., Dalmacy, D., Fayad, Z. A.,
Badimon, J. J., Harrington, M., Harrington, E., and Fuster, V., The di-
agnostic accuracy of ex vivo MRI for human atherosclerotic plaque
characterization, Arterioscler. Thromb., Vasc. Biol., Vol. 19, No. 11,
pp. 2756–2761, 1999.
[19] Toussaint, J. F., LaMuraglia, G. M., Southern, J. F., Fuster, V., and
Kantor, H. L., Magnetic resonance images lipid, fibrous, calcified, hem-
orrhagic, and thrombotic components of human atherosclerosis in
vivo, Circulation, Vol. 94, No. 5, pp. 932–938, 1996.
[20] Worthley, S. G., Helft, G., Fuster, V., Fayad, Z. A., Rodriguez, O. J.,
Zaman, A. G., Fallon, J. T., and Badimon, J. J., Noninvasive in vivo
magnetic resonance imaging of experimental coronary artery lesions
in a porcine model, Circulation, Vol. 101, No. 25, pp. 2956–2961,
2000.
[21] Toussaint, J. F., NMR sequences for biochemical analysis and imaging
of vascular diseases, Int. J. Cardiovasc. Imaging, Vol. 17, No. 3, pp. 187–
194, 2001.
[22] Naghavi, M., Libby, P., Falk, E., Casscells, S. W., Litovsky, S., Rum-
berger, J., Badimon, J. J., Stefanadis, C., Moreno, P., Pasterkamp, G.,
Fayad, Z., Stone, P. H., Waxman, S., Raggi, P., Madjid, M.,
Zarrabi, A., Burke, A., Yuan, C., Fitzgerald, P. J., Siscovick, D. S.,
de Korte, C. L., Aikawa, M., Airaksinen, K. E., Assmann, G., Becker,
C. R., Chesebro, J. H., Farb, A., Galis, Z. S., Jackson, C., Jang, I.-K.,
Koening, W., Lodder, R. A., March, K., Demirovic, J., Navab, M., Pri-
ori, S. G., Rekhter, M. D., Bahr, R., Grundy, S. M., Mehran, R., Colombo,
A., Boerwinkle, E., Ballantyne, C., Insull, W., Jr., Schwartz, R. S.,
Vogel, R., Serruys, P. W., Hansson, G. K., Faxon, D. P., Kaul, S., Drexler,
H., Greenland, P., Muller, J. E., Virmani, R., Ridker, P. M., Zipes, D. P.,
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 513
[23] Naghavi, M., Libby, P., Falk, E., Casscells, S. W., Litovsky, S., Rumberger,
J., Badimon, J. J., Stefanadis, C., Moreno, P., Pasterkamp, G., Fayad, Z.,
Stone, P. H., Waxman, S., Raggi, P., Madjid, M., Zarrabi, A., Burke, A.,
Yuan, C., Fitzgerald, P. J., Siscovick, D. S., de Korte, C. L., Aikawa, M.,
Airaksinen, K. E., Assmann, G., Becker, C. R., Chesebro, J. H., Farb, A.,
Galis, Z. S., Jackson, C., Jang, I.-K., Koenig, W., Lodder, R. A., March,
K., Demirovic, J., Navab, M., Priori, S. G., Rekhter, M. D., Bahr, R.,
Grundy, S. M., Mehran, R., Colombo, A., Boerwinkle, E., Ballantyne,
C., Insull, W., Jr., Schwartz, R. S., Vogel, R., Serruys, P. W., Hansson,
G. K., Faxon, D. P., Kaul, S., Drexler, H., Greenland, P., Muller, J. E.,
Virmani, R., Ridker, P. M., Zipes, D. P., Shah, P. K., and Willerson, J. T.,
From vulnerable plaque to vulnerable patient: A call for new deinitions
and risk assessment strategies, Part II, Circulation, Vol. 108, No. 15,
pp. 1772–1778, 2003.
[24] Fayad, Z. A., Fuster, V., Nikolaou, K., and Becker, C., Computed to-
mography and magnetic resonance imaging for noninvasive coronary
angiography and plaque imaging: Current and potential future con-
cepts, Circulation, Vol. 106, No. 15, pp. 2026–2034, 2002.
[25] Fuster, V., Fayad, Z. A., and Badimon, J. J., Acute coronary syndromes:
Biology, Lancet, Vol. 353, pp. SII5–SII9, 1999.
[26] Cai, J. M., Hatsukami, T. S., Ferguson, M. S., Small, R., Polissar, N. L.,
and Yuan, C., Classification of human carotid atherosclerotic lesions
with in vivo multicontrast magnetic resonance imaging, Circulation,
Vol. 106, No. 11, pp. 1368–1373 2002.
[27] Xu, D., Hwang, J.-N., and Yuan, C., Atherosclerotic blood vessel track-
ing and lumen segmentation in topology changes situations of MR
image sequences, In: Proceedings of the International Conference on
Image Processing (ICIP), 2000, Vol. 1, pp. 637–640.
[28] Xu, D., Hwang, J.-N., and Yuan, C., Atherosclerotic plaque segmen-
tation at human carotid artery based on multiple contrast weighting
514 Suri et al.
[29] Hatsukami, T. S., Ross, R., Polissar, N. L., and Yuan, C., Visalization
of fibrous cap thickness and rupture in human atherosclerotic carotid
plaque in vivo with high-resolution magnetic resonance imaging, Cir-
culation, Vol. 102, No. 9, pp. 959–964, 2000.
[30] Yuan, C., Lin, E., Millard, J., and Hwang, J. N., Closed contour edge
detection of blood vessel lumen and outer wall boundaries in black-
blood MR images, Magn. Reson. Imaging, Vol. 17, No. 2, pp. 257–266,
1999.
[31] Yuan, C., Kerwin, W. S., Ferguson, M. S., Polissar, N., Zhang, S., Cai, J.,
and Hatsukami, T. S., Contrast-enhanced high resolution MRI for
atherosclerotic carotid artery tissue characterization, J. Magn. Reson.
Imaging, Vol. 15, No. 1, pp. 62–67, 2002.
[32] Yuan, C., Zhang, S.-X., Polissar, N. L., Echelard, D., Ortiz, G., Davis,
J. W., Ellington, E., Ferguson, M. S., and Hatsukami, T. S., Identifica-
tion of fibrous cap rupture with magnetic resonance imaging is highly
associated with recent transient ischemic attack or stroke, Circula-
tion, Vol. 105, No. 2, pp. 181–185, 2002.
[33] Zhang, S., Hatsukami, T. S., Polissar, N. L., Han, C., and Yuan, C., Com-
parison of carotid vessel wall area measurements using three different
contrast-weighted black blood MR imaging techniques, Magn. Reson.
Imaging, Vol. 19, No. 6, pp. 795–802, 2001.
[34] Zhao, X. Q., Yuan, C., Hatsukami, T. S., Frechette, E. H., Kang, X. J.,
Maravilla, K. R., and Brown, B. G., Effects of prolonged intensive
lipid-lowering therapy on the characteristics of carotid atherosclerotic
plaques in vivo by MRI: A case-control study, Arterioscler. Thromb.
Vasc. Biol., Vol. 21, No. 10, pp. 1623–1629, 2001.
[35] Kerwin, W. S., Han, C., Chu, B., Xu, D., Luo, Y., Hwang, J.-N.,
Hatsukami, T. S., and Yuan, C., A quantitative vascular analysis sys-
tem for evaluation of atherosclerotic lesions by MRI, In: Proceed-
ings of the International Conference on Medical Image Computing
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 515
[36] Han, C., Hatsukami, T. S., and Yuan, C., A multi-scale method for au-
tomatic correction of intensity non-uniformity in MR images, J. Magn.
Reson. Imaging, Vol. 13, No. 3, pp. 428–436, 2001.
[37] Zhang, Q., Wendt, M., Aschoff, A. J., Lewin, J. S., and Duerk, J. L., A
multielement RF coil for MRI guidance of interventional devices, J.
Magn. Reson. Imaging, Vol. 14, No. 1, pp. 56–62, 2001.
[38] Goldin, J. G., Yoon, H. C., Greaser, L. E., III, et al., Spiral versus electron-
beam CT for coronary artery calcium scoring, Radiology, Vol. 221,
pp. 213–221, 2001.
[39] Becker, C. R., Kleffel, T., Crispin, A., et al., Coronary artery calcium
measurement: Agreement of multirow detector and electron beam CT,
Am. J. Roentgenol., Vol. 176, pp. 1295–1298, 2001.
[41] Haberl, R., Becker, A., Leber, A., et al., Correlation of coronary cald-
ification and angiographically documented stenoses in patients with
suspected coronary artery disease: Results of 1,764 patients, J. Am.
Coll. Cardio., Vol. 37, pp. 451–457, 2001.
[42] Leber, A. W., Knez, A., Mukherjee, R., et al., Usefulness of calcium
scoring using electron beam computed tomography and noninvasive
coronary angiography in patients with suspected coronary artery dis-
ease, Am. J. Cardiol., Vol. 88, pp. 219–223, 2001.
[45] Sevrukov, A., Jelnin, V., and Kondos, G. T., Electron beam CT of the
coronary arteries: Cross-sectional anatomy for calcium scoring, Am.
J. Roentgenol., Vol. 177, pp. 1437–1445, 2001.
[46] Bond, J. H., Colorectal cancer screening: The potential role of virtual
colonoscopy, J. Gastroenterol., Vol. 37, No. 13, pp. 92–96, 2002.
[47] Chaoui, A. S., Blake, M. A., Barish, M. A., et al., Virtual colonoscopy
and colorectal cancer screening, Abdom. Imaging, Vol. 25, pp. 361–367,
2000.
[49] Fenlon, H. M., Nunes, D. P., Clarke, P. D., et al., Colorectal neoplasm
detection using virtual colonoscopy: A feasibility study, Gut, Vol. 43,
pp. 806–811, 1998.
[50] Fenlon, H. M., Nunes, D. P., Schroy, P. C., III, et al., A comparison of
virtual and conventional colonoscopy for the detection of colorectal
polyps, N. Engl. J. Med., Vol. 341, pp. 1496–1503, 1999.
[52] Harvey, C. J., Renfrew, I., Taylor, S., et al., Spiral CT pneumocolon:
Applications, status and limitations, Euro. Radiol., Vol. 11, pp. 1612–
1625, 2001.
[56] Budoff, M. J., Oudiz, R. J., Zalace, C. P., et al., Intravenous three di-
mensional coronary angiography using contrast enhanced electron
beam computed tomography, Am. J. Cardiol., Vol. 83, pp. 840–845,
1999.
[57] Lu, B., Budoff, M. J., Zhuang, N., et al., Causes of interscan variability
of coronary artery calcium measurements at electron-beam CT, Acad.
Radiol., Vol. 9, pp. 654–661, 2002.
[58] Mao, S., Budoff, M. J., Bakhsheshi, H., Liu, S. C., Improved repro-
ducibility of coronary artery calcium scoring by electron beam tomog-
raphy with a new electrocardiographic trigger method, Invest. Radiol.,
Vol. 36, pp. 363–367, 2001.
[59] Bielak, L. F., Rumberger, J. A., Sheedy, P. F., II, et al., Probabilis-
tic model for prediction of angiographically defined obstructive
coronary artery disease using electron beam computed tomogra-
phy calcium score strata, Circulation, Vol. 102, No. 4, pp. 380–385,
2000.
[61] Pannu, H. K., Flohr, T. G., Corl, F. M., and Fishman, E. K., Current
concepts in multi-detector row CT evaluation of the coronary ar-
teries: Principles, Techniques, and Anatomy, Radiographics, Vol. 23,
No. 90001, pp. S111–S125, 2003.
[64] Suri, J. S., An algorithm for time-of-flight black blood vessel detection.
In: Proceedings of the 4th IASTED International Conference in Signal
Processing, 2002, pp. 560–564.
518 Suri et al.
[65] Suri, J. S., Artery–Vein detection in very noisy TOF angiographic vol-
ume using dynamic feedback scale-space ellipsoidal filtering, In: Pro-
ceedings of the 4th IASTED International Conference in Signal Pro-
cessing, 2002, pp. 565–571.
[67] Suri, J. S., Wilson, D. L., and Laxminarayan, S. N., Handbook of Med-
ical Image Analysis: Segmentation and Registration Models, Marcel
Dekker, New York, 2004.
[69] Suri, J. S., Computer vision, pattern recognition and image processing
in left ventricle segmentation: The last 50 years, Int. J. Patt. Anal. Appl.,
Vol. 3, No. 3, pp. 209–242, 2000.
[70] Suri, J. S., Kamaledin, S., and Singh, S., Advanced Algorithmic Ap-
proaches to Medical Image Segmentation: State-of-the-Art Applica-
tions in Cardiology, Neurology, Mammography and Pathology, 2001.
[71] Suri, J. S. and Laxminarayan, S. N., PDE and Level Sets: Algorithmic
Approaches to Static and Motion Imagery, Kluwer Academic/Plenum
Publishers, 2002.
[72] Salvado, O., Hillenbrand, C., Zhang, S., Suri, J. S., and Wilson, D.,
MR signal inhomogeneity correction for visual and computerized
atherosclerosis lesion assessment, In: IEEE International Sympo-
sium on Biomedical Imaging: From Nano to Macro (ISBI), Arlington,
VA, 2004.
[73] Yabushita, H., Bouma, B. E., Houser, S. L., Aretz, H. T., Jang, I.-K.,
Schlendorf, K. H., Kauffman, C. R., Shishkov, M., Kang, D.-H., Halpern,
E. F., and Tearney, G. J., Characterization of human atherosclerosis by
optical coherence tomography, Circulation, Vol. 106, No. 13, pp. 1640–
1645, 2002.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 519
[74] Nair, A., Kuban, B. D., Obuchowski, N., and Vince, D. G., Assessing
spectral algorithms to predict atherosclerotic plaque composition with
normalized and raw intravascular ultrasound data, Ultrasound Med.
Biol., Vol. 27, No. 10, pp. 1319–1331, 2001.
[75] Wink, O., Niessen, W. J., and Viergever, M. A., Fast delineation and
visualization of vessels in 3-D angiographic images, IEEE Trans. Med.
Imaging, Vol. 19, No. 4, pp. 337–346, 2000.
[76] Wink, O., Niessen, W. J., and Viergever, M. A., Fast quantification of
abdominal aortic aneurysms from CTA volumes, In: Proceedings of
Medical Image Computing and Computer Assisted Intervention, 1998,
pp. 138–145.
[78] Udupa, J. K., Odhner, D., Tian, J., Holland, G., and Axel, L., Automatic
clutter free volume rendering for MR angiography using fuzzy con-
nectedness, SPIE Proc., Vol. 3034, pp. 114–119, 1997.
[81] Saha, P. K., Udupa, J. K., and Odhner, D., Scale-based fuzzy
connected image segmentation: theory, algorithm, and validation,
Comput Vis. Image Understanding, Vol. 77, No. 2, pp. 145–174,
2000.
[82] Udupa, J. K. and Odhner, D., Shell rendering, IEEE Comput Graph.
Appl., Vol. 13, No. 6, pp. 58–67, 1993.
520 Suri et al.
[83] Lei, T., Udupa, J. K., Saha, P. K., and Odhner, D., MR angiographic
visualization and artery-vein separation, Proc. of SPIE, Int. Soc. Opt.
Eng., Vol. 3658, pp. 58–66, 1999.
[84] Sato, Y., Nakajima, S., Shiraga, N., Atsumi, H., Yoshida, S., Koller, T.,
Gerig, G., and Kikinis, R., Three-dimensional multi-scale line filter for
segmentation and visualization of curvilinear structures in medical
images, Med. Image Anal., Vol. 2, No. 2, pp. 143–168, 1998.
[85] Sato, Y., Chen, J., Harada, N., Tamura, S., and Shiga, T., Automatic ex-
traction and measurements of leukocyte motion in micro vessels us-
ing spatiotemporal image analysis, IEEE Trans. Biomed. Eng., Vol. 44,
No. 4, pp. 225–236, 1997.
[86] Sato, Y., Nakajima, S., Atsumi, H., Koller, T., Gerig, G, Yoshida, S., and
Kikinis, R., 3-D multi-scale line filter for segmentation and visualiza-
tion of curvilinear structures in medical images, In: Proceedings on
CVRMed and MRCAS (CVRMed/MRCAS), 1997, pp. 213–222.
[87] Sato, Y., Araki, T., Hanayama, M., Naito, H., and Tamura, S., A view-
point determination system for stenosis diagnosis and quantification in
coronary angiographic image acquisition, IEEE Trans. Med. Imaging,
Vol. 17, No. 1, pp. 121–137, 1998.
[88] Frangi, A. F., Niessen, W. J., Hoogeveen, R. M., van Walsum, Th., and
Viergever, M. A., Model-based quantification of 3-D magnetic reso-
nance angiographic images, IEEE Trans. Med. Imaging, Vol. 18, No. 10,
pp. 946–956, 1999.
[89] Berliner, J. A., Navab, M., Fogelman, A. M., Frank, J. S., Demer, L. L.,
Edwards, P. A., Watson, A. D., and Lusis, A. J., Atherosclerosis: Ba-
sic mechanismsm Oxidation, inflammation, and genetics, Circulation,
Vol. 91, No. 9, pp. 2488–2496, 1995.
[90] Botnar, R. M., Stuber, M., Kissinger, K. V., Kim, W. Y., Spuentrup, E.,
and Manning, W. J., Noninvasive coronary vessel wall and plaque imag-
ing with magnetic resonance imaging, Circulation, Vol. 102, No. 21,
pp. 2582–2587, 2000.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 521
[91] Breen, M. S., Lancaster, T. L., Lazebnik, R., Aschoff, A. J., Nour S.
G., Lewin J. S., and Wilson, D. L., Three dimensional correlation of
MR images to muscle tissue response for interventional MRI thermal
ablation, Proc. SPIE Med. Imaging, Vol. 5029, pp. 202–209, 2001.
[92] Carrillo, A., Wilson, D. L., Duerk, J. L., and Lewin, J. S., Semi-automatic
3D image registration and applied to interventional MRI liver can-
cer treatment, IEEE Trans. Med. Imaging, Vol. 19, No. 3, pp. 175–185,
2003.
[95] Correia, L. C. L., Atalar, E., Kelemen, M. D., Ocali, O., Hutchins, G.
M., Fleg, J. L., Gerstenblith, G., Zerhouni, E. A., and Lima, J. A. C.,
Intravascular magnetic resonance imaging of aortic atherosclerotic
plaque composition, Arterioscler. Thromb. Vasc. Biol., Vol. 17, No. 12,
pp. 3626–3632, 1997.
[97] Hajnal, J. V., Saeed, N., Soar, E. J., Oatridge, A., Young, I. R., and Bydder,
G., A registration and interpolation procedure for subvoxel matching
of serially acquired MR images, J. Comput. Assist. Tomography, Vol. 19,
No. 2, pp. 289–296, 1995.
[98] Hurst, G. C., Hua, J., Duerk, J. L., and Cohen, A. M., Intravascular
(catheter) NMR receiver probe: Preliminary design analysis and ap-
plication to canine iliofemoral imaging, Magn. Reson. Med., Vol. 24,
No. 2, p. 343, 1992.
522 Suri et al.
[99] Klingensmith, J. D., Shekhar, R., and Vince, D. G., Evaluation of three-
dimensional segmentation algorithms for the identification of lumi-
nal and medial-adventitial borders in intravascular ultrasound images,
IEEE Trans. Med. Imaging, Vol. 19, No. 10, pp. 996–1011, 2000.
[100] Ladak, H. M., Thomas, J. B., Mitchell, J. R., Rutt, B. K., and Steinman,
D. A., A semi-automatic technique for measurement of arterial wall
from black blood MRI, Med. Phy., Vol. 28, No. 6, p. 1098, 2001.
[102] Lazebnik, R., Lancaster, T. L., Breen, M. S., Lewin J. S., and Wilson,
D. L., Volume registration using needle paths and point landmarks
for evaluation of interventional MRI treatments, IEEE Trans. Med.
Imaging, Vol. 22, No. 5, pp. 659–660, 2003.
[103] Lorigo, L. M., Faugeras, O., Grimson, W. E. L., Keriven, R., Kikinis,
R., Nabavi, A., and Westin, C. F., Codimension-two geodesic active
contours for the segmentation of tubular structures, Vol. 1, No. 13–15,
pp. 444–451, 2000.
[104] Maes, F., Collignon, A., Vandermeulen, D., Marchal, G., and Suetens,
P., Multimodality image registration by maximization of mutual infor-
mation, IEEE Trans. Med. Imaging, Vol. 16, No. 2, pp. 187–198, 1997.
[105] Merickel, M. B., Carman, C. S., Watterson, W. K., Brookeman, J. R., and
Ayers, C. R., Multispectral pattern recognition of MR imagery for the
noninvasive analysis of atherosclerosis, In: 9th International Confer-
ence on Pattern Recognition, 1988, pp. 1192–1197.
[106] Pallotta, S., Gilardi, M. C., Bettinardi, B., Rizzo, G., Landoni, C., Striano,
G., Masi, R., and Fazio, F., Application of a surface matching image
registration technique to the correlation of cardiac studies in position
emission tomography (PET) by transmission images, Phy. Med. Biol.,
Vol. 40, No. 10, pp. 1695–1708, 1995.
[108] Pietrzyk, U., Herholz, K., Fink, G., Jacobs, A., Mielke, R., Slansky,
I., Michael, W., and Heiss, W. D., An interactive tecnique for three-
dimensional image registration: Validation for PET, SPECT, MRI and
CT brain studies, J. Nuclear Med., Vol. 35, No. 12, pp. 2011–2018,
1994.
[109] Rioufol, G., Finet, G., Ginon, I., Andre, F., XRossi, R., Vialle, E., Desjoy-
aux, E., Convert, G., Huret, J. F., and Tabib, A., Multiple atherosclerotic
plaque rupture in acute coronary syndrome: A three-vessel intravas-
cular ultrasound study, Circulation, Vol. 106, No. 7, p. 804, 2002.
[112] Thieme, T., Wernecke, K. D., Meyer, R., Brandenstein, E., Habedank,
D., Hinz, A., Felix, S. B., Baumann, G., and Kleber, F. X., Angioscopic
evaluation of atherosclerotic plaques: Validation by histomorphologic
analysis and association with stable and unstable coronary syndromes,
J. Am. Coll. Cardiol., Vol. 28, No. 1, pp. 1–6, 1996.
[113] Trouard, T. P., Altbach, M. I., Hunter, G. C., Eskelson, C. D., and Gmitro,
A. F., MRI and NMR spectroscopy of the lipids of atherosclerotic plaque
in rabbits and humans, Magn. Reson. Med., Vol. 38, No. 1, pp. 19–26,
1997.
[114] van den Elsen, P. A., Pol, E. J. D., and Viergever, M. A., Medical image
matching—A review with classification, IEEE Eng. Med. Biol., Vol. 12,
No. 1, pp. 26–39, 1993.
[117] West, J., Fitzpatrick, M., Wang, M. Y., Dawant, B. M., Maurer, C. R.,
Kessler, M. L., Maciunas, R. J., Barillot, C., Lemoine, D., Collignon, A.,
Maes, F., Suetens, P., Vandermeulen, D., van den Elsen, P. A., Napel,
S., Sumanaweera, T. S., Harkness, B. A., Hemler, P. F., Hill, D. L. G.,
Hawkes, D. J., Studholme, C., Maintz, J. B., Viergever, M. A., Malandain,
G., Pennec, X., Noz, M. E., Maguire, G. Q., Pollack, M., Pelizzari, C. A.,
Robb, R. A., Hanson, D., and Woods, R. P., Comparison and evaluation
of retrospective intermodality brain image registration techniques, J.
Comput. Assist. Tomography, Vol. 21, No. 4, pp. 554–566, 1997.
[118] Breen, M. S., Lancaster T. L., Lazebnik, R., Nour S. G., Lewin J. S., and
Wilson, D. L., Three dimensional method for comparing in vivo inter-
ventional MR images of thermally ablated tissue with tissue response,
J. Magn. Reson. Imaging, Vol. 18, No. 1, pp. 90–102, 2003.
[119] Wilson, D. L., Carrillo, A., Zheng, L., Genc, A., Duerk, J. L., and Lewin,
J. S., Evaluation of 3D image registration as applied to MR-guided
thermal treatment of liver cancer, J. Magn. Reson. Imaging, Vol. 8,
No. 1, pp. 77–84, 1998.
[120] Wink, O., Fast delineation and visualization of vessels in 3-D angio-
graphic images, IEEE Trans. Med. Imaging, Vol. 19, No. 4, pp. 337–346,
2000.
[121] Yu, J. N., Fahey, F. H., Gage, H. D., Eades, C. G., Harkness, B. A.,
Pelizzari, C. A., and Keyes, J. W., Intermodality, retrospective image
registration in the thorax, J. Nuclear Med., Vol. 36, No. 12, pp. 2333–
2338, 1995.
[122] Draney, M. T., Herfkens, R. J., Hughes, T. J. R., Plec, N. J., Wedding, K.
L., Zarins, C. K., and Taylor, C. A., Quantification of vessel wall cyclic
strain using cine phase contrast magnetic resonance imaging, Ann.
Biomed. Eng., Vol. 30, No. 8, pp. 1033–1045, 2002.
[123] MacNeill, B. D., Lowe, H. C., Takano, M., Fuster, V., and Jang, I.-K.,
Intravascular modalities for detection of vulnerable plaque: current
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 525
status, Arterioscler. Thromb. Vasc. Biol., Vol. 23, No. 8, pp. 1333–1342,
2003.
[124] Ziada, K., Tuzcu, E. M., Nissen, S. E., Ellis, S., Whitlow, P. L.,
and Franco, I., Prognostic importance of various intravascular ul-
trasound measurements of lumen size following coronary stenting
(submitted).
[125] Ziada, K., Kapadia, S., Tuzcu, E. M., and Nissen, S. E., The current status
of intravascular ultrasound imaging, Curr. Prob. Cardiol., Vol. 24, No. 9,
pp. 541–616, 1999.
[126] Nair, A., Kuban, B. D., Tuzcu, E. M., Schoenhagen, P., Nissen, S. E.,
and Vince, D. G., Coronary plaque classification with intravascular
ultrasound radiofrequency data analysis, Circulation, Vol. 106, No. 17,
pp. 2200–2206, 2002.
[128] Woods, R. P., Cherry, S. R., and Mazziotta, J. C., Rapid automated al-
gorithm for aligning and reslicing PET images, J. Comput. Assist. To-
mography, Vol. 16, No. 4, pp. 620–633, 1992.
[129] Woods, R. P., Mazziotta, J. C., and Cherry, S. R., MRI-PET registration
with automated algorithm, J. Comput. Assist. Tomography, Vol. 17,
No. 4, pp. 536–546, 1993.
[130] Fei, B. W., Boll, D. T., Duerk, J. L., and Wilson, D. L., Image registration
for interventional MRI-guided minimally invasive treatment of prostate
cancer, In: The 2nd Joint Meeting of the IEEE Engineering in Medicine
and Biology Society and the Biomedical Engineering Society, 2002,
Vol. 2, p. 1185.
[131] Fei, B. W., Duerk, J. L., Boll, D. T., Lewin, J. S., and Wilson, D. L., Slice
to volume registration and its potential application to interventional
MRI guided radiofrequency thermal ablation of prostate cancer, IEEE
Trans. Med. Imaging, Vol. 22, No. 4, pp. 515–525, 2003.
526 Suri et al.
[132] Fei, B. W., Duerk, J. L., and Wilson, D. L., Automatic 3D registration
for interventional MRI-guided treatment of prostate cancer, Comput.
Aided Surg., Vol. 7, No. 5, pp. 257–267, 2002.
[133] Fei, B. W., Frinkley K., and Wilson, D. L., Registration algorithms for
interventional MRI-guided treatment of the prostate cancer, Proc. SPIE
Med. Imaging, Vol. 5029, pp. 192–201, 2003.
[134] Fei, B. W., Kemper, C., and Wilson, D. L., A comparative study of
warping and rigid body registration for the prostate and pelvic MR
volumes, Comput. Med. Imaging Graph., Vol. 27, No. 4, pp. 267–281,
2003.
[135] Fei, B. W., Kemper, C., and Wilson, D. L., Three-dimensional warping
registration of the pelvis and prostate, In: Proceedings of SPIE Medical
Imaging on Image Processing, Sonka, M. and Fitzpatrick, J. M., eds.,
Vol. 4684, pp. 528–537, 2002.
[136] Fei, B. W., Wheaton, A., Lee, Z., Duerk, J. L., and Wilson, D. L., Au-
tomatic MR volume registration and its evaluation for the pelvis and
prostate, Phy. Med. Biol., Vol. 47, No. 5, pp. 823–838, 2002.
[137] Fei, B. W., Wheaton, A., Lee, Z., Nagano, K., Duerk, J. L., and Wilson, D.
L., Robust registration algorithm for interventional MRI guidance for
thermal ablation of prostate cancer, In: Proceedings of SPIE Medical
Imaging on Visualization, Display, and Image-Guided Procedures, Ki
Mun, S., ed., Vol. 4319, pp. 53–60, 2001.
[139] Studholme, C., Hill, D. L. G., and Hawkes, D. J., Automated 3D regis-
tration of MR and CT images of the head, Med. Image Anal., Vol. 1,
No. 2, pp. 163–175, 1996.
[140] Studholme, C., Hill, D. L. G., and Hawkes, D. J., Automated three-
dimensional registration of magnetic resonance and positron emission
tomography brain images by multiresolution optimization of voxel
similarity measures, Med. Phy., Vol. 24, No. 1, pp. 25–35, 1997.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 527
[141] Song, C. Z. and Yuille, A., Region competition: Unifying snakes, re-
gion growing, and Bayes/MDL for multiband image segmentation,
IEEE Trans. Patt. Anal. Machine Intell., Vol. 18, No. 9, pp. 884–900,
1996.
[142] Stary, H. C., Chandler, A. B., Glagov, S., Guyton, J. R., Insull, W. J.,
Rosenfeld, M. E., Schaffer, S. A., Schwartz, C. J., Wagner, W. D., and
Wissler, R. W., A definition of initial, fatty streak, and intermediate
lesions of atherosclerosis: A report from the Committee on Vascular
Lesions of the Council on Arteriosclerosis, American Heart Associa-
tion, Arterioscler. Thromb., Vol. 14, No. 5, pp. 840–856, 1994.
[143] Hemler, P. F., Napel, S., Sumanaweera, T. S., Pichumani, R., van den
Elsen, P. A., Martin, D., Drace, J., Adler, J. R., and Perkash, I., Reg-
istration error quantification of a surface-based multimodality image
fusion system, Med. Phy., Vol. 22, No. 7, pp. 1049–1056, 1995.
[144] Suri, J. S., White matter/Gray matter boundary segmentation using geo-
metric snakes: A fuzzy deformable model, In: International Conference
in Application in Pattern Recognition (ICAPR), Rio de Janeiro, Brazil,
March 11–14, 2001.
[145] Zhang, J., The mean field theory in EM procedures for Markov random
fields, IEEE Trans. Signal Process., Vol. 40, No. 10, 1992.
[146] Kapur, T., Model Based Three Dimensional Medical Image Segmen-
tation, Ph.D. Thesis, Artificial Intelligence Laboratory, Massachusetts
Institute of Technology, Cambridge, MA, May 1999.
[147] Li, S., Markov Random Field Modeling in Computer Vision, Springer
Verlag, Berlin, 1995. ISBN 0-387-701-451.
[148] Held, K., Rota Kopps, E., Krause, B., Wells, W., Kikinis, R., and Muller-
Gartner, H., Markov random field segmentation of brain MR images,
IEEE Trans. Med. Imaging, Vol. 16, No. 6, pp. 878–887, 1998.
[150] Koenderink, J. J., The structure of images, Biol. Cyb., Vol. 50, pp. 363–
370, 1984.
[151] Koller, T. M., Gerig, G., Székely, G., and Dettwiler, D., Multiscale de-
tection of curvilinear structures in 2-D and 3-D image data, In: IEEE
International Conference on Computer Vision (ICCV), 1995, pp. 864–
869.
[153] Gerig, G., Koller, M. Th., Székely, Brechbuhler, C., and Kubler, O., Sym-
bolic description of 3-D structures applied to cerebral vessel tree ob-
tained from MR angiography volume data, In: Proceedings of IPMI,
Series Lecture Notes in Computer Science, Vol. 687, Barett, H. H. and
Gmitro, A. F., eds., Springer-Verlag, Berlin, pp. 94–111, 1993.
[154] Thirion, J. P. and Gourdon, A., The 3-D marching lines algorithm,
Graph. Models Image Process., Vol. 58, No. 6, pp. 503–509, 1996.
[155] Lindeberg, T., Scale-space for discrete signals, IEEE Patt. Anal. Ma-
chine Intell., Vol. 12, No. 3, pp. 234–254, 1990.
[156] Lindeberg, T., On scale selection for differential operators, In: Proceed-
ings of the 8th Scandinavian Conference on Image Analysis (SCIA),
1993, pp. 857–866.
[157] Lindeberg, T., Detecting salient blob-like image structures and their
scales with a scalespace primal sketch: A method for focus of attention,
Int. J. Comput. Vision, Vol. 11, No. 3, pp. 283–318, 1993.
[158] Lindeberg, T., Edge detection and ridge detection with automatic scale
selection, In: Proceedings of Computer Vision and Pattern Recogni-
tion, 1996, pp. 465–470.
[159] Alyward, S., Bullitte, E., Pizer, S., and Eberly, D., Intensity ridge and
widths for tubular object segmentation and description, In: Proceed-
ings of Workshop Mathematical Methods Biomedical Image Analysis
(WMMBIA), Amini, A. A. and Bookstein, F. L., eds., pp. 131–138, 1996.
Lumen Identification, Detection, and Quantification in MR Plaque Volumes 529
[160] Lorenz, C., Carlsen, I.-C., Buzug, T. M., Fassnacht, C., and Wesse, J.,
Multi-scale line segmentation with automatic estimation of width, con-
trast and tangential direction in 2-D and 3-D medical images, In: Pro-
ceedings of Joint Conference on CVRMed and MRCAS, 1997, pp. 233–
242.
[161] Fidrich, M., Following features lines across scale, In: Proceedings
of Scale-Space Theory in Computer Vision, Series Lectures Notes in
Computer Science, Vol. 1252, ter Haar Romeny, B., Florack, L., Loeen-
derink, J., and Viergever, M., eds., Springer-Verlag, Berlin, pp. 140–151,
1997.
[163] Prinet, V., Monga, O., and Rocchisani, J. M., Vessels Representation in
2D and 3D Angiograms, International Congress Series (ICS), Vol. 1134,
pp. 240–245, 1998. ISSN 0531-5131.
[164] Prinet, V., Monga, O., Ge, C., Loa, X. S., and Ma, S., Thin network extrac-
tion in 3-D images: Application of medial angiograms, In: International
Conference on Pattern Recognition, Aug. 1996.
[165] Griffin, L., Colchester, A., and Robinson, G., Scale and segmentation of
images using maximum gradient paths, Image Vision Comput., Vol. 10,
No. 6, pp. 389–402, 1992.
[166] Koenderink, J. and van Doorn, A., Local features of smooth shapes:
Ridges and course, In: SPIE Proceedings on Geometric Methods in
Computer Vision-II, 1993, Vol. 2031, pp. 2–13.
[168] Majer, P., A statistical approach to feature detection and scale selec-
tion in images, Ph.D. Thesis, University of Göttingen, Gesellschaft für
wissenschaftliche Datenverarbeitung mbH Göttingen, Germany, July
2000.
530 Suri et al.
[169] Wells, W. M., III, Grimson, W. E. L., Kikinis, R., and Jolesz, F. A., Adaptive
segmentation of MRI data, IEEE Trans. Med. Imaging, Vol. 15, No. 4,
pp. 429–442, 1992.
[170] Gerig, G., Kubler, O., and Jolesz, F. A., Nonlinear anisotropic filtering
of MRI data, IEEE Trans. Med. Imaging, Vol. 11, No. 2, pp. 221–232,
1992.
[171] Joshi, M., Cui, J., Doolittle, K., Joshi, S., Van Essen, D., Wang, L., and
Miller, M. I., Brain segmentation and the generation of cortical sur-
faces, Neuroimage, Vol. 9, No. 5, pp. 461–476, 1999.
[172] Dempster, A. D., Laird, N. M., and Rubin, D. B., Maximum likelihood
from incomplete data via the EM algorithm, J. R. Stat. Soc., Vol. 39,
pp. 1–37, 1977.
[173] Kao, Y.-H., Sorenson, J. A., Bahn, M. M., and Winkler, S. S., Dual-
Echo MRI segmentation using vector decomposition and probability
technique: A two tissue model, Magn. Reson. Med., Vol. 32, No. 3,
pp. 342–357, 1994.
Yoshinobu Sato1
10.1 Introduction
1
Division of Interdisciplinary Image Analysis, Osaka University Graduate School of
Medicine, 2-2-D11 Yamada-oka, Suita, Osaka 565-0871, Japan
531
532 Sato
that underlies the discrete sample data. Second-order local structures around a
point of interest in the underlying continuous function can be fully represented
using up to second derivatives at the point, that is, the gradient vector and Hes-
sian matrix. In order to reduce noise as well as deal with second-order local
structures of “various sizes,” isotropic Gaussian smoothing with different stan-
dard deviation (SD) values is combined with derivative computation. Combining
Gaussian smoothing has another effect that accurate derivative computation of
the Gaussian smoothed version of the underlying “continuous” function is pos-
sible by convolution operations within a size-limited local window.
In this chapter, the following topics are discussed:
r Multiscale enhancement filtering of second-order local structures, that is,
line, sheet, and blob structures [5, 7, 11] in volume data.
r Analysis of filter responses for line structures using mathematical line
models [7].
r Description and quantification (width and orientation measurement) of
these local structures [10, 12].
Sheet λ3 λ2 λ1 0. λ3 0 Cortex
λ3 λ2 0 Cartilage
λ3 λ1 0
Line λ3 λ2 λ1 0. λ3 0 Vessel
λ3 λ2 Bronchus
λ2 λ1 0
Blob λ3 λ2 λ1 0. λ3 0 Nodule
λ3 λ2
λ2 λ1
in which γst controls the sharpness of selectivity for the conditions of each local
structure (Fig. 10.1(a)), and ω is written as
⎧
⎪ λs γst
⎨ (1 + |λt | )
⎪ λt ≤ λs ≤ 0
|λt |
ω(λs ; λt ) = (1 − α |λλs | )γst > λs > 0 (10.4)
⎪
⎪ t α
⎩0 otherwise,
Hessian-Based Multiscale Enhancement and Segmentation 535
1 1
Weight
Weight
γ = 0.5 γ = 0.5, α = 0.25
γ = 1.0 γ = 1.0, α = 0.25
0 0
0 1 0 1 2 3 4
λ_s / λ_t λ_s / | λ_t |
Therefore, when λ1 is positive, we make the decrease with the deviation from
the λ1 0 condition less sharp in order to still give a high response to a stenosis-
like shape. We typically used α = 0.25 and γst = 0.5 (or 1) in our experiments.
Extensive analysis of the line measure, including the effects of parameters γst
and α, can be found in [7].
536 Sato
e1
e3 e1
e2 Sheet
λ3 << λ2 = 0
e3 Sheet
ψ→0 λ3 << λ2 = 0
e2ψ → 0
Line
λ3 = λ2 Blob
Stenosis Blob
λ3 = λ2
λ1 >> 0 λ2 << λ1 = 0 λ2 = λ1 << 0
ω → 0 in ω → 0 in λ2 = λ1
Line
positive domain negative domain ψ→0 λ2 << λ1 = 0
(a) (b)
e1
e3
Groove e2 Blob
λ3 = λ1 << 0
λ1 >> 0
Sheet
ω → 0 in λ3 << λ1 = 0 ω → 0 in
positive domain negative domain Line
Pit λ3 << λ2 = 0
λ2 >> 0 λ3 = λ2 << 0
(c)
min(−λ2 , −λ3 ) = −λ2 λ2 < 0 and λ3 < 0
λmin 23 = (10.5)
0 otherwise.
Hessian-Based Multiscale Enhancement and Segmentation 537
and
√
λ2 λ3 λ2 < 0 and λ3 < 0
λg−mean 23 = (10.6)
0 otherwise,
Figures 10.2(b) and 10.2(c) show the relationships between the eigenvalue con-
ditions and weight functions in the blob and sheet measures.
the derivative computation for the gradient vector and the Hessian matrix is
combined with Gaussian convolution. By adjusting the standard deviation of
Gaussian convolution, local structures with a specific range of widths can be
enhanced. The Gaussian function is known as a unique distribution optimizing
localization in both the spatial and frequency domains [20]. Thus, convolution
operations can be applied within local support (due to spatial localization) with
minimum aliasing errors (due to frequency localization).
We denote the local structure filtering for a volume blurred by Gaussian
convolution with a standard deviation σ f as
Sξ { f ; σ f }, (10.11)
where ξ ∈ {sheet, line, blob}. The filter responses decrease as σ f in the Gaussian
convolution increases unless appropriate normalization is performed [21–23].
In order to determine the normalization factor, we consider a Gaussian-shaped
model of sheet, line, and blob with variable scales.
Sheet, line, and blob structures with variable widths are modeled as
x2
x ; σr ) = exp − 2 ,
sheet (" (10.12)
2σr
x 2 + y2
x ; σr ) = exp −
line (" , (10.13)
2σr2
and
x 2 + y2 + z2
x ; σr ) = exp −
blob (" , (10.14)
2σr2
respectively, where σr controls the width of the structures.
We determine the normalization factor so that Sξ {ξ ("
x ; σr ); σ f } satisfies the
following condition:
r maxσ r
" σr ); σ f } is constant, irrespective of σ f , where 0" = (0, 0, 0).
Sξ {ξ (0;
The above condition can be satisfied when the Gaussian second derivatives
are computed by multiplying by σ 2f as the normalization factor. That is, the
normalized Gaussian derivatives are given by
9 :
∂2
x ; σ f ) = σ 2f ·
fx p y q zr (" Gauss("
x ; σ f ) ∗ f ("
x) (10.15)
∂ x p ∂ y q ∂z r
where p, q, and r are non-negative integer values satisfying p + q + r = 2, and
Gauss("x ; σ ) is an isotropic 3D Gaussian function with a standard deviation σ
√
given by ( 2π σ )−1 exp(−|" x|2 /(2σ 2 )) (see the Questions section at the end of
Hessian-Based Multiscale Enhancement and Segmentation 539
Response
0.3
0.2
0.1
0
0 1 2 3 4 5 6
σ_r
(a) Line
Response
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 1 2 3 4 5 6 0 1 2 3 4 5 6
σ_r σ_r
at the end of this chapter for the theoretical derivations of the response curves.
(a) Response of the line filter for the line model (ξ = line). (b) Response of the
blob filter for the blob model (ξ = blob). (c) Response of the sheet filter for the
sheet model (ξ = sheet). ( c 2004 IEEE)
this chapter for the derivation). Figure 10.3 shows the normalized response of
√
" σr ); σ f } (where σ f = σi si−1 , σ1 = 1, s = 2, and i = 1, 2, 3, 4) for ξ ∈
Sξ {ξ (0;
{sheet, line, blob} when σr is varied.
In
the line case, the maximum of the normalized response
" σr ); σ f } is 1 (= 0.25) when σr = σ f [7]. That is, Sline { f ; σ f } is
Sline {line (0, 4
regarded as being tuned to line structures with a width σr = σ f . A line filter
with a single scale gives a high response in only a narrow range of widths. We
call the curves shown in Fig. 10.3 as width response curves, which represent
filter characteristics like frequency response curves. The width response curve
540 Sato
of the line filter can be adjusted and widened using multiscale integration of
filter responses given by
where σi = si−1 σ1 , in which σ1 is the smallest scale, s is a scale factor, and n is the
number of scales [7]. The width response curve of multiscale integration using
the four scales consists of the maximum values among the four single-scale
width response curves, and gives nearly uniform responses in the width range
√
between σr = σ1 and σr = σ4 when s = 2 (Fig. 10.3(a)). While the width
response curve can be perfectly uniform if continuous variation values are used
for σ f , the deviation from the continuous case is less than 3% using discrete
√
" σr ); σ f } and
values for σ f with s = 2 [7]. Similarly, in the cases of Ssheet {sheet (0,
" σr ); σ f }, the maximum of the normalized response is √2 3 (≈ 0.385)
Sblob {blob (0,
( 3)
σ
when σr = √f2 (Fig. 10.3(b)), and 23 ( 35 )5 (≈ 0.186) when σr = 32 σ f
(Fig. 10.3(c)), respectively (see the Question section at the end of this chapter
for the derivation). For the second-order cases, the width response curve can
be adjusted and widened using the multiscale integration method given by
Our 3-D local structure filtering methods described above assume that volume
data with isotropic voxels are used as input data. However, voxels in medical
volume data are usually anisotropic since they generally have lower resolution
along the third direction, i.e., the direction orthogonal to the slice plane, than
within slices. Rotational invariant feature extraction becomes more intuitive in a
space where the sample distances are uniform. That is, structures of a particular
size can be detected on the same scale independent of the direction when the
signal sampling is isotropic. We therefore introduce a preprocessing procedure
for 3-D local structure filtering in which we perform interpolation to make each
voxel isotropic. Linear and spline-based interpolation methods are often used,
but blurring is inherently involved in these approaches. Because, as noted above,
Hessian-Based Multiscale Enhancement and Segmentation 541
the original volume data is inherently blurrier in the third direction, further
degradation of the data in that direction should be avoided. For this reason, we
opted to employ sinc interpolation so as not to introduce any additional blurring.
After Gaussian-shaped slopes are added at the beginning and end of each profile
in the third direction to avoid unwanted Gibbs ringing, sinc interpolation is
performed by zero-filled expansion in the frequency domain [24, 25].
The method for sinc interpolation without Gibbs ringing is described be-
low. The sinc interpolation along the third (z-axis) direction is performed by
zero-filled expansion in the frequency domain. Let f (i) (i = 0, 1, . . . , n − 1) be
the profile in the third direction. In the discrete Fourier transform of f (i), f (i)
should be regarded as cyclic and then f (n − 1) and f (0) are essentially adjacent.
Unwanted Gibbs ringing occurs in the interpolated profile due to the discontinu-
ity between f (n − 1) and f (0). Thus, Gaussian-shaped slopes are added at the
beginning and end of f (i) to avoid the occurrence of unwanted ringing before
the sinc interpolation. Let f (i) (i = −3 · σ, . . . , 0, 1, . . . , n − 1, n, . . . , 3 · σ + n)
be the modified profile, which is given by
⎧ i2
⎨ exp(− 2σ 2 ) · f (0)
⎪ i = −3 · σ, . . . , 0
f (i) = f (i) i = 0, . . . , n − 1 (10.18)
⎪
⎩ 2
exp(− (i−n+1)
2σ 2
) · f (n − 1) i = n, . . . , 3 · σ
The computation of the Gaussian derivatives in the Hessian matrix and the
gradient vector (needed in the later chapters) can be implemented using three
separate convolutions with 1-D kernels as represented by
9 :
∂2
x;σf) =
fx p y q zr (" Gauss(" x ; σ f ) ∗ f ("
x)
∂ x p ∂ y q ∂z r
9 q
dp d
= Gauss(x ; σ f ) ∗ Gauss(y ; σ f )
dx p dy q
9 r ::
d
∗ Gauss(z ; σ f ) ∗ f ("x ) (10.19)
dz r
542 Sato
interest based on the width response curves as shown in Fig. 10.3. We confirmed
that the results were quite stable for different images obtained under similar
conditions once suitable values have been determined.
The detailed analyses of the effects of parameter values on the filter re-
sponses, which are the bases of the above guidelines, are discussed for the line
case in the next section, and more thorough analyses of them are found in [7].
10.2.4 Examples
10.2.4.1 Neurovascular Visualization from 3-D MR Data
(a)
(b)
(c)
Figure 10.4: Neurovascular visualization from 3-D MR data. (a) Original (left)
and line-filtered (right) cross-sectional images. (b) Original (left) and line-filtered
(right) volume rendered images. (c) DSA (digital subtraction angiography) image
at a vein phase.
(a)
(b)
Figure 10.5: Liver vessel (portal vein) segmentation from abdominal CT im-
ages. (a) Original cross-sectional images (left) and segmented liver region
(right). (b) Original (left) and line-filtered (right) surface-rendered images.
(CT arterial portography)2 ; the portal veins had high CT values due to the in-
jection of contrast material. A region of 400 × 400 pixels from each slice was
trimmed, which included the whole liver (the left frame of Fig. 10.5(a)), and fur-
ther the image size was reduced to half using the Laplacian pyramid [31] to re-
duce a computational amount to a practical level. The liver regions were roughly
hand-segmented by a radiology specialist and used as a mask (the right frame
of Fig. 10.5(a)). The CT values were converted so that the image intensity f was
zero for less than fmin , fmax − fmin for more than fmax , and f − fmin for between
fmin and fmax (where fmin = 1000 and fmax = 1300). Line filtering was applied
to the sinc-interpolated images using γ23 = γ12 = 1.0, α = 0.25, σ1 = 0.8 pixels,
2
The CT data were obtained by a helical CT scanner with the 20-sec delay following the
administration of contrast material using a catheter inserted in the SMA (superior mesenteric
artery). This method of portal vein imaging is called CTAP (CT arterial portography).
546 Sato
s = 1.5, and n = 2. We multiplied the mask images with the line-filtered im-
ages, thresholded the masked line-filtered images using an appropriate thresh-
old value, and removed small connected components whose size was less than
10 voxels.
In Fig. 10.5(b), the left frame gives the rendered result of the original binary
images, and the right frame shows a combination of the line-filtered binary
images for small-vessel detection and the original binary images using relatively
high threshold values for large-vessel detection. The two binary images were
combined by taking the union of them. The CT data were scanned when the
contrast material in the portal vein began to be absorbed by the liver tissues,
as seen in the lower part of Fig. 10.5(b). Such a condition is quite common in
CTAP for portal vein imaging. In the original images, the small vessels appear
buried due to the contrast material absorbed by the liver tissue. In the combined
result of the original and line-filtered images, not only is the nonuniformity of
the contrast material canceled out, but also the recovery of small vessels is
significantly improved over the entire liver area.
A single-scale sheet filter was applied to pelvic CT images for bone cortex en-
hancement. The purpose was to visualize the distribution of bone tumors and
localize them in relation to the pelvic structure for biopsy planning as well as
diagnosis [32]. Healthy bone cortex tissues and bone tumors have similar origi-
nal CT values. However, bone cortices are sheet-like in structure, while tumors
are not. Thus, enhanced bone cortices using sheet enhancement filtering are
expected to be discriminated from tumors which are not enhanced.
The CT dataset consisted of 40 slices with a 512 × 512 matrix (Fig. 10.6(a));
the pixel dimensions were 0.82 mm2 . The slice thickness and reconstruction
pitch were 5 mm. The matrix was reduced to half in the xy-plane, and thus the
pixel interval was 1.64 mm. Sheet filtering was applied to the sinc-interpolated
images using σ f = 1.0 pixel, γ23 = γ13 = 0.5, and α = 0.25.
Figure 10.6(b) shows the color volume renderings of bone tumors (pink)
and cortices (white). In the left frame, the opacity and color functions were
adjusted using only CT values of the original images. In the right frame, both the
original and sheet filtered images were used, where voxels having high intensities
Hessian-Based Multiscale Enhancement and Segmentation 547
(a)
(b)
(c)
Figure 10.6: Visualization of pelvic bone tumors from CT data. (a) Original
CT slice image. (b) Volume rendered images of bone tumors and cortices. Left:
Using only original images. Right: Using original and sheet-filtered images. (c)
Manually traced tumor regions. ( c 2004 IEEE). A color version of this figure
will appear on the CD that accompanies the volume.
both in sheet-filtered and original images were assigned as cortices (white), and
those having high in original but low in sheet-filtered images were assigned as
tumors (pink). Figure 10.6(c) shows the rendered color image generated from
the tumor regions manually traced by a radiology specialist, which is regarded
as an ideal visualization. The color rendering of the left frame of Fig. 10.6(b) was
well correlated with Fig. 10.6(c) (the “ideal” image), and the bone tumors were
visualized considerably better than only using original CT images. However,
nontumor regions around articular spaces were also detected mainly due to the
partial volume effect (by a large slice thickness).
548 Sato
Multiscale blob and line filters were applied to chest CT images for nodule
enhancement and vessel enhancement to detect early-stage lung cancers and
visualize them with relation to peripheral vessels [33–35]. The CT dataset con-
sisted of 60 slices with a 512 × 512 matrix (Fig. 10.7(a)); the pixel dimensions
were 0.39 mm2 . The slice thickness and reconstruction pitch were 2 mm and 1
mm, respectively. The matrix was reduced to half in the xy-plane, and thus the
pixel interval was 0.78 mm. The data were then interpolated along the z-axis
using sinc interpolation so that the voxel was isotropic. While nodules, vessels,
and other soft tissues have similar CT values in original images, the nodules and
vessels have blob and line structures, respectively. Multiscale blob filtering was
applied to the interpolated images using γ23 = γ12 = 0.5, α = 0.25, σ1 = 2.0 pix-
√
els, s = 2, and n = 3. Multiscale line filtering was applied using γ23 = γ12 = 1.0,
√
α = 0.25, σ1 = 1.0 pixels, s = 2, and n = 3.
Figure 10(b) shows the color volume renderings of nodules (green), ves-
sels (red), lung (violet), and bone tissues (white). In the left frame, the opac-
ity and color functions were adjusted using only CT values of the original im-
ages. In the right frame, the original, blob, and line filtered images were used,
where voxels having high intensities in the blob-filtered images were assigned
as nodules (green), those having high in the line-filtered images as vessels (red),
those having low in the original and two filtered images as lung tissues (vio-
let), and those having high in the original but low in the two filtered images as
bone tissues (white). The nodules and vessels were clearly depicted with dif-
ferent colors using blob and line enhancement filtering, while it was difficult to
discriminate soft tissues into different categories using only original intensity
values.
The measures of similarity to the local structures have been introduced based
on the ideal local structures with an isotropic Gaussian cross section shown
in Eqs. (10.12)–(10.14). To examine the effects of parameters involved in the
Hessian-Based Multiscale Enhancement and Segmentation 549
(a)
(b)
Figure 10.7: Visualization of lung nodule and vessel from CT data (Color Slide).
(a) Original CT slice image. (b) Volume rendered images of nodules (green),
vessels (red), lung (violet), and bone (white) tissues. Left: Using only original
images. Right: Using original, blob-filtered, and line-filtered images. ( c 2004
IEEE).
550 Sato
λ2 + λ3
λa−mean 23 = − . (10.20)
2
To compare these three measures, let us consider a 3-D line image with elliptic
(nonisotropic Gaussian) cross sections given by
x2 y2
elliptic ("
x ; σx , σ y) = exp − + . (10.21)
2σx2 2σ y2
1 1 original
Response
0.6 0.6
0 0
−25 −20 −15 −10 −5 0 5 10 15 20 25 −25 −20 −15 −10 −5 0 5 10 15 20 25
x coordinate
(a) (b)
0.5 0.5
λz λz
0 0
λy λx
λx
Response
Response
−0.5 −0.5
−1 −1
λy
−1.5 −1.5
−25 −20 −15 −10 −5 0 5 10 15 20 25 −25 −20 −15 −10 −5 0 5 10 15 20 25
x coordinate x coordinate
(c) (d)
Figure 10.8: Responses of eigenvalues and 3-D line filters to elliptic ("x ; σx , σ y)
along the x-axis. The eigenvalues and filter responses are normalized so
that |λ2 | and |λ3 | are one at x = y = 0 when σx = σ y = σ f = 4. (a) λg−mean 23 ,
λa−mean 23 , λmin 23 , and the original profile for the ideal line case (σx = σ y = 4 in
x ; σx , σ y), σ f = 4). (b) λg−mean 23 , λa−mean 23 , λmin 23 , and the original pro-
elliptic ("
file for the sheet-like case (σx = 20 and σ y = 3 in elliptic (" x ; σx , σ y), σ f = 4).
(c) Eigenvalues for the ideal line case. (d) Eigenvalues for the sheet-like
case.
the line (x = y = 0) when σx and σ y in Eq. (10.21) are varied. While λmin 23 and
λg−mean 23 decrease with deviations from the conditions σx ≈ σ f and σ y ≈ σ f ,
σ σ
λa−mean 23 gives high responses if σx ≈ √f
2
or σ y ≈ √f .
2
Thus, λa−mean 23 gives rela-
tively high responses to sheet-like structures while λmin 23 (γ 23 = 1 in Eq. (10.3))
and λg−mean 23 (γ 23 = 0.5 in Eq. (10.3)) are able to discriminate line structures
from sheet-like structures.
552 Sato
10
6
0.4
y
σ 5
4
0.6
3
0.8
0.9
2
0
0 1 2 3 4 5 6 7 8 9 10
σx
(a)
10 10
9 9
8 8
7 7
6 6
σy
y
σ
5 5
4 4
3 3
0.4
0.6
8
0.
9
2 2
0.
9
0.6
0.
0.
0.4
1 1
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
σx σx
(b) (c)
We define the height measure of the multiscale filter response as the peak re-
sponse hM (σr ) = Mline {line (0, 0, z ; σr )}. Since the filter response is normalized,
hM (σr ) is constant regardless of σr . That is,
where hMc = 0.25 (see the “Brain Storming Question” at the end of this chapter
the width measure wM (σr ) of the multiscale filter
for the derivation). We define
response as the distance x 02 + y02 from the z-axis to the circular locus where
hMc
Mline {line (x 0 , y0 , z ; σr )} gives half of the peak response, that is,
. Let wM
2
(σr )
be the ratio of the observed width wM (σr ) to σr . The width ratio wM (σr ) is constant
regardless of σr , that is,
wM (σr )
wM (σr ) = = wM , (10.24)
σr c
where wM c
≈ 1.0 when γ 23 = 1 in the formulation of Eq. (10.2). Similarly, we
define the height measure hR (σr ; σ f ) of the single-scale filter response as the
peak response hR (σr ; σ f )
= Sline {line (0, 0, z ; σr ); σ f } and the width measure
wR (σr , σ f ) as the distance x 02 + y02 from the z-axis to the circular locus where
hMc
Sline {line (x 0 , y0 , z ; σr ); σ f } gives the half of the maximum response, 2
. To
compare the widths of the filter response and the original profile, we also in-
troduce the width measure wL (σr ) of the original line image as the distance
x 02 + y02 from the z-axis to the circular locus where line (x 0 , y0 , z; σr ) gives half
of line (0, 0, z ; σr ). While σr is introduced for the convenience of generating line
profiles, wL (σr ) is for the convenience of comparing the widths of various profile
shapes.
Figure 10.10(a) and 10.10(b) show the variations in the height and width
measures. Figure 10.10(a) gives the plots of hR (σr ; σ f ) at three values of σ f and
hM (σr ), and Fig. 10.10(b) shows the plots of wR (σr ; σ f ) at three values of σ f ,
wM (σr ), and wL (σr ). The width measure of the multiscale response is propor-
tional to that of the original line image. In the case of the line image line ("
x ; σr )
with a Gaussian cross section, wL (σr ) ≈ 0.9wM (σr ). Although the filter responses
make the lines a little thinner than the original lines, the multiscale line-filter can
be designed so that the width of its responses becomes approximately propor-
tional to the original one.
554 Sato
Figure 10.10: Height and width measures of filter responses in multiscale in-
tegration with σi = si−1 σ1 (σ1 = 1.5, s = 1.5, i = 1, 2, 3). The height measure is
normalized so that hMc is one. (a) Height measures hM (σr ) = hMc and hR (σr ; σi )
for line ("x ; σr ). (b) Width measures wM (σr ) = wMc , wL (σr ), and wR (σr ; σi ) for
line ("
x ; σr ) with γ 23 = 1. (c) Height measures for elliptic ("
x ; σx , σ y) with γ 23 = 1.
Solid lines denote the height measures for the discrete scales. Dashed lines de-
note the height measures for the continuous scales from σ1 to σ3 . (d) Height
measures for elliptic (" x ; σx , σ y) with γ 23 = 0.5.
Similarly, the multiscale filter response using the discrete samples of σ f for the
x, σr ) is given by
line image line ("
Table 10.2: Height measure hM (σr ) and width ratio wM (σr ) minima in
multi-scale integration at discrete scales using typical scale factors, and σr
where the minima are taken
Scale factor s Min. height hMmin at σr = k p σi Min. width ratio wM min
at σr = kw σi
s→1 hMmin → hMc kp →1 wM min
→ wM c
kw → 0.65
s = 1.2 hMmin ≈ 0.99hMc kp ≈ 0.92 wM ≈ 0.99wM kw ≈ 0.59
√
min
c
Table 10.2. When s = 1.5, hMmin ≈ 0.96hMc , which means that the deviation from
the continuous case is less than 4%.
With regard to the width measure of the filter response, given the discrete
samples of σ f and the assumption of the profile shape, the accuracy of this
approximation can also be estimated. Given the scale factor s, we can determine
wM min
and kw satisfying
where σi = si−1 σ1 (i = 1, 2, . . . , n), wM min
is the minimum of the ratio of wM (σr ) to
σr within the range kw σ1 ≤ σr ≤ kw snσ1 , and the minimum is taken at σr = kw si σ1
(i = 0, 1, 2, . . . , n) (Fig. 10.10(b)). wM min
can be regarded as a function of the scale
factor s. wM (σr ) should be sufficiently close to wM c
within the width range of
interest. The values of wM min
and kw at typical scale factors are summarized in
Table 10.2. When s = 1.5, wM min
≈ 0.96wM c
. When the parameters for the discrete
scales of σ f are s = 1.5 and n = 3, the ranges of deviation within 4% for the height
and the width measures are 0.55σ1 ≤ σr ≤ 1.86σ1 and 0.82σ1 ≤ σr ≤ 2.77σ1 , re-
spectively. The range of deviation within 4% for the width measure is shifted to a
smaller σr than that for the height measure. As a result, the range of deviation of
less than 4% for both the height and the width measures is 0.82σ1 ≤ σr ≤ 1.86σ1 .
We now extend the experimental analysis of the multiscale integration to
the response to elliptic ("
x ; σx , σ y) shown in Eq. (10.21). We define the hight mea-
sure of elliptic ("
x ; σx , σ y) as hRelliptic (σx , σ y; σ f ) = Sline {elliptic (0, 0, z ; σx , σ y); σ f }.
Fig. 10.10(c) and 10.10(d) show the multiscale integration of the responses at
continuous and discrete scales with σ1 = 1.5, s = 1.5 and n = 3 for γ 23 = 1 and
Hessian-Based Multiscale Enhancement and Segmentation 557
In the previous sections, the enhancement of the local structures based on the
eigenvalues of the Hessian matrix was discussed. In this section, we further
combine the gradient vector with the Hessian matrix to perform explicit detec-
tion, localization, and description of the local structures. Especially, we focus on
the line and sheet structures. The methods are formulated as a 3-D extension of
2-D line description presented in [36]. The 3-D line model consists of the medial
axes of lines and the cross-sectional shape associated with each point on these
axes, while the 3-D sheet model consists of the medial surfaces of sheets and
the width associated with each point on these surfaces. The medial axes and
medial surfaces are detected and localized by fully utilizing formal analyses of
3-D second-order local intensity structures based on the gradient vector and the
Hessian matrix.
The following is an overview of the method:
Step 1: Existing filtering techniques for line and sheet enhancement are
used to extract the initial regions, which should include all potential
medial axes and surfaces [7, 11]. These are then used as initial values for
the subsequent subvoxel edge localization. The candidate regions, which
should include all potential line and sheet regions, are also extracted.
Step 2: The medial axes and surfaces are extracted using local second-
order approximation given by the gradient vector and Hessian matrix.
The eigenvectors of the Hessian matrix define the moving frames on
medial axes/surfaces. After this, the moving frames are embedded in a
3-D image such that each point within the candidate regions is directly
related to its corresponding moving frame.
In the following, we begin with a description of Step 2 of the method since Step 1
has been already described in the previous sections.
∇ f (" x ; σ ), fy("
x ; σ ) = ( fx (" x ; σ ))$ ,
x ; σ ), fz(" (10.30)
∂
where partial derivatives of f (" x;σ) =
x ; σ ) are represented as fx (" ∂x
x ; σ ),
f ("
∂ ∂
x;σ) =
fy(" ∂y
x ; σ ), and fz("
f (" x;σ) = ∂z
x ; σ ).
f ("
The Hessian matrix of Gaussian smoothed volume f (" x ; σ ) is given by
⎡ ⎤
x ; σ ) fxy("
fxx (" x ; σ ) fxz("
x;σ)
⎢ ⎥
x ; σ ) = ⎣ fyx ("
∇ 2 f (" x ; σ ) fyy(" x;σ)⎦,
x ; σ ) fyz(" (10.31)
x;σ)
fzx (" x;σ)
fzy(" x;σ)
fzz("
e1 e1
e3 e3
e2
e2
(a) (b)
Figure 10.11: Line and sheet models with the eigenvectors of the Hessian ma-
trix. (a) Line. (b) Sheet.
medial axis/surface detection, and we assume that the width range of structures
of interest is around the width at which the filter with σ f gives the peak response
(see [7] and [11] for detailed discussions).
We assume that the tangential direction is given by e"1 at the voxel around the
medial axis. The 2-D intensity function, c(" u = (u, v)$ ), on the cross-sectional
u) ("
x ; σ f ) orthogonal to e"1 , should have its peak on the medial axis. The
plane of f ("
second-order approximation of c("
u) is given by
1 $ 2
u) = f ("
c(" "$ ∇c0 + u
x0 ; σ f ) + u " ∇ c0 u
", (10.32)
2
where u"e2 + v"e3 = x "0 , ∇c0 = (∇ f · e"2 , ∇ f · e"3 )$ (∇ f is the gradient vector,
"−x
that is, ∇ f ("
x0 ; σ f )), and
λ2 0
∇ c0 =
2
. (10.33)
0 λ3
u) should have its peak on the medial axis of the line. The peak is located at
c("
the position satisfying
∂ ∂
u) = 0
c(" and u) = 0.
c(" (10.34)
∂u ∂v
By solving Eq. (10.34), we have the offset vector, p" = ( px , py, pz)$ , of the peak
"0 given by
position from x
where s = − ∇ fλ2· e"2 and t = − ∇ fλ3· e"3 . For the medial axis to exist at the voxel
u) needs to be located in the territory of voxel x"0 . Thus, the
x"0 , the peak of c("
medial axis is detected only if | px | ≤ 12 , | py| ≤ 12 , and | pz| ≤ 12 . By combining the
voxel position x"0 and offset vector p", the medial axis is localized at subvoxel
resolution.
We assume that the direction of the surface normal is given by e"1 at the voxel
around the medial surface. The 1-D intensity function, c(v), which is the profile of
x ; σ f ) along e"3 , should have its peak on the medial surface. The second-order
f ("
approximation of c(v) is given by
1
x0 ; σ f ) + vc0 + v 2 c0 ,
c(v) = f (" (10.36)
2
where v"e3 = x "0 , c0 = ∇ f · e"3 , and c0 = λ3 . c(v) should have its peak on the
"−x
medial surface of the sheet. The peak is located at the position satisfying
d
c(v) = 0. (10.37)
dv
By solving Eq. (10.37), we have the offset vector, p", of the peak position from x"0
given by
where t = − ∇ fλ3· e"3 . The medial surface is detected only if | px | ≤ 12 , | py| ≤ 12 , and
| pz| ≤ 12 .
The moving frame is defined by the voxel position x"0 , the offset vectors p", and
the eigenvectors e"1 , e"2 , e"3 at each detected point of a medial axis or surface. In
order to perform the subsequent processes based on moving frames, each voxel
within the candidate regions obtained in Step 1 needs to be related to the moving
frame. First, we find the correspondences between each voxel and one of the
detected points of a medial axis or surface. Once these correspondences are
found, each voxel is directly related to its corresponding moving frame. To find
the correspondences, we use the Voronoi tessellation of the detected points.
Hessian-Based Multiscale Enhancement and Segmentation 561
The territory of the detected point in the Voronoi tessellation can be regarded as
the set of voxels to which each discrete moving frame is applied. This process
is identical in both the line and sheet cases.
fline (" x)$ ∇ 2 f ("
x ; σe ) = r"(" x ; σe )"
r ("
x), (10.39)
where r"("
x) is the unit vector whose direction is parallel to the perpendicular
from the voxel position x" to the straight line defined by the origin and the medial
axis direction of the moving frame. The foot of the perpendicular can be regarded
as the corresponding axis position. The origin is given by the voxel position of
the medial axis point and the offset vector p". σe is the filter scale used in the
edge localization; it is desirable that σe be small compared to the line width for
accurate edge localization.
After the adaptive derivatives have been calculated at all the voxels, subvoxel
edge localization is carried out at every voxel in the candidate regions. Let o"a
be the foot of the perpendicular on the axis. Let r"a be the direction from o"a to
the voxel position x"a . For each voxel, we reconstruct the profiles originating
from o"a in the directions r"a and −"
ra for fline x ; σe ) and the initial regions (which
("
we specify as bline ("
x)) obtained in Step 1. The edges are then localized in both
directions and the width is calculated as the distance between the two edge
locations. The profile is reconstructed at subvoxel resolution by using a trilinear
interpolation for fline x ; σe ) and a nearest-neighbor interpolation for bline ("
(" x).
Let f (s) be the profiles of fline x ; σe ) along the directions r"a from o"a . Let
("
b(s) be the profiles of bline ("
x). Here, r denotes the position from the foot of
562 Sato
the perpendicular on the axis. The localization of edges consists of two steps;
finding the initial point for the subsequent search using b(s), and then searching
for the zero-crossing of f (s). The initial point, p0 , is given by s of the first
encountered point satisfying b(s) = 0, starting the search from s = 0, that is, the
axis point, to the direction r"a . Given the initial point of the search, if f ( p0 ) < 0,
search outbound from the axis point along the profile for the zero-crossing
position p; otherwise, search inbound. After the zero-crossing position q in the
opposite direction −"
ra is similarly determined, the width (diameter) is given
by | p − q|.
At every voxel within the candidate regions, the directional second derivative is
taken orthogonal to both e"1 and e"2 , that is, along e"3 , in its corresponding moving
frame. This spatially variable directional derivative is written as
fsheet (" x)$ ∇ 2 f ("
x ; σe ) = r"(" x ; σe )"
r ("
x), (10.40)
where r"("
x) is the unit vector whose direction is parallel to the medial surface nor-
mal of the moving frame. Using a method analogous to that employed in the line
case, the profiles f (s) and b(s) of fsheet x ; σe ) and bsheet ("
(" x) are reconstructed
for the directions r", respectively. These profiles are then used to determine the
edge locations p and q in the two directions r" and −"
r , respectively, and finally
the width (thickness) is obtained as | p − q|.
integration of scales appropriate for the line diameter [7, 11], the candidate
regions were extracted by thresholding and extracting large connective com-
ponents. The medial axis points were detected within these regions using the
√
procedures described in Section 10.4.1 with two values for the filer scale σ f , 2
( 1.4) and 4.0 voxels. The same candidate regions were used for both values
of σ f .
Figure 10.12 shows the volume rendering of the synthesized 3-D images and
typical axis detection results. The detection was successful using appropriate
combinations of line diameter D and filter scale σ f (Fig. 10.12(b)). Many axis
points are overlooked when the filter scale is larger than appropriate, while
a number of false detections are made when the filter scale is smaller than
appropriate (Fig. 10.12(c)).
564 Sato
0.5 1
0.25 0.5
0 0
2 2.82 4 5.66 8 11.31 2 2.82 4 5.66 8 11.31
Line_Diameter (voxels) Line_Diameter (voxels)
(a) (b)
2 10
Position Error (voxels)
0 0
2 2.82 4 5.66 8 11.31 2 2.82 4 5.66 8 11.31
Line_Diameter (voxels) Line_Diameter (voxels)
(c) (d)
Figure 10.13: Performance evaluation of medial axis detection. See text for the
definitions of true and false detections. (a) False positive detection ratio, which
is the ratio of the number of false detections to all the detections. The ratio was
zero for all the line diameters at σ f = 4.0 voxels. (b) True positive detection
ratio, which is the ratio of the number of true detections to all the analytically
determined points. (c) Average position error of axis points regarded as true
detections. The distance between detected points and analytically determined
points was used as the error. (d) Average angle error of the directions of axis
points regarded as true detections.
Figure 10.13 shows the performance evaluation results. Detected axis points
were evaluated by comparing them with analytically determined axis points. We
regarded a detected point as a true detection if the distance between its position
and one of the analytically determined points was within two voxels; otherwise,
detected points were regarded as false. The false and true positive detection
ratios are shown in Figs. 10.13(a) and 10.13(b); the plots verify the observations
in Fig. 10.12. The positions and directions of the detected axis points regarded as
true detections were compared with analytically determined ones (Figs. 10.13(c)
and 10.13(d)). These graphs clarify the effect of σ f on accurate and reliable axis
detection.
Hessian-Based Multiscale Enhancement and Segmentation 565
10.4.4 Examples
10.4.4.1 Bronchi Diameter Quantification from 3-D CT Data
The line width quantification method was applied to chest CT images taken by
a helical CT scanner to determine the diameters of bronchi. The original voxel
dimensions were 0.29 × 0.29 × 1.0 (mm3 ). In order to make the voxel isotropic,
sinc interpolation was applied along the z-direction. The volume size used in the
experiment was 90 × 70 × 80 (voxels) after interpolation.
Figure 10.14(a) shows the original CT images. After the initial region extrac-
tion by thresholding the line filtered images, the medial axis was detected using
√ √
σ f = 2, 2, and 2 2 voxels. Figure 10.14(b) shows the results of axis detection
at the three different scales. Note that the axis points of thin structures were
detected only at the smaller two scales while those of large structures (the right
segment) were stably extracted at the larger two scales. Figure 10.14(c) shows
the results of diameter estimation using σe = 1.2 voxels based on the medial
axes at these three scales.
566 Sato
The sheet width quantification method was applied to MR images of a hip joint
[37, 38] to determine the thickness of hip joint cartilage. The original voxel
dimensions were 0.62 × 0.62 × 1.5 (mm3 ). Sinc interpolation was applied along
the z-direction to make the voxel isotropic, and then further applied along all
the three directions to make the resolution double. The resultant sampling pitch
was 0.31 (mm) in all the three directions. The volume size used in the experiment
was 256 × 256 × 100 (voxels) after interpolation.
As shown in Fig. 10.15(a), cartilages are thin structures; thickness distri-
butions are considered to be particularly important in the diagnosis of joint
diseases. The initial cartilage regions were extracted from the enhanced images
by the sheet filter. The medial surfaces were extracted using σ f = 1.4 voxels.
Figures 10.15(b) and 10.15(c) show the results of thickness distribution esti-
mated using σe = 1.2 voxels. We also obtained the thickness distributions using
σe = 1.0 voxel for comparison purposes. The average thickness estimated using
σe = 1.2 voxels was Tf = 4.24 voxels and Ta = 3.50 voxels for the femoral and
Hessian-Based Multiscale Enhancement and Segmentation 567
x ; τ ) = Bar(x ; τ ),
s0 (" (10.41)
(a) (b)
Figure 10.16: Modeling 3-D sheet structures. (a) Bar profile of MR values along
sheet normal direction with thickness τ . L 0 , L − , and L + denote sheet object,
left-side, and right-side background levels, respectively. (b) 3-D sheet structures
with thickness τ and normal orientation r"θ,φ . (
c 2004 IEEE)
in which τ represents the thickness (width) of the sheet. L 0 , L − , and L + are the
MR signal intensities of the sheet and both sides of backgrounds, respectively
(Fig. 10.16(a)). Let (θ , φ) be a pair of latitude and longitude which represents
the normal orientation of the sheet given by
s(" x ; τ ),
x ; τ, r"θ,φ ) = s0 (" (10.44)
where x" = Rθ,φ x", in which Rθ,φ denotes a 3 × 3 matrix representing rotation
x ; τ ), i.e. the x-axis, corre-
which enables the normal orientation of the sheet s0 ("
spond to r"θ,φ (Fig. 10.16(b)).
[40] by
1
m(x ; x ) = Sinc x; . (10.46)
x
where
sin(π wx)
Sinc(x ; w) = . (10.47)
π wx
The 3-D PSF is given by
x ; x , y, z) = m (x ; x ) · m (y ; y) · m (z ; z),
m (" (10.48)
where x , y, and z are sampling intervals along the x-axis, y-axis, and z-axis,
respectively.
In actual MR imaging, the magnitude operator is applied to the complex
number obtained at each voxel by FFT reconstruction, whose effects are not
negligible [41]. Thus, the MR image of the sheet structure with orientation r"θ,φ
and thickness τ is given by
In this chapter, we restrict the scope of our investigation to the sheet model
described in section 10.5.1 (Fig. 10.16), that is, a sheet structure with constant
thickness τ and orientation r"θ,φ . We define the thickness measured from the
MR imaged sheet structure as the distance between both sides of image edges
along the sheet normal vector. As long as the sheet model shown in Fig. 10.16
is considered, other definitions of measured thickness, for example, the short-
est distance between both sides of the image edges, generally give the same
thickness value. We define the image edges as the zero-crossings of the second
directional derivatives along the sheet normal vector, which is equivalent to
the Canny edge detector [42]. Gaussian blurring is typically combined with the
second directional derivatives to adjust scale as well as reduce noise.
The partial second derivative combined with Gaussian blurring for the MR
image f ("
x), for example, is given by
x ; σ ) = gxx ("
fxx (" x ; σ ) ∗ f ("
x), (10.50)
570 Sato
where
∂2
x;σ) =
gxx (" x ; σ ),
Gauss(" (10.51)
∂x2
x ; σ ) is the isotropic 3-D Gaussian function with SD σ . The
in which Gauss("
second directional derivative along r"θ,φ is represented as
f (" x ; σ ) ∗ f ("
x ; σ, r"θ,φ ) = gxx (" x), (10.52)
where x" = Rθ,φ x", in which Rθ,φ denotes a 3 × 3 matrix representing rotation
x ; τ ), i.e. the x-axis, corre-
which enables the normal orientation of the sheet s0 ("
spond to r"θ,φ (Fig. 10.16(b)). Similarly, the first directional derivative along r"θ,φ
is represented as
f (" x ; σ ) ∗ f ("
x ; σ, r"θ,φ ) = gx (" x), (10.53)
f (" $
x ; σ, r"θ,φ ) = r"θ,φ ∇ f ("
x ; σ ), (10.54)
and
f (" $
x ; σ, r"θ,φ ) = r"θ,φ ∇ 2 f ("
x ; σ )"
rθ,φ , (10.55)
respectively.
Thickness of sheet structures can be determined by analyzing 1-D profiles of
x ; σ, r"θ,φ ) and f ("
f (" x ; σ, r"θ,φ ) along straight line given by
and
are derived respectively. Figure 10.17(a) shows a schematic diagram for the
1-D profile processing. Both sides of the boundaries for sheet structures can be
Hessian-Based Multiscale Enhancement and Segmentation 571
(a)
(b)
(c)
defined as the points with the maximum and minimum values of f (s) among
those satisfying the condition given by f (s) = 0. Let f (s) have its maximum
and minimum values at s = p and s = q, respectively. The measured thickness,
T is defined as the distance between the two detected boundary points, which
is given by
T = | p − q|. (10.59)
The procedures for thickness determination from a volume dataset in the pre-
vious section.
x ; τ ),
The Fourier transform of 3-D sheet structure orthogonal to the x-axis, s0 ("
is given by
S0 (ω;
" τ ) = F{Bar(x; τ )} · δ(ω y) · δ(ωz), (10.60)
where F represents the Fourier transform, δ(ω) denotes the unit impulse, and
ω
" = (ωx , ω y, ωz). Note that F{Bar(x ; τ )} = τ · Sinc(ωx ; τ ) when L + = L − = 0
and L 0 = 1 in Bar(x ; τ ). The Fourier transform of 3-D sheet structure whose
normal is r"θ,φ , s("
x ; τ, r"θ,φ ), is given by
S(ω; " ; τ ),
" τ, r"θ,φ ) = S0 (ω (10.61)
" = Rθ,φ ω,
where ω " in which Rθ,φ denotes a 3 × 3 matrix representing rotation
which enables the ωx -axis correspond to r"θ,φ (Fig. 10.18(a)).
(a)
(b)
(c)
" = ωs · r"θ,φ ,
ω (10.62)
where S(ωs ) represents energy distribution along Eq. (10.62). Analysis of the
degradation of 1-D distribution, S(ωs ), is sufficient to examine the effects of MR
imaging and postprocessing parameters in the subsequent processes. It should
be noted that S(ωs ) is the 1D sinc function when L − = L + .
Thus, the Fourier transform of MR image of the sheet structure, F(ωs ) is given
by
G (ω; " ; σ ),
" σ, r"θ,φ ) = G xx (ω (10.69)
" = Rθ,φ ω,
where ω " in which Rθ,φ denotes a 3 × 3 matrix representing rotation
which enables the ωx -axis correspond to r"θ,φ . One-dimensional frequency com-
ponent of G (ω;
" σ, r"θ,φ ) affecting S(ωs ) is given by
and
respectively.
The 1-D profiles along the sheet normal direction of the Gaussian derivatives
of MR imaged sheet structures (Eqs. (10.57) and (10.58)) are obtained by inverse
Fourier transform of Eqs. (10.71) and (10.72), and then thickness is determined
according to the procedure shown in Fig. 10.17(a). While simulating the MR imag-
ing and Gaussian derivative computation described in section 10.5.1 essentially
requires 3-D convolution in the spatial domain, only 1-D computation is neces-
sary in the frequency domain, which drastically reduces computational cost. In
the following sections, we examine the effects of various parameters, which are
involved in the sheet model, MR imaging resolution, and thickness determina-
tion processes, on measurement accuracy. Efficient computational methods of
simulating MR imaging and postprocessing thickness determination processes
are essential, and thus simulating the processes by 1-D signal processing in the
frequency domain is regarded as the key to comprehensive analysis.
576 Sato
xy = 1, (10.73)
The unit of dimension for the following simulation results was xy, i.e., xy =
1 was assumed as described in the previous section. Thus, other parameters
(τ , T, z, σ ) were normalized by xy, and voxel anisotropy was represented
as z = ( xyz = z
1
). In the simulation, we used L 0 = 200 and L − = L + = 100
for the bar profile. These parameter values were determined so that the bar
profile was symmetric and the magnitude operator in Eq. (10.49) did not affect
the results. Table 10.3 summarizes the parameter values used in the numerical
simulations described below.
Figure 10.19 shows the effects of the standardg deviation (SD), σ , in Gaussian
blurring. In Fig. 10.19(a), the relations between true thickness τ and measured
√
thickness T are shown for three σ values ( 12 , 2
2
, 1) when z is equal to 1, i.e. in
the case of isotropic voxel. The relation is regarded as ideal when T = τ , which
is the diagonal in the plots of Fig. 10.19(a). For each σ value, the relations were
plotted using two values of sheet normal orientation θ (0◦ , 45◦ ), while φ was
fixed to 0◦ . Strictly speaking, voxel shape is not perfectly isotropic even when
z is equal to 1 because the shape is not spherical. Thus, slight dependence on
θ was observed.
In order to observe the deviation from T = τ more clearly, we defined the
error as E = T − τ . Figure 10.19(b) shows the plots of error E instead of T.
With σ = 12 , considerable ringing was observed for error E. With σ = 1, error
√
magnitude |E| was significantly large for small τ (around τ = 2). With σ = 2
2
,
578 Sato
5
θ=0° 1.4
θ=45° 1.2
Measured thickness T
4 1 θ=0°
Ideal θ =45°
0.8
Error E = T − τ
0.6
3 0.4
0.2
0
2 −0.2
−0.4
−0.6
1 −0.8
σ = 1/2
−1 σ = 1/2
−1.2
0 −1.4
0 1 2 3 4 5 0 1 2 3 4 5
True thickness τ True thickness τ
5
θ=0° 1.4
θ=45° 1.2
Measured thickness T
4 1 θ=0°
Ideal θ=45°
0.8
Error E = T − τ
0.6
3 0.4
0.2
0
2 −0.2
−0.4
−0.6
1 1/2 −0.8
σ =2 /2
−1 σ = 21/2/2
−1.2
0 −1.4
0 1 2 3 4 5 0 1 2 3 4 5
True thickness τ True thickness τ
5
θ=0° 1.4
θ=45° 1.2
Measured thickness T
4 1 θ=0°
Ideal θ=45°
0.8
Error E = T − τ
0.6
3 0.4
0.2
0
2 −0.2
−0.4
−0.6
1 −0.8
σ =1
−1 σ =1
−1.2
0 −1.4
0 1 2 3 4 5 0 1 2 3 4 5
True thickness τ True thickness τ
(a) (b)
however, ringing became small and error magnitude |E| was sufficiently small
√
around τ = 2. σ = 2
2
gave a good compromise optimizing the trade-off between
reducing the ringing and improving the accuracy for small τ . Actually, error
√
magnitude |E| is guaranteed to satisfy |E| < 0.1 for τ > 2.0 with σ = 2
2
, while
|E| < 0.1 for τ > 3.2 with σ = 1
2
and, |E| < 0.1 for τ > 2.9 with σ = 1. Based on
√
this result, we used σ = 2
2
in the following experiments if not specified.
Figure 10.20(a) shows the effects of sheet normal orientation θ and voxel
anisotropy z on measured thickness T. The relations between measured thick-
ness T and sheet normal orientation θ for six values of true thickness τ (1, 2,
3, 4, 5, 6) were plotted when three different values of voxel anisotropy z (1,
2, 4) were used. The relations were regarded as ideal when T = τ for any θ ,
which is the horizontal in the plots of Fig. 10.20(a). When z = 1, the relations
were highly close to the ideal for τ > 2. When z = 2 and z = 4, significant
deviations from the ideal were observed for θ > 15◦ and θ > 30◦ , respectively.
Figure 10.20(b) shows the plots of the maximum θ at which error magnitude
|E| is guaranteed to satisfy |E| < 0.1, |E| < 0.2, and |E| < 0.4 for τ = 2 with
varied voxel anisotropy z. These plots clarify the range of θ where the deviation
from the ideal is sufficiently small. There was no significant difference between
the plots for τ = 2 and different values of τ (for τ > 2).
∂2
x ; σxy, σz) =
gxx (" Gauss(x, y ; σxy)Gauss(z ; σz), (10.75)
∂x2
σz z
where σz and σxy are determined so as to satisfy σxy
= xy
, and thus σz = zσxy
because we assumed xy = 1. Figure 10.20(c) shows plots of measured thick-
√
ness obtained using anisotropic Gaussian blurring when z = 2 and σxy = 2
2
.
The plots using anisotropic Gaussian blurring were closer to the ideal for τ ≥ 4
and any θ , while those using isotropic one were closer for τ ≥ 2 and θ < 30◦ .
580 Sato
6
τ =6 6
τ=6 6
τ=6
Measured thickness T
Measured thickness T
Measured thickness T
τ =5 τ=5 τ=5
5 5 5
τ =4 τ=4 τ=4
4 4 4
τ =3 τ=3 τ=3
3 3 3
τ =2 τ=2 τ=2
2 2 2
τ =1 τ=1 τ=1
1 ∆z = 1 1 ∆z = 2 1 ∆z = 4
0 0 0
0 15 30 45 60 75 90 0 15 30 45 60 75 90 0 15 30 45 60 75 90
Sheet normal orientation θ (deg) Sheet normal orientation θ (deg) Sheet normal orientation θ (deg)
(a)
8
|E|<0.1 τ= 6
|E|<0.2 6
Measured thickness T
τ= 5
Voxel anisotropy ∆z
|E|<0.4
5
4 τ= 4
4
τ= 3
3
2 τ= 2
2
τ= 1
1 ∆z = 2
1 0
0 15 30 45 60 75 90 0 15 30 45 60 75 90
Sheet normal orientation θ (deg) Sheet normal orientation θ (deg)
(b) (c)
(a)
θ = 0° and φ = 0 °
θ = 45° and φ = 0°
(b)
Figure 10.21: Acrylic plate phantom and its MR images. (a) Physical appear-
ance. (b) MR images. The horizontal and vertical axes of the images correspond
to the x-axis and z-axis, respectively. The voxel size was xy = 0.625 mm and
z = 1.5 mm. As can be easily observed by naked eye, the acrylic plate with
τ = 1 mm appears to be imaged slightly thicker in θ = 45◦ and φ = 0◦ than
θ = 0◦ and φ = 0◦ . (
c 2004 IEEE)
582 Sato
The phantom was submerged in a water bath so that the background (wa-
ter) showed higher intensity as contrasted to low intensity objects (acrylic
plates). Three-dimensional MR images (TR/TE/flip angle/matrix/FOV/slice thick-
ness: 12.8 ms/5.6 ms/5/256×256/160 mm/1.5 mm) of the phantom were ob-
tained using a fast spoiled gradient-echo sequence (FSPGR). The voxel size
was xy = 0.625(= 160
256
) (mm) and z = 1.5 (mm). Thus,
z 1.5
voxel anisotropy = = = 2.4. (10.76)
xy 0.625
Thirteen datasets of 3D MR images were acquired with different normal posi-
tions of the phantom plates, eight with variable θ (θ = 0, 15, 25, 35, 45, 60, 75,
and 90 degrees) and fixed φ (φ = 0), and five with variable φ (φ = 0, 15, 25, 35,
and 45 degrees) and fixed θ (θ = 0). In the obtained MR images, we observed
L − = L + = 40 and L 0 = 0. Figure 10.21(b) shows examples of the MR images.
We compared actually measured thickness from the real MR data with the com-
putational thickness calculated by the numerical simulations.
Figure 10.22 shows the averages and the SDs of the actually measured
(in vitro) thickness from the MR data of the phantom imaged with different
Measured thickness T (mm)
τ = 3 mm τ = 3 mm
3 3
τ = 2 mm τ = 2 mm
2 2
τ = 1.5 mm τ = 1.5 mm
1 τ = 1 mm 1 τ = 1 mm
0 15 30 45 60 75 90 0 15 30 45 60 75 90
Sheet normal orientation θ (deg) Sheet normal orientation φ (deg)
(a) (b)
θ and φ and the plots of the simulated thickness representing the dependences
on sheet normal orientation θ and φ. Figures 10.22(a) and 10.22(b) show the
√
plots of the dependences of θ and φ with σ = 2
2
xy, respectively. Good agree-
ment between the simulated and the in vitro thicknesses was observed in both
cases although the in vitro thicknesses was slightly greater than the simulated
thickness. The biases, i.e., the difference between the simulated thickness and
the average of in vitro thickness, were predominantly around 0.1 mm or less
(except for θ = 75◦ of τ = 3 mm), and the SDs of the in vitro thickness were
mostly within 0.1 mm (except for θ = 45◦ of τ = 2 mm and θ ≥ 35◦ of τ = 3
mm). It should be noted that the dependence on φ is theoretically equivalent to
z
the dependence on θ when the anisotropy is xy
= 1.
10.7 Acknowledgments
The author thanks Dr. Ron Kikinis and Dr. Shin Nakajima at Harvard Medical
School and Brigham and Women’s Hospital for providing MR data of a brain,
Dr. Hironobu Ohmatsu of the National Cancer Center, Japan, for providing CT
data of a chest, Dr. Nobuyuki Shiraga at Keio University for providing abdominal
CT data, Dr. Shigeyuki Yoshida at Osaka University for providing a CT data of a
584 Sato
chest, and Dr. Katsuyuki Nakanishi, Dr. Hisashi Tanaka, Dr. Nobuhiko Sugano,
and Dr. Takashi Nishii at Osaka University for providing hip joint MR data and
phantom MR data. The author also thanks all the above researchers and Prof.
Shinicni Tamura at Osaka University for fruitful discussion.
Questions
2. Explain the parameters involved in the procedures and discuss how to select
these parameters.
4. Discuss the effect of the anisotropic resolution (voxel shape) of input vol-
ume data on multiscale enhancement filtering.
Hessian-Based Multiscale Enhancement and Segmentation 585
Bibliography
[1] Knutsson, H., Representing local structure using tensors, In: Proceed-
ings of 6th Scandinavian Conference on Image Analysis, 1989, pp. 244–
251.
[3] Koller, T . M., Gerig, G., Szekely, G., and Dettwiler, D., Multiscale detec-
tion of curvilinear structures in 2-D and 3-D image data, In: Proceedings
of Fifth International Conference on Computer Vision, 1995, pp. 864–
869.
[4] Aylward, S., Bullitt, E., Pizer, S., and Eberly, D., Intensity ridge and
widths for tubular object segmentation and description, In: Proceed-
ings of IEEE Workshop on Mathematical Methods in Biomedical Image
Analysis, 1996, pp. 131–138.
[5] Sato, Y., Nakajima, S., Atsumi, H., Koller, T., Gerig, G., Yoshida, S., and
Kikinis, K., 3D multi-scale line filter for segmentation and visualization
of curvilinear structures in medical images, In: Lecture Notes in Com-
puter Science, Vol. 1205, pp. 213–222, 1997. Proceedings of CVRMed-
MRCAS’97, Glenoble, France.
[6] Lorenz, C., Carlsen, I.-C., Buzug, T. M., Fassnacht, C., and Wesse, J.,
Multi-scale line segmentation with automatic estimation of width, con-
trast and tangential direction in 2D and 3D medical images, In: Lecture
Notes in Computer Science, Vol. 1205, pp. 233–242, 1997. Proceedings
of CVRMed-MRCAS’97, Glenoble, France.
[7] Sato, Y., Nakajima, S., Shiraga, N., Atsumi, H., Yoshida, S., Koller, T.,
Guido, G., and Kikinis, R., Three-dimensional multi-scale line filter for
segmentation and visualization of curvilinear structures in medical im-
ages, Med. Image Anal., Vol. 2, No. 2, pp. 143–168, 1998.
[8] Frangi, A., Niessen, W., Vincken, K., and Viergever, M., Multiscale vessel
enhancement filtering, Vol. 1426 In: Proceedings of MICCAI’98, Boston,
Massachusetts, 1998, pp. 130–137.
586 Sato
[9] Westin, C.-F., Warfield, S., Bhalerao, A., Mui, L., Richolt, J., and
Kikinis, R., Tensor controlled local structure enhancement of CT im-
ages for bone segmentation, In: Lecture Notes in Computer Science,
Vol. 1426, pp. 1205–1212, 1998. Proceedings of MICCAI’98, Boston,
Massachusetts.
[10] Sato, Y., Kubota T., Nakanishi K., Sugano N., Nishii T., Ohzono K.,
Nakamura H., Ochi O., and Tamura S., Three-dimensional reconstruc-
tion and quantification of hip joint cartilages from magnetic resonance
images, In: Lecture Notes in Computer Science, Vol. 1679, pp. 338–347,
1999. Proceedings of MICCAI’99, Cambridge, UK.
[11] Sato, Y., Westin, C.-F., Bhalerao, A., Nakajima, S., Shiraga, N., Tamura,
S., and Kikinis, R., Tissue classification based on 3D local intensity
structures for volume rendering, IEEE Trans. Visual. Comput. Graphics,
Vol. 6, No. 2, pp. 160–180, 2000.
[12] Sato, Y. and Tamura S., Detection and quantification of line and sheet
structures in 3-D images, In: Lecture Notes in Computer Sceinece,
Vol. 1935, pp. 164–165, 2000. Proceedings of MICCAI 2000, Pittsburgh,
Pennsylvania.
[13] Aylward, S. R. and Bullitt, E., Initialization, noise, singularities, and scale
in height ridge traversal for tubular object centerline extraction, IEEE
Trans. Med. Imaging, Vol. 21, No. 2, pp. 61–75, 2002.
[14] Suri, J. S., Liu, K., Reden L., and Laxminarayan, S. N., White and
black blood volumetric angiographic filtering: Ellipsoidal scale-space
approach, IEEE Trans. Inform. Tech. Biomed., Vol. 6, No. 2, pp. 142–158,
2002.
[15] Suri, J. S., Liu, K., Reden L., and Laxminarayan, S. N., A review on
MR vascular image processing algorithms: Acquisition and prefiltering,
Part I, IEEE Trans. Inform. Tech. Biomed., Vol. 6, No. 4, pp. 324–337,
2002.
[16] Suri, J. S., Liu, K., Reden L., and Laxminarayan, S. N., A review on
MR vascular image processing algorithms: Skeleton versus nonskeleton
approaches, Part I, IEEE Trans. Inform. Tech. Biomed., Vol. 6, No. 4,
pp. 338–350, 2002.
Hessian-Based Multiscale Enhancement and Segmentation 587
[17] Sato, Y., Nakanishi, K., Tanaka, H., Nishii, T., Sugano, N., Nakamura,
H., Ochi, T., and Tamura, S., Limits to the accuracy of 3D thickness
measurement in magnetic resonance images, In: Lecture Notes in Com-
puter Science, Vol. 2208, pp. 803–810, 2001. Proceedings of MICCAI2001,
Utrecht, The Netherlands.
[18] Sato, Y., Tanaka, H., Nishii, T., Nakanishi, K., Sugano, N., Kubota, T.,
Nakamura, H., Yoshikawa, H., Ochi, T., and Tamura, S., Limits on the ac-
curacy of 3D thickness measurement in magnetic resonance images—
Effects of voxel anisotropy, IEEE Trans. Med. Imaging, Vol. 22, No. 9,
pp. 1076–1088, 2003.
[19] Haralick, R. M., Watson, L. T., and Laffey, T. J., The topographic primal
sketch, Int. J. Robotic Res., Vol. 2, No. 1, pp. 50–72, 1983.
[20] Marr, D., Vision—A Computational Investigation into the Human Rep-
resentation and Processing of Visual Information, W. H. Freeman, New
York, 1982.
[21] Lindeberg, T., On scale selection for differential operators, Proc. 8th
Scandinavian Conference on Image Analysis, pp. 857–866, 1993.
[22] Lindeberg, T., Feature detection with automatic scale selection, Int.
J. Comput. Vision, Vol. 30, No. 2, pp. 77–116, 1998.
[23] Lindeberg, T., Edge Detection and ridge detection with Automatic Scale
Selection, Int. J. Comput. Vision, Vol. 30, No. 2, pp. 117–154, 1998.
[24] Hylton, N. M., Simovsky, I., Li, A . J., and Hale, J . D., Impact of section
doubling on MR angiography, Radiology, Vol. 185, No. 3, pp. 899–902,
1992.
[25] Du, Y. P., Parker, D. L., Davis, W. L., and Cao, G., Reduction of partial-
volume artifacts with zero-filled interpolation in three-dimensional
MR angiography, J. Magn. Reson. Imaging, Vol. 4, No. 5, pp. 733–741,
1995.
[26] Kikinis, R., Gleason, P. L., Moriarty, T. M., Moore, M. R., Alexander, E.,
III, Stieg, P. E., Matsumae, M., Lorensen, W. E., Cline, H. E., Black, P. M.,
Jolesz, F. A., Computer-assisted interactive three-dimensional planning
588 Sato
[27] Nakajima, S., Atsumi, H., Kikinis, R., Moriarty, T. M., Metcalf, D. C.,
Jolesz, F. A., and Black, P. M., Use of cortical surface vessel registration
for image-guided neurosurgery, Neurosurgery, Vol. 40, No. 6, pp. 1201–
1210, 1997.
[28] Levoy, M., Display of surfaces from volume data, IEEE Comput. Graph-
ics Appl., Vol. 8, No. 3, pp. 29–37, 1988.
[29] Lacroute, P. and Levoy M., Fast volume rendering using a shear-
warp factorization of the viewing transform, In: Proceedings of SIG-
GRAPH’94, 1994, pp. 451–458.
[31] Burt, P. J. and Adelson E. H., The Laplacian pyramid as a compact image
code, IEEE Trans. Commun., Vol. 31, No. 4, pp. 532–540, 1983.
[32] Shiraga, N., Sato, Y., Kohda, E., Okada, Y., Sato, K., Hasebe, T., Hira-
matsu, K., Kikinis, R., and Jolesz, F. A., Three dimensional display of
the osteosclerotic lesion by volume rendering method, Nippon Acta
Radiol. Vol. 58, No. 2, p. S84, 1998.
[33] Shimizu, A., Hasegawa, J., and Toriwaki J., Minimum directional dif-
ference filter for extraction of circumscribed shadows in chest X-ray
images and its characteristics, IEICE Trans., Vol. J-76D-II, No. 2, pp. 241–
249, 1993.
[34] Giger, M. L., Bae, K. T., and MacMahon, H., Computerized detection of
pulmonary nodules in computed tomography images, Invest. Radiol.,
Vol. 24, No. 4, pp. 459–465, 1994.
[35] Kanazawa, K., Kubo, M., Niki, N., Satoh, H., Ohmatsu, H., Eguchi, K.,
and Moriyama, N., Computer aided screening system for lung cancer
based on helical CT images, In: Lecture Notes in Computer Science,
Vol. 1131, pp. 223–228, 1996. Proceedings of Visualization in Biomedical
Computing, Hamburg, Germany.
Hessian-Based Multiscale Enhancement and Segmentation 589
[37] Nakanishi, N., Tanaka, H., Nishii, T., Masuhara, K., Narumi, Y.,
and Nakamura, H., MR evaluation of the articular cartilage of the
femoral head during traction, Acta Radiol., Vol. 40, No. 1, pp. 60–63,
1999.
[38] Nakanishi, K., Tanaka, H., Sugano, N., Sato, Y., Ueguchi, T., Kubota, T.,
Tamura, S., and Nakamura, H., MR-based three-dimensional presenta-
tion of cartilage thickness in the femoral head, Euro. Radiol., Vol. 11,
No. 11, pp. 2178–2183, 2001.
[39] Parker, D. L., Du, Y. P., and Davis, W. L., The voxel sensitivity func-
tion in Fourier transform imaging: applications to magnetic resonance
angiography, Magn. Reson. Med., Vol. 33, No. 2, pp. 156–162, 1995.
[40] Hoogeveen, R. M., Bakker, C. J. G., and Viergever, M. A., Limits to the
accuracy of vessel diameter measurement in MR angiography, J. Magn.
Reson. Imaging, Vol. 8, No. 6, pp. 1228–1235, 1998.
[41] Steckner, M. C., Drost, D. J., and Prato, F. S., Computing the modulation
transfer function of a magnetic resonance imager, Med. Phy., Vol. 21,
No. 3, pp. 483–489, 1994.
A Knowledge-Based Scheme
for Digital Mammography
11.1 Introduction
1
Pann Research, Department of Computer Science, University of Exeter, Exeter EX4 4QF,
UK
2
Met Office, Fitzroy Road, Exeter EX1 3PB, UK
591
592 Singh and Bovis
based on the properties of the image under consideration, predict the single
best algorithm to be applied at each layer from this set. We demonstrate that
this scheme of work has significant advantages over a nonadaptive structure
(where only one algorithm is available per layer and it is fixed for all images in
the dataset).
We aim to answer the following questions: (a) What is a knowledge-based
framework? We discuss the components of this framework in section 11.2
putting it in the context of previous research. (b) How does the image enhance-
ment layer work in this framework? This is detailed in section 11.3 where we
discuss measures of image viewability based on enhancement, and demonstrate
the role of good enhancement in image segmentation. We also propose two new
mapping schemes that can map the image features to chosen enhancement meth-
ods. (c) How does the image segmentation layer work within the knowledge-
based framework? In section 11.4 we detail the implementation of sophisticated
Gaussian mixture models in both supervised and unsupervised modes, with
an expert combination framework and compare them on overlap measures.
(d) What are the different strategies for reducing false positives? In sec-
tion 11.5 we discuss several postprocessing steps that are aimed at reduc-
ing the number of false positives per image. (e) Is the adaptive knowledge-
based framework superior to a nonadaptive scheme that uses the same al-
gorithms across all images uniformly? We discuss our results on this is-
sue in section 11.6 where we show the relative superiority of the adaptive
framework.
Methodology
part = image grouping; ann = multistage neural networks; usr = user interaction; cbr = case-based
reasoning.
Optimal
segmentations
Optimal contrast
enhancements
Mammogram
grouping
TRAINING TESTING
Optimal
combination
Trained
model n
PART III
Component n Segment.
Trained knowledge Fn decision Fn
Expert image
model 2 segmentation
Component Segment.
Trained knowledge F2 decision F2
model 1
Component Segment.
knowledge F1 decision F1
Component
knowledge PART II
Expert image
contrast enhancement
Enhance. Enhance. Enhance. Enhance. Enhance. Enhance.
method 1 method 2 method n method 1 method 2 method n
Component
knowledge PART V
Image
preprocessing
Independent image training set Testing image
Using the method for labeling the Target (T) and Background (B) regions, it
is possible to plot the overlap of the density functions for the gray scales com-
prising these two regions. In mammography, this is representative of the over-
lap found between a breast cancer lesion and its background border. A good
enhancement technique should ideally reduce the overlap. In particular, it is
anticipated that the enhancement technique should help reduce the spread of
the target distribution and shift its mean gray-scale level to a higher value thus
separating the two distributions and reducing their overlap. The best decision
boundary for the original image between the two classes, assuming both classes
have a multivariate normal distribution with equal covariances, is given using
[21] as
Similarly, the best decision boundary for the original image after enhance-
ment is given as
where µOB , σ BO , µOT , and σTO are the mean and standard deviation of the gray scales
comprising the background and target area, respectively, of the original image
before enhancement. Similarly µEB , σ BE , µET , and σTE correspond to the mean and
standard deviation of the gray scales after the enhancement. An alternative
approximation to D1 and D2 can be found using the cutting score [22]. If the
groups are assumed to be representative of the population, a weighted average
of the group centroids will provide an optimal cutting score where Eq. (11.1) is
rewritten as
where NBO and NTO are the number of samples in the background and target
prior to enhancement, and NBE and NTE the respective sample numbers after
the enhancement. Again this approximation assumes that the two distributions
are normal and that the group dispersion structures are known. By combining
the above two equations it is possible to compute a distance measure between
the decision boundaries and the means of the targets and background, before
and after segmentation. This measure is termed as the distribution separation
measure (DSM), and it is a measure of the quality of enhancement. It is defined
as
DSM = {|(D2 − µEB | + |(D2 − µET )|} − {|(D1 − µOB | + |(D1 − µOT )|} (11.5)
Ideally the measurement should be greater than zero; the greater the DSM value,
the better the quality of enhancement. For comparing any two enhancement
techniques, choose the technique that gives a higher value on the DSM measure.
where the mean and standard deviation of the gray scales comprise the target
and background before and after the enhancement. Assuming that the target has
a smaller mean before and after enhancement compared to the background, it
is expected that as a result of enhancement, this measure should give a value
greater than zero.
A Knowledge-Based Scheme for Digital Mammography 605
The enhancement method giving the smallest value of D is selected as the best
enhancement method for this image.
1 n1 n2
AVS(ω1 , ω2 ) = d(xi , yj ) xi ∈ ω1 x j ∈ ω2 (11.9)
n1 n2 i=1 j=1
for all pairs of points such that a single point is drawn from each region, target
ω1 and background ω2 with n1 and n2 pixels in total respectively. A large value
of AVSdiff will result if the enhanced image has a greater intergroup dissimilarity
for gray scales in the target and background region compared with that of the
original. This increased value of AVSenhanced indicates that the enhancement has
maximized the Euclidean distance of the confused pixels thereby resulting in
an improved contrast enhancement.
TRAINING TESTING
Component
knowledge
Figure 11.3:
This section describes the mixture of experts framework and it is laid out as
follows. Section 11.3.2.1 reviews the contrast enhancement experts used to build
the framework. Then the segmentation algorithm used to evaluate the enhanced
images is briefly described together with quantitative measures of segmentation
performance. In section 11.3.2.2 results are presented when applying the dif-
ferent image enhancement on DDSM images and the resulting segmentation
from them. Section 11.3.2.3 discusses the features that can be extracted from
the mammograms to be fed into a mapping scheme (e.g., neural networks) that
maps features to optimal enhancement methods. Finally, section 11.3.2.4 dis-
cusses a machine learning system for this mapping. A neural network is used
in two different modes: double network mapping and a single direct mapping
scheme.
1. The overlap area between the target 3. (Aarea ∩ Tarea ) < Tmin
and the actual regions is less than
Tmin .
(a) (b)
to be detected. In each case the target region is shown as darker color and the
actual region, following segmentation, is shown as lighter color overlapping.
Figure 11.4(a) shows the TP outcome where the target and actual region over-
lap is greater than Tmin = 0.5 and conversely, Fig. 11.4(b) where the overlap of
the target region is less than Tmin = 0.5, the SUBTP outcome.
This section presents the results obtained from the segmentation of 200 mam-
mograms from the DDSM. The aim of the experiment is to identify the optimal
contrast enhancement expert for each of the 200 abnormal mammograms. Each
mammogram image has been grouped according to its target breast type. There
are 50 images per breast type grouping and results will be presented on a per
breast type basis. Each mammogram is contrast enhanced using each enhance-
ment method identified in section 11.3.2.1.1 and segmented using the unsuper-
vised HMRFU segmentation method. The sensitivity in the detection of breast
lesions following segmentation of the enhanced images is quantified using the
outcomes given Table 11.2 and the ground truth definition.
From a set of M enhancement methods (E1 , . . . , E M ) for a given mammo-
gram, the target contrast enhancement, Em where m ∈ {1, . . . , M} is the enhance-
ment method giving the largest value of (TPT + SUBTPT ) following segmenta-
tion using HMRFU . The target contrast enhancement expert Em is identified as
(11.10)
A Knowledge-Based Scheme for Digital Mammography 611
The target contrast enhancement Em is found for every mammogram from all
M enhancement methods, m ∈ {1, . . . , M), keeping the segmentation method
and associated initialization parameters constant. Having identified each of the
target enhancement experts, the following important observations can be made
(see sections 11.3.3.2.1–11.3.2.2.6).
methods for fatty breasts but is noticeably less effective in the segmentation of
dense breasts (types 3 and 4).
(a) ACE
1 −0.24 0.00 −0.21
2 0.17 −0.55 −0.10
3 0.10 −0.25 0.09
4 −0.07 0.00 0.00
Mean −0.01 −0.20 −0.06
(b) DWCE
1 0.12 0.00 0.10
2 −0.22 −0.27 −0.21
3 0.05 0.25 0.13
4 −0.13 1.00 0.06
Mean −0.05 0.24 0.02
(c) FUZZY
1 −0.12 2.50 0.24
2 0.17 1.00 0.48
3 0.05 5.50 1.04
4 0.33 8.00 1.38
Mean 0.11 4.25 0.79
(d) HISTEQ
1 −0.28 0.00 −0.24
2 0.50 −0.55 0.07
3 0.00 −0.50 −0.04
4 0.20 1.00 0.38
Mean 0.11 −0.01 0.04
(e) ACELE
1 −0.24 0.75 −0.10
2 0.06 −0.45 −0.17
3 0.10 −0.25 0.09
4 −0.20 0.50 −0.06
Mean −0.07 0.14 −0.06
(f) ACELFD
1 −0.24 0.25 −0.17
2 −0.11 −0.55 −0.28
3 0.05 0.25 0.09
4 −0.07 0.50 0.06
Mean −0.09 0.11 −0.07
A Knowledge-Based Scheme for Digital Mammography 615
This approach to feature extraction extracts a set of F features from pixels com-
prising a suspicious ROI target, T, thus FROI = { f1 , f2 , . . . , f F }. A surrounding
region labelled background, B, of the same area is constructed encircling the
ROI, but comprising normal pixels. From pixels that comprise the target (T) and
background (B) region, the following gray-level statistics are extracted: mean,
standard deviation, entropy, skewness, and kurtosis. These are transformed
into feature values by determining the ratio of the target value to background
value (T/B) for each gray-scale statistic. The features reflect the mathematical
composition of the quantitative measure of contrast enhancement previously
proposed.
616 Singh and Bovis
The double network mapping (DNM) method is used to predict the target con-
trast enhancement using two ANNs for each enhancement method. The aim
is to learn a mapping based on a set of gray-scale features FROI from a given
mammogram, with a quantitative measure of segmentation performance, S. The
segmentation performance is quantified following contrast enhancement, for
each enhancement method m, where 1 ≤ m ≤ M from a set of M enhancement
methods. The two submappings are detailed below:
1. ANNm
DNMenh : For a mammogram I, enhanced using enhancement method m,
this ANN learns the mapping between the set of F gray-scale input features
FROI = ( f1 , f2 , . . . , f F ) extracted from a suspicious ROI, and a set of P
quantitative measures Q = (q1 , q2 , . . . , q P ) of enhancement performance
as described previously in section 11.3.1.
2. ANNm
DNMseg : For a mammogram I, enhanced using enhancement method
m, the ANN learns the mapping between the set of quantitative measure
Q = (q1 , q2 , . . . , q P ) of enhancement performance and a set of R measures
quantifying the performance of lesion segmentation S = (S1 , S2 , . . . , SP )
identified in Table 11.2.
A diagrammatic overview of the mappings learnt is given in Fig. 11.6 and the
training and testing phases are described in more detail below. To evaluate the
strategy, a firefold cross-validation approach is used to reduce bias and ensure
that a test result is produced for each mammogram image.
11.3.4.1.1 Training the DNM Approach. Using this strategy, ANNm DNMenh
and ANNm DNMseg are trained independently for each enhancement method, Em
where m ∈ {1, . . . , M}. For a training image, a border comprising normal pixels
of the same areas as the target ROI is constructed around it. The set of gray-
scale input features FROI are extracted from the target ROI and background
regions as described in section 11.3.3.1. Each training mammogram is contrast
enhanced with each method and a set of quantitative measures of enhancement
618 Singh and Bovis
Q are calculated from the target ROI and border. Thus ANNm
DNMenh learns the
mappings:
ANNm
DNMenh
FRO I −→ Q ∀m = {1, . . . , M} (11.11)
ANNmDNMseg
Q −→ S ∀m = {1, . . . , M} (11.12)
11.3.4.1.2 Testing the DNM Approach. The first step in determining the
optimal contrast enhancement method for a test mammogram I is to locate
a suspicious ROI. To do this, the HMRFU segmentation algorithm is used to
segment the test image and it results in several candidate regions. Regions with a
Euler number > 1 (i.e, enclose a smaller region totally) are removed and from the
remaining regions, the most likely suspicious regions are selected on the basis
of area and morphological tests using a previously trained ANN. For the single
suspicious ROI identified, a surrounding border is constructed of equal area, and
the set of input gray-scale features FROI are extracted. These are used as inputs to
DNMenh for each enhancement method Em where m ∈ {1, . . . , M}. The output
ANNm
of these networks is then supplied as input to ANNm
DNMseg for each enhancement
A Knowledge-Based Scheme for Digital Mammography 619
assign Em → m if AN NDNMseg
m
= argmaxm=1
M m
AN NDNMseg for ∀m = {1, . . . , M}
(11.13)
11.3.4.1.4 DNM Framework Results. Table 11.6 shows the mean percent-
age improvement in segmentation performance compared with the segmenta-
tion of the unenhanced original image, using the predicted expert enhancement
for each breast type. The DNM strategy results are significantly poorer than
those obtained using the target expert contrast enhancement methods reported
in Table 11.4. They are also inferior to the use of the FUZZY method on all breasts
as shown in Table 11.5 (part c).
620 Singh and Bovis
The second strategy used for learning the expert contrast enhancement for a
mammogram is the breast profile mapping (BPM) strategy. For a mammogram
I, enhanced using enhancement method Em, where m ∈ {1, . . . , M}, the BPM
strategy learns the mapping between the set of N gray-scale input features FBP N
detailed in section 11.3.3.1, and a l = {1., . . . , L} indicates the target contrast
enhancement for a training mammogram. Both feature sets {FBP316 , {FBP26 ) are
evaluated separately in their utility for learning the expert contrast enhance-
ment. The expert l is based on a set of R measures quantifying the performance
of lesion segmentation S = {s1 , s2 , . . . , sR } described in Table 11.2. The expert l
is identified as the one maximising the sum of TPT and SUBTPT outcomes for
each enhancement method Em where m ∈ {1, . . . , M} as defined previously in
Eq. (11.10).
Unlike the DNM strategy, this method utilizes a single classifier to pre-
dict the target contrast enhancement method. The k-nearest neighbor (k-
NN) classifier has been show to be effective at learning nonparametric map-
pings with a small sample size [27] and for this reason it is employed in the
knowledge-based contrast enhancement expert. To evaluate the strategy a five-
fold cross validation is used to reduce bias and provide a test decision for each
mammogram.
11.3.4.2.1 Training the BPM Approach. To train the BPM strategy, the
set of gray-scale input features FBP N = ( f1 , f2 , . . . , f N ), where N identifies the
original and PCA feature sets (N = {316, 26}), are extracted from the segmented
breast profile. Each training mammogram is contrast enhanced with each en-
hancement method. The quantitative measures of segmentation are calculated
for the target ROI. For each enhancement method, the winning predicted en-
hancement method identified by the label l is used to learn the mapping between
F and l with the k-NN classifier.
1. Feature set FBP316 : Using an optimized value of k = 23, Table 11.7 shows
the percentage improvement in segmentation performance when using the
predicated actual enhancement method, compared with that obtained with
the unenhanced original from the FBP316 set. These results show that the
segmentation improvement obtained over the unenhanced image, when
segmenting an image enhanced using a enhancement method predicted
by the BPM strategy, is greater than that obtained using the DNM strategy
predicted enhancement method. However, segmenting the BPM strategy’s
predicted enhanced image results in inferior performance to that using the
target enhancement method identified in Table 11.4. The result for breast
type 4, the densest breast type, shows a small improvement over using
the FUZZY method, shown in Table 5 (part c), for all mammograms of that
type.
2. Feature set FBP26 : Using an optimized value of k = 19, Table 11.8 shows the
percentage improvement in segmenting the unenhanced image compared
622 Singh and Bovis
to that when segmenting the image enhanced using the predicted enhance-
ment method by the optimized BPM strategy with the FBP26 feature set.
These results indicate better performance than the DNM strategy but are
still inferior to the segmentation using the target expert enhancement
method shown in Table 11.4. The result for breast type 1–3 show an im-
provement over using the FUZZY method in Table 5 (part c) for all mammo-
grams of that type. Comparing the results from the evaluation of the two
feature sets, FBP26 and FBP316 from the BPM strategy, the results indicate
that the feature set FBP26 is better suited to processing mammograms with
breast types 1–3, whereas the feature set FBP316 gives better performance
on the densest breast type, i.e., type 4. Interestingly, for both feature sets,
the performance improvement is worse over the fattiest breast types, type
1, compared with the densest, type 4. This is because of the variability
of optimal enhancement method for the fatty breast types, whereas the
denser breasts tend to be optimal enhanced by the FUZZY method more
often.
of the target optimal values from Table 11.4. Additionally, the table shows the
result obtained by applying the FUZZY method to all images (given in Table
5(part c) over all four breast types. The last row in Table 11.9 shows the result
of using the prediction from the BPM strategy with feature set FBP26 on breast
types 1–3 and feature set FBP316 on type 4. From these results the following key
observations are made:
2. Target experts: Figure 11.5 highlighted that given a set of contrast enhance-
ment methods, different methods can be identified as target enhancement
experts for different mammograms. This observation is the motivation for
learning the optimal expert.
4. The superior BPM approach: The resultant performance using the mod-
ified BPM strategy based on breast type leads to a greater performance
624 Singh and Bovis
than simply using the FUZZY method. The result is inferior to the tar-
get contrast enhancement baseline performance indicating that learning
the expert enhancement is a nontrivial problem. In implementing the
modified BPM strategy, a mechanism of predicting the breast type is
required.
5. Use of mammogram grouping knowledge: The BPM approach has been de-
veloped to utilize a priori knowledge describing the mammogram group-
ing indicating the mammographic breast density type. This knowledge is
used to determine the feature extraction method to be used, either FBP26 for
breast types 1–3 or FBP316 for type 4. In the experimental results presented
above, the target breast type was used.
The GMM approach does not consider the spatial arrangement of class la-
bels in an image, which can be quite useful for relaxation labeling [28]. Markov
random fields (MRF) have been shown as a powerful class of techniques [29–31]
for modeling the spatial arrangement of class labels. MRF can be expressed in
terms of a probabilistic framework and they can be combined with a statistical
observed model of the mammogram. An MRF can increase the homogeneity of
the formed regions that leads to a reduction in the false positives.
In this study we propose a Weighted Gaussian Mixture Model (WGMM) for
both supervised (WGMMS ) and unsupervised (WGMMU ) data analysis. A set of
GMMs is constructed, each modeling a particular class distribution and capable
of being combined into a single unconditional density. We combine the WGMM
model with a MRF hidden model and propose two approaches that work for
supervised (WGMMMRF
S ) and unsupervised (WGMMUMRF ) modes. The four models
or experts (WGMMS , WGMMU , WGMMMRF
S , and WGMMUMRF ) each produce a label
for the test pixel. We use a number of different features, each forming the basis of
a different expert and relying on one of the above four models for segmentation.
The expert outputs can be combined using well-known expert combination
methods. In this chapter we propose an adaptive weighted model (AWM) for
the combination of four experts and show that this new method of combination
outperforms other popular methods.
If Ymn = 1, then data point xn will only be considered when setting the param-
eters of class ωl modeled by component m. Using the labelled training data, a
maximum likelihood (ML) estimate of all component parameters and mixing
coefficients can be found.
We first describe the two modes of test image segmentation, supervised and
unsupervised, in section 11.4.2. We then detail our weighted GMM/MRF models
in section 11.4.3.
such that the mixing coefficients f (m) satisfy the following constraints:
M
f (m) = 1 and 0 ≤ f (m) ≤ 1.
m=1
The lth GMM estimates the class-conditional pdf p(x | ωl , l ), which is itself
another mixture model, for each data point for each class {ωl }l=1
L
. The vector
l is defined as the M component Gaussian parameters of the lth GMM as
l = {Pl (m), µlm, lm }, ∀m = {1, . . . , M}. Each estimate of the class conditional
pd f is mixed to model the overall unconditional density p(x), using a mixing
coefficient p(ωl ), identifying the contribution of the lth class density in the
unconditional pdf.
If it is assumed that for a complete dataset X, of points xn, where X ≡
{x1 , . . . , xN ), is drawn independently from the distribution f (x | θ ), then the joint
occurrence of the whole dataset can be conveniently expressed as the log like-
lihood as follows:
N
N
L
log ζ () = log p(xn | ) = log γnl p(ωl ) p(xn | ωl , l) (11.16)
n=1 n=1 l=1
in Eq. (11.13) are themselves mixture models. In the EM algorithm, the update
equations for mixing coefficients do not depend on the functional particulars
of the component densities. Hence, the mixing coefficients of the WGMM are
updated according to
1 N
Pnew (ωl ) = pold (ωl | xn, lold ) (11.17)
N n=1
The m-step involves maximizing the auxiliary function with respect to the pa-
rameters {l }l=1
L
. The auxiliary function can be written as
N
L
Q(new , old ) = pold (ωl | xn, lold ) log Pnew (ωl ) pnew (xn | ωl , θnew
l )
n=1 l=1
(11.18)
where
M
pnew (xn | ωl , new
l )= pnew (ml ) pnew (xn | ml , new
ml ) (11.19)
m=1
Writing γnl = pold (ωl | xn, lold ), the auxiliary function can be written as the sum
of L auxiliary functions, one for each mixture model:
N
L
Q(new , old ) = γnl log Pnew (ωl ) pnew (xn | ωl , θnew
l ) (11.20)
n=1 l=1
L
Q(new , old ) = Q̂ l (new , old ) (11.21)
l=1
p(xn | ωl , θl )P(ωl )
where γnl = p(ωl | xn, l ) = L (11.22)
j=1 p(xn | ω j , θ j )P(ω j )
N
l , l ) =
Q̂ l (new γnl log Pnew (ωl ) pnew (xn | ωl , θnew
old
and l ) (11.23)
n=1
2. Iterate outer E-step and outer M-step until the change in auxiliary func-
tion (Eq. 11.18) between iterations is less than some convergence thresh-
old WGMM converge .
3. Outer EM E-step:
4. Outer EM M-step:
(b) Find new values for the W G M M mixing coefficients, new , that max-
imizes the auxiliary function given in step 3(b) above.
Finally, we combine our WGMM model with MRF in the same manner as
Zhang et al. [24] combined GMM with MRF. The W G M M MRF model is based on
Eq. (11.15) except that the mixing coefficients p(ωl ) are replaced with a MRF-
MAP estimate p(yn = ωl | ℵn) using ICM algorithm [29]. The auxiliary function
given in Eq. (11.18) is rewritten to include the MRF hidden model as follows:
N
L
Q(new , old ) = pold (ωl | xn, old
l ) log( p(yn = ωl | ℵn) p
new
(xn | ωl , lnew )
n=1 l=1
(11.24)
630 Singh and Bovis
The update equations for the mean and covariances in the GMM-EM algorithm
remain unchanged. The MRF-MAP estimate is combined in the conditional den-
sity function pold (ωl | xn, θlold ) as
covariances lm from each of the M component Gaussians, m ∈ {1, . . . , M}, for
each class ωl ∈ {1, . . . , L}. On segmentation of an image, the rth expert provides
an estimate of the a posteriori probability of a feature vector associated with a
pixel xn, belonging to a given class ωl as p( ŷn = ωl | xn, θr ), for ∀n = (1, . . . , N).
In order to combine the decisions of different experts, the joint probability of
all segmentation decisions is required. Using the Bayes rule, the combined a
posteriori probability can be computed from the segmentation experts for class
ωl as follows:
p( ŷ = ωl | xn, θ1 . . . θ R ) p(ωl )
p( ŷ = ωl | xn, θ1 . . . θ R ) = (11.26)
p(xn, θ1 , . . . , θ R )
where p(ωl ) is the prior probability (assumed to be set equally for all classes as
1/R) for each class ωl , and p(xn, θl , . . . , θ R ) is the unconditional joint probability
defined as
L
p(xn, θ1 , . . . , θ R ) = p( ŷ = ωk | xn, θ1 , . . . , θ R ) p(ωk ) (11.27)
k=1
On the basis of this nomenclature and equal priors from each class, in the fol-
lowing two sections we detail the “ensemble-based combination rules” (section
4.4.2), and then propose a novel strategy for combining results, called “adaptive
weighted model (AWM)” (section 4.4.3)
Kittler [33] proposed a set of very popular rules for combining probability out-
puts from a number of experts. These rules are stated as follows:
Product ;R
p( ŷ = ωl | xn, θr )
(Prod) p( ŷ = ωl | xn, θ1 . . . θ R ) = L r=1
;R .
j=1 r=1 p( ŷ = ω j | xn, θr )
Sum
1 R
(Sum) p( ŷ = ωl | xn, θ1 . . . θ R ) = r=1 p( ŷ = ωl | xn, θr )
R
Max
(Max) p( ŷ = ωl | xn, θ1 . . . θ R ) = max r=1
R
( ŷ = ωl | xn, θr )
Min
(Min) p( ŷ = ωl | xn, θ1 . . . θ R ) = min r=1
R
( ŷ = ωl | xn, θr )
632 Singh and Bovis
Majority
Voting R
lr
(Mv) p( ŷ = ωl | xn, θ1 . . . θ R ) = r=1
⎧ R
⎨1 if p( ŷ = ωl | xn, θ1 . . . θr )
⎪
where lr = = max r=1
R
p( ŷ = ωl | xn, θ j )
⎪
⎩
0 otherwise
The above combination rules have been used in several studies and form the
basis of our baseline comparison.
R
p( ŷn | xn, ) = p(r) p( ŷn | r, xn) (11.29)
r=1
R
given that the mixing coefficients satisfy the following constraints: r=1 p(r) =
1 and 0 ≤ p(r) ≤ 1. If we treat the weighted contribution of each expert in the
unconditional distribution as probabilities, then statistical models such as mix-
ture of experts (MOE) framework [34] can be trained to learn the individual
classifier and weight contribution distributions. For this we propose using the
GMM using EM algorithm. We now present a method for identifying the weights
in a probabilistic manner motivated by the MOE framework. Our proposed ap-
proach is, however, different to the conventional MOE method in two ways: (i)
First, the a posteriori pd f from each segmentation expert remains fixed hav-
ing been generated during segmentation; (ii) second, the mixing coefficients for
A Knowledge-Based Scheme for Digital Mammography 633
For simplicity, the above likelihood function can be rewritten and expressed as
a log likelihood as follows:
N
N
R
log ζ () = log p( ŷn | ) ≡ log p(r) p( ŷn | r, xn) (11.31)
n=1 n=1 r=1
For the above equation, it is not possible to find the ML estimate of the parameter
∂ζ
values directly because of the inability to solve ∂ = 0 [23]. Our approach used
to maximising the likelihood log ζ () is based on the EM algorithm proposed
in the context of missing data estimation [35].
It should be noted that the a posteriori estimate p( ŷn | r, xn) for the nth data
point from the rth segmentation expert remains fixed. The conditional density
function pold (r | ŷn) is computed using the Bayes rule as
p( ŷn | r, xn) p(r)
pold (r | ŷn) = R (11.33)
j=1 p( ŷn | j, xn) p( j)
634 Singh and Bovis
In order to maximize the estimate of the likelihood function given by the auxiliary
function, update equations are required for the mixing coefficients. These can be
obtained by differentiating with respect to the parameters set equal to zero. For
the AWM, the update equations are taken from [27]. For the rth segmentation
expert
1 N
pnew (r) = pold (r | ŷn) (11.34)
N n=1
1. The new estimate of the segmentation expert weightings for the rth com-
ponent P new (r) is given by Eq.(11.33).
case, there is no concept of training and testing and each image is treated
individually.
explanation for this phenomenon could be based on the model order selection
where m = 1 for the abnormal class of the fatty breast types. A more sophisti-
cated approach to determining model order might improve the segmentation of
these breast types. Without the hidden MRF model, the supervised strategy is
inferior to the unsupervised approach on the denser breasts.
We now present the results on 200 test mammograms that contain lesions.
The details of training and testing scheme are the same as detailed in sec-
tion 11.4.2. As we mentioned earlier, each breast is classified as one of the
four types (1, predominantly dense; 2, fat with fibroglandular tissue; 3, hetero-
geneously dense; and 4, extremely dense) and the results are presented for
data from each type. Table 11.11 shows the test results on sensitivity of the
638 Singh and Bovis
Table 11.11: Mean sensitivity for each testing strategy for DDSM image
database
Results are shown for all breast types. Winning segmentation expert are shown in bold per breast type.
Table 11.12: Mean sensitivity for each combination strategy for DDSM
database
Results are shown for all breast types. Winning combination method shown in bold per breast type.
A Knowledge-Based Scheme for Digital Mammography 639
(b) 1 WGMMMRF
S AWM 0.575 .25
2 WGMMUMRF AWM 0.667 .26
3 WGMMMRF
S AWM 0.727 .38
4 WGMMMRF
S AWM 0.680 .37
We next compare the ensemble combination rules with the AWM expert
combination strategy on the four breast type data testing. The results are shown
in Table 11.12. The key results can be summarized as follows: (a) The AWM
method result always turns out to be the overall best result compared to all
ensemble combination rules on all breast types. (b) The AWM results are best
with the WGMMMRF
S segmentation method on breast types 1, 3, and 4, and best
with WGMMUMRF on breast type 2. (c) The combination methods Max and Prod
never win. (d) Segmentation models using MRF are better than those that do
not use them.
In Table 11.13 we compare single best experts with the best combination
of experts for the four breast types. The results show that only on breast type
1, using the single best expert WGMMS with laws1 , features will outperform all
other experts and combination of experts (sensitivity of 0.74). For the remaining
three breast types, the AWM expert combination method is the best. For breast
types 3 and 4 (dense breasts), the supervised learning based models with MRF
are better, whereas for fatty breast of type 2, unsupervised learning model with
MRF is the best.
640 Singh and Bovis
SEGMENTED IMAGES
Region prefiltering
Feature extraction
PCA
This section describes the approach used within the adaptive knowledge-based
model for the reduction of false-positive regions. Figure 11.7 shows a schematic
overview of the approach adopted. Using the actual breast type grouping pre-
dicted by the breast classification component, a segmented mammogram is di-
rected to one of four process flows. Each process flow, shown in Fig. 11.7,
A Knowledge-Based Scheme for Digital Mammography 641
comprises the same functionality. This is discussed in more detail in the follow-
ing subsections.
Using a labelled training set, an ANN classifier can be trained using supervised
learning algorithms to discriminate between normal and abnormal regions. Fea-
tures from representative training samples are provided during supervised learn-
ing and the weights of the ANN are updated until the generalization ability of
the classifier starts to decrease measured on a separate validation set. Imple-
mentation in the adaptive knowledge-based model results in the construction
of a separate ANN classifier for each breast type grouping. Only regions from
mammograms of the same mammogram type will be considered for each ANN.
Each ANN is a three-layer feed-forward network comprised of a different num-
ber of hidden nodes and two output nodes (normal, abnormal). The optimal
number of hidden nodes is determined for each ANN individually. To ensure
an unbiased result and that every sample is used at least once in training and
testing, a 10-fold cross validation strategy [32] is employed. No sample appears
simultaneously in training and test. Additionally a validation set is used (com-
prising 10% of the training samples) to prevent over-fitting of the ANN to the
training set. The feed-forward ANN is trained using a back-propagation with
momentum learning function (learning rate η = 0.01, momentum µ = 0.5) to-
gether with a softmax activation function and used on test to give an estimate
of the a posteriori probability of each pattern for each class.
After expert segmentation with WGMMMRF S combined using AWM; after region prefiltering using Tarea = 122;
after false-positive (FP) reduction using classifier operating point, by breast type.
a lower one. Using the unbiased PCA strategy described above only eigenvalues
≥ 1.0 are considered, resulting in a 37-dimensional feature vector.
To optimize the number of hidden nodes, using 10-fold cross validation, different
ANN models are evaluated. For the evaluated ANN model, performance in dis-
criminating between abnormal and normal regions is determined using receiver
operating characteristic (ROC) analysis [40]. By calculating the area under the
ROC curve (AZ ), a quantitative measure of performance can be determined.
Table 11.15 summarizes the results from applying the false-positive reduction
strategy to 200 abnormal segmented DDSM mammograms. Three sets of results
are shown for each stage in the false-positive approach described for each breast
type grouping.
The first column shows the sensitivity and average number of false posi-
tives per image following mammogram segmentation. The segmentation was
obtained by combining 10 segmentation expert outcomes using the AWM de-
scribed earlier. Each expert was constructed using the WGMM constrained with
a MRF utilizing a supervised learning approach WGMMMRF
S .
A Knowledge-Based Scheme for Digital Mammography 645
The second column shows the sensitivity and average number of false-
positives regions per image obtained after applying the region prefiltering. These
results demonstrate the utility of the region prefiltering stage. The average num-
ber of false-positive regions per image has dropped from approximately 147 to
just 9 when testing on the complete dataset of 200 abnormal mammograms.
This result has been obtained at a reduction in the sensitivity to the detection
of breast lesions, from 0.63 to 0.60, for all breast types.
The final column shows the results obtained after classifying each region
passing the prefiltering using an optimized ANN based on the 37-dimensional
PCA feature vector for each sample. Using ROC analysis, the threshold for the
detection of positive cases is set using the operating point of each ANN [40].
From these results it can be seen that the sensitivity is reduced still further to
just 0.54 for all 200 abnormal mammograms, with a reduced average number
of false-positive regions per image of 3.84. The results indicate that the biggest
drop in sensitivity is obtained for the fatty breasts, breast types 1 and 2. This
may be attributed to the increased variability of breast lesions in these breast
types compared with that of the denser breasts.
This section presents the results from evaluating the optimal contrast en-
hancement and segmentation knowledge-based components of the adaptive
knowledge-based model on a dataset of 200 DDSM mammograms containing
abnormalities. The 200 mammograms comprise 50 images from each of four dif-
ferent breast types. To obtain a testing result for each mammogram, knowledge-
based components utilize separate training and testing folds such that no image
from a test fold exists in a corresponding training fold. Training data for the
abnormal mammograms is based on redefined DDSM ground truth boundaries.
Figure 11.8 shows the configuration of the adaptive knowledge-based model
for contrast enhancement and mammogram segmentation used for performance
evaluation. Enhancement and segmentation experts are identified in the black
boxes. Knowledge-based components, providing optimal enhancement and op-
timal segmentation, are identified in dotted boxes. Associated with each expert
and knowledge-based component in Fig. 11.8 is a table with four rows, one for
each breast type. The right-hand column of the table identifies the performance
of the associated expert or knowledge-based component for all mammograms
of the predicted breast type. This performance measure is computed differently
for contrast enhancement and segmentation components as follows:
OPTIMAL
Mammogram Enhancement
ENHANCE
1 0.28
2 0.85
3 0.38
4 0.89
Mean 0.60
GREYS DWT.1 DWT.2 DWT.3 ENHANCED LAWS.1 LAWS.2 LAWS.3 LAWS.4 LAWS.5
1 0.65 1 0.62 1 0.61 1 0.61 1 0.62 1 0.58 1 0.60 1 0.56 1 0.59 1 0.57
2 0.63 2 0.60 2 0.58 2 0.59 2 0.60 2 0.59 2 0.60 2 0.57 2 0.60 2 0.55
3 0.66 3 0.61 3 0.62 3 0.61 3 0.63 3 0.58 3 0.59 3 0.56 3 0.60 3 0.55
4 0.69 4 0.64 4 0.65 4 0.65 4 0.67 4 0.60 4 0.61 4 0.59 4 0.65 4 0.58
Mean 0.65 Mean 0.61 Mean 0.61 Mean 0.61 Mean 0.63 Mean 0.58 Mean 0.60 Mean 0.57 Mean 0.61 Mean 0.56
OPTIMAL
Mammogram Segmentation
SEGMENT
1 0.71
2 0.71
3 0.72
4 0.75
Mean 0.72
Singh and Bovis
Figure 11.8: Evaluation of a given configuration of the adaptive knowledge-based model. Performance shown for each breast
type for each component is interpreted as a percentage.
A Knowledge-Based Scheme for Digital Mammography 649
The following paragraphs briefly review the contrast enhancement and seg-
mentation of digitized mammograms described in previous sections.
Contrast enhancement: The trained contrast enhancement knowledge-
based component selects the optimal contrast enhancement method for a test
mammogram, as one from a subset of six selected enhancement methods. Each
of the enhancement methods has been described in section 11.3.2.1.1. The BPM
strategy is used to implement the knowledge-based contrast enhancement com-
ponent, and following training predicts the optimal enhancement method for
a testing mammogram on the basis of an extracted feature vector. A different
feature vector is used depending on the predicted breast type. A feature vector
comprising a selected number of principal components FBP26 is used for mam-
mograms of breast types 1–3. For breast type 4, the complete feature vector
FBP316 is used.
Segmentation: To segment a mammogram, the semisupervised WGMM con-
strained with a WGMMMRF
S strategy is used. Ten different segmentation experts
are trained and each one gives a segmentation decision for the test mammogram.
The 10 experts have been trained to operate on specific groupings of input fea-
ture spaces. The experts for this configuration of the adaptive knowledge-based
model are described in section 11.4.4. The decision of each expert is combined
using a knowledge-based segmentation component implemented using the AWM
described earlier. The AWM will predict the optimal blend of expert decisions
to maximize the segmentation performance.
11.6.1.2 Results
This section presents the results from contrast enhancement and segmentation
using 200 abnormal images, such that the image processing pipeline is con-
structed on the basis of the predicted breast type.
Knowledge-based contrast enhancement: From the results presented in
Fig. 11.8, it can be seen the best performing expert is the FUZZY contrast en-
hancement method over all breast types. The average improvement in segmenta-
tion performance is 54% for all 200 abnormal images. Using the predicted optimal
enhancement method from the knowledge-based contrast enhancement compo-
nent, the average improvement in segmentation performance increases to 60%.
The knowledge-based contrast enhancement component is determining the op-
timal enhancement based on component knowledge learnt during supervised
650 Singh and Bovis
The results in the previous section show that the performance obtained following
ROC analysis of the knowledge-based segmentation component is greater than
that obtained from the best performing segmentation expert. By thresholding
each probability image using a ROC operating point following optimal expert
combination, region boundaries can be identified. In general, the ROC operating
point [40] can be selected for each individual mammogram by associating a cost
for a false positive, CFP , and a false negative, CFN . In this chapter, the operating
point cannot be determined using this method. This is because the ground truth
knowledge cannot be used during testing.
To determine an estimate of the operating point, the mean operating point is
calculated from all mammograms contained within a training fold. Only mam-
mograms that following segmentation, give lesion detection with an operating
point greater than 0.95 are considered. The mean operating point is calculated
from each training fold, for each breast type. To compute each operating point,
the relative cost of a false positive is chosen as CFP = 1 and for a false negative
CFN = 20. In addition, the probability of a positive outcome, P(D+) = 0.03, com-
puted as the mean percentage of abnormal pixels in all training mammograms.
A Knowledge-Based Scheme for Digital Mammography 651
In this section, the adaptive knowledge-based model is evaluated using the same
configuration as described in the previous section and using exactly the same
strategy for determining the segmentation operating point from an independent
652 Singh and Bovis
1 53 54
2 20 20
3 28 36
4 99 90
Total 200 200
training set. The dataset is extended to include 200 normal images from four
different breast types, 50 normal images drawn from each. The use of normal
mammograms will demonstrate the specificity levels of the adaptive knowledge-
based model. Table 11.17 shows the frequency of predicted breast groupings for
normal and abnormal classes following breast type classification. The adaptive
knowledge-based model is evaluated in its ability to provide an optimal segmen-
tation for all the normal and abnormal mammograms.
Using overlap analysis, both sensitivity and the average number of false-positives
per image can be determined for each predicted breast group. The results from
overlap analysis are shown in Table 11.18. From this table, it can be seen that
the average number of false positives over all breast types has risen slightly with
1 0.79 207.26
2 0.80 162.68
3 0.96 161.86
4 0.80 136.45
Mean 0.84 167.01
A Knowledge-Based Scheme for Digital Mammography 653
the inclusion of the 200 normal mammograms compared with the results pre-
sented in Table 11.16. The aim of the false-positive reduction knowledge-based
component described is to reduce the false-positive count, while maintaining
sensitivity in the detection of lesions. The next section describes how this is
achieved in this configuration of the adaptive knowledge-based model.
False positives are initially reduced by removing regions with an area less than
a predefined threshold Tarea . We choose Tarea = 122 pixels, thus any region less
than 5 mm in diameter is removed. This approach is used here. From the re-
maining suspicious regions, features are extracted, and using a trained ANN
classifier, a region is labelled as abnormal or normal.
Feature extraction: For those regions that remain following the application
of the area test, the 316-dimensional feature vector described in [42] is extracted
using the pixels comprising the region. To improve classifier generalization [32]
unbiased PCA is used to map the 316-dimensional feature vector into a lower
dimensional feature space. PCA is used on a per breast type basis, so that the
number of principal components is selected independently for each breast type.
Using this approach, for each predicted breast type, the number of principal
components selected are as follows: (type 1, 37 components; type 2, 33 compo-
nents; type 3, 35 components; type 4, 41 components). From this table it can be
seen that the highest dimensional feature space results from the densest breast
types (type 4), which are the generally the hardest to interpret by an expert
radiologist [37].
Model order selection: In order to maximize the performance of each ANN
for each predicted breast type, model order selection of the ANN classifier is
performed. By varying the number of hidden nodes and performing a classifi-
cation on all suspicious region, ROC can be performed and the area under the
ROC curve ( AZ ) computed. The optimal number of hidden nodes is determined
as that maximising the AZ value.
11.6.2.4 Results
This section presents the results from applying the false-positive reduction
methodology on the suspicious regions resulting from the knowledge-based
654 Singh and Bovis
Table 11.19: Sensitivity and average number of false-positives per image for
200 abnormal and 200 normal images
Values after segmentation, after region prefiltering, and after false-positive reduction using optimized
classifier at ROC operating point, each by breast type (FP/i = average number of false-positive per image,
OP = operating point).
11.7 Conclusions
Questions
2. What are the two main areas used in this chapter when it comes to X-ray
breast imaging?
3. What are the different measures used for X-ray breast “contrast enhance-
ment”? Discuss each of them.
4. Show how the knowledge-based system works for the contrast enhance-
ment.
Bibliography
[3] Li, L., Qian, W., and Clarke, L. P., Digital mammography: Computer
assisted diagnosis method for mass detection with multiorientation
and multiresolution wavelet transforms, Acad. Radiol., Vol. 4, No. 11,
pp. 724–731, 1997.
[9] Zheng, B., Chang, Y., and Gur, D., Adaptive computer-aided diagnosis
scheme of digitised mammograms, Acad. Radiol., vol. 3, pp. 806–814,
1996.
658 Singh and Bovis
[11] Lai, S. and Fang, M., Adaptive medical image visualisation based on
hierarchical neural networks and intelligent decision fusion, In: Pro-
ceedings of IEEE Signal Processing Society Workshop, pp. 438–447,
1998.
[12] Lai, S. and Fang, M., A hierarchical neural network algorithm for robust
and automatic windowing of MR images, Artif. Intell. Med., Vol. 19,
pp. 97–119, 2000.
[13] Pitiot, A., Toga, A. W., Ayache, N., and Thompson, P., Texture based MRI
segmentation with a two stage hybrid neural classifier, In: Proceedings
of IEEE IJCNN Conference, 2002, Vol. 3, pp. 2053–2058.
[16] Perner, P., An architecture for a CBR image segmentation system, Eng.
Appl. Artif. Intell., Vol. 12, No. 6, pp. 749–759, 1999.
[17] Guan, L., Anderson, J. A., and Sutton, J. P., A network of networks pro-
cessing model for image regularisation, IEEE Trans. Neural Networks,
Vol. 8, No. 1, pp. 169–174, 1997.
[19] Singh, S. and Bovis, K. J., Digital mammography segmentation, In: Ad-
vanced Algorithmic Approach to Medical Image Segmentation: State-
of-the-Art Application in Cardiology, Neurology, Mammography and
Pathology, Suri, J., Setarehdan, S. K., and Singh, S., eds., Springer-Verlag,
Berlin, pp. 440–540, 2001.
A Knowledge-Based Scheme for Digital Mammography 659
[22] Hair, J., Anderson, R., and Tatham, R., Multivariate data analysis, 1998.
[24] Zhang, Y., Brady, M., and Smith, S., Segmentation of brain MR images
through a hidden Markov random field model and the expectation min-
imisation algorithm, IEEE Trans. Med. Imaging, Vol. 20, No. 1, pp. 45–57,
2001.
[25] Kallergi, M., Carney, G. M., and Gaviria, J., Evaluating the performance
of detection algorithms in digital mammography, Med. Phys., Vol. 26,
No. 2, pp. 267–275, 1999.
[27] Duda, R. O., Hart, P. E., and Stork, D. G., Pattern Classification, Wiley,
Neq York, 2001.
[28] Sonka, M., Hlavac, V., and Boyle, R., Image Processing, Analysis and
Machine Vision, PSW Publishing, 1999.
[29] Besag, J., On the statistical analysis of dirty pictures, J. Roy. Soc. B,
Vol. 48, No. 3, pp. 259–302, 1986.
[30] Dubes, R. C. and Jain, A. K., Random field models in image analysis, J.
Appl. Stat., Vol. 16, No. 2, pp. 131–163, 1989.
[34] Jacobs, R. A. et. al., Adaptive mixtures of local experts, Neural Comput.,
Vol. 3, pp. 79–87, 1991
[35] Dempster, A. P., Laird, N. M., and Rubin, D. B., Maximum likelihood
from incomplete data via the EM algorithm, J. Roy. Stat. Soc. B, Vol. 39,
pp. 1–38, 1977.
[40] Metz, C. E., Basic principles of ROC analysis, Sem. Nuclear Med., Vol.
8, No. 4, pp. 283–298, 1978.
[42] Bovis, K. J. and Singh, S., Learning the optimal contrast enhancement of
mammographic breast masses, In: Proceedings of the 6th International
Workshop on Digital Mammography, Bremen, Germany, June, 22–25,
2002, Springer, Berlin, pp. 179–181, 2002.
Chapter 12
12.1 Introduction
1
Doctoral Program in Computer Science, The Graduate Center, CUNY, 365 5th Avenue,
New York, NY
2
Department of Computer and Information Science, University of Pennsylvania, Levine
Hall, 3330 Walnut Street, Philadelphia, PA
661
662 Herman and Carvalho
algorithms perform the whole segmentation process without any user interven-
tion, usually by obtaining all information necessary to perform the segmentation
from prior knowledge about the class of problems to which the segmentation at
hand belongs.
Algorithms can be also be classified according to how they solve the seg-
mentation problem. Point-based algorithms make a local decision about a point’s
membership to an object. This decision can be based solely on the point’s bright-
ness value or on the brightness values of a small neighborhood surrounding the
point. A widely used and very simple point-based segmentation algorithm is
thresholding, where a user selects one or two brightness values that are in-
terpreted as lower and/or upper values of the brightness of the object to be
segmented. Then, all pixels whose values are in the specified brightness range
are considered to be part of the object. It is easy to see that algorithms of this
type are very sensitive to noise, to inhomogeneous illumination, and are not
appropriate for segmenting textured objects.
Edge-based segmentation algorithms usually work in two steps by first de-
tecting edges in the image and then grouping or linking them into boundaries of
objects based on the orientation of the edges and on prior knowledge regarding
the expected shape of objects. Common edge detection procedures include the
use of gradient operators, Laplacians or the Canny edge detector [1], while edge
linking can be performed locally by searching small local pixel neighborhoods
or globally by making use of the Hough Transform [2], for example. Other edge-
based segmentation algorithms use active contour models, such as snakes [3]
or balloons [4]. Snakes are energy-minimizing splines guided by external con-
straint forces and pushed by image forces (edges) toward image features, while
balloons use image forces to stop their inflated curve models on image fea-
tures. There are also global optimization algorithms [5–7] that segment images
by minimizing various energy functions defined in terms of pixel labels and prior
knowledge.
Region-based algorithms are subdivided into region growing and split-and-
merge algorithms. Region growing algorithms, as the name suggests, start with
preselected seed points forming the initial regions that grow according to some
predefined rules until the whole image is labeled. Split-and-merge algorithms
begin by subdividing an image into arbitrary disjoint regions, and then split
and/or merge them repeatedly until some preset conditions are satisfied. The
methods of balloons [4, 8] and level sets [9, 10] can also be considered region
Simultaneous Fuzzy Segmentation of Medical Images 663
growing methods since they make use of contour models that inflate from an
initial position to segment objects in a scene.
In this work we present a multiseeded fuzzy segmentation algorithm, which
is a greedy semiautomatic region growing algorithm based on the fuzzy segmen-
tation algorithm of [11] but is capable of efficiently segmenting multiple objects
simultaneously.
If what distinguishes objects in an image are not the exact values assigned to
the pixels but rather some textural property (as it is the case for images con-
taining random noise and/or shading), then fuzzy connectedness can be usefully
employed to achieve segmentation (see [12–17] and their references). Fuzzy
connectedness was explicitly introduced by Rosenfeld [18], but it had been
foreshadowed earlier (for example by the “Minimum Method” in [13]). Our ap-
proach is based on that advocated in [11], but is generalized to arbitrary digital
spaces [19].
A digital space is a pair (V, π ), where V is a set and π is a symmetric binary
relation on V such that V is connected under π . A picture over this digital space
is a triple (V, π, f ), where f maps V into the real numbers. Because of the
nature of the applications that we have in mind, we refer to elements of V as
spels, which is short for spatial elements [19]. In this paper we assume that π
is antireflexive (i.e., that, for all c ∈ V , (c, c) ∈ π ) and we use N(c) to denote
the neighborhood of c that consists of c itself and all d ∈ V , such that (c, d) ∈ π .
If (c, d) ∈ π , we say that c and d are adjacent. The spels can be pixels of an
image (as in [11, 12, 14, 16–18, 20]), but they can also be dots in the plane (as
in [21, 22]), or any variety of other things. The theory and algorithm presented
here will be independent of the specifics of the application area. They are in
particular applicable to data clustering [23] in general, and so their range of
usefulness goes far beyond just image segmentation and includes such distant
areas of endeavor as psychology [13] and statistics [24].
The basic concept that we are generalizing here is that of fuzzy connect-
edness: to every ordered pair (c, d) of spels, it assigns a real number not less
than 0 and not greater than 1. This indeed is an example of a fuzzy set (as it is
normally defined in the literature [25]): the fuzzy set in question is “the set of
664 Herman and Carvalho
connected pairs” and the grade of membership of (c, d) in this set is the fuzzy
connectedness of c to d. In the approach used below, fuzzy connectedness is
defined in the following general manner.
We call a sequence of distinct spels a chain; its links are the ordered pairs of
consecutive spels in the sequence. We define the ψ-strength of a link to be the
appropriate value of a fuzzy spel affinity function ψ : V 2 → [0, 1], i.e., a function
that assigns a value between 0 and 1 to every pair of spels in V . For example, if
the set of spels V is a finite set of dots in the plane, we may define the strength of
the link from one dot to another as the reciprocal of the distance between them
(we need to make the unit of distance such that all distinct dots are at least one
unit from each other). A chain is formed by one or more links and the ψ-strength
of a chain is the ψ-strength of its weakest link; the ψ-strength of a chain with
only one spel in it is 1 by definition. A set U (⊆ V ) is said to be ψ-connected if,
for every pair of spels in U , there is a chain in U of positive ψ-strength from
the first to the second spel of the pair. As we will see later, for the purpose of
fuzzy segmentation of images, the strength of any link of one pixel to another
can often be automatically defined based on statistical properties of the links
within regions identified by the user as belonging to the object of interest.
We associate with the fuzzy spel affinity function ψ a fuzzy connectedness
function µψ : V 2 → [0, 1] defined by
i.e., the ψ-strength of the strongest chain from c to d. We then define the ψ-
connectedness map f of a set V for a seed spel o by the fuzzy connectedness val-
ues of o to c ( f (c) = µψ (o, c)), for all c ∈ V . A hard object C is then defined based
on the ψ-connectedness map by selecting a threshold t and associating with C
all spels c for which f (c) is above the threshold, i.e., C = {c | c ∈ V, f (c) ≥ t}.
The algorithm proposed in [11] for obtaining a fuzzy connectedness map uses
the concept of dynamic programming and has the characteristic that a single
spel can be put into a spel queue O (that holds the spels waiting to be considered
in the search for optimal chains) many times. This seemed to us an unnecessary
inefficiency. In [12] we investigated the use of so-called greedy algorithms [26]
for computing the fuzzy connectedness map. We observed that if we treat the set
V as a connected graph and we consider the cost of the arc (c, d) to be 1 − ψ(c, d),
some of the graph algorithms for finding shortest paths could be applied to this
Simultaneous Fuzzy Segmentation of Medical Images 665
problem. We showed that both Dijkstra’s and Prim’s algorithms can be used for
computing the fuzzy connectedness map of an image faster than the previously
used dynamic programming algorithm. In the experiments reported in [12] we
achieved an average speedup of 8.2 times (over the algorithm of [11]) when using
Dijkstra’s or Prim’s Algorithms for computing the connectedness maps for a set
of images with the same size as the image shown in Fig. 12.1 (|V | = 10,621).
To obtain a version of Dijkstra’s algorithm for computing the fuzzy connect-
edness map we only have to make two changes to the algorithm of [11]. First,
we make O a set instead of a queue, and second, when we remove a spel from
O , we remove the spel d for which f (d) is maximal (greedy step). If a spel c is
already in O it is not reinserted since O is now a set. This is the reason why this
greedy algorithm is more efficient than the dynamic programming algorithm,
in which a spel may be inserted into O many times. In order to implement effi-
ciently the removal of d with the maximal f (d), we make use of a priority queue,
in our case a binary heap, that maintains a partial ordering of the elements in
O [26].
In order to apply the algorithms mentioned above to image segmentation,
we have to define the fuzzy spel affinity ψ. Usually this is done by a computer
program, based on some minimal information supplied by a user [11,12,19]. The
underlying idea is that, even though the user most likely will not be able to define
mathematically the characteristics of the object of interest, it is quite easy for
him/her to select a spel belonging to it. The program will then compute some
statistics based on the neighborhood of the selected spel and use these statistics
to compute the fuzzy spel affinity ψ. We now make a sample methodology we
have been using to achieve this.
For a picture (V, π, I) and selected spel o, we define ψ by
[g1 (I(c)+I(d))+g2 (|I(c)−I(d)|)]
if (c, d) ∈ π,
ψ(c, d) = 2
(12.2)
0 otherwise,
The values for mi and σi are computed using the spels in the neighborhood
of o: m1 and σ1 are defined as the mean and standard deviation, respectively, of
I(c) + I(d) over all adjacent spels c and d in N(o) and m 2 and σ2 are defined to be
the mean and standard deviation, respectively, of |I(c) − I(d)| over all adjacent
666 Herman and Carvalho
spels c and d in N(o). This means that for any pair (c, d) of adjacent spels, their
fuzzy spel affinity will be large if both I(c) + I(d) and |I(c) − I(d)| have values
similar to those in the neighborhood of the selected spel. This definition reflects
the fact that in many applications both the values assigned by I to spels and the
differences between the values assigned to neighboring spels are important for
distinguishing objects in an image.
As an example, the top-left image in Fig. 12.1 was mathematically defined on
the hexagonal grid (each element of V is a hexagon and all of them are arranged
on an enclosing hexagon with |V | = 10,621), and the user was asked to select
a spel that is located inside the object to be segmented. The object in question
is the rectangular region in the upper half of the image with slowly increasing
brightness from left to right. In this example, two hexagons are considered to be
adjacent if, and only if, they share an edge (thus the neighborhood of any interior
hexagon consists of seven hexagons), and I(c) is the gray value assigned to the
hexagon c.
The image on the top-right of Fig. 12.1 is the result of thresholding the original
image at some level. Note that because of the brightness variation inside the
Simultaneous Fuzzy Segmentation of Medical Images 667
object that we wish to segment (the horizontal stripe near the top of the image)
there is no threshold level that can successfully segment it from the background.
When using the fuzzy segmentation algorithm, the user chose a point belonging to
the object (the brightest point in the lower-left image) that is used to identify the
neighborhood over which information is collected regarding the characteristics
of the object, to be used in Eqs. (12.2) and (12.3). The resulting fuzzy spel affinity
ψ is then used to produce the connectedness map f shown in the lower-left
image (note that (V, π, f ) is also a picture over the digital space (V, π )), which
is then thresholded to produce the successful final segmentation shown in the
lower-right image (the hexagons belonging to the resulting hard object are shown
white).
Similarly to the method presented in last section, we rely on the user of our
method to identify seed spels that definitely belong to the various objects into
which we desire to segment the images, and we suggest (as other advocates
of segmentation based on fuzzy connectedness have done before us) that the
user-selected seed spels can be used for automatic calculation of the definitions
of the strengths of links in each one of the objects. Since our choice implies
that the output of our algorithm is user-dependent, we report on experiments
(in which five users segmented five images, each five times) that validate the
accuracy and robustness of our approach.
12.3.1 Theory
For a positive integer M, an M-semisegmentation of a set V of spels is a
function σ that maps each c ∈ V into an (M + 1)-dimensional vector σ c =
(σ0c , σ1c , . . . , σ M
c
), such that
then for 1 ≤ m ≤ M
c
sm , if sm
c
≥ snc for 1 ≤ n ≤ M,
σm =
c
(12.5)
0 otherwise,
c
“claim” that c belongs to it if, and only if, sm is maximal and is greater than 0.
This is indeed how things get sorted out in Eq. (12.5): σmc has a positive value
only for such objects. Furthermore, this is a localized property in the following
sense: for a fixed spel c we can work out the values of the snc using Eq. (12.4)
and what we request is that, at that spel c, Eq. (12.5) be satisfied. What Theo-
rem 1.1 says that there is one, and only one, M-semisegmentation that satisfies
this property simultaneously everywhere, and that this M-semisegmentation
is in fact an M-segmentation, provided that the seeded M-fuzzy graph is
connectable.
Now we illustrate the Theorem 1.1 for the seeded two-fuzzy graph V , , V
defined by V = {(−1), (0), (1)} and = ψ 1 , ψ 2 , where ψ 1 and ψ 2 are the reflex-
ive and symmetric fuzzy spel affinity functions (i.e., ψ m(c, c) = 1 and ψ m(c, d) =
ψ m(d, c) for 1 ≤ m ≤ 2 and c, d ∈ V ) defined by the additional conditions
ψ 1 ((−1), (0)) = 0.5, ψ 1 ((0), (1)) = 0.25, ψ 1 ((−1), (1)) = 0, ψ 2 ((−1), (0)) =
0.5, ψ 2 ((0), (1)) = 0.5, ψ 2 ((−1), (1)) = 0, and V = ({(0)} , {(−1)}). The two-
segmentation σ of V that satisfies Theorem 1.1 is given by σ (−1) = (1, 0, 1),
σ (0) = (1, 1, 0), and σ (1) = (0.25, 0.25, 0). To test this suppose, for example, that
we have been informed that σ (−1) = (1, 0, 1) and σ (0) = (1, 1, 0) and we wish
(1)
to use the Theorem 1.1 to determine σ (1) . We find that s1 = 0.25 (obtained by
(1)
the choice d = (0)) and s2 = 0 (if we choose in Eq. (12.4) d to be (−1), then
ψ 2 ((−1), (1)) = 0, if we choose it to be (0), then µσ ,2,(−1) (0) = 0 since there is no
(0)
σ 2-chain containing (0), because σ 2 = 0). Hence Eq. (12.5) tells us that indeed
σ (1) = (0.25, 0.25, 0). Note that there is a chain (−1), (0), (1) of ψ 2 -strength 0.5
from the seed spel (−1) of Object 2 to (1), while the maximal ψ 1 -strength of
any chain from the seed spel (0) of Object 1 to (1) is only 0.25; nevertheless, (1)
is assigned to Object 1 by Theorem 1.1, since the fact that (0) is a seed spel of
Object 1 prevents it (for the given ) from being also in Object 2, and so the
chain (−1), (0), (1) is “blocked” from being a σ 2-chain.
An intuitive picture of the inductive definition of the M-semisegmentation of
Theorem 1.1 is given by the following description of a military exercise. There
are M armies (one corresponding to each object) competing for the control of N
castles that are connected by one-way roads between them. Initially all armies
have full strength (defined to be 1) and they occupy their respective castles (seed
spels). All armies try to increase their respective territories, but the moving from
a castle to another one reduces the strength of the soldiers to the minimum of
their strength at the previous castle and the affinity (for that army or object) of
Simultaneous Fuzzy Segmentation of Medical Images 671
the road connecting the castles. The affinities of the roads for the various armies
are fixed for the duration of the exercise.
All through the exercise each castle will also have have a strength assigned
to it, which is a real value in the range [0, 1]. The strength of a castle may increase
as the exercise proceeds. Also, at any time, each castle may be occupied by one
or more of the armies.
The objective of the exercise is to see how the final configuration of occupied
castles depends on the initial castle assignment to the various armies. Since we
are describing an algorithm here, the individual armies have to follow fixed rules
of engagement, which are the following.
The exercise starts by distributing the soldiers of the armies into some of
the castles and assigning to those castles that have soldiers in them (and to the
soldiers themselves) the strength 1 and to all other castles the strength 0. We
say that this distribution of armies and strengths describes the situation at the
start of Iteration 1. At any given time, a castle will be occupied by the soldiers
of the armies that were not weaker than any other soldiers who reached that
castle by that time.
The exercise proceeds in discrete iterative steps and the total number of iter-
ations (N I) is determined by the number of distinct affinity values for all armies
and roads. These values are put into a strictly decreasing order and the strength
of the iteration i (I S(i)) is defined as the ith number of this sequence. The fol-
lowing gets done during Iteration i. Those soldiers (and only those soldiers) that
occupy a castle of strength I S(i) will try to increase the territory occupied by
their army. They will send units from their castle toward all the other castles.
When these units arrive at another castle, their strength will be defined as the
minimum of I S(i) and the affinity for their army of the road from the originally
occupied castle to the new one. If the strength of the new castle is greater than
the strength of any of the armies arriving at it, the strength and occupancy of the
castle will not change. If no arriving army has greater strength than the strength
of the new castle, then the strength of the new castle does not change, but it
will get occupied also by those arriving armies whose strength matches that of
the castle (but not by any of the others). If some of the arriving armies have
greater strength than the strength of the castle, then the castle will be taken
over by those (and only those) arriving armies that have the greatest strength,
and the strength of the castle is set to the strength of the new occupiers. This
describes what happens in one iterative step except for one detail: if an army
672 Herman and Carvalho
gets to occupy a new castle because its strength is I S(i) (this can only happen
if the affinity for this army of the road to this castle is at least I S(i)), then that
army is allowed to send out units from this new castle as well. (This cannot lead
to an infinite loop, since there are only finitely many castles and so it can only
happen finitely many times that an army gets to occupy a new castle because its
strength is I S(i).)
The exercise stops at the end of Iteration N I. The output of the exercise
provides, for each castle, the strength of the castle and the armies that occupy
it at the end of the exercise.
(B) For c ∈ V , 1 ≤ m ≤ M, and 2 ≤ i ≤ |R|, |R| σmc = ir if, and only if, there is a
7 8
chain c(0) , · · · , c(K ) of ψm-strength ir such that c(0) ∈ (i−1) U, |R| σmc > 0,
(0)
Let c, d ∈ V . We say that (c, d) is consistent if, for 1 ≤ m ≤ M, |R| σmc = |R|
σ0c
implies that one of the following is true:
c = d; (12.10)
|R|
> min |R| σ0c , ψm(c, d) ;
σ0d (12.11)
|R| d
σ0 = min |R| σ0c , ψm(c, d) and |R| σmd = |R| σ0d . (12.12)
|R|
|R|
σ0d < min σ0c , ψm(c, d) ; (12.13)
|R|
|R| |R|
σ0d = min σ0c , ψm(c, d) and σmd = |R| σ0d . (12.14)
|R|
We may assume that σ0c > 0 and that ψm(c, d) > 0, for otherwise one of
|R| |R|
Eqs. (12.11) or (12.12) clearly holds. Hence σmc = σ0c = ir, for some 1 ≤
i ≤ |R|. From Eqs. (12.13) and (12.14) it follows that |R| σ0d ≤ ir. It then follows
from Eq. (12.9) that if i ≥ 2, then neither c nor d is in (i−1) U.
If i = 1, then by A there is a chain of ψm-strength 1 from a seed in Vm
7 8
to c. If i ≥ 2, then by B there is a chain c(0) , . . . , c(K ) of ψm-strength ir
|R| (0)
such that c(0) ∈ (i−1)
U, σmc > 0, c(K ) = c, and, for 1 ≤ k ≤ K, c(k) ∈
/ (i−1)
U.
In both cases, either d is already in the chain or we can extend the chain
without losing its just stated property to d, and so A or B implies that
|R|
σmd = ir. It follows that if ψm(c, d) ≥ ir Eq. (12.12) holds, a contradic-
tion. So assume that ψm(c, d) = j r for some j > i. Since Eq. (12.13) or
(12.14) holds, we get that d ∈
/ ( j−1)
U . But c ∈ ( j−1)
U , and so, applying B to
|R|
the chain c, d, we get that σmd = r. This implies that Eq. (12.12) holds.
j
This final contradiction completes our proof that, for all c, d ∈ V , (c, d) is
consistent.
Next we show that, for all c ∈ V and 1 ≤ m ≤ M,
|R|
σmc = µ|R| σ,m,Vm (c). (12.15)
To simplify the notation, we use in this proof s to abbreviate |R| σmc . Recall that
µ|R| σ,m,Vm (c) denotes the maximal ψm-strength of an |R| σ m-chain from a seed in
Vm to c. Note that we can assume that s ∈ R, for the alternative is that s = 0 in
|R|
which case there can be no σ m-chain that includes c and so the right-hand
side of Eq. (12.15) is also 0 by definition. Our proof will be in two stages: first
we show that there is an |R| σ m-chain from a seed in Vm to c of ψm-strength s and
then we show that there is no |R| σ m-chain from a seed in Vm to c of ψm-strength
greater than s.
To show the existence of an |R| σ m-chain from a seed in Vm to c of ψm-strength
s, we use an inductive argument. If s = 1r = 1, then the desired result is assured
by A. Now let i > 1 and s = ir. Assume that, for 1 ≤ j < i, whenever a spel d
|R| |R|
is such that σmd = j r, then there is an σ m-chain from a seed in Vm to d of
ψm-strength r. j
Simultaneous Fuzzy Segmentation of Medical Images 675
7 8
By B there is a chain c(0) , . . . , c(K ) of ψm-strength s such that c(0) ∈ (i−1)
U,
|R| (0)
σmc
> 0, c = c, and, for 1 ≤ k ≤ K, c ∈
(K )
/ (k)
U . We are now going to (i−1)
7 (0) 8 |R|
show that c , . . . , c (K )
is an σ m-chain by showing that, for 1 ≤ k ≤ K,
|R| (k)
σmc = s. Otherwise, consider the smallest k ≥ 1 that violates this equation.
|R| (k−1)
|R| (k)
Then we have that σmc ≥ s and
σmc < s (recall that c(k) ∈/ (i−1) U and so
|R| c(k)
σ0 ≤ s). This combined with the fact that ψm c(k−1) , c(k) ≥ s violates the
consistency of c(k−1) , c(k) . Since c(0) ∈ (i−1) U and |R| σmc > 0, |R| σmc = j r for
(0) (0)
some 1 ≤ j < i and, by the induction hypothesis, there is an |R| σ m-chain from a
7 8
seed in Vm to c(0) of ψm-strength j r > s. Appending c(0) , . . . , c(K ) to this chain
|R|
we obtain σ m-chain from a seed in Vm to c of ψm-strength s. (Just append-
ing may not result in a chain, since a chain is defined as a sequence of distinct
spels. However, this is easily remedied by removing, for a repeated spel in the
sequence, all the spels between the repetitions and one of the repetitions.)
|R|
Now we show that there is no σ m-chain from a seed in Vm to c of ψm-
strength greater than s. This is clearly so if s = 1. Suppose now that s < 1 and
7 8
that c(0) , . . . , c(K ) is an |R| σ m-chain from a seed in Vm of ψm-strength t > s. We
now show that, for 0 ≤ k ≤ K, |R| σmc ≥ t. From this it follows that c(K ) cannot
(k)
be c and we are done. Since c(0) is a seed in Vm, |R| σmc = 1 ≥ t. For k > 0,
(0)
induction that makes use of the consistency of c(k−1) , c(k) leads to the desired
result.
|R|
To show that σ = σ satisfies the property stated in Theorem 1.1(i), we first
make two preliminary observations:
(A) For any c ∈ V and 1 ≤ m ≤ M, if σnc > 0, then snc = σnc = σ0c . (The first
equality follows from Eqs. (12.14) and (12.15), and the second from the
definition of M-semisegmentation.)
(B) For any c ∈ V and 1 ≤ n ≤ M, if σnc = 0 and σ0c > 0, then snc < σ0c . (As-
sume the contrary. It cannot be that snc is defined by the first line
of Eq. (12.4), for then c ∈ Vn and by A we would have that σnc = 1.
Hence snc is defined by the second line of Eq. (12.4) using some d
such that min µσ,n,Vn (d), ψ(d, c) = snc ≥ σ0c > 0. Hence, by Eq. (12.15)
σnd ≥ σ0c > 0 and so σ0d = σnd ≥ σ0c . Interchanging c and d in the defini-
tion of consistency, we see that Eq. (12.10) cannot hold since σnd > 0
and σnc = 0, (12.11) cannot hold since σ0c ≤ σ0d , and (12.12) cannot hold
since σnc = 0 and σ0c > 0. This contradiction with the consistency of (d, c)
proves B.)
676 Herman and Carvalho
12.3.2 Algorithm
In this subsection, we present an algorithm that produces the M-
semisegmentations whose existence and uniqueness are guaranteed by
Theorem 1.1. In designing the algorithm we aimed at making it efficient: as
is illustrated in the next subsection, our implementation of it allowed us to find
3-segmentations of images with over 10,000 spels in approximately a tenth of a
second.
As the algorithm proceeds, it maintains (and repeatedly changes) an M-
semisegmentation σ . The claim is that at the time when the algorithm terminates,
σ satisfies the property of Theorem 1.1(i).
The algorithm makes use of a priority queue H of spels c, with associated
keys σ0c [26]. Such a priority queue has the property that the key of the spel at
its head is maximal (its value is denoted by Maximum-Key(H), which is defined
to be 0 if H is empty). As the algorithm proceeds, each spel is inserted into H
exactly once (using the operation H ← H ∪ {c}) and is eventually removed from
H using the operation Remove-Max(H), which removes the spel c from the head
of the priority queue. At the time when a spel c is removed from H, the vector σ c
has its final value. Spels are removed from H in a nonincreasing order of the final
value of σ0c . We use the variable l to store the current value of Maximum-Key(H).
Algorithm 1 shows a detailed specification using the conventions adopted in [26].
The process is initialized (Steps 1–10) by first setting σmc to 0, for each
spel c and 0 ≤ m ≤ M. Then, for every seed spel c ∈ Vm, c is put into Um and H and
678 Herman and Carvalho
1. for c ∈ V
2. do for m ← 0 to M
3. do σmc ← 0
4. H←∅
5. for m ← 1 to M
6. do Um ← Vm
7. for c ∈ Um
8. do if σ0c = 0 then H ← H ∪ { c}
9. σ0c ← σmc ← 1
10. l ← 1
11. while l > 0
12. for m ← 1 to M
13. do while Um = ∅
14. do remove a spel d from Um
15. C ← { c ∈ V | σmc < min(l, ψm(d, c)) and
σ0c ≤ min(l, ψm(d, c))}
16. while C = ∅
17. do remove a spel c from C
18. t ← min(l, ψm(d, c))
19. if l = t and σmc < l then Um ← Um ∪ { c}
20. if σ0c < t then
21. if σ0c = 0 then H ← H ∪ { c}
22. for n ← 1 to M
23. do σnc ← 0
24. σ0c ← σmc ← t
25. while Maximum-Key(H) = l
26. Remove-Max(H)
27. l ← Maximum-Key(H)
28. for m ← 1 to M
29. Um ← { c ∈ H | σmc = l}
Simultaneous Fuzzy Segmentation of Medical Images 679
both σ0c and σmc are set to 1. Following this, l is also set to 1. At the end of the
initialization, the following conditions are satisfied.
(i) σ is an M-semisegmentation of V .
(iii) l = Maximum-Key(H).
The initialization is followed by the main loop of the algorithm. At the beginning
of each execution of this loop, conditions (i) to (iv) above are satisfied. The
main loop is repeatedly performed for decreasing values of l until l becomes
0, at which time the algorithm terminates (Step 11). There are two parts to the
main loop, each of which has a very different function.
The first part of the main loop (Steps 12–24) is the essential part of the
algorithm. It is in here where we update our best guess so far of the final values
of the σmc . A current value is replaced by a larger one if it is found that there is a
σ m-chain from a seed spel in Vm to c of ψm-strength greater than the old value
(the previously maximal ψm-strength of the known σ m-chains of this kind) and
it is replaced by 0 if it is found that (for an n = m) there is a σ n-chain from a
seed spel in Vn to c of ψn-strength greater than the old value of σmc .
The purpose of the second part of the main loop (Steps 25–29) is to restore
the satisfaction of conditions (iii) and (iv) above for a new (smaller) value of l.
To help with the understanding of why this algorithm performs as desired,
we comment that just prior to entering its main loop (Steps 11–29), there are
four kinds of spels. There are those spels d that have previously been put into
and have subsequently been removed from H; for these spels not only does the
vector σ d has its final value, but also we have already put into H (and possibly
even have already removed from H) every spel c such that ψm(d, c) > 0, for at
least one m. (For spels of this first kind, σ0d > l.) Secondly, there are the spels d
that are in at least one of the Um; for these spels the vector σ d has its final value,
but we may not have yet put into H every spel c such that ψm(d, c) > 0 for at
least one m. (For spels of this second kind, σ0d = σmd = l.) This will get done in
the next execution of Steps 13–21, while Steps 22–24 will insure that the σ c get
updated appropriately. Consequently, the spels c which are in H but not in any
of the Um are those for which there is, for some 1 ≤ m ≤ M, a σ m-chain (for the
680 Herman and Carvalho
current σ ) from a seed spel in Vm to c; for the rest of the spels (those which
have not as yet been put into H) there is no m for which there is a σ m-chain (for
the current σ ) from a seed spel in Vm to c. (For spels c of these third and fourth
kinds, 0 < σ0c < l and σ0c = 0, respectively.)
One tricky aspect of the algorithm is that a spel of the third kind may become
a spel of the second kind and a spel of the fourth kind may become a spel of
the third (or even of the second) kind during the execution of the main loop.
That the description of the four kinds of spels remains as given in the previous
paragraph is insured by Steps 19 and 21. (Step 21 also insures that condition
(ii) stated above remains satisfied. To see this, observe that Step 15 guarantees
that if c is put into C, then 0 < min(l, ψm(d, c)) and consequently the t defined
in Step 18 and used in Step 24 is also positive. That condition (i) stated above
remains satisfied is obvious from Steps 20–24.)
We complete this subsection with a brief discussion of our implementation
of Algorithm 1. As suggested in [26], we use a heap to implement the priority
queue H. This provides us with efficient implementations of the operations of
insertion into (H ← H ∪ c) and removal from (Remove-Max(H)) the priority
queue, as well as of Step 29. In applications it is typically the case that, for every
spel d, there is a fixed number of spels c such that m=1
M
ψm(d, c) > 0 and a list of
all such c is inexpensive to produce. In such a case the cost of executing Step 15
becomes proportional to a constant (four, six, or twelve in the examples shown
below and in Section 12.4) independent of the size of V . Using L to denote this
constant, the computational complexity of the Algorithm 1 is the following: since
each spel can belong to multiple objects there can be at most M|V | executions
of the loop 13–24, while the loop 16–24 can be executed at most L times. Steps
19 and 24 have O (log |V |) operations while Steps 22–23 have O (M) operations,
so the loop 16–24 has O (M log |V |) operations. Since this loop can be executed
at most ML|V | times, the time complexity of Algorithm 1 is O (M 2 L|V | log |V |).
12.3.3 Experiments
Now we demonstrate the usage of Algorithm 1 on mathematically-defined as well
as on real images. Similarly to the example shown in section 12.2, the appro-
priate fuzzy spel affinities were automatically defined by a computer program,
based on some minimal information supplied by a user. However, this is not
the only option: for example, if sufficient prior knowledge about the class of
Simultaneous Fuzzy Segmentation of Medical Images 681
Figure 12.3: A mathematically defined image (top left) including both back-
ground variation and noise, and the corresponding 3-segmentation (top right
and bottom row).
Figure 12.4: A mathematically defined image (top left) including both back-
ground variation and noise, and the corresponding 3-segmentation (top right
and bottom row).
Figure 12.5: A mathematically defined image (top left) including both back-
ground variation and noise, and the corresponding 3-segmentation (top right
and bottom row).
Simultaneous Fuzzy Segmentation of Medical Images 683
Figure 12.6: A mathematically defined image (top left) including both back-
ground variation and noise, and the corresponding 3-segmentation (top right
and bottom row).
Figure 12.7: A mathematically defined image (top left) including both back-
ground variation and noise, and the corresponding 3-segmentation (top right
and bottom row).
684 Herman and Carvalho
Figure 12.8: Two images obtained using magnetic resonance imaging (MRI) of
heads of patients (left) and the corresponding 3-segmentations (right). (Color
slide.)
to be segmented, in the three objects being pairwise disjoint as well.) The right
column of Fig. 12.8 shows the resulting maps of the σm by assigning the color
(r, g, b) = 255 × (σ1c , σ2c , σ3c ) to the spel c. Note that not only the hue, but also
the brightness of the color is important: the less brightly red areas for the last
two images correspond to the ventricular cavities in the brain, correctly reflect-
ing a low grade of membership of these spels in the object that is defined by
seed spels in brain tissue. The seed sets Vm consist of the brightest spels. The
times taken to calculate these 3-segmentations using our algorithm on a 1.7 GHz
Intel
R TM
Xeon personal computer were between 90 and 100 ms for each of the
seven images (average = 95.71 ms). Since these images contain 10,621 spels, the
execution time is less than 10 s per spel. The same was true for all the other
2-D image segmentations that we tried, some of which are reported in what
follows.
To show the generality of our algorithm and to permit comparisons with
other algorithms, we also applied it to a selection of real images that appeared
in the recent image segmentation literature. Since in all these images V con-
sist of squares inside a rectangular region, the π of Eq. (12.2) is selected to
Simultaneous Fuzzy Segmentation of Medical Images 685
Figure 12.9: An SAR image of trees and grass (left) and its 2-segmentation
(center and right).
Figure 12.11: An indoor image of a living room (top left) and its 5-segmentation.
living room image. The 6-segmentation of the room shown in Fig. 12.12 includes
a new object corresponding to the base and arm of one of the sofas.
It is stated in [6] that the times needed for the segmentations reported in
TM
that paper “are in the range of less than five seconds” (on a Sun UltraSparc ).
Our CPU time to obtain the segmentations shown in Figs. 12.10–12.12 is
688 Herman and Carvalho
window are assigned to the same object. As opposed to this, in our segmen-
tations any pixel can be assigned to any object. Another way of putting this is
that we could also make our spels to be the 8 × 8 windows of [6] and thereby
reduce the size of the V to be a 64th of what it is currently. This should result in
a two order of magnitude speedup in the performance of our segmentation al-
gorithm (at the cost of a loss of resolution in the segmentations to the level used
in [6]).
are somewhat ad hoc and so the fact that they yield similar results indicates that
the reported figures of merit are not over-sensitive to the precise nature of the
definition of accuracy. The slightly larger mean for the membership accuracy is
due to the fact that misclassified spels tend to have smaller than average grade
of membership values.
The average error (defined as “100 less point accuracy”) over all segmenta-
tions is less than 3%, comparing quite favorably with the state of the art: in [6]
the authors report that a “mean segmentation error rate as low as 6.0 percent
was obtained.”
The robustness of our procedure was defined based on the similarity of two
segmentations. The point similarity of two segmentations is defined as the
number of spels which are assigned to the same object in the two segmenta-
tions divided by the total number of spels multiplied by 100. The membership
similarity of two segmentations is defined as the sum of the grades of member-
ships (in both segmentations) of all the spels which are assigned to the same
object in the two segmentations divided by the total sum of the grades of mem-
bership (in both segmentations) of all the spels multiplied by 100. (For both
these measures of similarity, identical segmentations will be given the value 100
and two segmentations in which every spel is assigned to a different object will
be given the value 0.)
Since each user segmented each image five times, there are 10 possible
ways of pairing these segmentations, so we had 50 pairs of segmentations per
user and a total of 250 pairs of segmentations. Because the results for point
and membership similarity were so similar for every user and image (for de-
tailed information, see [29]) we decided to use only one of them, the point
similarity, as our intrauser consistency measure. The results are quite satis-
factory, with an average intra-user consistency of 96.88 and a 5.56 standard
deviation.
In order to report on the consistency between users (interuser consistency)
we selected, for each user and each image, the most typical segmentation by
that user of that image. This is defined as that segmentation for which the sum
of membership similarities between it and the other four segmentations by that
user of that image is maximal. Thus, we obtained five segmentations for each
image that were paired between them into 10 pairs, resulting into a total of
50 pairs of segmentations. The average and standard deviation of the interuser
consistency (98.71 and 1.55, respectively) were even better than the intrauser
Simultaneous Fuzzy Segmentation of Medical Images 691
consistency, mainly because the selection of the most typical segmentation for
each user eliminated the influence of relatively bad segmentations.
Finally, we did some calculations of the sensitivity of our approach to M
(the predetermined number of objects in the image). The distinction between
the objects represented in the top right and bottom left images of Fig. 12.5 and
between the objects represented in the bottom images of Fig. 12.7 is artificial; the
nature of the regions assigned to these objects is the same. The question arises:
if we merge these two objects into one do we get a similar 2-segmentation to
what would be obtained by merging the seed points associated with the two
objects into a single set of seed points and then applying our algorithm? (This
is clearly a desirable robustness property of our approach.) The average and
standard deviation of the point similarity under object merging for a total of
50 readings by our five users on the top-left images of Figs. 12.5 and 12.7 were
99.33 and 1.52, respectively.
where δ denotes the grid spacing. From the definitions above, the fcc and bcc
grids can be seen either as one sc grid without some of its grid points or as a
union of shifted sc grids, four in the case of the fcc and two in the case of the bcc.
We now generalize the notion of a voxel to an arbitrary grid. Let G be any
set of points in R3 , then the Voronoi neighborhood of an element g of G is
692 Herman and Carvalho
Figure 12.13: Three grids with the Voronoi neighborhood of one of their grid
points. From left to right: the simple cubic (sc) grid, the face-centered cubic
(fcc) grid, and the body-centered cubic (bcc) grid.
defined as
In Fig. 12.13, we can see the sc, the fcc and the bcc grids and the Voronoi
neighborhoods of the front-lower-left grid points.
Why should one choose grids other than the ubiquitous simple cubic grid?
The fcc and bcc grids are superior to the sc grid because they sample the 3-D
space more efficiently, with the bcc being the most efficient of the three. This
means that both the bcc and the fcc grid can represent a 3-D image with the
same accuracy as that of the sc grid but using fewer grid points [33].
We decided to use the fcc grid for 3-D images instead of the bcc grid for
reasons that will become clear in a moment; now we discuss one additional
advantage of using the fcc grid over the sc grid. If we have an object that is
a union of Voronoi neighborhoods of the fcc grid, then for any two faces on
the boundary between this object and the background that share an edge, the
normals of these faces make an angle of 60◦ with each other. This results in a less
blocky image than if we used a surface based on the cubic grid with voxels of
the same size. This can be seen in Fig. 12.14, where we display approximations
to a sphere based on different grids. Note that the display based on the fcc grid
(center) has a better representation than the one based on the sc grid with the
same voxel volume (left) and is comparable with the representation based on
cubic grid with voxel volume equal to one eighth of the fcc voxel volume (right).
The main advantage of the bcc grid over the fcc grid is that it needs fewer
grid points to represent an image with the same accuracy. However, in the bcc
Simultaneous Fuzzy Segmentation of Medical Images 693
Figure 12.14: Computer graphic display of a sphere using different grids (re-
produced from [19]). (a) Display based on a sc grid with voxels of the same
volume as the display based on a fcc used for (b). The image (c) corresponds
to a display based on a sc grid with voxels of volume equal to one eighth of the
voxel volume in the other two images.
grid, grid points whose Voronoi neighborhoods share a face can be at one of
two distances from each other, depending on the kind of face they share (see
Fig. 12.13), a characteristic that may not be desirable. The Voronoi neighbor-
hoods of an fcc grid Fδ are rhombic dodecahedra (polyhedra with 12 identical
rhombic faces), as can be seen in Fig. 12.13. We can define the adjacency relation
β for the grid Fδ by: for any pair (c, d) of grid points in Fδ ,
√
(c, d) ∈ β ⇔ c − d = 2δ. (12.20)
Each grid point c ∈ Fδ has 12 β-adjacent grid points in Fδ . In fact, two grid points
in Fδ are adjacent if, and only if, the associated Voronoi neighborhood share one
face. In practice these definitions give rise to a digital space (V, π ) by using a
V that is a finite subset of Fδ and a π that is the β of Eq. (12.20) restricted to V .
(Note that since Eq. (12.2) ignores distance, a similar approach applied to the
bcc grid would have the undesirable consequence of having a fuzzy spel affinity
that does not incorporate the difference in distances between adjacent spels.)
Experiments with segmentations using this approach on 3-D images were
reported in [34]. Here we present two more recent experiments from [35] of
multiple object fuzzy segmentation of 3-D images on the fcc grid.
The first experiment was performed on a computerized tomography (CT)
reconstruction that assigned values to a total of (298 × 298 × 164)/2 = 7,281,928
(see Eq. 12.17)) fcc grid points. We selected seeds for four objects, the intestine
(red object), other soft tissues (green object), the bones (blue object) and the
694 Herman and Carvalho
Figure 12.15: Two axial slices from a CT volume placed on the fcc grid and
the corresponding 4-segmentations. (All four images were interpolated to the sc
grid for display purposes.)
Second, the memory saving approach used in implementing the 3-D version of
our algorithm slowed down the execution. (Since the goal was to segment vol-
umes that could have as many as 512 × 512 × 512 spels, we chose to implement
a “growing” heap, where a new level of the heap was added or an old one was
removed as the program was executed depending on the number of spels cur-
rently in the heap, so that the memory usage was kept as low as possible. Note
that, besides the heap, both the M-segmentation map and the original volume
are accessed simultaneously by Algorithm 1). Finally, our program was devel-
oped with the goal of being able to segment images placed on the sc, fcc, and bcc
grids, and this generality also contributed to the longer execution time of the
algorithm; as opposed to the approach taken in the 2-D case, where we use two
programs to produce the segmentations shown in subsection 12.3.3 (one for the
images on the hexagonal grid and another for the images on the square grid).
In order to have a better idea of the quality of the segmentation produced by
our algorithm on this volume, we created a 3-D model of the segmented intestines
(the red object of Fig. 12.15) using the software OpenDX [36], which can be seen
in Fig. 12.16. Since OpenDX can work with arbitrary grids, we used the fcc grid:
the surface shown on Fig. 12.16 consists of faces of rhombic dodecahedra (fcc
Voronoi neighborhoods).
Figure 12.16: A 3-D view of the segmented intestines shown in Fig. 12.15.
696 Herman and Carvalho
Figure 12.17: An axial slice from the VB dataset volume placed on the fcc grid
and three maps of a 4-segmentation of it. (All four images were interpolated to
the sc grid for display purposes.)
Simultaneous Fuzzy Segmentation of Medical Images 697
(a) (b)
(c) (d)
Figure 12.18: An axial slice from the VB dataset volume placed on the fcc grid
and three maps of a 4-segmentation of it. (All four images were interpolated to
the sc grid for display purposes.)
display window settings) as those in the bronchi and trachea, the placement
of seeds in the areas of the lung close to bronchial junctions stop the leakage
of the trachea–bronchi object (top right) into the lung object (bottom right).
Figure 12.19 shows a 3-D view of the segmented trachea and bronchial tubes.
12.5 Conclusion
Figure 12.19: A 3-D view of the segmented trachea and bronchi shown in
Figs. 12.17 and 12.18.
characteristics make the proposed method a valuable tool for interactive seg-
mentation, since a low-quality segmentation (as judged by the user) can be
corrected by the removal or introduction of new seed spels and a series of seg-
mentations can be produced until a satisfactory one is achieved. The method
can also be transformed into a fully automatic one if sufficient prior information
is available pertaining the objects to be segmented.
12.6 Acknowledgements
We thank T. Yung Kong for his contributions to this work. This research has been
supported by NIH grant HL70472 (GTH and BMC) and CAPES-BRAZIL (BMC).
Questions
Figure 12.20: Three examples of a set of spels V ; in each case the spels are dots
in the plane and V = T ∪ L ∪ R ∪ B ∪ {o}, where T contains the top three dots,
L contains the five (a), four (b), or three (c) horizontally centered dots on the
left, R contains the three horizontally centered dots on the right, B contains the
three vertically centered dots on the bottom, and o is the dot on the bottom-right.
and
= 1/3 if c − d ≤ 4,
ψ (c, d) = (12.23)
0 otherwise.
4. Consider the seeded 2-fuzzy graph (V, , V) where V is the set (a) of
Fig. 12.20, = (ψ, ψ), V1 contains the leftmost spel of V and V2 contains
the rightmost spel of V . Compute the 2-segmentation σ using Theorem 1.1.
=
5. Does the 2-segmentation σ change if we use = (ψ, ψ)?
700 Herman and Carvalho
−
6. Is (V, (ψ, ψ), V), where V1 contains the leftmost spel of V and V2 contains
the rightmost spel of V , a connectable 2-fuzzy graph for any of the sets of
Fig. 12.20?
8. Why should one use the fcc grid instead of the traditional sc (cubic) grid?
9. Suppose that the fuzzy spel affinities defined for a specific application
can only assume values from a small set (around 1000 elements). Dis-
cuss an alternative data structure for implementing the algorithm more
efficiently.
3
The definitions of RFC and IRFC of [27] are restricted to 2-fuzzy graphs where ψ1 = ψ2
with a single seed spel per object.
Simultaneous Fuzzy Segmentation of Medical Images 701
10. Consider the seeded 2-fuzzy graph (V, , V) where V is the set (c) of
Fig. 12.20, = (ψ, ψ), V1 contains the leftmost spel of V and V2 contains
the rightmost spel of V . Compute the 2-segmentations σ using Theorem 1.1
and RFC and compare them.
11. Consider the seeded 2-fuzzy graph (V, , V) where V is the set (a) of
Fig. 12.20, = (ψ, ψ), V1 contains the leftmost spel of V and V2 con-
tains the bottommost spel of B. Compute the 2-segmentations σ using
Theorem 1.1 and IRFC and compare them.
702 Herman and Carvalho
Bibliography
[3] Kass, M., Witkin, A., and Terzopoulos, D., Snakes: Active contour mod-
els, Int. J. Computer Vision, Vol. 1, pp. 321–331, 1988.
[4] Cohen, L. D., On active contour models and balloons, CVGIP: Image
Understanding, Vol. 53, pp. 211–218, 1991.
[5] Geman, D., Geman, S., Graffigne, C., and Dong, P., Boundary detection
by constrained optimization, IEEE Trans. Pattern Anal. Mach. Intell.,
Vol. 12, pp. 609–628, 1990.
[6] Hofmann, T., Puzicha, J., and Buhmann, J. M., Unsupervised texture seg-
mentation in a deterministic annealing framework, IEEE Trans. Pattern
Anal. Mach. Intell., Vol. 20, pp. 803–818, 1998.
[8] Ronfard, R., Region-based strategies for active contour models, Int. J.
Comput. Vision, Vol. 13, pp. 1374–1387, 1994.
[9] Malladi, R., Sethian, J. A., and Vemuri, B. C., Shape modelling with front
propagation: A level set approach, IEEE Trans. Patt. Anal. Mach. Intell.,
Vol. 17, pp. 158–175, 1995.
[10] Tsai, A., Yezzi, A., Wells, W., Tempany, C., Tucker, D., Fan, A., Grimson,
W. E., and Willsky, A., A shape-based approach to the segmentation
of medical imagery using level sets, IEEE Trans. Med. Imag., Vol. 22,
pp. 137–154, 2003.
[12] Carvalho, B. M., Gau, C. J., Herman, G. T., and Kong, T. Y., Algorithms
for fuzzy segmentation, Pattern Anal. Appl., Vol. 2, pp. 73–81, 1999.
[15] Rice, B. L. and Udupa, J. K., Clutter-free volume rendering for magnetic
resonance angiography using fuzzy connectedness, Int. J. Imag. Syst.
Tech., Vol. 11, pp. 62–70, 2000.
[16] Saha, P. K., Udupa, J. K., and Odhner, D., Scale-based fuzzy connected
image segmentation: Theory, algorithms and validation, Comput. Vision
Image Understanding, Vol. 77, pp. 145–174, 2000.
[17] Udupa, J. K., Wei, L., Samarasekera, S., Miki, Y., van Buchem, M. A.,
and Grossman, R.I., Multiple sclerosis lesion quantification using fuzzy-
connectedness principles, IEEE Trans. Med. Imag., Vol. 16, pp. 598–609,
1997.
[18] Rosenfeld, A., Fuzzy digital topology, Inform. Control, Vol. 40, pp. 76–87,
1979.
[20] Dellepiane, S. G., Fontana, F., and Vernazza, G. L., Nonlinear image
labeling for multivalued segmentation, IEEE Trans. Image Process.,
Vol. 5, pp. 429–446, 1996.
[21] Ahuja, N., Dot pattern processing using Voronoi neighborhoods. IEEE
Trans. Pattern Anal. Mach. Intell., Vol. 3, pp. 336–343, 1982.
[23] Jain, A. K., Murty, M. N., and Flynn, P. J., Data clustering: A review, ACM
Comput. Surveys, Vol. 31, pp. 264–323, 1999.
[24] Gower, J. C. and Ross, G. J. S., Minimum spanning trees and single
linkage cluster analysis, Appl. Statist., Vol. 18, pp. 54–64, 1969.
[26] Cormen, T. H., Leiserson, C. E., and Rivest, R. L., Introduction to Algo-
rithms, MIT Press, Cambridge, MA, 1990.
[27] Udupa, J. K., Saha, P. K., Udupa, J. K., and Lotufo, R. A., Fuzzy con-
nected object definition in images with respect to co-objects, In: Proc.
SPIE, Bellingham, WA, Vol. 3661: Image Processing, Hanson, K. M., ed.,
pp. 236–245, 1999.
[31] Pollak, I., Willsky, A. S., and Krim, H., Image segmentation and edge
enhancement with stabilized inverse diffusion equations, IEEE Trans.
Image Proc., Vol. 9, pp. 256–266, 2000.
[32] Koepfler, G., Lopez, C., and Morel, J.-M., A multiscale algorithm for
image segmentation by variational method, SIAM J. Numer. Anal., Vol.
31, pp. 282–299, 1994.
Simultaneous Fuzzy Segmentation of Medical Images 705
[34] Carvalho, B. M., Garduño, E., and Herman, G. T., Multiseeded fuzzy
segmentation on the face centered cubic grid, In: Advances in Pat-
tern Recognition: Second International Conference, ICAPR 2001, Rio
de Janeiro, Brazil, 2001. LNCS Vol. 2013, Singh, S., Murshed, N., and
Kropatsch, W., eds., Springer-Verlag, pp. 339–348, 2001.
Computer-Aided Diagnosis of
Mammographic Calcification Clusters:
Impact of Segmentation
13.1 Introduction
Medical image analysis is an area that has always attracted the interest of en-
gineers and basic scientists. Research in the field has been intensified in the
last 15–20 years. Significant work has been done and reported for breast can-
cer imaging with particular emphasis on mammography. The reasons for the
impressive volume of work in this field include
(a) increased awareness and education of women on the issues of early breast
cancer detection and mammography,
(b) the potential for significant improvements both in the fields of imaging
and management, and
(c) the multidisciplinary aspects of the problems and the challenge presented
to both engineers and basic scientists.
1
Department of Radiology, H. Lee Moffitt Cancer Center & Research Institute, University
of South Florida, Tampa, FL 33612-4799
707
708 Kallergi, Heine, and Tembey
arguments that support an opposite view [3]. It should be noted that breast
cancer was the second major cause of death for women in 2003 and mammog-
raphy has been responsibly for a mortality reduction of 20–40% [4, 5]. Despite
its success, mammography still has a false negative rate of 10–30% and great
variability [6].
Calcifications are one of the main and earliest indicators of cancer in mam-
mograms. They are present in 50–80% of all mammographically detected cancers
but pathologic examinations reveal an even greater percentage [7]. Most of the
minimal cancers and in-situ carcinomas are detected by the presence of calci-
fications [7]. A review of the literature on missed breast cancers indicates that
calcifications are not commonly found among the missed lesions [8]. Although
perception errors are not excluded, particularly in the case of microcalcifica-
tions (size < 1 mm), the technique of screen/film mammography (SFM) has been
significantly improved over the years offering high-contrast and high-resolution
mammograms that make calcification perception relatively easy. A greater and
continuing problem for radiologists, with a major impact on the specificity of
the diagnosis, is the mammographic differentiation between benign and ma-
lignant clustered calcifications. Almost all cases with calcifications are recom-
mended for biopsy but only about 15–34% of these prove to be malignant [9]. The
biopsies necessary to make the determination between benign and malignant
disease represent the largest category of induced costs of mammography screen-
ing and a major source of concern for radiologists, surgeons, and patients. The
advent of full field direct digital mammography will probably amplify this prob-
lem by providing more details and revealing breast abnormalities at very early
stages [10].
In the last 20 years, researchers have developed various computer schemes
for analyzing mammograms with calcifications, masses, and other breast abnor-
malities in an effort to improve mammography and breast cancer detection and
diagnosis [11]. Computer algorithms can be divided in three groups depending
on their final goal: detection, diagnosis, and prognosis methodologies [12]. The
majority of the effort to-date has been focused on the development of detection
tools, namely tools that point out to the primary reader suspicious areas associ-
ated with calcification clusters or masses that may warranty further review. The
outcome of the intensive research on detection has led to three commercial,
FDA approved systems for computer-aided detection (CADetection) of calcifi-
cations and masses; two more manufacturers were in the process of applying
Computer-Aided Diagnosis of Mammographic Calcification Clusters 709
for FDA approval as of this writing [2]. The commercial CADetection systems
play the role of a virtual “second reader” by highlighting suspicious areas for
further review and evaluation by the human observer [2].
Research on computer tools for diagnosis has been lacking behind but is
now gaining momentum. The goal of a CADiagnosis system is to aid in the
differentiation between benign and malignant lesions identified previously by a
human observer [13] or a CADetection technique [14]. Such systems are not fully
tested yet for clinical efficacy but are very promising and may provide significant
aid to the mammographer in the form of a “second opinion.”
Finally, computer-aided prognosis (CAP) tools appear in the horizon and are
beginning to be explored as the next step in computer applications for breast
imaging. Certainly, the variety of problems encountered in the detection, man-
agement, treatment, and follow-up of breast cancer patients leave several unex-
plored areas where computer applications could be clinically useful with major
benefits to health care delivery and patient management.
Although the goals of automated detection and diagnosis are different, the
actual detection and classification tasks are not always separate [15]. Almost
all modern detection algorithms contain modules that discriminate true from
false signals, calcification-like or mass-like artifacts from true calcifications or
masses, isolated or single calcifications from clustered ones, even benign from
malignant lesions in order to point only to malignant ones [16]. Most of the
computer-based diagnosis techniques rely on the human observer to provide
the detection step and/or the classification features [13, 17]. Few, however, in-
corporate segmentation and detection with pattern recognition processes in
order to provide an automated, seamless approach that yields detection as well
as likelihood of malignancy [18].
CADiagnosis methodologies that aim at the automatic differentiation of be-
nign from malignant calcifications use a variety of mathematical descriptors
that represent or correlate with one or more clinical findings, demographic in-
formation, or purely technical image characteristics. Reported algorithms usu-
ally employ combinations of morphological, texture, and intensity features as
well as patient-related, demographic information [13, 14, 17, 19]. A valuable,
comprehensive summary of reported techniques is given by Chan et al. [18].
Most of these methods are successful particularly when compared to the diag-
nostic performance of the human readers and their positive predictive value
that is relatively low. Jiang et al. [14] have reported one of the first ROC
710 Kallergi, Heine, and Tembey
The diagram in Fig. 13.1 shows the major components of a CADiagnosis algo-
rithm aiming at the differentiation between benign and malignant lesions. Based
on this diagram, one may distinguish two major pathways to algorithm develop-
ment:
,
CADetection Radiol
Radiologist s
Input Input
Characterization
Feature Selection
Clas
Classification
Benign/Malignant
Digitized Film
Direct Digital
Multimodality
Single View
i
Full
ull Image
Two Views Data Types Region of
Interest (ROI)
Multiple Views
2-D Image
3-D Image
Leave-one-out Likelihood of
Thresholding
resampling malignancy
Figure 13.3: Flowchart of the CADiagnosis algorithm developed for the dif-
ferentiation of benign from malignant microcalcification clusters in digitized,
screen/film mammography [20].
The terms segmentation and detection may be confusing for the reader not so
familiar with the medical imaging vernacular. In some instances these terms may
be used interchangeably, but other times not. We might consider segmentation
as being a more refined or specialized type of detection. For instance, we may
gate a receiver for some time increment and make a decision as to whether or
not a signal of interest was present within the total time duration, but not care
about exactly where the signal is within the time window; this may be defined
as a detection task with a binary output of yes or no. Segmentation takes this
a step further. With respect to the image processing, the detection task makes
a decision if the abnormality is present, which in this case is a calcification. If,
in addition, the detection provides some reasonable estimate as to the spatial
location and extent of the abnormality, then we would say that the calcification
716 Kallergi, Heine, and Tembey
this signal or any 2-D signal (image). Specifically, this signal is purely a DC signal
in the y-direction when considering its Fourier composition and a sinc-type
function in the other direction. Consequently, the transform is a sinc function
along the fx coordinate axis and about zero elsewhere; this may be deduced
by considering the ( fx , fy) coordinates and noting that the fx component is
zero everywhere except at fy = 0. The following may be observed: (1) Linear
structures in the vertical direction are likely to give rise to Fourier signatures in
the fx direction. The narrower the width the more spread out the contribution is
in the Fourier fx direction and the wider in width, the more contracted along the
fx direction. (2) Linear structures in the horizontal direction are likely to have
significant Fourier signatures in the fy direction and less in the fx direction.
(3) Taking this a step further, spots give rise to components in both coordinate
directions. These examples are idealizations that may inspire the newcomer to
Fourier analysis to observe what exactly the Fourier Transform is telling the user.
Filtering can be applied to set the stage for detection or segmentation. The
basic idea is that there is some structure that we define as the signal of inter-
est, which in our case is the localized calcified areas in mammograms termed
“calcifications.” These signals are surrounded (or embedded in) by other sig-
nals (in this case normal breast tissue) that may interfere with the ability to
automatically detect them. In the best scenario, the signal of interest will have
a frequency signature that is somewhat different than the background. If this is
the case, filtering the image will pass the signal of interest (perhaps not intact)
and block a portion of the background tissue. If this is successful, the filtered
image will show a relatively more pronounced calcified area and a somewhat
subdued background when compared with the raw image.
A simple somewhat contrived example of the usefulness behind filtering is
proved here. Suppose we have a white 2-D (n × n) noise field with variance σ 2
and filter it with a perfect band pass filter. Can we say anything about the re-
sulting noise power? The answer is yes. White noise by definition is a flat power
spectrum (more correctly a constant power spectral density). For illustration,
we will apply a perfect half-band filter to this field and calculate the resulting
noise power. In the Fourier domain, the half band filter looks like a square
box centered about zero of unit height with its sidewalls intersecting the fre-
quency coordinates and the midway point. Fourier components within the box
are passed when filtering and everything outside is blocked. Thus the total area in
the Fourier domain is n2 , the pass-band area is (n/2)2 , and the blocked portion of
Computer-Aided Diagnosis of Mammographic Calcification Clusters 719
The reverse argument applies when the wavelet is most contracted implying
better spatial location but spread out in frequency. These ideas are fundamental
to understanding both Fourier and wavelet analysis. For the purpose of this
discussion, a very simple wavelet interpretation was developed and presented
in the following paragraphs.
The wavelet expansion may be considered as a band pass filter network. The
intact signal (raw image) is put into the mill and out come many filtered versions
of the image. The orthogonal wavelet gives an expansion of the form:
F0 = d1 + d2 + d3 + d4 + · · · + d j + F j (13.1)
where F0 is the raw image, the d j s represent band pass versions of the raw image,
and F j is a blurred version of the raw image, which contains the DC and slow
varying image attributes. These images are linearly independent, which amounts
to perfect reconstruction and is one of the great strengths of wavelet analysis
compared with just any band pass filter network. Each image of these expansion
images may be divided further into three complimentary components expressed
as vertical, horizontal, and diagonal components, which are also not correlated.
Roughly speaking, the ds represent an octave sectioning (or fine to coarse image
representation) of the frequency domain information. This can be observed by
taking the Fourier transform of each expansion component individually and
noting where each has appreciable Fourier amplitudes. Figure 13.4 shows the
idealized division of the Fourier domain relative to the image expansion images.
d4
d3
d2
d1
Figure 13.4: Idealized graphical representation of the first four band pass split-
tings of the raw image.
Computer-Aided Diagnosis of Mammographic Calcification Clusters 721
There are many orthogonal wavelet coefficients to choose from, and the band
pass nature is not generally sharp indicating that the expansion components will
have some shared frequency attributes; these cancel when performing the addi-
tion. Here is a simple rule of thumb: the shorter the wavelet filter kernel the less
sharp the cutoffs are in the Fourier domain and the longer the support length the
sharper the cutoffs. The strength of the two-channel or quadrature wavelet filter
is the orthogonality of the expansion images. The price paid for orthogonality
is the fixed-way the associated information is divided (octave sectioning).
Figure 13.7: Detection output from the dual wavelet expansion image approach
of Fig. 13.5. The binary mask has been projected into the sum of the first five d j
images, a process that gives better detail for further processing.
there were two thresholds associated with each detection stage that were varied
independently. Figure 13.7 shows the output of this process for the cluster shown
in Fig. 13.5.
As we shall see below, the classification algorithm we developed requires
the analysis of calcification attributes that were not fully present in the bi-
nary detection representation of Fig. 13.7. Two options could lead to the de-
sired representation: (a) The binary detection output may be used as a mask
that points back to the calcification location in the raw image, or, more gen-
erally, to any other data representation. For example, classification analysis
could be done on any combination of the d images in Eq. (13.1). (b) Perform
an additional segmentation operation on the binary detection output that would
extract the shape and distribution of the detected cluster(s) and allow their
shape analysis necessary for the classification step. The second option was se-
lected in this application and calcifications were segmented in the detection
726 Kallergi, Heine, and Tembey
1.0
0.8
0.6
0.4
0.2
0.0
Figure 13.8: Slices through the donut filter in the Fourier domain for l = 2
(dashed curve) and l = 4 (solid curve) versions. Note the difference in the cut-
off behavior of the two versions.
Figure 13.9: Original ROI (512 × 512 pixels) with a calcification cluster associ-
ated with cancer. The ROI was obtained from a screen/film mammogram digi-
tized at 60 m and 16 bits/pixel.
Figure 13.10: Output of the symmlet wavelet filter for the ROI of Fig. 13.9.
Strong edge effects are present with this filter that often remain in the final
segmentation step (see section 13.3.4) and interfere with classification.
Figure 13.11: Output of the donut filter for the ROI of Fig. 13.9. Smaller edge
effects are observed in this case and improved edge definition of the objects of
interest.
730 Kallergi, Heine, and Tembey
Figure 13.12: Main section of an FFDM image of the left breast of a patient
with benign clustered calcifications enclosed in a white box. Image was ac-
quired with GE’s Senographe 2000D digital system at a resolution of 100 m and
16 bits/pixel. The insert shows the region with the calcifications after filtering
with the new donut filter in combination with a prewhitening approach. Note
that the background is subdued (gray value information is removed) and calci-
fications remain as outlines that can be easily extracted.
periodic wrap around present in the discrete Fourier Transform. Basically, the
filter kernel slides off one side of the image and appears on the other. Thus, for
all practical purposes, the kernel slides over what appears as a discontinuity.
The artifact appears more pronounced in the wavelet-filtered than the donut-
filtered image, which may be due to the iterative convolution inherent to its
application.
In all fairness, we have not discussed the characteristics of the actual mam-
mograms, digitized or direct digital. Following, we will give a short description
that will assist the reader in understanding the difficulties in processing mam-
mograms either automatically (computer vision) or manually (human vision).
Evidence indicates that mammograms, regardless of resolution, obey an in-
verse power law with respect to their power spectral density [51–53]. Specifi-
cally, the power spectrum of a particular image drops off a 1/ f γ , with γ on the
order of three. This indicates that the images are predominately low-frequency
fields with long-range, although not well defined, spatial correlation. Power laws
are inherently termed self-similar and often the term fractal is used. This implies
that there are no preferred scales as with the human voice for example. There
are debates as to whether an anatomical structure like the breast could be truly
fractal. But, it is reasonable to say that the image statistics will quite often vary
widely across the image from region-to-region due to this spectral characteristic.
As an aside, wavelet expansion images may be considered as multiresolution
derivatives (derivatives with respect to scale) in a loose sense. Effectively, the
differencing produces images (the expansion image) that are not as irregular
as the raw images. The reader interested in this line of reasoning could consult
Heine et al. [52, 54] and the references therein.
According to the flowchart of Fig. 13.3, the steps following the detection and
segmentation of the calcifications involve shape analysis of the segmented
Computer-Aided Diagnosis of Mammographic Calcification Clusters 733
Figure 13.14: Adaptive thresholding of the donut filter output of Fig. 13.11. As
in Fig. 13.13, both true and false calcifications were isolated and outlined. No
edge effects were generated in this case. Furthermore, more calcifications were
preserved in the segmentation stage at the expense of a slightly higher number
of false signals.
objects and selection of the feature set to be used as input to the classifier.
For this stage, we took advantage of prior art in the field of classification and
our experience in mammographic features [56]. Our starting point was the im-
plementation of the four shape features developed by Shen et al. [57] for indi-
vidual calcifications and their modification to apply to calcification clusters. We
expanded this initial set with two more shape descriptors of individual calcifica-
tions [20]. To represent the clusters, we added the standard deviations of the six
shape descriptors and a distribution feature. To represent the patient and link
the demographic data to the images, we added a demographic feature [58]. The
final results was a set of fourteen features for cluster classification in mammog-
raphy. Table 13.2 lists the selected feature set and the physical interpretation
of each feature [59]. Specific definitions and details may be found in the listed
references.
734 Kallergi, Heine, and Tembey
Table 13.2: Feature set selected from the shape analysis of the segmented
individual calcifications and clusters and demographic dataa
a
Features are limited to morphological and distributional characteristics (with the exception of “age”)
in order to reproduce the visual analysis system and indirectly use the classification as a measure of
segmentation.
F1 I1
H1
F2 I2
H2
F3 I3 Percent
H3 O likelihood of
malignancy
F4 I4 :
: Output layer
:
:
:
: H12
F14 I14
Figure 13.15: Diagram of the NevProp4 artificial neural network (ANN) used
for cluster classification. This is a standard three-layer, feed-forward ANN where
F1–F14 are the input features, I1–I14 are the input units, H1–H12 are the hidden
units, and O is the output layer [20, 59, 60].
target output for the cancer cases. This value could be interpreted as a percent
likelihood for a cluster to be malignant.
The generalization error of the ANN classifier was estimated by the “leave-
one-out” resampling method [61, 62]. Leave-one-out is a method generally rec-
ommended for the validation of pattern recognition algorithms using small
datasets. The use of this approach usually leads to a more realistic index of
performance and eliminates database problems such as small size and not fully
representative contents and problems associated with the mixing of training
and testing datasets [61, 63]. In the leave-one-out validation process, the net-
work was trained on all but one of the cases in the set for a fixed number
of iterations and then tested on the one excluded case. The excluded case
was then replaced, the network weights were reinitialized, and the training
was repeated by excluding a different case until every case had been excluded
once. For N cases, each exclusion of one case resulted in N–1 training cases,
1 testing case and a unique set of network weights. As the process was re-
peated over all N, there were N(N−1) training outputs and N testing outputs
from which the training and testing mean square error (MSE) was, respectively,
determined.
In addition to the leave-one-out method, other resampling approaches have
been proposed for CADiagnosis algorithm training that could yield unbiased
736 Kallergi, Heine, and Tembey
Several applications of the algorithm described here have been reported in the
literature [20, 28, 29]. In this chapter, we summarize the most important ones,
report on new experiments that are linked to segmentation issues and reveal
some of the open questions remaining in this area.
1.0
0.8
Symmlet Wavelet
Azz=0.86; SE=0.02
Donut Filter
0.6 Azz=0.89; SE=0.02
TPF
0.4
0.2
0.0
0 0.2 0.4 0.6 0.8 1
FPF
Figure 13.16: Computer ROC plots of the TPF and FPF pairs obtained from
the classification of 260 clusters. The dashed curve corresponds to the results
obtained with the symmlet wavelet filter and the solid curve corresponds to
the results obtained with the donut filter. The estimated area indices AZ and
corresponding SE values are included in the insert.
Computer-Aided Diagnosis of Mammographic Calcification Clusters 739
ing that cluster classification using the donut filter in the detection and seg-
mentation stage was significantly better than classification using the symmlet
wavelet.
0.8
0.6
TPF
0.4
Symmlet Wavelet
Az=0.93; SE=0.02
0.2 Donut Filter
Az=0.96; SE=0.02
0
0 0.2 0.4 0.6 0.8 1
FPF
Figure 13.17: Computer ROC plots of the TPF and FPF pairs obtained from the
classification of 101 clusters using two-view information for feature estimation.
The dashed curve corresponds to the results obtained with the symmlet wavelet
filter and the solid curve corresponds to the results obtained with the donut
filter. The estimated area indices AZ and corresponding SE values are included
in the insert.
740 Kallergi, Heine, and Tembey
(a) The 512 × 512 pixel ROIs of the 260 clusters were automatically reduced
to 200 × 200 pixels. As a result, several of the edge effects and associated
false signals were eliminated concentrating the analysis on the center
of the region where the signal of interest (cluster) should normally be
present. Both algorithm versions were applied to the reduced-size ROIs.
Results suggested that the classification of both the benign and malignant
cases might be improved by up to 15% for the algorithm using the symmlet
wavelet filter and up to 10% for the algorithm using the donut filter. The
smaller improvement in the latter case was expected because the donut
filter did not show major edge effects as the symmlet wavelet did in the
original ROIs (see Figs. 13.13 and 13.14). This seemed to be an easy and
fast remedy to the problem of FP signals with one downside. Namely, if
the clusters were off-center in the initial ROI either due to their physical
location in the breast (e.g., close to the chest wall or the skin area) or
due to the initial ROI selection, then important information was lost and
classification could not be done successfully.
(b) In a second experiment, all FP clusters and all single, isolated false sig-
nals that were outside the boundaries of the true cluster were manually
eliminated from the 512 × 512 pixel ROIs. This manual elimination was
done for a subset of 30 cases that contained small calcification clusters
(3–10 calcifications per cluster). The original and FP-free ROIs were then
used for feature estimation and classification. The elimination of the FP
signals improved the classification of both benign and malignant cases.
Significant classification improvement was observed for both benign and
malignant calcification clusters and both algorithm versions. Classifica-
tion errors were reduced up to 30% for the benign cases and up to 50% for
the malignant cases. Further analysis of these results revealed that the
presence of very small false objects in the segmentation output degraded
classification performance more than large false objects such as those
originating from the edge artifacts.
742 Kallergi, Heine, and Tembey
13.7 Conclusions
CADiagnosis is an area that merges the fields of signal processing and pattern
recognition for the creation of tools that can have a significant impact in health
care delivery and patient management. CADiagnosis algorithms usually involve
several modules that need to be separately optimized and validated for an overall
optimum performance. In this chapter, we presented a CADiagnosis methodol-
ogy for the differentiation between benign and malignant breast calcification
clusters in mammograms. We specifically looked into one of the aspects of the
algorithm, namely the impact of segmentation in the overall classification pro-
cess, and the role of multiresolution analysis in the segmentation process.
Our classification approach was based primarily on morphological and dis-
tributional features of mammographic calcifications and, hence, the role of seg-
mentation was particularly important in the overall implementation and perfor-
mance. Knowing the limitations of image segmentation techniques that were
further exaggerated by our additional challenge to preserve morphology and
distribution, we developed two multiresolution filters that were able to yield
successful and clinically promising results. Although far from perfect segmen-
tations, the symmlet wavelet and the donut filter adequately preserved the char-
acteristics of the calcifications as required by the overall algorithm’s design. A
new filter, labeled as the “donut filter,” was introduced for mammogram process-
ing that seems to offer a robust solution to the problems associated with the
detection and segmentation of mammographic images. The new filter was not
utilized to its full potential and several implementation pathways remain to be
explored. Its initial testing, however, yielded promising results and its usefulness
could go beyond mammography to other medical imaging applications.
An important question at the end of the experiments presented here
is whether similar classification performance can be achieved, either with
the symmlet wavelet or the donut filter, for images generated from various
sources. For example, for images generated by different film digitizers (laser-
based vs. charge-couple-device-based systems), or by different imaging systems
(screen/film vs. direct digital systems), or with different resolution characteris-
tics (pixel size and bit depth). Preliminary work with different data types sug-
gests that similar classification results may be obtained if a standardization pro-
cess is applied to the images prior to processing. As long as pixel size and depth
are within acceptable ranges for CADiagnosis applications in mammography,
Computer-Aided Diagnosis of Mammographic Calcification Clusters 743
13.8 Acknowledgments
The authors acknowledge the valuable assistance of Angela Salem in the gener-
ation of the image databases used for algorithm development and testing, and
of Joseph Murphy in the processing of the data.
Questions
4. What property exists between wavelet expansion images that does not exist
in an arbitrary filter bank output?
5. What can you say about a time series signal that is nothing but a spike a
t = t0 with respect to its Fourier properties?
7. Explain what a band pass filter is and what it may be used for.
10. What are the criteria for database design as needed for the evaluation of
CADiagnosis algorithms?
12. What is the different between computer ROC and observer ROC? How can
we correlate the ROC indices of performance to clinically used indices of
sensitivity and specificity?
Bibliography
[1] Huo, Z., Giger, M. L., Vyborny, C. J., and Metz, C. E., Breast cancer: Effec-
tiveness of computer-aided diagnosis-observer study with independent
database of mammograms, Radiology, Vol. 224, pp. 560–568, 2002.
[6] Bird, R. E., Wallace, T., W., and Yankaskas, B. C., Analysis of cancers
missed at screening mammography, Radiology, Vol. 184, No. 3, pp. 613–
617, 1992.
[7] Millis, R. R., Davis, R., and Stacey, A. J., The detection and significance
of calcifications in the breast: A radiological and pathological study, Br.
J. Radiol., Vol. 49, pp. 12–26, 1976.
[8] Reintgen, D., Berman, C., Cox, C., Baekey, P., Nicosia, S., Greenberg, H.,
Bush, C., Lyman, G. H., and Clark, R. A., The anatomy of missed breast
cancer, Surg. Oncol., Vol. 2, pp. 65–75, 1993.
[9] Kopans, D. B., The positive predictive value of mammography, AJR, Vol.
158, No. 3, pp. 521–526, 1992.
[10] Lewin, J. M., Hendrick, R. E., D’Orsi, C. J., Isaacs, P. K., Moss, L. J.,
Karellas, A., Sisney, G. A., Kuni, C. C., and Cutter, G. R., Comparison
of full-field digital mammography with screen-film mammography for
cancer detection: Results of 4,945 paired examinations, Radiology, Vol.
218, pp. 873–880, 2001.
746 Kallergi, Heine, and Tembey
[13] Floyd, C. E., Lo, J. Y., Yun, A. J., Sullivan, D. C., and Kornguth, P. J., Pre-
diction of breast cancer malignancy using an artificial neural network,
Cancer, Vol. 74, No. 11, pp. 2944–2948, 1994.
[14] Jiang, Y., Nishikawa, R. M., Schmidt, R. A., Metz, C. E., Giger, M. L.,
and Doi, K., Improving breast cancer diagnosis with computer-aided
diagnosis, Acad. Radiol., Vol. 6, pp. 22–33, 1999.
[15] Giger, M. L., Huo, Z., Kupinski, M. A., and Vyborny, C. J., Computer-
aided diagnosis in mammography, In: Handbook of Medical Imaging,
Volume 2, Medical Image Processing and Analysis, Sonka, M. and
Fitzpatrick, M. J., eds., SPIE Press, Bellingham, WA, pp. 915–1004,
2000.
[16] Li, L., Zheng, Y., Zhang, L., and Clark, R. A., False-positive reduction in
CAD mass detection using a competitive strategy, Med. Phys., Vol. 28,
No. 2, pp. 250–258, 2001.
[17] Wu, Y., Giger, M. L., Doi, K., Vyborny, C. J., Schmidt, R. A., and Metz, C.
E., Artificial neural networks in mammography: Application to decision
making in the diagnosis of breast cancer, Radiology, Vol. 187, pp. 81–87,
1993.
[18] Chan, H. P., Sahiner, B., Kam, K. L., Petrick, N., Helvie, M. A., Good-
sitt, M. M., and Adler, D. D., Computerized analysis of mammographic
microcalcifications in morphological and texture feature spaces, Med.
Phys., Vol. 25, No. 10, pp. 2007–2019, 1998.
[23] Hall, F. M., Storella, J. M., Silverstone, D. Z., and Wyshak, G., Nonpalpa-
ble breast lesions: Recommendations for biopsy based on suspicion of
carcinoma at mammography, Radiology, Vol. 167, pp. 353–358, 1988.
[24] Olson, S. L., Fam, B. W., Winter, P. F., Scholz, F. J., Lee, A. K., and Gordon,
S. E., Breast calcifications: Analysis of imaging properties, Radiology,
Vol. 169, pp. 329–332, 1988.
[25] Muir, B. B., Lamb, J., Anderson, T. J., and Kirkpatrick, A. E., Microcal-
cification and its relationship to cancer of the breast: Experience in a
screening clinic, Clin. Radiol., Vol. 34, pp. 193–200, 1983.
[27] Liberman, L., Abramson, A. F., Squires, F. B., Glassman, J. R., Morris,
E. A., and Dershaw, D. D., The breast imaging reporting and data system:
Positive predictive value of mammographic features and final assess-
ment categories, AJR, Vol. 171, pp. 35–40, 1998.
[28] Kallergi, M., Gavrielides, M. A., He, L., Berman, C. G., Kim, J. J., and
Clark, R. A., A simulation model of mammographic calcifications based
on the ACR BIRADS, Acad. Radiol., Vol. 5, pp. 670–679, 1998.
[29] Gavrielides, M. A., Kallergi, M., and Clarke, L. P., Automatic shape anal-
ysis and classification of mammographic calcifications, In: SPIE, Vol.
3034, pp. 869–876, 1997.
[30] Tolstov, G. P., Fourier Series, Dover Publications, New York, 1962.
748 Kallergi, Heine, and Tembey
[31] Bracewell, R. L., The Fourier Transform and Its Applications, 2nd edn.
revised, McGraw-Hill, New York, 1988.
[32] Brigham, E. O., The Fast Fourier Transform and Its Applications, Pren-
tice Hall, Englewood Cliffs, NJ, 1988.
[39] Strang, G. and Nguyen, T., Wavelets and Filter Banks, Wellesley-
Cambridge Press, Wellesley, MA, 1996.
[41] Vetterli, M. and Kovacevic, J., Wavelets and Subband Coding, Prentice
Hall, Englewood Cliffs, NJ, 1995.
[42] Daubechies, I., Ten Lectures on Wavelets, Society for Industrial and
Applied Mathematics, Philadelphia, PA, 1992.
[43] Peitgen, H. O., Jurgens, H., and Saupe, D., Chaos and Fractals: New
Frontiers of Science, Springer-Verlag, New York, 1992.
[44] Wornell, G. W., Signal Processing with Fractals: A Wavelet Based Ap-
proach, Prentice Hall, Upper Saddle River, NJ, 1996.
Computer-Aided Diagnosis of Mammographic Calcification Clusters 749
[45] Turner, M. J., Blackledge, J. M., and Andrews, P. R., Fractal Geometry
in Digital Imaging, Academic Press, San Diego, CA, 1998.
[46] Heine, J. J., Deans, S. R., Cullers, D. K., Stauduhar, R., and Clarke, L. P.,
Multiresolution statistical analysis of high-resolution digital mammo-
grams, IEEE Trans. Med. Imag., Vol. 16, No. 5, pp. 503–604, 1997.
[47] Heine, J. J., Deans, S. R., and Cullers, D. K., Stauduhar, R., and Clarke,
L. P., Multiresolution probability analysis of gray scaled images, J. Opt.
Soc. Am. A, Vol. 15, pp. 1048–1058, 1998.
[48] Heine, J. J., Deans, S. R., and Clarke, L. P., Multiresolution probabil-
ity analysis of random fields, J. Opt. Soc. Am. A, Vol. 16, pp. 6–16,
1999.
[52] Heine, J. J., Deans, S. R., Velthuizen, R. P., and Clarke, L. P., On the
statistical nature Of mammograms, Med. Phys., Vol. 26, pp. 2254–2265,
1999.
[53] Burgess, A. E., Jacobson, F. L., and Judy, P. F., Human observer detection
experiments with mammograms and power-law noise, Med. Phys., Vol.
28, No. 4, pp. 419–437, 2001.
[54] Heine, J. J., Deans, S. R., Gangadharan, D., and Clarke, L. P., Multireso-
lution analysis of two dimensional 1/f processes: Approximations, Opt.
Eng., Vol. 38, pp. 1505–1516, 1999.
[55] Freedman, M., Pe, E., Zuurbier, R., Katial, R., Jafroudi, H., Nelson, M.,
Lo, S. C. B., and Mun, S. K., Image processing in digital mammography,
SPIE, Vol. 2164, pp. 537–554, 1994.
750 Kallergi, Heine, and Tembey
[56] Woods, K., Automated Image Analysis Techniques for Digital Mammog-
raphy, Ph.D. Dissertation, Department of Computer Science and Engi-
neering, College of Engineering, University of South Florida, 1994.
[57] Shen, L., Rangayyan, R. M., and Desautels, J. E. L., Application of shape
analysis to mammographic calcifications, IEEE Trans. Med. Imag., Vol.
13, No. 2, pp. 263–274, 1994.
[58] Jemal, A., Thomas, A., Murray, T., and Thun, M., Cancer statistics 2002,
CA Cancer J. Clin., Vol. 52, pp. 23–47, 2002.
[60] Burke, H. B., Goodman, P. H., Rosen, D. B., Henson, D. E., Weinstein,
J. N., Harrell, F. E., Marks, J. R., Winchester, D. P., and Bostwick, D.
G., Artificial neural networks improve the accuracy of cancer survival
prediction, Cancer, Vol. 79, pp. 857–862, 1997.
[61] Efron, B., The Jacknife, the Bootstrap, and Other Resampling Plans,
Society for Industrial and Applied Mathematics, Philadelphia, PA, 1982.
[62] Tourassi, G. D. and Floyd, C. E., The effect of data sampling on the per-
formance evaluation of artificial neural networks in medical diagnosis,
Med. Decis. Making, Vol. 17, pp. 186–192, 1997.
[63] Harrell, F. E., Lee, K. L., and Mark, D. B., Tutorial in biostatistics, mul-
tivariate prognostic models: Issues in developing models, evaluating
assumptions and adequacy, and measuring and reducing errors, Stat.
Med., Vol. 15, pp. 361–387, 1996.
[64] Chen, D. R., Kuo, W. J., Chang, R. F., Moon, W. K., and Lee, C. C., Use
of the bootstrap technique with small training sets for computer-aided
diagnosis in breast ultrasound, Ultrasound Med. Biol., Vol. 28, No. 7, pp.
897–902, 2002.
[68] Dorfman, D. D., Berbaum, K. S., and Lenth R. V., Multireader, multicase
receiver operating characteristic methodology: A bootstrap analysis,
Acad. Radiol., Vol 2, pp. 626–633, 1995.
[69] Nishikawa, R. M., Giger, M. L., Doi, K., Metz, C. E., Yin, F. F., Vy-
borny, C. J., and Schmidt R. A., Effect of case selection on the perfor-
mance of computer-aided detection schemes, Med. Phys., Vol. 21, No. 2,
pp. 265–269, 1994.
[70] Kallergi, M., Carney, G., and Gaviria, J., Evaluating the performance
of detection algorithms in digital mammography, Med. Phys., Vol. 26,
No. 2, pp. 267–275, 1999.
[72] Kallergi, M., Gavrielides, M. A., Gross, W. W., and Clarke, L. P., Evaluation
of a CCD-based film digitizer for digital mammography, SPIE, Vol. 3032,
pp. 282–291, 1997.
Computer-Supported Segmentation
of Radiological Data
14.1 Introduction
1
Computer Vision Laboratory, ETH-Zurich, Switzerland
753
754 Cattin et al.
becomes questionable not only because of the amount of work, but also with
regard to the poor reproducibility of the results.
Because of the above reasons, computer-assisted segmentation is a very im-
portant problem to be solved in medical image analysis. During the past decades
a huge body of literature has emerged, addressing all facets of the related sci-
entific and algorithmic problems. A reasonably comprehensive review of all
relevant efforts is clearly beyond the scope of this chapter. Instead, we just tried
to analyze the underlying problems and principles and concisely summarize the
most important research results, which have been achieved by several genera-
tions of PhD students at the Computer Vision Laboratory of the Swiss Federal
Institute of Technology during the past 20 years.
Figure 14.1: Spin-echo MR image pair (an early echo is shown on the left, a
late echo on the right). In the middle the two-dimensional intensity distribution
(i.e., the frequency of the occurrence of intensities I1 and I2 in the left and right
images) is given.
Computer-Supported Segmentation of Radiological Data 755
Figure 14.2: Segmentation of the dual-echo MR image using training. The left
image shows user-defined training regions for the different tissue classes. The
corresponding tessellation of the feature space (spanned by the joint intensity
distribution) is shown in the middle, resulting in the segmentation on the right.
Even the most sophisticated pre- and postprocessing techniques cannot, how-
ever, overcome the inherent limitation of the basically intensity-based methods,
namely the assumption that segmentation can be carried out solely based on
information provided by the actual image. This assumption is fundamentally
Computer-Supported Segmentation of Radiological Data 757
wrong, and the radiologist uses a broad range of related knowledge on the field
of anatomy, pathology, physiology, and radiology in order to arrive at a rea-
sonable image interpretation. The incorporation of such knowledge into the
algorithms used is therefore indispensable for automatic image segmentation.
Different procedures have been proposed in the literature to approach the
problem of representation and usage of prior knowledge for image analysis.
Because of the enormous complexity of the necessary prior information, clas-
sical methods of artificial intelligence as the use of expert systems [12, 13] can
offer only limited support to solve this problem.
be achieved based on the fact that the single components of the vector p are
usually highly correlated. A simplified characterization of the probability density
is possible based on the first- and second-order moments of the distribution (for
a multivariate Gaussian distribution this description is exact). The resulting
descriptors are
1 N
p= p j, (14.1)
N j=1
where the training set consists of the N examples described by the param-
eter vectors p j ;
1 N
= (p j − p) · (p j − p)T (14.2)
N − 1 j=1
(a) (b)
Figure 14.4: (a) The corpus callosum from an anatomical atlas and (b) the
corresponding region of interest in a midsagittal MR image. On the left image
the connecting line between the anterior commissure (AC) and the posterior
commissure (PC), which is used for normalization, is also shown.
the largest 12 eigenvalues (defined by the 400 original parameters) already rea-
sonably represent the variability (covering about 95% of the full variance).
This statistical description can easily be used as a parametric deformable
model allowing the fully automatic segmentation of previously unseen images
(apart from the definition of the stereotactic reference system). Based on the
Figure 14.5: Building the active shape model for the corpus callosum. (a) The
71 outlines of the training set normalized in the anatomical coordinate system
defined by the anterior and posterior commissures (AC/PC). The eigenvalues
resulting from the principal component analysis are plotted in (b), while the
eigenvectors corresponding to the three largest eigenvalues are illustrated in
(c), (d), and (e). The deformations which correspond the eigenmodes cover the
√ √
range − 2λk (light gray) to + 2λk (dark gray).
Computer-Supported Segmentation of Radiological Data 763
Figure 14.6: Segmentation of the corpus callosum. The top-left image shows
the initialization, resulting from the average model and a subsequent match in
the subspace of the largest four deformation modes. The other images (top
right, lower left, and lower right) illustrate the deformation of this model during
optimization using all selected deformation modes, allowing fine adjustments.
The black contour is the result of a manual expert segmentation.
concept of deformable contour models or snakes [34] (see section 14.4.3), the
corpus callosum outline can be searched in the subspace spanned by the selected
number of largest eigenmodes using the usual energy minimization scheme as
illustrated in Fig. 14.6. The efficiency of the fit can be vastly increased by in-
corporating information about the actual appearance of the organ on the radio-
logical image, for example, in the form of intensity profiles along its boundary,
as illustrated in Fig. 14.7(a), leading ultimately to the usage of integrated active
appearance models [35] incorporating the shape and gray-level appearance of
the anatomy in a coherent manner.
The illustrated ideas generalize conceptually very well to three dimensions,
as illustrated on the anatomical model of the basal ganglia of the human brain
shown in Fig. 14.8. The corresponding active shape model has been successfully
applied for the segmentation of neuroradiological MR volumes [36]. Remaining
interactions needed for the establishment of the anatomical coordinate system
764 Cattin et al.
(a) (b)
Figure 14.7: Intensity profiles along the boundary of a (a) 2-D and a (b) 3-D
object.
Putamen
Globus pallidus
Thalamus
Hippocampus
Figure 14.8: Three-dimensional model of the basal ganglia of the human brain.
On the left an individual anatomy from the training set is shown, while the the
average model is presented in the right image.
Computer-Supported Segmentation of Radiological Data 765
Figure 14.9: Prediction of the position of the prostate for a known bladder
shape.
particular the relative positions with respect to the origin of a common coor-
dinate system of the combined organs must be modeled. There are different
possibilities to choose this reference coordinate system. One possible and intu-
itive choice for a reference coordinate system would be the center of gravity of
one of the organs.
Figure 14.9 shows that the position of the prostate depending on the shape
of the bladder is modeled reliably. Here, the mean bladder–prostate model is
shown on the left. In the right the first 10 eigenmodes were added, so that they
best approximate the bladder. As can be seen in the figure, the position of the
prostate is also approximately found, although no information on the prostate
was included.
It should be noted that the establishment of correspondence is still a major
matter of concern while the training set is created, which further complicates
the generation of suitable data collections for training. The intensive manual
work needed is, however, limited to the training phase, while the actual seg-
mentation of the unseen data is fully automatic. The correspondences including
the spatial variability and radiological appearance of the anatomical landmarks
are integrated into the statistical model and will be transfered to the new images
during the fitting process.
766 Cattin et al.
path array are then updated iteratively until convergence, i.e., until no more
values in the array are changed in one iteration. The values of the ith row of P
are first adjusted from left to right by the following rule
⎛ ⎞
P(i − 1, j − 1) + C(i, j)
⎜ ⎟
⎜ P(i − 1, j) + C(i, j) ⎟
⎜ ⎟
⎜ ⎟
P(i, j) = min ⎜ P(i − 1, j + 1) + C(i, j) ⎟ , (14.3)
⎜ ⎟
⎜ P(i, j − 1) + C(i, j) ⎟
⎝ ⎠
P(i, j)
shortest path can be computed online while the user drags the mouse, thus
providing direct feedback.
In contrast to Snakes, which will be described next, the selected path is a
global optimum of the cost function defined over the complete image, whereas
Snakes iteratively adapt their contours based on local information, which will
very likely represent a local optimum of their respective target function. Global
optimum is a desirable mathematical property in optimization; it is, however, not
obvious that the best segmentation is equivalent to this definition of an optimum.
14.4.2 Snakes
The second class of algorithms presented intends to overcome some of the
limitations of the graph-based approaches. The former allows the segmentation
line, respectively surface, to have individual properties that are not related to the
image, but rather to physical properties of some material. The segmentation pro-
cess is no longer solely based on the image, but regularized by the constraineds
imposed by the physical model. This model introduces some generic knowledge
of general organ’s shape encoded in the elasticity of the material. The algorithms
of this section belong to the field of physically based deformable models. In this
section, the basics of Snakes are first depicted, followed by the presentation of
two extensions that have been proposed during the last few years.
Traditional Snakes used for image segmentation are polygonal curves to
which an objective function is associated. This function combines an “image
term,” which measures either edge strength or edge proximity, and a regular-
ization term, which accounts for tension and curvature. The curve is deformed
in order to optimize a score and, as a result, to match the image edges. The
optimization typically is global, i.e., it takes edge information into account along
the whole curve simultaneously, but only considers local image information, i.e.
image intensities close to the curve.
Snakes were originally introduced by Kass et al. [44] and are modeled as
time-dependent 2-D curves defined parametrically as
where s is proportional to the arc length, t the current time, and x and y the
curve’s image coordinates. The Snake deforms itself as time progresses so as to
770 Cattin et al.
minimize an energy term that combines image, internal, and external energies:
These energies are derived by integration along the curve. The forces resulting
from the minimization of the image energy Eimage guide the Snake to match
the desired image features. This image energy is derived by integrating over a
v (s, t)) from an image feature map, i.e.
potential P("
1
Eimage (v) = − P(v(s, t)) ds (14.7)
0
A typical choice is to take P(v(s, t)) to be equal to the magnitude of the image
gradient, that is
where I is either the image itself or the image convolved by a Gaussian kernel. As
for the Live-Wire cost function, many different feature maps have been suggested
in the past [45–47], yet the results are comparable.
The internal energy term Eint models the physical behavior of the material
describing the Snake. Using the elastic rod model, Eint is taken to be
1 2 2
1 ∂v(s, t) ∂ 2 v(s, t)
Eint (v) = α(s) + β(s) ds. (14.9)
2 ∂s ∂s2
0
The parameters α(s) and β(s) are arbitrary functions that regulate the curve’s
tension and rigidity and are commonly supplied by the user. It has been shown
that they can be chosen in a fairly image-independent way [46]. Generally α(s)
and β(s) are defined as constant along the curve, i.e. α(s) = α and β(s) = β.
Other authors have proposed to dynamically adjust the values of α and β along
the curve by a feed-back strategy [48].
The segmentation process performed with Snakes is governed by the min-
>
imization of the term E(v) dt. This amounts to using Hamilton’s principle in
variational calculus to derive the Euler–Lagrange equations of motion. The re-
sulting equation of motion for the basic Snake described here can be written
as
where v stands for either (x0 , . . . , xn−1 ) or (y0 , . . . , yn−1 ). The stiffness matrix
K is pentadiagonal and singular, thus Eq. (14.12) cannot be solved by direct
inversion of K. To be able to solve the Snake Eq. (14.10), two different methods
will be described. First an iterative solution is presented, which stems from
the original Snakes framework. Ziplock Snakes, which will be introduced in
section 14.5.1.1, incorporate boundary conditions into Eq. (14.12), so that the
equation can be solved by inversion of K. In the original approach, additional
terms are incorporated into Eq. (14.10) that introduce a temporal development
of the Snake.
Dynamic terms have been introduced to account for kinetic energy and velocity-
dependent friction, leading to a more physically reasonable movement of the
Snake [49]. The kinetic energy term EK (v) is set to
1 2
1 ∂v(s, t)
EK (v) = µ(s) ds, (14.13)
2 ∂t
0
where µ(s) and γ (s) are considered constants along the curve. Forward differ-
ences are used to approximate the time derivatives, resulting in a linear system
of equations which can be formulated in matrix notation as
The role of the damping becomes evident, as the condition of the stiffness matrix
K is improved through the damping term (µ + γ )I. This term has to be selected
in a reasonable manner to prevent the Snake to be “glued,” which would be
the case for |(µ + γ )I| # |K|. With appropriate selections for µ and γ , a well-
conditioned linear system results, so that the term (µ + γ )I + K can be inverted
and Eq. (14.16) solved for v [t] yielding an iterative algorithm to solve the Snake
equation of motion.
The Lagrangian formalism allows to unify forces from very different type of
sources. The target function is extended to incorporate more energy terms so
that the final energy to be minimized is of the form Etot = i Ei . Some basic
extensions that have been presented are summarized.
User interaction is introduced through external forces, so that the Snake can
be modified manually [49]. Two type of forces are commonly used to attract or
repulse the Snake from the current mouse position. Attraction can be modeled
by introducing a virtual spring connecting the mouse position m with a point
p on the Snake, that adds a term k(p − m)2 to the external energy Eext , where
the spring constant k is a parameter. To push the Snake away from an undesired
local energy minimum, a “volcano” is introduced by adding an energy function
Computer-Supported Segmentation of Radiological Data 773
1
proportional to r2
to the external energy, where r is the distance of a point from
the volcano center. Obviously, special care is required to avoid instabilities near
r = 0.
Similar to the balloon forces, a gravitational force can be defined [51]. Interpret-
ing the image intensity as the z-dimension, the energy term is defined as
1
EG (v) = g(s) ds, (14.18)
0
where g(s) is the gravitation vector (0, 0, −g). The Snake is then accelerated in
negative z-direction, so that the model seems to “falls” on an object.
scale, while keeping small details on a finer scale intact. Tamed Snakes combine
the hierarchical modeling and Snake-like edge delineation. They adhere to the
concept of hierarchical shape representations with several scales of resolution
to provide the necessary interactive modeling capabilities while being suitable
for numerical simulations.
Hierarchical modeling consists of (a) an iterative refinement of the geome-
try, which defines a hierarchy of representations and (b) a local detail encoding,
which represents the details on a finer level with respect to the next coarser one.
Subdivision curves are best suited for such hierarchical modeling, as their rep-
resentation implicitly comprises a hierarchy of refined shapes. These curves are
constructed using univariate subdivision schemes defined as the iterative appli-
(l)
cation of an operator that maps a given polygon Pl = [vi ] to a refined polygon
(l+1)
Pl+1 = [vi ], where l denotes the level of the hierarchy. Such an operator is
(l+1)
given by two rules for computing the new so-called even vertices Pl+1 = {v2i }
! (l+1)
and the new odd vertices Pl+1 = {v2i+1 }.
The Tamed Snakes as introduced by Hug [52] employ the DLG-subdivision
scheme [53], given by
(l+1) (l) (l+1) 1
(l) (l)
(l) (l)
v2i = vi v2i+1 = + ω vi + vi+1 − ω vi−1 + vi+2 . (14.23)
2
As the even vertices remain unchanged the subdivision operator has interpolat-
ing behavior. The free tension parameter ω has to be chosen inside the interval
0<ω< 1
8
to obtain a limit curve that has a continuous tangent vector.
(l)
Local details, i.e. transformations of the vertices vi from their given position,
(l) (l)
are encoded by establishing a local coordinate system fi in each vertex vi
and by representing the details with respect to this frame.
The subdivision scheme suggests to start the segmentation process with a
reasonably coarse model and to iteratively adjust and refine the control vertices
of the resulting curve. Since the subdivision scheme is interpolating, only the odd
vertices of the current level must be adjusted to align with a correct boundary
position before proceeding to the next finer level. In doing so, the prediction
of the refinement operator improves continuously with respect to the vertex
position on the next finer level and converges to the correct boundary position.
The traditional Snake energy has to be modified to combine the described
coarse-to-fine approach with the Snake-like edge tracking. Tamed Snakes re-
place the elastic rod model term (Eq. 14.9) by a spring energy similar to the
776 Cattin et al.
external energy term introduced earlier for mouse interaction with Snakes. The
!
springs are attached to each odd vertex vi ∈ Pl , so that the vertices vi snap
to the correct boundary position within the vicinity of their starting positions
vi (0) = vi |t=0 . Assuming a good initialization, the imposed restriction on the
search space to the local neighborhood is reasonable. The spring constant k(l)
can be increased with each subdivision step to further restrict the search area,
as the error of the subdivision operator’s prediction tends to decrease.
Besides the spring energy, Tamed Snakes incorporate a kinetic energy EK , an
image energy Eimage and a Rayleigh dissipation functional D(vt ). The resulting
Euler–Lagrange equation of motion for Tamed Snakes describing the motion of
all odd mass points vi at time t is
In order to prevent the control vertices from drifting along the boundary, the
gradient of the potential is projected onto the normal direction of the curve,
denoted by ∇⊥ in the previous equation.
The segmentation process using Tamed Snakes is depicted in Fig. 14.11.
The initialization has a strong impact on the additional manual editing required
in finer levels. For the presented case, user interaction was required on a few
points in the first and second subdivision level. Because of the limited number
of vertices in these levels and the ability to better predict new positions at finer
levels, Tamed Snakes proof themselves to alleviate the interactive segmentation
task. In case of clear boundaries though, the segmentation is not as fast and
elegant as with traditional Snakes.
1
U(vi ) = u j − vi (14.25)
ni j∈N (i)
1
where n is the valence of the vertex vi . This operator clearly does not consider
the geometric constellation of the neighborhood of vi , but results in a sim-
ple computation of the Laplace operator with reasonable accuracy for regular
meshes.
The approximation of differential operators on arbitrary, discrete 2-D man-
ifolds poses a complex problem. In contrast to the 1-D situation, where the
adjacent vertices are always the left and right neighbors of the current vertex,
there exists no such fixed relationship for 2-D manifolds. Many different meth-
ods for the computation of discrete operators have been proposed in the past
few years [58, 59].
At this point it has to be noted that practical implementations of 3-D snakes
pose additional challenges that have to be considered. In general, the 3-D
778 Cattin et al.
14.5.1 Background
An essential prerequisite of interactive segmentation, which affects overall ac-
curacy as well as efficiency of the method, is the sound initialization of the
underlying model. On the one hand, the initial guess has a critical impact on
the quality of the segmentation outcome. On the other hand, tedious and time-
consuming manual initialization procedures forfeit possible time savings of the
segmentation phase.
Although these are well-known facts, emphasis in the literature is usually
placed on extensions of the deformable models, while an initial position rela-
tively close to the desired solution is assumed. Nevertheless, the determination
of such an initial guess with mouse-based interfaces, especially in 3D, poses a
problem.
In the following, two approaches to aid a user in the fast generation of an
initialization for a deformable model are described. In the first method, a priori
shape knowledge is used for efficient initialization, thus reducing the amount of
user interaction. In the second approach, the human–computer interface itself
is enhanced by using multimodal interaction metaphors stemming from virtual
reality techniques.
Ziplock Snakes emphasize on the improvement of the result based on the user’s
initialization [60]. They reduce the requirements on the initialization while
Computer-Supported Segmentation of Radiological Data 779
The user interaction closely resembles the Life-Wire approach: start- and
endpoints of single segments have to be specified and the complete contour
is assembled from several segments. The potential discontinuities arising at
the connecting vertices are compensated by the fact that these vertices were
selected on salient edges with clear directions.
Ziplock Snakes improve the overall convergence properties of Snakes and
the probability of getting trapped in an undesirable local minimum is consid-
erably reduced in most cases. However, gaps in object boundaries, misleading
edges, and object outlines with low contrast represent insuperable obstacles
that are quite usual in medical imagery.
The 3-D analogs of Ziplock Snakes are called Velcro surfaces, as their behav-
ior mimics a piece of Velcro that is progressively clamped onto the surface of
interest.
Following a natural extension from 1- to 2-D manifolds, points become
lines. In the case of the Snakes under scrutiny this observation states that the
initialization of 3-D models requires the specification of lines as boundary condi-
tions. This conclusion comprises the original goal of the Ziplock framework—to
reduce the user interaction. From the end-user’s perspective, the specification
of point landmarks for the initialization of the surface models is more desirable
as it can be provided faster and more reliably. Velcro surfaces aim at such a
landmark based initialization.
Assuming a set of anchor points and surface normals are given, a solution for
the homogeneous equation (thin plate problem without external forces, τ = 0,
see Eq. (14.22))
2 v = 0 (14.26)
∗
K ∗ v"˙ = Fv"∗∗ , (14.27)
∗ ∗ ∗
where v"∗ stands for either v"(1) , v"(2) , or v"(3) , the reduced vectors of the three
coordinate functions, and K ∗ for an (N − M) × (N − M) sparse matrix that is
now invertible and can be solved using a sparse linear solver. Closed 3-D objects
can be initialized by selecting at least four non-coplanar points. Of course, since
Fv"∗∗ depends on the surface’s current position, Eq. (14.27) cannot, in general, be
solved in closed form.
The algorithm for the approximation of the underlying image data is analo-
gous to the 2-D case. Starting from the initial shape that is approximately correct
in the neighborhood of the selected anchor points, the image potential is taken
into account progressively for all surface vertices.
orthonormalization χ :
The covariance matrix ˜ and the resulting PCA given by the eigensystem of ˜
can subsequently be calculated according to:
1 N
PCA
˜ = p̃i p̃iT = Ũ #Ũ T # = diag(λ1 , . . . , λ N−1 ) (14.29)
N − 1 i=1
The principal components defining the eigenmodes in shape space are then given
by back projecting the eigenvectors Ũ :
After the statistical analysis of the anatomical shape, this information can be
used to progressively eliminate variation by point-wise fixation of control points.
After defining the coordinate system with the AC–PC line, the initialization starts
with the average model p (Fig. 14.13(a)). Additional boundary conditions are
then introduced by moving control vertices to approximately correct positions
(a) Initial average model and correct seg- (b) Basis vectors Rj
mentation
Figure 14.13: (a) Boundary conditions for an initial outline are established
by prescribing a position for each coarse control vertex. (b) Shape variations
caused by adding the basis vectors defining the x- and y-translation of one point
to the average model. The various shapes are obtained by evaluating p + ωU rk
with ω ∈ {−2, . . . , 2} and k ∈ {x j , yj }.
Computer-Supported Segmentation of Radiological Data 783
on the object border. In the next step, given the a priori shape knowledge and
these constraints, the most natural initialization outline should be chosen. In
the context of PCA, this means choosing the model with minimal Mahalanobis
distance Dm.
The solution to this task is to find two vectors in variation space describing
decoupled x- and y-translations of a given point j in object space with minimal
overall variations. Once these vectors are found, all possible boundary condi-
tions can be satisfied by adding these appropriately weighted vectors to the
mean shape.
Let rxj and r yj denote the two unknown basis vectors causing unit x- and y-
translation of the point j respectively. The Dm of these two vectors is then given
by
2
[e]
N−1 r
k
Dm(rk ) = (Ũ rk )T ˜ −1 Ũ rk = rkT #−1 rk = k ∈ {x j , yj } (14.31)
e=1
λe
Taking into account that x j and yj depend only on two rows of U , we define the
submatrix U j according to the following expression:
xj xj u2 j−1 ◦ xj
= + b= + U jb u j ◦ := jth row of U (14.32)
yj yj u2 j ◦ yj
The vectors lxj and l yj denote the required Lagrange multipliers. To find the
optimum of L(rk , lk ), we calculate the derivatives with respect to all elements
of rxj , r yj , lxj , and l yj and set them equal to zero:
δ ! δ !
L(rxj , lxj ) = 0 ∧ L(rxj , lxj ) = 0,
δrxj δlxj
(14.34)
δ ! δ !
L(r yj , l yj ) = 0 ∧ L(r yj , l yj ) = 0
δr yj δl yj
784 Cattin et al.
⎡ .. ⎤⎡ .. ⎤ ⎡ .. ⎤
2
. . .
⎢ λ1 ⎥⎢ ⎥ ⎢ ⎥
⎢ .. .. ⎥⎢ ⎥ ⎢ .. ⎥ ..
⎢ . −U T ⎥⎢
. r r ⎥ ⎢ 0 . 0 ⎥ .
⎢ j ⎥ ⎢ xj yj ⎥ ⎢ ⎥
⎢ .. ⎥⎢ ⎥ ⎢ .. ⎥ ..
⎢ 2 ⎥⎢ ⎥=⎢ ⎥
⎢ λ N−1
. ⎥⎢ ⎥ ⎢ . ⎥ .
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢........................... ⎥⎢............... ⎥ ⎢............ ⎥
⎣ ⎦⎣ ⎦ ⎣ ⎦
.. .. ..
Uj . 0 lxj . l yj exj . e yj
(14.34 )
If the basis vectors and the Lagrange multipliers are combined according to
R j = [rxj r yj ] and L j = [lxj l yj ], Eq. (14.34 ) can be rewritten as two linear matrix
equations:
2#−1 R j = U jT L j (14.35)
U j Rj = I (14.36)
The two basis vectors rxj and r yj (resulting from simple algebraic operations on
Eqs. (14.35) and (14.36)) are then given by
2 3−1
R j = [rxj r yj ] = #U jT U j #U jT (14.37)
While rxj describes the translation of x j by one unit with constant yj and minimal
shape variation, r yj alters yj correspondingly. The resulting effect caused by
adding these shape-based basis vectors to the average model is illustrated in
Fig. 14.13(b). The most probable shape p̌ given the displacement [x j , yj ]T
for the control vertex j is consequently determined by
x j
p̌ = p + U R j . (14.38)
yj
After obtaining the most probable shape for a given control vertex, we now
have to ensure that subsequent modifications do not alter the adjusted vertex.
Therefore, we remove the components from the statistic that cause a displace-
ment of the point. The first step is to subtract the basis vectors R j , weighted
by the example specific displacement U j = [x j , yj ]iT , from the parameter
representation bi of each instance i:
jˆ
bi = bi − R j U j bi = I − R j U j bi ∀i ∈ {1, . . . , N} (14.39)
Computer-Supported Segmentation of Radiological Data 785
Figure 14.14: The first five one-point invariant eigenmodes after the subtraction
of thefirst principal landmark. The various shapes are obtained by evaluating
ĵ ĵ
p + ω λk uk with ω ∈ {−2, . . . , 2} and k ∈ {1, . . . , 5}.
ĵ
Doing so for all instances, we obtain a new description of our population bi
which is invariant with respect to point j. An example of the removal of the
variation is visualized in Fig. 14.14.
In order to further improve the point-wise elimination process, the control
point selection strategy has to be optimized. This can be done by choosing con-
trol vertices, or principal landmarks, which carry as much shape information
as possible.
We define the reduction potential of a vertex jk being a candidate to serve
as the kth principal landmark by
N−1−2(k−1)
ŝk
P( jk ) = − σ̃l2 = −tr(˜ ŝk ) = −tr(#ŝk ), (14.40)
l=1
with sequence ŝk = { ĵ1 , . . . , ĵk } denoting the set of the k point-indices of the
principal landmarks that have been removed from the statistic in the given order,
and the superscript ◦ŝk indicating the value of ◦ if the principal landmarks ŝk have
been removed.
In order to remove as much variation as possible, we choose consequently
that point as the first principal landmark that holds the largest reduction
potential: j1 = max[P( j)]. This selection strategy was applied to obtain the
j
eigenmodes shown in Fig. 14.14. Further application of the selection strategy
786 Cattin et al.
Figure 14.15: Remaining variability after vertex elimination of (a) two and (b)
three principal landmarks.
to the example, obtains the optimal second and third principal landmark
(Fig. 14.15).
The described framework can now be used for efficient initialization of de-
formable models. Examples of the initialization process are shown in Fig. 14.16.
The left image shows how the initial average model converges toward a
sound approximation by adding control vertices. The right image depicts four
additional examples with adjusted principle landmarks. Generally speaking,
Figure 14.16: (a) Generation of an initial outline for segmentation. Shape in-
stance in black and fitted initializations in gray with an increasing number of
fitted principal landmarks. (b) Initial shapes with four adjusted principal land-
marks for the segmentation of four randomly chosen instances.
Computer-Supported Segmentation of Radiological Data 787
14.6.1 Background
Extensive research has been invested in recent years into improving interac-
tive segmentation algorithms. It is, however, striking that the human–computer
interface, a substantial part of an interactive setup, is usually not investigated.
Although the need for understanding the influence of human–computer inter-
action on interactive segmentation is recognized, only very little research has
been done in this direction.
In order to improve information flow and to achieve optimal cooperation be-
tween interactive image analysis algorithms and human operators, we evaluated
closed-loop systems utilizing new man–machine interfacing paradigms [62].
The mouse-based, manual initialization of deformable models in two dimen-
sions represents a major bottleneck in interactive segmentation. In order to
overcome the limitations of 2-D viewing and interaction the usage of direct
3-D interaction is inevitable. However, adding another dimension to user inter-
action causes several problems. Editing, controlling, and interacting in three
dimensions often overwhelms the perceptual powers of a human operator. Fur-
thermore, today’s desktop metaphors are based on 2-D interaction and cannot
easily be extended to the volumetric case. Finally, the visual channel of the
human sensory system is not suitable for the perception of volumetric data.
However, these major drawbacks are valid only in terms of interactive sys-
tems that are based on 2-D Window–Icon–Mouse–Pointer (WIMP) interfaces
that solely rely on the visual sense of the human operator. In order to alleviate
the limitations of visual-only systems we may try to enhance the interaction
process with additional sensory feedback. The fundamental challenge here is
to find efficient ways for information flow between user and computer. Several
sensory channels could be addressed, but due to the 3-D nature of the problem,
the most obvious choice is the haptic channel. As an example, a multimodal sys-
tem using visual and haptic volumetric rendering will be described, which was
successfully applied to the segmentation of the intestinal system (Fig. 14.17).
788 Cattin et al.
for each (x, y, z) ∈ W , where d denotes the Euclidean distance from a voxel that
is part of the tubular structure to a voxel of the surrounding tissue W = V \ W .
In the next step we negate the 3-D distance map and approximate the gradi-
ents by central differences. Moreover, to ensure the smoothness of the computed
forces, we apply a 5 × 5 × 5 binomial filter. This force map is precomputed be-
fore the actual interaction to ensure a stable force-update. Because the obtained
forces are located at discrete voxel positions, we have to do a trilinear interpo-
lation to obtain the continuous gradient force map needed for stable haptic
interaction. Furthermore, we apply a low-pass filter in time to further suppress
instabilities. The computed forces can now be utilized to guide a user on a path
close to the centerline of the tubular structure. In the optimal case of good data
quality, the user falls through the dataset guided along the 3-D ridge created by
the forces. However, if the 3-D ridge does not exactly follow the centerline the
user can guide the 3-D cursor by exerting a gentle force on the haptic device to
leave the precalculated curve.
While moving along the path, points near the centerline are set. These points
can be used to obtain a B-spline, which approximates the path. In the next step
this extracted centerline is used to generate a good initialization for a deformable
surface model. To do this, a tube with varying thickness is created according to
the precomputed distance map. This resulting object is then deformed subject
to a thin plate under tension model. Details of the algorithmic background of
this deformable model approach are described in section 14.4.2.
Because of the good initialization, only a few steps are needed to approxi-
mate the desired object [65]. The path initialization can be seen in Fig. 14.18(a).
Note, that the 3-D data is rendered semitransparent to visualize the path in the
lower left portion of the data. Figure 14.18(b) depicts the surface model during
deformation.
In order to further improve the interaction with complicated datasets a step-
by-step segmentation approach can be adopted by hiding already segmented
loops. This allows a user to focus attention on the parts that still have to be
extracted. For this purpose the 3-D surface model is turned back into vox-
els and removed from the dataset (Fig. 14.19). This step can be carried out in
790 Cattin et al.
14.7 Conclusions
In spite of the enormous research and development effort invested into finding
satisfactory solutions during the past decades, the problem of medical image
segmentation (as image segmentation in general) is still an unsolved problem
today, and no single approach is able to successfully address the whole range of
possible clinical problems. The basic reason for this rather disappointing status
lies in the difficulties in representing and using the prior information in its full
extent, which is necessary to successfully solve the underlying task in scene
analysis and image interpretation.
While first results already clearly demonstrate the power of the model-based
techniques, generic segmentation systems capable to analyze a broad range of ra-
diological data even under severely pathological conditions cannot be expected
in the near future. Currently available methods, like those discussed in this chap-
ter, allow to work only within a very narrow, specialized problem domain and
fundamental difficulties have to be expected if trying to establish more generic
platforms. The practically justifiable number of examples in the training sets
can cover only very limited variations of the anatomy and are usually applied
to analyzing images without large pathological changes. It still needs a long way
to go, before the computer representation and usage of the prior knowledge
involved in the interpretation of radiological images can be represented and
used by a computer in complexity which is sufficient to reasonably imitate the
everyday work of an experienced clinical radiologist. Accordingly, in the near
future only a well-balanced cooperation between computerized image analysis
methods and a human operator will be able to efficiently address many clinically
relevant segmentation problems. Better understanding of the perceptional and
technical principles of man–machine interaction is therefore a fundamentally
important research area which should now get significantly more attention than
what it was getting in the past.
792 Cattin et al.
Bibliography
[1] Gerig, G., Martin, J., Kikinis, R., Kübler, O., Shenton, M., and Jolesz,
F., Automatic segmentation of dual-echo MR head data, In: Proceed-
ings of Information Processing in Medical Imaging’91, Wye, GB, 1991,
pp. 175–187.
[2] Duda, R. and Hart, P., Pattern Classification and Scene Analysis, Wiley,
New York, 1973.
[3] Shattuck, D., Sandor-Leahy, S., Schaper, K., Rottenberg, D., and Leahy,
R., Magnetic resonance image tissue classification using a partial vol-
ume model, Neuroimage, Vol. 13, pp. 856–876, 2001.
[4] S. Ruan, J. X., C. Jaggi and Bloyet, J., Brain tissue classification of mag-
netic resonance images using partial volume modeling, IEEE Trans.
Med. Imaging, Vol. 19, No. 12, pp. 172–186, 2000.
[5] Gerig, G., Kübler, O., Kikinis, R., and Jolesz, F., Nonlinear anisotropic
filtering of MRI data, IEEE Trans. Med. Imaging, Vol. 11, No. 2, pp. 221–
232, 1992.
[6] Guillemaud, R. and Brady, M., Estimating the bias field of MR images,
IEEE Trans. Med. Imaging, Vol. 16, No. 3, pp. 238–251, 1997.
[7] M. Styner, G. S., Ch. Brechbühler and Gerig, G., Parametric estimate of
intensity inhomogeneities applied to MRI, IEEE Trans. Med. Imaging,
Vol. 19, No. 3, pp. 153–165, 2000.
[8] Wells, W., Grimson, W., Kikinis, R., and Jolesz, F., Adaptive segmentation
of MRI data, IEEE Trans. Med. Imaging, Vol. 15, No. 4, pp. 429–443, 1996.
[9] Van Leemput, K., Maes, F., Bello, F., Vandermeulen, D., Colchester, A.,
and Suetens, P., Automated segmentation of MS lesions from multi-
channel MR images, In: Proceedings of Second International Confer-
ence on Medical Image Computing and Computer-Assisted Interven-
tions, MICCAI’99, Taylor, C. and Colchester, A., eds., Lecture Notes
in Computer Science, Vol. 1679, Springer-Verlag, New-York, pp. 11–21,
1999.
Computer-Supported Segmentation of Radiological Data 793
[10] Li, S. Z., Markov Random Field Modeling in Computer Vision, Springer-
Verlag, Tokyo, 1995.
[15] Evans, A. C., Collins, D. L., and Holmes, C. J., Toward a probabilistic atlas
of human neuroanatomy, In: Brain Mapping: The Methods, Mazziotta,
J. C. and Toga, A. W., eds., Academic Press ISBN 0126930198 pp. 343–
361, 1996.
[16] Jiang, H., Holton, K., and Robb, R., Image registration of multimodal-
ity 3-D medical images by chamfer matching, In: Proceedings of
Biomedical. Image Processing and 3D Microscopy, SPIE, Vol. 1660,
SPIE, The International Society of Optical Engineering pp. 356–366,
1992.
[17] Christensen, G., Miller, M., and Vannier, M., Individualizing neu-
roanatomical atlases using a massively parallel computer, IEEE Com-
puter, pp. 32–38, January 1996.
[18] Bookstein, F., Shape and the information in medical images: A decade
of the morphometric synthesis, Comput. Vision. Image Understand.,
Vol. 66, No. 2, pp. 97–118, 1997.
[19] Evans, A., Kamber, M., Collins, D., and MacDonald, D., An MRI-based
probabilistic atlas of neuroanatomy, In: Magnetic Resonance Scanning
and Epilepsy, Shorvon, S., ed., Plenum Press, New York, pp. 263–274,
1994.
794 Cattin et al.
[20] Wang, Y. and Staib, L., Elastic model based non-rigid registration in-
corporating statistical shape information, In: Proc. First Int. Conf. on
Medical Image Comp. and Comp. Assisted Interventions, Vol. 1679 of
Lecture Notes in Comp. Sci., pp. 1162–1173, Springer-Verlag, New York,
1998.
[21] Terzopoulos, D. and Metaxas, D., Dynamic 3D models with local and
global deformations: Deformable superquadrics, IEEE Trans. Pattern
Anal. Mach. Intell., Vol. 13, No. 7, pp. 703–714, 1991.
[23] Staib, L. and Duncan, J., Boundary finding with parametrically de-
formable models, IEEE Trans. Pattern Anal. Mach. Intell., Vol. 14, No. 11,
pp. 1061–1075, 1992.
[24] Brechbühler, C., Gerig, G., and Kübler, O., Parametrization of closed
surfaces for 3-D shape description, CVGIP: Image Understand., Vol. 61,
pp. 154–170, 1995.
[25] Cootes, T., Cooper, D., Taylor, C., and Graham, J., Training models
of shape from sets of examples, In: Proceedings of The British Ma-
chine Vision Conference (BMVC) Springer-Verlag, New-York, pp. 9–18,
1992.
[26] Cootes, T. and Taylor, C., Active shape models—‘Smart snakes,’ In: Pro-
ceedings of The British Machine Vision Conference (BMVC) Springer-
Verlag, New-York, pp. 266–275, 1992.
[27] Rangarajan, A., Chui, H., and Bookstein, F., The softassign pro-
crustes matching algorithm, information processing in medical imaging,
pp. 29–42, 1997. Available at http://noodle.med.yale.edu/anand/ps/
ipsprfnl.ps.gz.
[30] Kelemen, A., Szekely, G., and Gerig, G., Elastic model-based segmen-
tation of 3-d neuroradiological data sets, IEEE Trans. Med. Imaging,
Vol. 18, pp. 828–839, 1999.
[31] Staib, L. and Duncan, J., Model-based deformable surface finding for
medical images, IEEE Trans. Med. Imaging, Vol. 15, No. 5, pp. 1–12,
1996.
[32] Cootes, T. F., Taylor, C. J., Cooper, D. H., and Graham, J., Active shape
models—Their training and application, Comput. Vision Image Under-
stand., Vol. 61, No. 1, pp. 38–59, 1995.
[33] Székely, G., Kelemen, A., Brechbühler, C., and Gerig, G., Segmentation
of 2-D and 3-D objects from MRI volume data using constrained elas-
tic deformations of flexible Fourier contour and surface models, Med.
Image Anal., Vol. 1, No. 1, pp. 19–34, 1996.
[35] Cootes, T., Edwards, G., and Taylor, C., Active appearance models, In:
Proceedings of the European Conference on Computer Vision, Vol. 2,
Springer-Verlag, New-York, pp. 484–498, 1998.
[36] Kelemen, A., Szekely, G., and Gerig, G., Elastic model-based segmen-
tation of 3-d neuroradiological data sets, IEEE Trans. Med. Imaging,
Vol. 18, No. 10, pp. 828–839, 1999.
[39] Fischler, M., Tenenbaum, J., and Wolf, H., Detection of roads and linear
structures in low-reslution aerial imagery using a multisource knowl-
edge integration technique, Comput. Graph. Image Process., Vol. 15,
pp. 201–233, 1981.
[40] O’Donnell, L., Weslin, C.-T., Grimson, W. E. L., Ruiz-alzola, J., Shenton,
M. E., and Kikinis, R., Phase-based user-steered image segmentation, In:
International Conference on Medical Image Computing and Computer-
Assisted Intervention (MICCAI), 2001, pp. 1022–1030.
[42] Falcão, A., Udapa, J., and Miyazawa, F., An ultra-fast user-steered image
segmentation paradigm: Live wire on the fly, IEEE Trans. Med. Imaging,
Vol. 19, No. 1, pp. 55–62, 2000.
[44] Kass, M., Witkin, A., and Terzopoulos, D., Snakes: Active contour mod-
els, Int. J. Comput. Vision, Vol. 1, No. 4, pp. 321–331, 1988.
[46] Fua, P. and Leclerc, Y., Model driven edge detection, Mach. Vision Appl.,
Vol. 3, pp. 45–56, 1990.
[47] Leymarie, F. and Levine, M., Tracking deformable objects in the plane
using an active contour model, IEEE Trans. Pattern Anal. Mach. Intell.,
Vol. 15, No. 6, pp. 617–634, 1993.
[50] Cohen, L. and Cohen, I., A finite element method applied to new active
contour models and 3D reconstructions, In: Proceedings of the Third
International Conference on Computer Vision, Osaka, Japan, Dec. 1990,
pp. 587–591.
[51] Cohen, I., Cohen, L. D., and Ayache, N., Using deformable surfaces to
segment 3-D images and infer differential structures, Comput. Vision
Graph. Image Process., Vol. 56, No. 2, pp. 242–263, 1992.
[52] Hug, J., Brechbühler, C., and Székely, G., Tamed snake: A particle system
for robust semi-automatic segmentation, In: MICCAI, 1999, pp. 106–115.
[53] Dyn, N., Levin, D., and Gregory, J., A 4-point interpolatory subdivision
scheme for curve design, Comput. Aided Geomet. Design, Vol. 4, No. 4,
pp. 257–268, 1987.
[55] Dyn, N., Levin, D., and Gregory, J., A butterfly subdivision scheme for
surface interpolation with tension control, Trans. Graph., Vol. 9, No. 2,
pp. 160–169, 1990.
[56] Zorin, D., Schröder, P., and Sweldens, W., Interpolating subdivision for
meshes of arbitrary topology, In: SIGGRAPH, August 1996, pp. 189–192.
[58] Schneider, R. and Kobbelt, L., Geometric fairing of irregular meshes for
free-form surface design, Comput. Aided Geomet. Design, Vol. 18, No. 4,
pp. 359–379, 5 2001.
[59] Desbrun, M., Meyer, M., Schroder, P., and Barr, A., Discrete Differential-
Geometry Operators in nD, preprint, The Caltech Multi-Res Modeling
Group, 2000.
[60] Neuenschwander, W., Fua, P., Székely, G., and Kübler, O., Initializing
snakes, In: IEEE Computer Society Conference on Computer Vision
and Pattern Recognition, June 1994, pp. 658–663.
798 Cattin et al.
[61] Hug, J., Brechbühler, C., and Székely, G., Model-based initialisation for
segmentation, In: Proceedings of 6th European Conference on Com-
puter Vision (ECCV 2000), Part II, Vernon, D., ed., Lecture Notes in
Computer Science, Springer, Berlin pp. 290–306, 2000.
[63] Rosenberg, L., Virtual fixtures: Perceptual tools for telerobotic manipu-
lation, In: IEEE Virtual Reality Annual International Symposium, 1993,
pp. 76–82.
[64] Sayers, C. and Paul, R., An operator interface for teleprogramming em-
ploying synthetic fixtures, Presence Teleoperat. Virtual Environ., Vol. 3,
pp. 309–320, 1994.
[65] Harders, M. and Székely, G., New paradigms for interactive 3D volume
segmentation, J. Visual. Comput. Animation, Vol. 13, pp. 85–95, 2002.
[66] Karabassi, E.-A., Papaioannou, G., and Theoharis, T., A fast depth-buffer-
based voxelization algorithm, J. Graph. Tools, Vol. 4, No. 4, pp. 5–10,
1999.
The Editors
Dr. Jasjit S. Suri received his BS in computer engineering with distinction from
Maulana Azad College of Technology, Bhopal, India, his MS in computer sciences
from University of Illinois, Chicago, and Ph.D. in electrical engineering from
University of Washington, Seattle. He has been working in the field of computer
engineering/imaging sciences for 20 years. He has published more than 125 tech-
nical papers in body imaging. He is a lifetime member of research engineering
societies: Tau-Beta Pi, Eta-Kappa-Nu, Sigma-Xi, and a member of NY Academy
of Sciences, Engineering in Medicine and Biology Society (EMBS), SPIE, ACM,
and is also a senior member at IEEE. He is in the editorial board/reviewer of
several international journals such as Real Time Imaging, Pattern Analysis
and Applications, Engineering in Medicine and Biology, Radiology, Journal
of Computer Assisted Tomography, IEEE Transactions of Information Tech-
nology in Biomedicine, and IASTED Board.
799
800 The Editors
in New Jersey, and the director of the Bay Networks Authorized Center in Prince-
ton. He has also served as an adjunct professor of biomedical engineering at the
New Jersey Institute of Technology, a clinical associate professor of health in-
formatics, visiting professor at the University of Bruno in Czech Republic, and
an honorary professor of health sciences at Tsinghua University in China.
As an educator, researcher, and technologist, Prof. Laxminarayan has been
involved in biomedical engineering and information technology applications in
medicine and health care for over 25 years and has published over 250 scien-
tific and technical articles in international journals, books, and conferences.
His expertize lies in the areas of biomedical information technology, high per-
formance computing, digital signals and image processing, bioinformatics, and
physiological systems analysis. He is the coauthor of the book State-of-the-Art
PDE and Level Sets Algorithmic Approaches to Static and Motion Imagery Seg-
mentation published by Kluwer Publications and the book Angiography Imag-
ing: State-of-the-Art-Acquisition, Image Processing and Applications Using
Magnetic Resonance, Computer Tomography, Ultrasound and X-ray, Emerg-
ing Mobile E-Health Systems published by the CRC Press and two volumes of
Handbook of Biomedical Imaging to be published by Kluwer Publications. He
has also worked as the editor/coeditor of 20 international conferences and has
served as a keynote speaker in international conferences in 13 countries.
He is the founding editor-in-chief and editor emeritus of IEEE Transactions
on Information Technology in Biomedicine. He served as an elected member
of the administrative and executive committees in the IEEE Engineering in
Medicine and Biology Society and as the society’s vice president for 2 years. His
other IEEE roles include his appointments as program chair and general confer-
ence chair of about 20 EMBS and other IEEE conferences, an elected member of
the IEEE Publications and Products Board, member of the IEEE Strategic Plan-
ning and Transnational Committees, member of the IEEE Distinguished Lecture
Series, delegate to the IEEE USA Committee on Communications and Informa-
tion Policy (CCIP), U.S. delegate to the European Society for Engineering in
Medicine, U.S. delegate to the General Assembly of the IFMBE, IEEE delegate
to the Public Policy Commission and the Council of Societies of the AIMBE,
fellow of the AIMBE, senior member of IEEE, life member of Romanian Society
of Clinical Engineering and Computing, life member of Biomedical Engineering
Society of India, and U.S. delegate to IFAC and IMEKO Councils in TC13. He was
recently elected to the Administrative Board of the International Federation for
The Editors 803
805
806 Index
Fuzzy leader clustering: see Adaptive fuzzy Hessian matrix: see Three-dimensional (3-D)
leader clustering local structures
Fuzzy membership function, 471, 472 Hidden Markov random field (HMRFU ), 608,
Fuzzy segmentation, 663–667, 697–698 609
3-D, 691–697 Highest confidence first (HCF), 380, 383–385;
multiobject, 677, 678 see also QHCF
multiseeded, 667–668 Hip joint cartilage thickness quantification,
accuracy and robustness, 689–691 566–567
algorithm, 677–680 Hotelling transform: see Principal components
experiments, 680–689 analysis
theory, 668–677 Hough transform, 122–123
Fuzzy spel affinity, 664, 665, 667 Human-computer interaction, improved
background, 787
Gabor filters, 77–80, 98 multimodal segmentation, 788–791
responses for different filters of spectrum, 79
Gaussian blurring, anisotropic Image analysis, steps and ultimate goal of
based on voxel anisotropy, 579–580 in clinical environment, 112–114
Gaussian derivative of MR imaged sheet Image enhancement, 319
structure, 575 Image generation process, 476–478
Gaussian function, 84, 85, 122, 378, 538 Image segmentation: see Segmentation
derivatives of, 71–73 Imaging modalities, 112
Gaussian mixture model (GMM), 84; see also Impulse response functions (IRFs), 144
Weighted Gaussian mixture model Incrementation, 189–190
supervised segmentation with, 626 Initial region merging process, 416
unsupervised segmentation with, 626 Initialization
Gaussian smoothed volume, 558 deformable model, 778–787
Gaussian standard deviation (SD), effects of model-based, 781–782
in postprocessing, 577–579 Initialization process, 786–787
Geometry-based methods, 3, 4 Insight Segmentation and Registration Toolkit
Geometry model, 8–10, 43 (ITK), 196–198
Gibbs’ model, 468–469 Intensity-based automatic segmentation,
Glagov effect, 454 754–756
Grade of membership, 664, 667 Intensity-based methods, 2–4
Graph segmentation method (GSM), 459, Intensity model, 4–7, 13, 43
473–476 Internal forces, 127
classification system, 490, 491 Intravascular ultrasound (IVUS) images, 57–58;
decision criterion D for, 475 see also Texture classification for
Gray-level co-occurrence matrix (GLCM), 455 intravascular tissue characterization
Gray-level transformation, 327–329 response to different measures of
Gray matter, 281, 282 co-occurrence matrix, 65
Gray-scale features Ischemia, distal, 232
extracted from breast profile, 616 Ischemic attack, transient, 457
extracted from suspicious ROI, 615 Iterated conditional modes (ICM), 380, 382–383
Ground truth files, generation of electronic,
204–206 K-means algorithm, 271, 276–278, 282, 284
Ground truth overlays K-nearest neighbors, 81–82, 97–99, 422, 426
abnormal, 462–465 Kalman filtering scheme, 238–240
normal, 462–463, 466 Karhunen-Loéve transform: see Principal
components analysis
Helical CT imaging characteristics, 188–191, Knowledge-based components in medical
194 imaging CAD schemes, 592–593
of normal pancreas and pancreatic knowledge representation
adenocarcinoma, 191–194 by image grouping on various criteria,
Helical CT imaging parameters, 188–191 594–595
Hemoglobin, extinction coefficient of, 321 learnt from user interactions, 596
810 Index
Nearest neighbors, 61; see also K-nearest Pelvic bone tumor and cortex visualization
neighbors from 3-D MR data, 546–547
Neural network-based methods Periventricular leukomalacia (PVL), 9–10
(segmentation), 299 Perona-Malik smoothing function, 490, 491
Neural networks; see also Artificial neural Pitch, 190
networks Pixel classification, 124–126
multistage, 595–596 algorithms, 465–476
Neurovascular visualization from 3-D MR data, Plaque, 457
543–544 hard vs. soft, 95, 96
Nitroglycerin-mediated dilation (NMD), 232, Plaque segmentation, 93, 94
233 Plaque segmentation techniques, 453, 454
Noise protocols, small and large, 494–497 survey of, 453–458
Nonparametric estimation, 422 Plaque tissues, 58, 91; see also Texture
classification for intravascular tissue
Optic disk, detection of characterization
algorithm, 341 Plaque tissues classification process, 91, 92
detection of contours, 342–344 Plaque volumes, MR
localization, 341–342 accurate lumen identification, detection, and
motivation, 340 quantification in, 458–460; see also
properties, 340 Lumen
results, 344–345 circular vs. elliptical data analysis, 501–
state of the art, 340–341 509
Optical nerve head (ONH), 291 performance evaluation system (rulers
Over-shooting of human tracings, 461 and error curves), 492–501
pixel classification algorithms, 465–476
Pancreas, 184 synthetic system design and its processing,
imaging of, 186–191, 194 476–492
imaging modalities, 186–187 system strengths, 508–509
Pancreatic cancer, 183–186 system weakness, 509
Pancreatic cancer imaging Point accuracy, 689
computer applications in, 194–195, 217–218 Point-based segmentation algorithms, 662
external signal segmentation, 195 Point similarity, 690
image enhancement, 195 Point spread function (PSF), 568–569
processing–classification, 199 Point-wise subtraction of variation, 782–786
processing–image segmentation, 196–199 Polyline distance, 492, 493
helical CT imaging characteristics, 191–194 Polyline distance method (PDM), 459, 494–501,
Pancreatic tumor detection and classification, 504, 505
algorithm for, 199–200, 217–219 Polynomial contrast enhancement,
electronic ground truth file generation, 327–329
204–206 Portal vein segmentation from 3-D MR data,
external signal segmentation, 206–207 544–546
fuzzy clustering, 208–215 Positron emission tomography (PET), 167; see
medical image database, 200–204 also FDG-PET studies
preprocessing–enhancement, 207–208 attenuation correction in, 162–163
validation, 215–217 partial volume correction in, 162
Parametric images, fast generation of, segmentation in, 116–117
158–160 Positron emission tomography (PET) data,
Partial volume effects (PVEs), 116, 153, 162, 461 141–142
Partial volume (PV) model, 13 absolute quantification of, 136–137
Partial volume (PV) voxels, 26–28 Principal components analysis (PCA), 60, 81,
Partition function, 376 85–87, 132–139, 248, 249, 642
Parzen window, 422 feature extraction and, 643–644
Pattern recognition techniques (segmentation), Principal components (PCs), 132–139
298 Principal components (PCs) images, 137, 138
PDE-based smoothing system, 494–495 Principal landmarks, 785
814 Index