0% found this document useful (0 votes)
15 views14 pages

Liu 2012

This paper introduces an enhanced local binary pattern (LBP) approach for texture classification that combines intensity-based and difference-based features from local patches. The proposed descriptors, CI-LBP, NI-LBP, RD-LBP, and AD-LBP, enable the formation of joint histograms for improved classification performance, demonstrating significant advancements over traditional LBP methods. Extensive experiments on various texture databases validate the effectiveness of the proposed method, achieving superior results in texture classification tasks.

Uploaded by

kd17209
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views14 pages

Liu 2012

This paper introduces an enhanced local binary pattern (LBP) approach for texture classification that combines intensity-based and difference-based features from local patches. The proposed descriptors, CI-LBP, NI-LBP, RD-LBP, and AD-LBP, enable the formation of joint histograms for improved classification performance, demonstrating significant advancements over traditional LBP methods. Extensive experiments on various texture databases validate the effectiveness of the proposed method, achieving superior results in texture classification tasks.

Uploaded by

kd17209
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Image and Vision Computing 30 (2012) 86–99

Contents lists available at SciVerse ScienceDirect

Image and Vision Computing


journal homepage: www.elsevier.com/locate/imavis

Extended local binary patterns for texture classification☆


Li Liu a,⁎, Lingjun Zhao a, Yunli Long a, Gangyao Kuang a, Paul Fieguth b
a
School of Electronic Science and Engineering, National University of Defense Technology, 47 Yanwachi, Changsha, Hunan, 410073, China
b
Department of Systems Design Engineering, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1

a r t i c l e i n f o a b s t r a c t

Article history: This paper presents a novel approach for texture classification, generalizing the well-known local binary pattern
Received 19 April 2011 (LBP) approach. In the proposed approach, two different and complementary types of features (pixel intensities
Received in revised form 20 October 2011 and differences) are extracted from local patches. The intensity-based features consider the intensity of the
Accepted 4 January 2012
central pixel (CI) and those of its neighbors (NI); while for the difference-based feature, two components
are computed: the radial-difference (RD) and the angular-difference (AD). Inspired by the LBP approach,
Keywords:
Texture classification
two intensity-based descriptors CI-LBP and NI-LBP, and two difference-based descriptors RD-LBP and
Local binary pattern (LBP) AD-LBP are developed. All four descriptors are in the same form as conventional LBP codes, so they can be
Bag-of-words (BoW) readily combined to form joint histograms to represent textured images. The proposed approach is compu-
Rotation invariance tationally very simple: it is totally training-free, there is no need to learn a texton dictionary, and no tuning
of parameters. We have conducted extensive experiments on three challenging texture databases (Outex,
CUReT and KTHTIPS2b). Outex results show significant improvements over the classical LBP approach,
which clearly demonstrates the great power of the joint distributions of these proposed descriptors for
gray-scale and rotation invariant texture classification. The proposed method produces the best classifica-
tion results on KTHTIPS2b, and results comparable to the state-of-the-art on CUReT.
© 2012 Elsevier B.V. All rights reserved.

1. Introduction vectors to form a texton dictionary, and then represent texture images
statistically as compact histograms over the learned texton dictionary.
Texture classification is a fundamental issue in computer vision and In this simple and efficient BoW framework, it is generally agreed
image processing, playing a significant role in a wide range of applica- that the local descriptors play a much more important role, and have
tions that include medical image analysis, remote sensing, object recog- therefore received considerable attention [2-5, 8, 9, 6]. The ap-
nition, document analysis, environment modeling, content-based proaches can be grouped into sparse and dense types, with the sparse
image retrieval etc. [1]. For four decades, texture analysis has been an approach using appearance descriptors at a sparse set of detected in-
area of intense research, however analyzing real world textures has terest points. Noticeable sparse descriptors include SPIN, SIFT and
proven to be surprisingly difficult, in many cases caused by natural RIFT [8, 10]. In contrast, dense approaches use appearance descriptors
texture inhomogeneity of varying illumination, scale changes, and pixel by pixel [2-5, 9]. The sparse approach largely relies on the sparse
variability in surface shape. output of local interest region detectors, which might miss important
Recently, the orderless Bag-of-Words (BoW) [5, 2, 3, 8] approach, texture primitives and fail to provide enough regions for a robust statis-
representing texture images statistically as histograms over a discrete tical characterization of the texture.
texton dictionary, has proven extremely popular and successful in Among the most popular dense descriptors are the various filter
texture classification tasks. Robust and discriminative local texture banks, such as Gabor filters [11], the filter bank of Schmid [5], the filter
descriptors and global statistical histogram characterization have sup- bank of Leung and Malik [5], the MR8 [2], the filter bank of Crosier [9]
plied complementary components toward the BoW feature extraction and many others [12]. The design of a filter bank is nontrivial and likely
of texture images. The former attempts to extract a collection of powerful to be application dependent. Although enormous efforts have been
and distinctive appearance descriptors from local patches; while the carried out along this direction, the supremacy of filter bank-based de-
latter first utilizes the fact that texture images contain self-repeating scriptors for texture analysis has been challenged by several authors [4,
patterns by vector-quantifying (typically by k-means) the local feature 3, 7] who have demonstrated that using the intensities or differences in
a local small patch directly can produce superior or comparable classifi-
cation performance to filter banks with large spatial support. In [7], the
☆ This paper has been recommended for acceptance by Matti Pietikainen.
authors propose sparse modeling of local texture patches, however the
⁎ Corresponding author. Tel.: + 86 731 84573479 807; fax: + 86 731 84518730.
E-mail addresses: [email protected] (L. Liu), [email protected]
sparse texton learning and sparse coding process is computationally ex-
(L. Zhao), [email protected] (Y. Long), [email protected] (G. Kuang), pensive. Two particularly important works along these lines are the
pfi[email protected] (P. Fieguth). VZ-Joint classifier [3] and the LBP method [4]. The simple, elegant and

0262-8856/$ – see front matter © 2012 Elsevier B.V. All rights reserved.
doi:10.1016/j.imavis.2012.01.001
L. Liu et al. / Image and Vision Computing 30 (2012) 86–99 87

efficient local texture descriptor LBP may be the preferable choice over common concept, the LBP coding strategy. The pixel intensities are di-
VZ-Joint classifier since LBP uses a pre-defined texton dictionary and vided into two components: the intensity of the central pixel and the in-
does not need to use nearest neighbor to obtain the texton labels, a tensities of its neighboring pixels. For pixel differences, we study radial
time consuming step. and angular differences.
Due to its impressive computational efficiency and good texture All four descriptors (two intensity based, two difference based)
discriminative property, the dense LBP descriptor [4] has gained con- are in the same form as the conventional LBP codes, thus they can be
siderable attention since its publication [13], and has already been readily combined to form a joint histogram. The fusing of these descrip-
used in many other applications, including visual inspection, image tors will be shown to lead to significantly improved classification results
retrieval, dynamic texture recognition, remote sensing, biomedical on the experimental protocols designed for verifying the performance
image analysis, face image analysis, motion analysis, environment of the LBP approach in [4]. The key to our proposed approach is that it
modeling, and outdoor scene analysis [14–16, 18, 19, 34]. 1 Despite employs the advantages of VZ-Joint/VZ-MRF in its strong performance
the great success of LBP in computer vision and pattern recognition, from having a joint distribution, and those of LBP in computational
the conventional LBP operator comes with disadvantages and efficiency.
limitations: The paper is organized as follows: we start with a brief review of the
classical LBP approach in Section 2, followed by details of the derivation
1. The LBP operator produces long histograms which are sensitive to
of the proposed descriptors and the classification scheme. In Section 3,
image rotation.
we verify the proposed approach with extensive experiments on popular
2. The LBP has small spatial support; in its basic form, the LBP operator
texture datasets and comparisons with various state-of-the-art texture
cannot properly detect large-scale textural structures.
classification techniques. Section 4 provides concluding remarks and pos-
3. LBP loses local textural information, since only the signs of differ-
sible extensions of the proposed method. A short, preliminary version of
ences of neighboring pixels are utilized.
this work appeared in [26].
4. LBP is very sensitive to noise. The slightest fluctuation above or
below the value of the central pixel is treated as equivalent to a
2. Proposed descriptors
major contrast between the central pixel and its surroundings.

On the basis of the above issues, researchers have proposed a variety This section begins by reviewing conventional LBP, followed by
of LBP variants. In terms of locality, the authors in [20] propose to ex- the new descriptors designed to address the limitations of LBP. Finally,
tract global features from Gabor filter responses as a complementary the multiresolution analysis and classification scheme of this work is
descriptor. In order to recover the loss of information created by com- presented.
puting the LBP value, the local image contrast has been introduced by
Ojala et al. [4] as a complementary measure, and better performance 2.1. A brief review of LBP
has been reported therein. Moreover, Guo et al. [21] propose to include
the information contained in the magnitudes of local differences as The LBP method, first proposed by Ojala et al. [25, 4], encodes the
complementary to the signs used by LBP, and claim better performance. pixel-wise information in textured images. Images are probed locally
Regarding LBP robustness, especially to noise, the influential work by sampling grayscale values at a central point x0, 0 and p points xr, 0,
by Ojala et al. [4] extends basic LBP to a multiresolution context, and …, xr, p − 1 spaced equidistantly around a circle of radius r (the choice
rotation invariant patterns are introduced and successfully used in reduc- of which acts as a surrogate for controlling the scale of description),
ing the dimension of the LBP histogram and enhancing robustness and as shown in Fig. 1. In LBP, a “local pattern” operator describes the re-
speed. Ahonen et al. introduce soft histograms [28], and Tan and Triggs lationships between a pixel and its neighborhood pixels; all neigh-
[29] introduced local ternary patterns (LTP), using tertiary numbers in- bors that have values higher than or equal to the value of the
stead of binary. Noting that uniform LBPs are not necessary to occupy central pixel are given a value of 1, and all those lower a value of 0.
the major pattern proportions, Liao et al. [20] proposed to use dominant The binary values associated with the neighbors are then read se-
LBP (DLBP) which considers the most frequently occurred patterns in a quentially, clockwise, to form a binary number which may be used
texture image. to characterize the local texture. Formally,
Very recently, Heikkilä et al. [22] exploit circular symmetric LBP
X
p−1   
(CS-LBP) for local interest region description, and Chen et al. present n 1; x ≥ 0
LBP p;r ¼ s xr;n −x0;0 2 ; sðxÞ ¼ : ð1Þ
a WLD descriptor by including orientation information as a robust de- 0; x b 0
n¼0
scriptor [23].
The LBP approach is based on the assumption that the local differ-
ences of the central pixel and its neighbors are independent of the cen-
tral pixel itself. However, in practice an exact independence is not
warranted: the superiority of both VZ-Joint and VZ-MRF classifiers
over LBP clearly demonstrates the benefits of explicitly including the in-
formation contained in the central pixel [3].
The fundamental question being raised here is whether explicitly
modeling the joint distribution of the central pixel and its neighbors
is an advantage or not, and how to effectively include the missing
between-scale information so that better texture classification can
be achieved? Motivated by the work of Varma and Zisserman [3]
and the LBP approach studied by Ojala et al. [4], in this paper we pro-
pose a simple, yet very powerful and novel local texture descriptor to
generalize the conventional LBP approach. In the proposed approach,
two different but complementary types of features in a local patch,
the pixel intensities and the pixel differences, are utilized by using a

1 −1
A bibliography of LBP-related research can be found at http://www.cse.oulu.fi/ Fig. 1. A central pixel x0, 0 and its p circularly and evenly spaced neighbors {xr, i}ip= 0 on
MVG/LBP_Bibliography/. radius r.
88 L. Liu et al. / Image and Vision Computing 30 (2012) 86–99

where 0 ≤ k ≤ K − 1, K = 2 p is the number of LBP codes, and δ(•) is the


Dirac delta function. This formulation has several attractive properties
that favor its usage: gray-scale invariance, computational speed, few pa-
rameters, satisfactory discriminant power, and rotation invariance
achieved by simple cyclic shifts.
On the other hand, the basic LBP operator produces rather long
histograms (2p distinct values), and it can become intractable to estimate
h due to the overwhelming dimensionality of h with large p. Moreover,
it is easy to realize that due to the way LBP numbers are created, they are
very sensitive to noise: the slightest fluctuation above or below the value
of the central pixel is treated the same way as a major contrast between
the central pixel and its surroundings. One way to avoid noisy patterns is
to simply ignore them: a noisy pattern due to its randomness will create
neighbors that fluctuate above and below the value of the central pixel,
with 0 s and the 1 s frequently succeeding each other. Therefore, one im-
provement suggested by Ojala et al. [4] is to consider only the so-called
riu2
“uniform” patterns by proposing the LBPp, r operator, merging nonuni-
riu2
form patterns directly into one pattern. The success of the LBPp, r opera-
tor also comes from that fact that the “uniform” patterns appear to be
fundamental properties of local image textures [4], and represent some
riu2
prominent and salient local texture structures. The LBPp, r operator is for-
mally defined as
(    
riu2 ∑p−1
n¼0 s xr;n −x0;0 ; if U LBP p;r ≤2
LBP p;r ¼ ð3Þ
p þ 1; otherwise

where

Fig. 2. Overview of the proposed approach.  X  


 p−1     

U LBP p;r ¼ s xr;n −x0;0 −s xr; mod n þ 1;p −x0;0  ð4Þ
n¼0
Relative to the origin at (0, 0), the coordinates of the neighbors are
given by r sin(2πn/p), r cos(2πn/p). The gray values of neighbors
where superscript riu2 denotes the rotation invariant “uniform” patterns
which do not fall exactly in the center of pixels are estimated by riu2
that have U values at most 2. Therefore, mapping from LBPp, r to LBPp, r re-
interpolation.
sults in only p+ 1 distinct groups of patterns, leading to a much shorter
Given an N × M image I, let LBPp, r(i, j) be the identified LBP pattern
histogram representation for the whole image.
of each pixel (i,j), then the whole textured image is represented by a
It is obvious that LBP oversimplifies local structure and loses textural
histogram vector h of length K:
information. Therefore, Ojala et al. [4] made a further important correc-
tion by including the local contrast of each pattern and proposing a
N X
X M   complementary local descriptor called VARp, r. Using the 2D joint histo-
h ðkÞ ¼ δ LBP p;r ði; jÞ−k ð2Þ gram of LBPp, riu2 riu2
r and VARp, r, denoted as LBPp, r /VARp, r is demonstrated
i¼1 j¼1 in [4].

18 15 16 119 178 126 134 136 132 128


12 16 126 178 70 124 128 132 136 68
13 122 181 120 13 120 122 128 62 44
88 168 120 14 13 126 68 30 42 26
169 74 12 15 14 68 66 20 38 10

(a) (b)

0 0 0 1 1 0 0 0 0 0 1 1 1 1 1 0 1 1 1 1
0 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 0

0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0
1 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

(a1) (a2) (b1) (b2)


Fig. 3. The proposed NI-LBP preserves weak edge patterns. Two 5 × 5 example image patches are shown in (a,b). (a1) NI-LBP patterns of (a) at two resolutions with thresholds of
108 and 52.4 respectively. (b1) NI-LBP patterns of (b) at two resolutions with thresholds of 90 and 85.4 respectively. (a2,b2) are the patterns given by LBP.
L. Liu et al. / Image and Vision Computing 30 (2012) 86–99 89

In conventional LBP the central pixel is discarded (despite the im- Table 1
plicit use of the intensity of the central pixel as the threshold to Comparison of the detail mean classification accuracy for NI-LBP, LBP and MBP on test
suite Outex_TC_00000. Results are obtained as the average of the 100 test groups. The
achieve local gray-scale invariance), and only the joint distribution patch size used is 3 × 3. The 1NN classifier is used. The distance measure is χ2. Each
of the neighborhood around each pixel is considered. However, in image is normalized to have zero mean and unit standard deviation. “NI-LBP (512)” en-
their recent extensive texture study, Zhang et al. [8] suggested that codes the binary value of the center pixel, similar to “MBP (512)”. “MBP (256)” excludes
it is vital to use a combination of several detectors and descriptors. the binary value of the center pixel. Since LBP uses the value of the center pixel as the
threshold, therefore it is unnecessary to include the center pixel in this case. The numbers
Motivated by the work of Lazebnik et al. [10] and Zhang et al. [8], in
in the brackets denote the number of bins of the histogram.
this paper we seek to propose a method which possesses the
strengths of combining complementary local features, with those of Class NI-LBP (512) NI-LBP (256) LBP MBP (512) MBP (256)
LBP in computational efficiency and smaller support regions. canvas001 100.0% 100.0% 100.0% 100.0% 100.0%
canvas002 100.0% 100.0% 100.0% 100.0% 100.0%
2.2. Intensity-based descriptors canvas003 100.0% 100.0% 100.0% 100.0% 100.0%
canvas005 100.0% 100.0% 100.0% 100.0% 100.0%
canvas006 100.0% 100.0% 100.0% 100.0% 100.0%
The brightness level at a point in an image is highly dependent on canvas009 100.0% 100.0% 100.0% 100.0% 100.0%
the brightness levels of its neighboring points unless the image is canvas011 100.0% 100.0% 100.0% 100.0% 100.0%
simply random noise [24]. In MRF modeling [24], the probability of canvas021 100.0% 100.0% 100.0% 100.0% 100.0%
canvas022 100.0% 100.0% 100.0% 100.0% 100.0%
the central pixel depends only on its neighborhood as
canvas023 99.6% 99.4% 99.6% 99.8% 99.8%
canvas025 100.0% 100.0% 100.0% 100.0% 100.0%
pðΙðxc ÞjΙðxÞ; ∀x≠xc Þ ¼ pðΙðxc ÞjΙðxÞ; ∀x∈N ðxc ÞÞ ð5Þ canvas026 100.0% 100.0% 100.0% 100.0% 100.0%
canvas031 100.0% 100.0% 100.0% 100.0% 100.0%
where xc is a site in the 2D integer lattice on which the image I has canvas032 100.0% 100.0% 100.0% 100.0% 100.0%
canvas033 95.5% 96.0% 92.0% 94.4% 92.5%
been defined and N ðxc Þ is the neighborhood of that site. The center canvas035 100.0% 100.0% 100.0% 100.0% 100.0%
pixel also has discriminant information, however its distribution is canvas038 95.5% 98.2% 100.0% 99.7% 99.6%
conditioned on its neighbors alone. canvas039 99.8% 99.5% 100.0% 99.6% 99.8%
Inspired by such MRF models, and related to the ideas explored by tile005 100.0% 100.0% 100.0% 100.0% 100.0%
tile006 100.0% 99.6% 100.0% 99.8% 99.7%
Varma and Zisserman [3], we propose to use only local neighborhood
carpet002 100.0% 100.0% 100.0% 100.0% 100.0%
carpet004 100.0% 100.0% 100.0% 100.0% 100.0%
carpet005 100.0 99.6% 100.0% 99.4% 96.5%
66 182 248 carpet009 99.9% 99.9% 99.3% 95.2% 94.4%
63 58 126 5 193 12 Mean 99.76% 99.68% 99.62% 99.50% 99.26%
Original 60 60 60
Texture 66 56 248 12 182 5
Pattern
68 64 80 250 80 250
62 193 126 distributions in our NI-LBP descriptor. We explicitly model the joint
(a) (b) (c) distribution of the central pixels and its neighbors, in order to test
how significant this conditional probability distribution is for
1 1 1
1 0 1 0 1 0 classification.
LBP Next, inspired by the coding strategy of LBP, we define the following
1 0 1 0 1 0
NI-LBP descriptor (see also Fig. 2):
1 1 1 1 1 1
1 1 1
X
p−1   
(a1) (b1) (c1) n 1; x ≥ 0
NI−LBP p;r ¼ s xr;n −μ 2 ; sðx ¼Þ ð6Þ
0; x b 0
n¼0
VAR 15.8 8333.8 8333.8
p−1 riu2
1 1 1 where μ ¼ 12 ∑n¼0 xr;n . Similar to LBPp, r , the rotation invariant version
riu2
1 0 0 0 1 0 of NI − LBP, denoted by NI − LBPp, r , can also be defined to achieve rota-
tion invariant classification.
NI-LBP 1 0 1 0 1 0
Regarding the selection of the threshold μ, although it was moti-
1 1 0 1 0 1 vated by intuition and experimental studies, it is also selected in
0 1 0
(a2) (b2) (c2) order to preserve LBP characteristics and to increase robustness.
Hafiane et al. [17] proposed Median Binary Pattern (MBP) which
1 1 1 seeks to derive the localized binary pattern by thresholding the pixels
1 0 1 0 1 0
0 0 0 against their median value over a 3 ×3 neighborhood. In MBP, the central
MBP 1 0 1 0 1 0 pixel is also included in this filtering process, resulting 2 9 binary
1 0 0
patterns.
1 1 1
0 1 1 NI-LBP, LBP and MBP differ in the selection of thresholding value.
(a3) (b3) (c3) The capability of encoding image configuration and pixelwise rela-
tionships might be different as they use different thresholds. For illus-
Fig. 4. Three different original texture patterns (a, b, c) and their corresponding LBPs tration purpose, Fig. 4 gives three different example local texture
(a1, b1, c1), NI-LBPs (a2, b2, c2), MBPs (a3, b3, c3) and VAR values. All three LBP patterns
patterns. The patterns shown in Fig. 4(a) and (b) would be classified
(a1, b1, c1) are the same. Patterns in (a) and (b) would be considered as the same pattern
type by LBP, though corresponding textural surfaces might be quite different from each into the same class. But the textural surfaces they represent are quite
other. By incorporating LBP with local variance information, patterns in (a) and (b) different from each other, which means they probably belong to dif-
could be distinguished, while patterns in (b) and (c) would still be considered as the ferent classes. While the other three descriptors NI-LBP, MBP and
same pattern type because of the same variance. But they are different in configuration, VAR can all tell the difference between (a) and (b). This is why
which is not due to the rotation but underlying textural properties. In terms of MBP,
MBP can distinguish (a) and (b). However, MBP cannot distinguish (b) and (c). In contrast,
Ojala et al. use the combination of LBP and VAR. However, the joint
all three NI-LBPs are different. Therefore, all three patterns can be distinguished by our histogram of LBP and VAR cannot fully solve the problem. The classi-
proposed NI-LBP. fication might be misled without considering the relationships among
90 L. Liu et al. / Image and Vision Computing 30 (2012) 86–99

a 100 24 texture classes, with each class having 20 monochrome texture


images (128 × 128) with incandescent constant illumination and a
spatial resolution 100 dpi. The images in each class are divided into
Classification Accuracy
80
two nonoverlap groups with 10 images as training and the other 10
60 as testing. Outex_TC_00000 provides 100 couples of test and train
files for this category, we have performed experiments on all the
40 100 couples and we report classification accuracy as the average
over the 100 couples. Note that in [17], the authors selected randomly
20 NI−LBP
8,1
one couple in their evaluation, see Table 1 in [17]. From Table 1, we
LBP can clearly observe that our NI-LBP performs the best, especially for
8,1
0 two texture classes canvas033 and carpet009.
50 45 40 36 32 28 24 20
Furthermore, LBP thresholding at the value of the central pixel x0, 0
SNR (dB)
tends to be sensitive to noise, particularly in near-uniform image regions,
b 100
and smooths weak illumination gradients. While MBP thresholding
against the median value is claimed to be robust to “salt and pepper”
noise [17]. However, MBP is not robust to Gaussian noise. In contrast,
Classification Accuracy

80
the proposed NI-LBP descriptor has the following advantages:
60 1. Thresholding at μ is equivalent to making the local neighborhood
vector zero-mean, therefore resistant to local lighting effects, and
40 specifically invariant to gray scale changes.
2. Compared with LBP, weak edges are preserved by NI-LBP, as illus-
NI−LBPriu2
20 8,1 trated in Fig. 3. We can clearly observe that LBP does not match the
riu2
LBP
8,1
visual patterns, producing output unrelated to the peak in (a) or
0 the edge in (b). In contrast, the proposed NI-LBP outputs more
100 90 80 70 60 50 35 25
consistent patterns, owing to the better thresholding of μ.
SNR (dB)
3. Better noise robustness, as shown in Fig. 5.
Fig. 5. Comparison of the robustness to additive Gaussian noise of different signal-to-noise Recall that the local contrast measure proposed by Ojala et al. [4] is
ratios (SNR) for the proposed NI-LBP and the conventional LBP on the Outex textures: (a)
riu2 riu2 defined as follows:
NI− LBP8, 1 vs. LBP8, 1; (b) NI− LBP8, 1 vs. LBP8, 1 . We have used all the original texture im-
ages present in the Outex_TC_00010 training set (20 samples of illuminant “inca” and
1 X 2 1X
angle 0 in each of the 24 texture classes, totaling 480 images). Training is done with all p−1 p−1
the 480 noise free images and testing is done with the same images, but added with addi- VARp;r ¼ xr;n −μ ; where μ ¼ x : ð7Þ
tive Gaussian noise with different SNR. The nearest neighbor classifier is used for 2 n¼0 2 n¼0 r;n
classification.

We can see that NI −LBPp, r and VARp, r capture similar types of texture
neighborhood intersection. Taking the patterns in Fig. 4(b) and (c) information, with slight differences:
for example, which would be considered as the same pattern type
1. VARp, r achieves rotation invariance by summing up the whole var-
according to LBP and VAR, they are actually two patterns with differ-
iation in the circular neighborhood, whereas NI − LBPp, r is rotation
ent textural properties. Moreover, MBP also fails to distinguish patterns
sensitive, by default;
(b) and (c). Clearly, our proposed NI-LBP approach can distinguish
2. NI − LBPp, r is independent of gray scale, whereas VARp, r is not;
all the three different patterns, as shown in (a2) (b2) and (c2).
3. Finally, VARp, r is continuous-valued and needs to be quantized.
Therefore, the proposed NI-LBP approach is more discriminative
and effective. The latter quantization step has associated limitations of additional
In order to make further comparisons, we conducted texture clas- training to determine threshold values, and the difficulty in setting
sification on test suite Outex_TC_00000, which was used in [17]. The the number of bins. Too few bins will fail to provide enough discrimina-
results are listed in Table 1. For test suite Outex_TC_00000, there are tive information while too many bins would make the feature size too

Outex (P = 16, R = 2)
100
Proportion of Uniform Patterns (%)

Basic LBP
RD-LBP
AD-LBP
80

60

40

20

0
1

9
5

9
00

00

00

00

00

00

01

02

02

02

02

02

03

03

03

03

03

03
00

00

t00

t00

t00

t00
as

as

as

as

as

as

as

as

as

as

as

as

as

as

as

as

as

as
tile

tile

rpe

rpe

rpe

rpe
nv

nv

nv

nv

nv

nv

nv

nv

nv

nv

nv

nv

nv

nv

nv

nv

nv

nv

ca

ca

ca

ca
ca

ca

ca

ca

ca

ca

ca

ca

ca

ca

ca

ca

ca

ca

ca

ca

ca

ca

Texture Class

Fig. 6. Comparing the proportions (%) of “Uniform” patterns of all patterns for each texture in Outex for three methods: LBP, RD-LBP and AD-LBP, with P = 16, R = 2.
L. Liu et al. / Image and Vision Computing 30 (2012) 86–99 91

Fig. 7. Proposed multiresolution scheme.

large. Although there are some rules to guide selection [4], it is hard to (denoted as RD-LBP and AD-LBP respectively, as illustrated in
obtain an optimal number of bins in terms of accuracy and feature Fig. 2). We define the RD-LBP descriptor as follows:
size. On the basis of the above discussion, we expect that the pro-
posed NI − LBPp, r will be a better choice over VARp, r. X
p−1   
Rad n 1; x ≥ 0
To make it consistent with the binary coding strategy, the 1D dis- RD−LBP p;r;δ ¼ s Δδ;n 2 ; sðxÞ ¼ ð9Þ
0; x b 0
n¼0
tribution of the central pixels' intensity is represented by two bins,
i.e., Rad
where Δδ, n = xr, n − xr − δ, n is the radial difference computed with
   given integer radial displacement δ, xr, n and xr − δ, n correspond to
1; x ≥ 0
CI−LBP ¼ s x0;0 −μ r ; sðxÞ ¼ ð8Þ the gray values of pairs of pixels of δ equally spaced pixels of the
0; x b 0
same radial direction.
Similarly, the AD-LBP descriptor is defined as
where μI is the mean of the whole image.
X
p−1   
2.3. Difference-based descriptors Ang n 1; x ≥ ε
AD−LBP p;r;δ;ε s Δδ;n 2 ; sðxÞ ¼ ð10Þ
0; x b ε
n¼0
As a parallel development to the intensity descriptors just devel-
Ang
oped, we also propose pixel differences in radial and angular directions where Δδ, n = xr, n − xr, mod(n + δ, p) is the angular difference computed
on a circular grid, different from the traditional pixel differences with given angular displacement δ(2π/p), where δ is an integer such
which are computed in horizontal and vertical directions. More spe- that 1 ≤ δ ≤ p/2, xr, n and xr, mod(n + δ, p) correspond to the gray values
cifically, we propose two different descriptors, Radial Difference of pairs of pixels of δ equally spaced pixels on a circular radius r,
Local Binary Pattern and Angular Difference Local Binary Pattern and function mod(x, y) is the modulus x of y. ε is a threshold value,

Table 2
Summary of texture datasets used in our experiments.

Experiment # 1

Texture Texture Samples per Sample size Test suite Training or Number of Illuminant Samples in
dataset classes class testing angles used total

Brodatz 16 8 180 × 180 Contrib TC 00001 (problem Training 1 “inca” 16


000–009) Testing 9 “inca” 1008
Outex 24 20 127 × 128 Outex TC 00010 Training 1 “inca” 480
Testing 9 “inca” 3840
Outex TC 00012 (problem 000) Training 1 “inca” 480
Testing 10 “tl84” 4320
Outex TC 00012 (problem 001) Training 1 “inca” 480
Testing 10 “horizon” 4320

Experiment # 2
Texture Dataset Image rotation Controlled Scale variation Texture classes Sample size Samples per Samples in
dataset notation illumination class total

CUReT Dc √ √ 61 200 × 200 92 5612


KTH-TIPS2b DKT2b √ √ 11 200 × 200 432 4752
92 L. Liu et al. / Image and Vision Computing 30 (2012) 86–99

Table 3 approach to produce better results, we will prefer joint histogramming


Abbreviations for the notations of methods. (shown in Fig. 2). Following [4], we use only joint distributions of oper-
Name of Proposed method Abbreviation ators that have the same (p, r) values, although nothing would prevent
riu2 us from using joint distributions of operators computed from different
LBPp, r LBP
VARp, r VAR neighborhoods.
riu2
LBPp, r /VARp, r LBP/VAR
2.4. Multiresolution analysis and classification
CI-LBP CI
riu2
NI − LBPp, r NI
riu2
RD − LBPp, RD The proposed descriptors described above are extracted from a sin-
r
riu2
RD − LBPp, r /CI − LBP RD/CI gle resolution with a circularly symmetric neighbor set of p pixels placed
riu2
NI − LBPp, r /CI − LBP NI/CI on a circle of radius r (as in Fig. 2). Now clearly by altering (p,r), we can
riu2 riu2
NI − LBPp, r /RD − LBPp, r NI/RD realize operators for any quantization of the angular space and for any
riu2 riu2
NI − LBPp, r /RD − LBPp, r /CI − LBP NI/RD/CI
spatial resolution. Motivated by the idea of [4], we conduct the multire-
solution analysis by combining the information provided by multiple de-
and is 1% of the pixel value range in our experiments. We experimen- scriptors of varying (p, r), as illustrated in Fig. 7. The histogram feature
tally set ε = 0.01. We can see that when δ = p/2, our descriptor vector of multiresolution analysis is obtained by concatenating the histo-
AD − LBPp, r, p/2, ε is equivalent to the CS-LBP descriptor proposed by grams from a single resolution analysis realized with different (p, r).
Heikkilä [22] for local interesting region description. To perform the actual texture classification, there are two crucial
As discussed in Section 2.1, limiting attention to the uniform binary components: (i) texture feature extraction, and (ii) the classifier
patterns has attractive and elegant advantages over using all the binary and the associated similarity measure used within the classifier. In
patterns, specifically that the uniform patterns represent meaningful this work the focus is on evaluating the discrimination properties of
and fundamental characteristics of the texture, they appear to be the the proposed descriptors, so for classification we wish to make as
major parts of all binary patterns and are relatively reliable, and they few assumptions as possible and have chosen a non-parametric tech-
lead to texture image representation of low dimensionality. We wish nique, since non-parametric classifiers can handle a large number of
to see to what extent the proposed descriptors maintain these proper- classes, avoid parameter overfitting, and require no learning/training.
ties; Fig. 6 shows the proportions of the uniform patterns for three dif- Of non-parametric classifiers, the k nearest neighbor (kNN) is one of
ferent descriptors (Basic LBP, RD-LBP and AD-LBP), extracted from the most popular and simplest methods, which we adopt with
texture images of test suite Outex_TC_00010. It can clearly be seen k = 1. The samples are then classified according to their normalized
that the proportions of the uniform patterns of AD-LBP are too small histogram feature vectors hi and hi , using χ 2 distance metric
to provide a reliable and meaningful description of texture. Conse-
h i2
quently, we have decided against including AD-LBP descriptor in fur-  
2 1 hi ðkÞ− hj ðkÞ
ther experiments in this paper, and focus instead on RD-LBP. χ hi ; hj ¼ ∑ ðkÞ ð11Þ
Based on the above analysis, in order to produce acceptable dimen- 2 k hiðkÞ þ hj
sionality of histogram features, we merely use the uniform patterns mo-
tivated by the work of Ojala et al. [4]. The uniform patterns produce low the same distance metric used in [2, 3, 27, 30].
dimensionality features so that they can be conveniently used together
for pattern classification. There are two ways to combine the NI-LBP and 3. Experimental evaluation
RD-LBP codes: calculating the histograms separately and concatenating,
or jointly, calculating a joint two dimensional histogram of the NI-LBP In this section, we demonstrate the performance of the proposed
and RD-LBP codes, represented as NI-LBP/RD-LBP. Following the work method with comprehensive experiments on six texture datasets,
of Varma and Zisserman [3] and Guo et al. [21], who showed the joint summarized in Table 2, which are derived from four popular publicly

Fig. 8. 128 × 128 samples of the textures from Brodatz used in Experiment #1.
L. Liu et al. / Image and Vision Computing 30 (2012) 86–99 93

Fig. 9. 128 × 128 samples of the 24 textures from Outex used in Experiment #1.

Table 4
Classification accuracies (%) on Contrib_TC_00001, where training is done at just one rotation angle and the average accuracy over 10 angles. The results for LBP, VAR, and LBP/VAR
are quoted directly from the original paper by Ojala et al. [4].

2 ∗ Method 2 ∗ (p, r) 2 ∗ Bins Rotation angle for training 2 ∗ Average

0° 20° 30° 45° 60° 70° 90° 120° 135° 150°

LBP (16, 2) 18 96.2 99.0 98.6 98.9 98.5 99.1 97.6 98.6 98.7 97.5 98.3
VAR (16, 2) 128 89.9 84.5 986.2 90.5 87.3 85.6 91.0 89.8 90.8 88.5 88.4
LBP/VAR [4] (8, 1) + (16, 2) + (24, 3) 864 100 99.7 99.5 99.8 99.6 99.7 99.8 99.6 99.8 99.9 99.7

2 ∗ NI (8, 1) 10 65.4 85.5 81.3 76.6 77.0 78.4 68.8 81.4 75.8 76.5 76.7
(16, 2) 18 87.6 95.2 92.3 93.6 89.4 96.0 88.9 91.3 93.4 90.1 91.8
(24, 3) 26 96.2 93.4 97.6 96.6 98.3 96.7 97.1 96.7 92.6 98.2 96.4
2 ∗ RD (8, 1) 10 68.8 86.4 84.4 76.0 84.9 84.4 70.2 84.1 76.1 84.7 80.0
(16, 2) 18 89.2 92.9 96.7 97.8 96.1 92.6 88.4 94.7 96.7 97.3 94.3
(24, 3) 26 87.6 90.6 98.2 90.8 96.5 93.8 89.5 98.6 89.5 94.2 92.9
2 ∗ RD/CI (8, 1) 20 87.1 84.7 94.3 88.6 95.9 95.1 85.8 94.8 90.3 95.0 92.2
(16, 2) 36 92.7 94.6 96.8 97.3 98.4 95.6 91.8 99.4 96.7 98.6 96.2
(24, 3) 52 96.9 95.8 95.6 92.8 96.5 94.3 96.9 99.1 95.3 95.9 95.9
2 ∗ NI/CI (8, 1) 20 74.8 90.4 86.4 80.3 82.5 85.2 74.4 86.2 80.6 82.2 82.2
(16, 2) 36 95.6 99.2 98.8 98.0 98.2 99.4 93.8 98.3 96.9 97.4 97.6
(24, 3) 52 99.1 98.7 99.4 99.4 100 100 99.7 97.5 97.3 99.1 99.1
2 ∗ NI/RD/CI (8, 1) 100 70.2 88.9 87.0 80.0 85.2 85.5 71.9 87.1 81.6 84.9 82.2
(16, 2) 324 100 100 100 100 100 100 100 100 100 100 100
(24, 3) 676 98.2 100 100 100 100 100 99.6 99.9 99.9 100 99.8
2 ∗ 100.0% (8, 1) 200 78.1 94.5 92.2 91.1 93.0 92.0 76.2 92.4 91.8 92.6 89.4
(16, 2) 648 100 100 100 100 100 100 100 100 100 100 100
(24, 3) 1352 98.8 100 100 100 100 100 99.8 100 99.8 100 99.8
94 L. Liu et al. / Image and Vision Computing 30 (2012) 86–99

Table 5
Classification accuracies (%) for the three Outex test suites, where training was done at angle 0 and testing at the remaining 9 angles. The mean accuracy is the average over the three
test suites. The results for LBP, VAR, and LBP/VAR are quoted directly from the original paper by Ojala et al. [4].

The bold numbers indicate the highest classification score achieved on each dataset.

available texture databases namely, the Brodatz [31], the Outex [32], DLBP and NGF [20]: DLBP is an LBP variant extracting the dominant
the CUReT [3], and the KTHTIPS2b [27, 30] databases. The presenta- LBP patterns in texture image for classification. It is suggested in [20]
tion of the experimental results is divided into two groups with cor- that DLBP in combination with another complementary Gabor based
responding objectives. descriptor NGF, which captures global texture information, can yield
Experiment #1, presented in Section 3.2, aims at investigating the improved and robust classification results. This method also needs
proposed approach for gray scale and rotation invariant texture clas- pretraining.
sification, comparing our proposed descriptors with the classical LBP CLBP [21]: A local texture patch is represented by its center pixel,
and VAR descriptors proposed by Ojala et al. [4] and with other LBP the sign and magnitude of the differences of the neighborhoods
based approaches [21, 33, 20]. This setup utilizes all the same texture against the center pixel. CLBP is training free.
test suites and experimental setup as those used by Ojala et al. [4] The following three state-of-the-art approaches all need a time
(except that additional training angles where tested for Out- consuming universal texton dictionary learning stage:
ex_TC_00010 and Outex_TC_00012).
Experiment #2, presented in Section 3.3, examines the classifica- VZ-MR8 [2, 3]: Eight filter responses derived from the responses of
tion performance of the proposed approach for two more realistic 38 filters with large spatial support. A complicated anisotropic
and challenging texture classification tasks:
Gaussian filtering method was used to calculate the MR8 responses,
a texton dictionary is learned from the MR8 feature space, and a his-
1. Material classification dealing with exemplar identification, where
togram model is learned for an image by labeling each of the image
instances are imaged from single images obtained under unknown
viewpoint and illumination, using the popular CUReT database [3]. pixels with the texton that lies closest to it in filter response space.
2. Material categorization where each material consists of instances VZ-Joint [3]: The VZ-Joint is identical to the VZ-MR8 except the
imaged from multiple different physical samples under different local descriptor used, instead of using a dense filter bank descriptor,
viewpoints, illuminations and imaging distances, using the material the raw pixel intensities of an N × N square neighborhood around
database KTHTIPS2b [27, 30]. that point are taken as features.
VZ-MRF [3]: A texture image is represented using a two-
In both cases, comparisons are made with state-of-the-art dimensional histogram: one for the quantized bins of the patch
methods that have reported results on these datasets. center pixel, the other for the learned textons from the patch
with the center pixel excluded. The number of bins for the center
3.1. Methods tested pixel in [3] is as large as 200, and the size of the texton dictionary
riu2 riu2
is 61 × 40 = 2440, resulting in an extremely high dimensionality of
LBP and VAR [4]: three descriptors: joint LBPp, r /V ARp, r, LBPp, r and
2440 × 200 = 488,000.
VARp, r, are used in comparison. We follow the experimental setup in
[4] for these three descriptors, see [4] for details. VAR needs Implementation details: To make the comparisons as meaningful as
pretraining. possible, we keep our experimental settings as in [4]. The descriptor

Table 6
riu2 riu2
The number of misclassified samples for each texture class and rotation angle for NI − LBP16, 2/LBP _ R16, 2/CI on test suites Outex_TC_00010 (italic), Outex_TC_00012 “tl84” (plain)
and “horizon” (bold). Only texture classes with misclassified samples are shown, and all other texture classes are all correctly classified. This table can be compared with Table 5
from Ojala et al. [4], where however only results for Outex_TC_00010 are shown.

Texture 0° 5° 10° 15° 30° 45° 60° 75° 90° Total All

canvas001 • • • • • • • • • • • • • 1 • • • • • 1 • • 1 • • • • • 3 • 3
canvas033 • • • • 2 • 1 3 2 1 3 3 • 1 2 1 3 2 • 5 4 3 6 5 3 8 4 9 31 22 62
canvas038 • • 1 • • 1 • • 2 • • 1 • • 2 • • 2 • • 5 • 2 4 1 4 8 1 6 26 33
tile005 • 4 1 • 5 • • 3 1 • 5 • • 2 • • 2 • • 1 • • • • • 1 • • 23 2 25
tile006 • • 3 • 1 3 3 2 2 • 4 2 2 8 2 4 5 3 4 4 2 4 3 5 4 6 19 22 38 79
carpet002 • • • • • • • • • • • • • 1 • • 1 • • • • • 1 • • 1 • • 4 • 4
total 0 4 5 0 8 4 4 9 7 3 8 8 2 6 12 3 9 9 3 10 13 5 12 12 9 18 18 29 89 88 206
L. Liu et al. / Image and Vision Computing 30 (2012) 86–99 95

Fig. 10. Some example texture samples from tile005 (top row) and tile006 (bottom row). We can see that they look fairly similar.

abbreviations are summarized in Table 3. In all experiments, each tex- Following [4], each training sample is split into 121 disjoint 16× 16 sub-
ture sample is normalized to be zero mean and unit standard deviation. samples, whose histograms are then merged into one model histogram.
Results for the CUReT database are reported over 100 random partitions We point out that the seven testing images in each texture class are
of training and testing sets. 1NN is used for classification. physically different from the one designated training image.
Outex_TC_00010: 24 Outex texture classes (shown in Fig. 9) with
3.2. Experiment #1 each class having 20 samples. It was created by Ojala et al. [4], again
for rotation invariant texture classification. All textures in this test
3.2.1. Image data and experimental setup suite have the same illuminant “inca”. The training and testing
Contrib_TC_00001: This test suite consists of 16 texture classes scheme is the same as that for Contrib_TC_00001 but with nine differ-
from the Brodatz database [31] (a few shown in Fig. 8). This test ent rotation angles. All of the 480 (24 × 20) samples rotated by one
suite was designed for rotation invariant texture classification note [2] angle are adopted as the training data, and testing data consists of
http://www.ee.oulu.fi/mvg/page/image_data. There are eight samples all 480 samples rotated by the other 8 angles. Hence, there are 480
of size 180 × 180 in each class, out of which the first sample is utilized models for training, and 3840 (480 × 8) for validation.
for training and the other seven as testing. Given ten rotation angles, Outex_TC_00012: Created by Ojala et al. [4] for rotation and illumina-
the classifier is trained with samples artificially rotated to just one tion invariant texture classification. The texture classes are the same as
angle and tested against samples rotated to the other nine angles. In Outex_TC_00010. The classifier was trained with the same training
each experiment, the classifier was trained with 16 images and tested samples as Outex_TC_00010, but tested with all samples captured at all
with 1008 (16 × 7 × 9) samples, 63 in each of the 16 texture classes. 9 rotation angles under different illuminants “t184” or “horizon”. Due

Table 7
Classification accuracies (%) of descriptor NI/RD/CI for Outex_TC_00010 and Outex_TC_00012: training is done at just one rotation angle, and the average accuracy over 9 angles.

Test suite (p, r) Rotation Angle for Train (“inca”)

0° 5° 10° 15° 30° 45° 60° 75° 90° Average

7 ∗ [c] Outex_ (8, 1) 90.9 91.6 92.1 93.0 91.3 90.8 88.9 89.0 84.3 90.2
TC_00012 (16, 2) 98.0 98.3 99.1 98.6 98.4 98.6 98.6 97.7 96.8 98.3
(“tl84”) (24, 3) 97.3 98.3 98.5 98.7 97.2 96.4 93.4 94.2 94.1 96.5
(8, 1) + (16, 2) 97.4 98.0 98.4 98.5 98.3 98.3 97.8 97.1 95.6 97.7
(8, 1) + (24, 3) 97.7 93.3 98.7 98.7 98.5 97.9 96.4 96.6 96.4 97.7
(16, 2) + (24, 3) 98.3 99.0 99.3 99.2 98.9 98.9 98.3 98.1 98.1 98.7
(8, 1) + (16, 2) + (24, 3) 98.5 98.9 99.1 99.1 99.0 98.9 98.4 98.2 98.1 98.7

7 ∗ [c] Outex_ (8, 1) 92.7 92.8 93.3 93.6 92.7 91.6 90.3 91.1 86.6 91.6
TC_00012 (16, 2) 98.0 98.0 98.3 98.4 97.7 97.9 98.2 98.3 98.1 98.1
(“horizon”) (24, 3) 96.2 97.0 97.0 97.3 95.5 95.1 92.7 93.7 94.1 95.4
(8, 1) + (16, 2) 98.2 97.8 98.3 97.9 97.1 97.8 98.2 97.8 97.0 97.8
(8, 1) + (24, 3) 97.8 97.5 97.7 97.7 96.2 96.1 95.1 95.2 95.1 96.3
(16, 2) + (24, 3) 97.8 98.3 98.2 98.3 97.3 97.5 96.9 97.0 97.7 97.7
(8, 1) + (16, 2) + (24, 3) 97.8 98.4 98.4 98.2 97.4 97.7 97.5 97.1 97.6 97.8

7 ∗ [c] Outex_ (8, 1) 96.5 96.3 97.4 97.6 96.2 95.3 92.7 94.9 91.8 95.4
TC_00010 (16, 2) 99.3 99.4 99.5 99.7 99.6 99.6 99.5 99.0 99.0 99.4
(“inca”) (24, 3) 99.2 99.5 99.4 99.5 99.5 99.5 99.2 99.3 99.1 99.4
(8, 1) + (16, 2) 99.4 99.4 99.6 99.6 99.5 99.4 99.4 99.0 98.6 99.3
(8, 1) + (24, 3) 99.3 99.5 99.5 99.5 99.6 99.6 99.7 99.4 99.2 99.5
(16, 2) + (24, 3) 99.6 99.7 99.8 99.7 99.7 99.9 99.8 99.7 99.5 99.7
(8, 1) + (16, 2) + (24, 3) 99.7 99.7 99.7 99.6 99.6 99.8 99.9 99.7 99.4 99.7

The bold numbers indicate the highest classification score achieved on each dataset.
96 L. Liu et al. / Image and Vision Computing 30 (2012) 86–99

Fig. 11. Comparing the best classification scores of our approach with various state-of-the-art methods on all the three test suites. All the results are as originally reported, except for
those of VZ-MR8 and VZ-Joint, which are obtained by us using the exact same experimental setup as Varma and Zisserman did [2, 3]. For VZ-MR8 and VZ-Joint, 40 textons per class
is used for building the universal texton dictionary.

riu2
to the varying illuminants, some texture samples have a large tactile significantly outperformed their simpler counterpart NI − LBP8, 1.
dimension which induces significant local gray-scale distortions, there- This is also the case with RD-LBP. Interestingly, the performance of
fore Outex_TC_00012 is more challenging than Outex_TC_00010. NI-LBP increases with the neighborhood size, while for RD-LBP, the
riu2
best performance is achieved by RD − LBP16, 2. On average, between
3.2.2. Experimental results on Contrib_TC_00001 the individual descriptors, LBP performs the best and VAR the worst.
Ojala et al. [4] reported a near-perfect classification accuracy of The center pixel also provides useful discriminative information,
99.7% for the joint descriptor LBP VAR when using two spatial resolu- since it is apparent in Table 4 that combining the center pixel CI-LBP
tions (8,1) + (24,3) or three spatial resolutions (8,1) + (16,2) + 24,3. with NI-LBP or RD-LBP can generally improve classification perfor-
Table 4 presents the results for our proposed descriptors, comparing mance. Neglecting the center pixel clearly results the loss in informa-
with the state-of-the-art methods [4]. tion, similar to how [3] and [21] demonstrated the benefits of
The individual descriptor NI-LBP and RD-LBP perform similarly, explicitly including the information of the center pixel in the
riu2 riu2
with NI-LBP doing slightly better. NI − LBP16, 2 and NI − LBP24, 3 classifier.

Fig. 12. One sample of each of the 61 texture classes from CUReT.
L. Liu et al. / Image and Vision Computing 30 (2012) 86–99 97

Fig. 13. The variations within each category of the new KTHTIPS2b database. Each row shows one example image from each of four samples of a category.

riu2 riu2 riu2


The fusion of NI-LBP and RD-LBP produced perfect classification re- Finally, we can see that NI − LBP16, 2/RD − LBP16, 2 and NI − LBP16, 2/
riu2
sults at (16,2). The much improved classification accuracy by combin- RD − LBP16, 2/CI − LBP produce very robust classification performance
ing NI-LBP and RD-LBP, which were on their own no better than LBP, in all three cases. This is in contrast to the descriptor LBP/VAR, the per-
implies that they capture truly complementary texture information. formance of which decreases considerably in gray scale and rotation
It is evident, based on the results, that the performance of the pro- invariant texture classification. The excellent classification results
posed NI-LBP/RD-LBP and NI-LBP/RD-LBP/CI-LBP descriptors are superior demonstrate that NI-LBP/RD-LBP/CI0-LBP is more stable for texture
to that of LBP/VAR. Note that here we only consider a single resolution classification irrespective of the different imaging geometries of the
for our descriptors. It may be argued that this test suite is too easy for illuminants affecting the appearance of local distortions caused by
texture classification; more challenges of test suites follow. the tactile dimension of textures.
Table 6 shows the number of misclassified samples for each texture
riu2 riu2
and rotation angle for the best descriptor NI − LBP16, 2/LBP _ R16, 2/
3.2.3. Experimental results on Outex_TC_00010 and Outex_TC_00012 CI − LBP for all three cases, allowing a detailed analysis of discrimina-
Table 5 presents results for our proposed descriptors and those tion of individual textures and the effect of rotation. Overall, NI −
riu2 riu2
proposed in [4] on test suites Outex_TC_00010 and Outex_TC_00012. LBP16, 2/RD − LBP16, 2/CI − LBP classified 18 out of the 24 classes com-
The conclusions from these results are similar to those from Table 4, pletely correctly, having most difficulties with tile006, followed by
with the following additional observations. canvas033.
riu2 riu2
First, our proposed descriptor NI − LBP16, 2/RD − LBP 16, 2/CI − LBP Incidentally, for all the three test suites nearly all of the misclassified
produces consistently the best classification scores across all three samples in tile006 and tile005 were assigned to each other. Fig. 10 shows
test suites, a considerable improvement over the best reported re- some example textures from class tile005 and tile006, where we can ob-
sults by Ojala et al. [4], especially for Outex_TC_00012 “tl84” and serve the high degree of perceptual similarity.
“horizon”. Motivated by its excellent classification performance and in order
Second, among individual descriptors, although NI-LBP and RD-LBP to fully examine the classification performance of descriptor NI-LBP/
did not outperform LBP or VAR, their combination significantly outper- RD-LBP/CI-LBP, Table 7 shows the results of extensive experiments
formed LBP/VAR. We maintain that the NI-LBP/RD-LBP strength stems which we conducted on the three Outex test suites by varying the
from their complementarity, in that NI-LBP measures the variation of training angle. We can see that the performance is very robust, espe-
riu2 riu2
the neighboring pixels on the same circumference, while RD-LBP cap- cially true with NI − LBP16, 2/RD − LBP16, 2/CI − LBP. We can also ob-
tures the edge information between circumferences, analogous to the serve the better results obtained by multiresolution analysis over
combination of RIFT and SIFT used by Zhang et al. [8]. single resolution. We acknowledge that the multiresolution analysis

Table 8
Comparing classification accuracy (%) on CUReT: Ntr is the number of training samples per class used. All results are obtained by us except for VZ-Joint, which are quoted from the
recent comparative study of Zhang et al. [8]. For VZ-MR8, we learn 10 textons per class.

(p, r) (8, 1) (16, 2) (24, 3)

Ntr 46 23 12 6 2 46 23 12 6 2 46 23 12 6 2
LBP/VAR 93.76 88.71 81.80 71.08 50.43 4.00 89.76 81.53 71.09 52.77 91.90 85.34 77.12 66.04 48.64
NI/RD/CI 95.15 92.00 86.19 77.97 57.96 9563 92.7 87.12 79.57 60.89 92.59 87.85 80.92 70.33 52.21

(p, r) (8, 1) + (16, 2) (8, 1) + (24, 3) (8, 1) + (16, 2) + (24, 3)


Ntr 46 23 12 6 2 46 23 12 6 2 46 23 12 6 2
NI/RD 94.78 91.17 85.73 76.67 57.28 95.79 91.17 86.26 77.25 55.68 95.75 91.88 85.41 75.71 57.80
NI/RD/CI 96.88 93.55 89.29 80.18 61.28 96.66 93.57 88.41 79.90 60.52 96.78 93.45 88.94 79.69 62.14

Method Neighborhood size 46 23 12 6 2


VZ-MR8 19 × 19 96.37 92.34 86.96 77.17 54.88
VZ-Joint 7×7 96.19 92.00 86.56 76.87 54.69
98 L. Liu et al. / Image and Vision Computing 30 (2012) 86–99

Table 9 Comparison of Best Classification Scores


Comparison of highest classification performance on CUReT with state-of-the-art re- 70

Classification Accuracy (%)


Our
sults using NNC classifier. Number of training and testing samples per class is equal, LBP 66.0 65.3
i.e. 46. Our score 97.29% is obtained with NI/RD/CI at multiresolutions (8, 1) + 65 VZ-MR8
62.9
(16, 2) + (24, 5). All the results from other methods are quoted directly from the orig- VZ-Joint 62.2
60.7
inal papers except for those of LBP/VAR, which are obtained by us. CLBP 60.3
60 59.3
58.1 57.3
Method Ours LBP/ VZ-MR8 VZ-Joint VZ-MRF CLBP 56.9
55.7
VAR [2] [3] [3] [21] 55 53.3
51.5 52.3
Neighborhood 11 × 11 5 × 5 49 × 49 19 × 19 11 × 11 7×7
size 50
Bins 2200 416 2440 610 219, 600 2200 46.3
Accuracy (%) 97.29 94.00 97.43 97.17 98.03 97.39 45

Number of training samples per material


will increase the dimensionality of the histogram feature, however
Fig. 14. Comparing the proposed approach with various state-of-the-art methods on
the largest dimensionality of 2200 for three resolutions is not a big
KTHTIPS2b.
problem.
To conclude Experiment #1, Fig. 11 compares the best scores
achieved by our proposed method and those reported by six other in scale or in-plane rotation. CUReT is a challenging test of texture de-
state-of-the-art methods. It is quite clear that our approach consis- scriptors because of the large intra-class variation including the effects
tently outperforms all state-of-the-art methods in gray scale and ro- of specularities, interreflections, shadowing, and other surface normal
tation invariant texture classification. variations due to lighting geometry. Consistent with other CUReT stud-
It is important to emphasize that although our proposed descrip- ies [2, 3], we consider only the 92 images per class which afford the ex-
tors are motivated by LBP, in practice we are extracting very different traction of a 200 × 200 pixel foreground region of texture, the same
local texture information, whereas DLBP, LBPV and CLBP are all subset of images as in [2, 3].
LBP-based approaches. LBP, CLBP, and our proposed approach share KTHTIPS2b [36, 27]: It is generally agreed [2, 3, 27] that the major
the advantage of being training-free and computationally simple, drawback of the CUReT database is that materials are imaged at a con-
since they are based upon a pre-defined dictionary rather than one stant scale. The acquisition procedure for KTHTIPS2b has been de-
derived with reference to the dataset to be analyzed. In contrast, scribed in more detail in [36], with 3 viewing angles, 4 illuminants,
VZ-MR8 and VZ-Joint require a time-consuming universal texton dic- and 9 different scales, producing 432 images per class. Fig. 13 illus-
tionary learning stage by clustering local feature vectors extracted trates an example of the 11 materials. Notice in particular the striking
from training samples. From Fig. 11 we see that our approach has differences between samples of the same class. There is almost no
about a 5%–7% improvement over VZ-MR8 and VZ-Joint, most likely intra-class variation due to in-plane rotation for this database.
in part due to the limited training samples for learning the universal For the experiments on KTHTIPS2b, we follow the training and
texton dictionary, leading to a drop in accuracy for VZ-MR8 and testing scheme used in [27]. We perform experiments training on
VZ-Joint. one, two, or three samples; testing is always conducted only on un-
seen samples.
3.3. Experiment #2
3.3.2. Experimental results
Motivated by the excellent performance for the proposed approach Table 8 presents the results on CUReT. Our method consistently
demonstrated in the previous section, here we test the performance of outperformed LBP/VAR, and the multiresolution analysis of the pro-
the proposed approach for material classification and categorization, posed approach helps to improve classification performance, produc-
using the CUReT and KTHTIPS2b databases. ing slightly higher classification scores than VZ-MR8 and VZ-Joint.
Moreover, it is clear that the degree of improvement of our descriptor
3.3.1. Image data and experimental setup NI-LBP/RD-LBP/CI-LBP over that of LBP/VAR is increased given fewer
CUReT [2, 3, 35]: The original CUReT database [35] consists of 61 training samples, in accordance with the findings based on the three
texture classes, shown in Fig. 12, with each class containing 205 im- Outex test suites.
ages of a physical texture sample photographed under a (calibrated) In order to make the comparison fair, Table 9 compares the best
range of viewing and lighting angles, but without significant variation classification scores achieved by various state-of-the-art methods on
CUReT. We can see that our proposed approach is outperformed by
Table 10 VZ-MR8, VZ-Joint, and VZ-MRF with large neighborhood size and
Classification results (%) of the proposed descriptors and the LBP on KTHTIPS2b. more textons. This is because VZ-MR8 and VZ-Joint are statistical ap-
riu2 riu2
CLBP _ S/M/C represents CLBP _ Sp, r /Mp, r /C. proaches, and the very large number of training samples in this dataset
(p, r) (8, 1) (16, 2) (24, 3) allows those methods to find representative textons; with reduced
training data the performance of the VZ methods would decrease. Nev-
Ntrain 1 2 3 1 2 3 1 2 3
ertheless, our method, despite a small spatial support, can compete
LBP 48.1 54.2 56.8 50.5 55.8 59.1 49.9 54.6 57.8 with VZ-MR8 and VZ-Joint having a much larger spatial support. The
NI 46.1 48.7 52.3 37.6 41.2 44.4 39.0 42.6 45.4
lesser performance, in general, of the LBP methods is that there are
RD 48.1 54.2 56.9 44.0 49.4 52.1 38.5 42.7 45.2 scale and affine variations in the CUReT database, while LBP-based ap-
RD/CI 52.6 57.8 61.2 49.9 56.2 59.8 48.0 53.5 56.3 proaches are proposed for rotation and gray level invariance and have
NI/CI 47.1 53.1 56.4 46.5 50.5 53.0 44.8 48.8 51.6 limited capability to address scale and affine invariance.
NI/RD 53.5 60.0 63.1 54.0 59.0 61.6 49.5 55.2 59.1
We have conducted experiments with larger neighborhood sizes on
NI/RD/CI 56.6 61.9 64.8 57.7 62.5 65.1 52.4 57.5 61.7
CUReT, and our descriptor NI/RD/CI at multiresolutions (8,1)+ (16,2) +
(p, r) (8, 1) + (16, 2) (16, 2) + (24, 3) (8, 1) + (16, 2) (24,5) gives classification scores of 97.29%, 94.48%, 88.96%, 80.70% for
+ (24, 3) 46, 23, 12, and 6 training samples per texture class, respectively,
Ntrain 1 2 3 1 2 3 1 2 3 which are slightly better than the results achieved by the CLBP approach
NI/RD/CI 58.1 62.9 66.0 55.9 61.0 64.2 56.7 61.7 65.0
(97.39%, 94.19%, 88.72%, and 79.88%, respectively).
L. Liu et al. / Image and Vision Computing 30 (2012) 86–99 99

As shown in Table 10 and Fig. 14, we also compare our method [8] J. Zhang, M. Marszalek, S. Lazebnik, C. Schmid, Local features and kernels for clas-
sification of texture and object categories: a comprehensive study, Int. J. Comput.
with state-of-the-art methods on the material categorization task of Vis. 73 (2) (2007) 213–238.
the KTHTIPS2b textures, with all results from other methods quoted [9] M. Crosier, L.D. Griffin, Using basic image features for texture classification, Int. J.
directly from [27]. For this database, our proposed NI-LBP/RD-LBP/CI- Comput. Vis. 88 (3) (2010) 447–460.
[10] S. Lazebnik, C. Schmid, J. Ponce, A sparse texture representation using local affine
LBP descriptor outperforms all compared state-of-the-art methods regions, IEEE Trans. Pattern Anal. Mach. Intell. 27 (8) (2005) 1265–1278.
by a significant margin. We should bear in mind that the classification [11] B.S. Manjunath, W.Y. Ma, Texture features for browsing and retrieval of image
results of all of the methods are obtained with a 1NN classifier, since data, IEEE Trans. Pattern Anal. Mach. Intell. 18 (8) (1996) 837–842.
[12] T. Randen, J. Husøy, Filtering for texture classification: a comparative study, IEEE
we mainly focus our attention on the effectiveness of the descriptors Trans. Pattern Anal. Mach. Intell. 21 (4) (1999) 291–310.
rather than on the capabilities of the classifier. Using a more advanced [13] T. Ojala, M. Pietikäinen, D. Harwood, A comparative study of texture measures
classifier (SVM or k > 1) might improve performance significantly. with classification based on feature distributions, Pattern Recognit. 29 (1)
(1996) 51–59.
[14] Y. Rodriguez, S. Marcel, Face authentication using adapted local binary pattern
histograms, European Conference on Computer Vision (ECCV), Graz, Austria,
4. Conclusions and future work
2006, pp. 321–332.
[15] T. Ahonen, A. Hadid, M. Pietikäinen, Face description with local binary patterns:
This paper has proposed a novel local texture descriptor, generalizing application to face recognition, IEEE Trans. Pattern Anal. Mach. Intell. 28 (12)
(2006) 2037–2041.
the well-known LBP approach. Four LBP-like descriptors, two local
[16] G. Zhao, M. Pietikäinen, Dynamic texture recognition using local binary patterns
intensity-based CI-LBP and NI-LBP, and two local difference-based de- with an application to facial expressions, IEEE Trans. Pattern Anal. Mach. Intell.
scriptors RD-LBP and AD-LBP, were presented to extract complementary 29 (6) (2007) 915–928.
texture information of local spatial patterns. We showed that combining [17] A. Hafiane1, G. Seetharaman, B. Zavidovique, Median binary pattern for textures
classification, Proceedings of ICIAR, 2007, pp. 387–398.
complementary descriptors played an important role in texture discrim- [18] M. Heikkilä, M. Pietikäinen, J. Heikkilä, A texture-based method for detecting
ination. In addition, we found that information contained in radial differ- moving objects, British Machine Vision Conference (BMVC), London, vol. 1,
ences is more discriminative than those contained in angular difference. 2004, pp. 187–196.
[19] M. Pietikäinen, T. Nurmela, T. Mäenpää, M. Turtinen, View-based recognition of
The advantages of the proposed approach include its computational real-world textures, Pattern Recognit. 37 (2) (2004) 313–323.
simplicity, no training (in the feature extraction stage), and a data- [20] S. Liao, M.W.K. Law, A.C.S. Chung, Dominant local binary patterns for texture clas-
independent universal texton dictionary. Extensive experimental re- sification, IEEE Trans. Image Process. 18 (5) (2009) 1107–1118.
[21] Z. Guo, L. Zhang, D. Zhang, A completed modeling of local binary pattern operator
sults show that the joint distribution of CI-LBP, NI-LBP and RD-LBP sig- for texture classification, IEEE Trans. Image Process. 9 (16) (2010) 1657–1663.
nificantly outperform the conventional LBP approach and its various [22] M. Heikkilä, M. Pietikänen, C. Schmid, Description of interest regions with local
invariants on the Outex test suites. Furthermore, results on the material binary patterns, Pattern Recognit. 42 (3) (2009) 425–436.
[23] J. Chen, S. Shan, C. He, G. Zhao, M. Pietikäinen, X. Chen, W. Gao, WLD: A robust
database KTHTIPS2b demonstrate the best performance of the proposed
local image descriptor, IEEE Trans. Pattern Anal. Mach. Intell. 32 (9) (2010)
approach in comparison with several state-of-the-art methods with a 1705–1720.
nearest neighbor classifier. [24] G.K. Cross, A.K. Jain, Markov random field texture models, IEEE Trans. Pattern
Anal. Mach. Intell. 5 (1) (1983) 25–39.
In the future, we plan to explore how to reduce the feature dimen-
[25] T. Ojala, K. Valkealahti, E. Oja, M. Pietikäinen, Texture discrimination with multi-
sion of the multiresolution CI-LBP/NI-LBP/RD-LBP. We also believe dimensional distributions of signed gray-level differences, Pattern Recognit. 34 (3)
that an in-depth investigation of the AD-LBP descriptor would be valu- (2001) 727–739.
able for local region description, looking at the parallels between [26] L. Liu, P. Fieguth, G. Kuang, Generalized local binary patterns for texture classifica-
tion, British Machine Vision Conference (BMVC2011), 2011.
AD-LBP and the CS-LBP of [22] developed for image matching. [27] B. Caputo, E. Hayman, P. Mallikarjuna, Class-specific material categorization, Inter-
national Conference on Computer Vision (ICCV), Beijing, 2005, pp. 1597–1604.
[28] T. Ahonen, M. Pietikäinen, Soft histograms for local binary patterns, Finnish Signal
References Processing Symposium, Oulu, Finland, 2007.
[29] X. Tan, B. Triggs, Enhanced local texture feature sets for face recognition under
[1] M. Tuceryan, A.K. Jain, Texture analysis, in: C.H. Chen, L.F. Pau, P.S.P. Wang (Eds.), difficult lighting conditions, IEEE Trans. Image Process. 19 (6) (2010) 1635–1650.
Handbook Pattern Recognition and Computer Vision, World Scientific, Singapore, [30] B. Caputo, E. Hayman, M. Fritz, J.-O. Eklundh, Classifying materials in the real
1993, pp. 235–276. world, Image Vis. Comput. 28 (1) (2010) 150–163.
[2] M. Varma, A. Zisserman, A statistical approach to texture classification from single [31] P. Brodatz, Texture: A Photographic Album for Artists and Designers, Dover, New
images, Int. J. Comput. Vis. 62 (1–2) (2005) 61–81. York, 1966.
[3] M. Varma, A. Zisserman, A statistical approach to material classification using [32] T. Ojala, T. Mäenpää, M. Pietikäinen, J. Viertola, J. Kyllönen, S. Huovinen, Outex–new
image patches, IEEE Trans. Pattern Anal. Mach. Intell. 31 (11) (2009) 2032–2047. frame work for empirical evaluation of texture analysis algorithm, International
[4] T. Ojala, M. Pietikäinen, T. Mäenpää, Multiresolution gray-scale and rotation in- Conference on Pattern Recognition, 2002, pp. 701–706.
variant texture classification with local binary patterns, IEEE Trans. Pattern [33] Z. Guo, L. Zhang, D. Zhang, Rotation invariant texture classification using LBP vari-
Anal. Mach. Intell. 24 (7) (2002) 971–987. ance (LBPV) with global matching, Pattern Recognit. 43 (3) (2010) 706–719.
[5] T. Leung, J. Malik, Representing and recognizing the visual appearance of materials [34] L. Nanni, A. Lumini, S. Brahnam, Local binary patterns variants as texture descriptors
using three-dimensional textons, Int. J. Comput. Vis. 43 (1) (2001) 29–44. for medical image analysis, Artif. Intell. Med. 49 (2) (2010) 117–125.
[6] L. Zhang, L. Zhang, Z. Guo, D. Zhang, Monogenic-LBP: a new approach for rotation [35] K.J. Dana, B. van Ginneken, S.K. Nayar, J.J. Koenderink, Reflectance and texture of
invariant texture classification, IEEE International Conference on Image Processing real-world surfaces, ACM Trans. Graph. 18 (1) (1999) 1–34.
(ICIP), 2010, pp. 2677–2680. [36] P. Mallikarjuna, M. Fritz, A.T. Targhi, E. Hayman, B. Caputo, J.-O. Eklundh, The
[7] J. Xie, L. Zhang, J. You, D. Zhang, Texture classification via patch-based sparse texton KTH-TIPS and KTH-TIPS2 Databases, http://www.nada.kth.se/cvap/databases/
learning, IEEE International Conference on Image Processing (ICIP), 2010, kth-tips/ 2006.
pp. 2737–2740.

You might also like