FEATURE
HIERARCHIES FOR
OBJECT
CLASSIFICATION
By: Eng Wei Yong, Rui Hua, Vanya
V.Valindria
OUTLINE


1.   Introduction
2.   Comparison with previous work
3.   Algorithm
4.   Experiment and Results
5.   Conclusion
Feature Hierarchies for Object
Classification
   Automatically extracting
    informative feature
    hierarchies for object
    classification
   Top-down manner
   Entire hierarchy are
    learned during a training
    phase
Overview of Feature Hierarchy
   Hierarchies are significantly more informative
    compared with holistic features.
   Selection of effective image features is crucial
     Identify common object parts
     Allows variations learned from training data

   Input: A set of class & non-class images
   Output: Hierarchical features with learned
    parameters
Previous Work

   Non-hierarchical
   Feature hierarchies
     Architecture   of the hierarchy is pre-defined
   Advantages of both method are combined in
    this paper
Construction of Feature
Hierarchies

Algorithm

   Initial informative fragments are selected
   Selected fragments are used to extract the
    sub-features
   Optimize parameters of features hierarchy
   Classification
Selecting informative image
fragment
   Detection threshold,  for each fragment is
    selected to maximize MI(fi;C)




   Identifies next fragment that delivers maximal
    amount of additional information
Extracting sub-fragments
Constructing Positive and negative
examples
•   Positive examples
    are thus fragments in
    class positive images
    where the feature is
    detected or almost
    detected
•   Negative examples
    are fragments in
    class negative
    images where the
    feature is detected or
Extracting sub-fragments


                                         Parent fragment

                            Child fragment




If it increases
                  Keep decomposition
delivered
information
Extracting sub-fragments


                                                          Grand parent
                                                          fragment
                                             Parent fragment


              Child fragment



If it does NOT                                       Atomic fragment
                               Stop decomposition
increases delivered
information
Optimizing ROI
   Size of ROI
                  ROI too smallinformation
                  low
                  ROI too largeinformation
                  low

                  The size of ROI should be
                  chosen to maximize the
                  mutual information between
                  the fragment and the class

                  Top-down manner
Classification by hierarchy
   The response of all sub-features




   Final response

                                       -1< Sp <1

   At top level, compare Sp with 0
Classification by hierarchy
   During training updating weights and positions
    alternatively
   Position step:
     Fixed weights
     Optimize positions
   Weight step:
    Fixed position
    Optimize weights
Summary of algorithm
                 Hierarchical Feature
                  Construction
 Positive
 Images
                 S(f)




  Negative
  Images
                          Evaluate MI
Summary of algorithm
   Hierarchical Feature Construction
                                  Hierarchical Feature
                                   Construction H
     Positive
     Images
                               S(f)


                                                          Evaluate MI


      Negative
      Images                            Atom




                                               Optimize
                                                 ROI
Summary of algorithm
   Classification
    Step
    Novel Image          Hierarchy




             Cross-
           correlation
Summary of algorithm

   Classification
    Step
      Hierarchy                TOP: Final
                               Response


                                            >0   <0


                                            1    0


        Sub-
                     Feature
       feature
                     Respons
      Response
                      e Map
         Map
Experiment




3 object classes:
faces, cows and
airplanes
Experiment
Conclusions
Pros:
 The extraction of image fragments is automatic

 The hierarchies outperforms the holistic features

 Feature hierarchies can be used to improve the

  performance of classification schemes

Cons:
 Optimization of features is not quite complete

 Application process is not as computationally

  efficient
THANK YOU…..

Feature Hierarchies for Object Classification

  • 1.
    FEATURE HIERARCHIES FOR OBJECT CLASSIFICATION By: EngWei Yong, Rui Hua, Vanya V.Valindria
  • 2.
    OUTLINE 1. Introduction 2. Comparison with previous work 3. Algorithm 4. Experiment and Results 5. Conclusion
  • 3.
    Feature Hierarchies forObject Classification  Automatically extracting informative feature hierarchies for object classification  Top-down manner  Entire hierarchy are learned during a training phase
  • 4.
    Overview of FeatureHierarchy  Hierarchies are significantly more informative compared with holistic features.  Selection of effective image features is crucial  Identify common object parts  Allows variations learned from training data  Input: A set of class & non-class images  Output: Hierarchical features with learned parameters
  • 5.
    Previous Work  Non-hierarchical  Feature hierarchies  Architecture of the hierarchy is pre-defined  Advantages of both method are combined in this paper
  • 6.
    Construction of Feature Hierarchies Algorithm  Initial informative fragments are selected  Selected fragments are used to extract the sub-features  Optimize parameters of features hierarchy  Classification
  • 7.
    Selecting informative image fragment  Detection threshold, for each fragment is selected to maximize MI(fi;C)  Identifies next fragment that delivers maximal amount of additional information
  • 8.
    Extracting sub-fragments Constructing Positiveand negative examples • Positive examples are thus fragments in class positive images where the feature is detected or almost detected • Negative examples are fragments in class negative images where the feature is detected or
  • 9.
    Extracting sub-fragments Parent fragment Child fragment If it increases Keep decomposition delivered information
  • 10.
    Extracting sub-fragments Grand parent fragment Parent fragment Child fragment If it does NOT Atomic fragment Stop decomposition increases delivered information
  • 11.
    Optimizing ROI  Size of ROI ROI too smallinformation low ROI too largeinformation low The size of ROI should be chosen to maximize the mutual information between the fragment and the class Top-down manner
  • 12.
    Classification by hierarchy  The response of all sub-features  Final response -1< Sp <1  At top level, compare Sp with 0
  • 13.
    Classification by hierarchy  During training updating weights and positions alternatively  Position step: Fixed weights Optimize positions  Weight step: Fixed position Optimize weights
  • 14.
    Summary of algorithm  Hierarchical Feature Construction Positive Images S(f) Negative Images Evaluate MI
  • 15.
    Summary of algorithm  Hierarchical Feature Construction  Hierarchical Feature Construction H Positive Images S(f) Evaluate MI Negative Images Atom Optimize ROI
  • 16.
    Summary of algorithm  Classification Step Novel Image Hierarchy Cross- correlation
  • 17.
    Summary of algorithm  Classification Step Hierarchy TOP: Final Response >0 <0 1 0 Sub- Feature feature Respons Response e Map Map
  • 18.
  • 19.
  • 20.
    Conclusions Pros:  The extractionof image fragments is automatic  The hierarchies outperforms the holistic features  Feature hierarchies can be used to improve the performance of classification schemes Cons:  Optimization of features is not quite complete  Application process is not as computationally efficient
  • 21.

Editor's Notes

  • #4 Feature Hierarchies for Object Classification is a method which used to Automatically extracting informative feature hierarchies for object classificationa top-down manner:informative top-level fragments are extracted first, and bya repeated application of the same feature extractionprocess;the classification fragments are broken downsuccessively into their own optimal components. hierarchical decomposition terminates with atomicfeatures that cannot be usefully decomposed into simplerfeatures.entire hierarchy, the different features andsub-features, and their optimal parameters, are learnedduring a training phase using training examples.
  • #5 hierarchies are significantly more informative and better for classificationExperimental evaluations show that the decompositionby our method increases the amount of informationdelivered by the fragments by a wide margin, improvesthe detection rate, and increases the tolerance for localdistortions and illumination changes.selection of effective image features is crucial for a successful classification scheme.first, they identify common object parts thatcharacterize the different objects within the class, andsecond, the parts are combined in a manner that allowsvariations learned from training data.Output: hierarchical features with learned parameters (combination weights, geometric relations)
  • #6 The features used by thesemethods were non-hierarchical, that is, they were notbroken down into distinct simpler sub-parts, but detecteddirectly by comparing the fragment to the image. Theirsimilarity can be measured by different measures,including normalized cross-correlation, affine-invariantmeasures [6], and the SIFT measure [7].A number of classification schemes have also usedfeature hierarchies rather than holistic features. Suchschemes were often based on biological modeling,motivated by the structure of the primate visual system,which has been shown to use a hierarchy of features ofincreasing complexity, from simple local features in theprimary visual cortex, to complex shapes and object viewsin higher cortical areas.In a number of these models, [8,9], the architecture of the hierarchy (size, position andshape of features and their sub-features) is pre-definedrather than learned for different classification tasks.The study uses a network model in whichboth the combination weights and the convolutiontemplates were learned from examples by backpropagation,whereas the number of hierarchy levels andpositional tolerance were pre-defined.In the present work, we combine the advantages oflearning informative classification fragments, with thelearning of hierarchical structure with adaptiveparameters.In summary, classification features used inthe past were either highly informative but non-hierarchical,or hierarchical features which were lessinformative and not as useful.
  • #7 These two steps are applied recursivelyuntil a level of ‘atomic fragments’ is reached
  • #8 The process identifies fragments thatdeliver the maximal amount of information about theclass.The mutual information is a function of the detectionthreshold θi. If the threshold is too low, the informationdelivered by the fragment about the class will be low,because the fragment will be detected with high frequencyin both class and non-class images. A high threshold willalso yield low mutual information, since the fragment willbe seldom detected in both class and non-class images. Atsome intermediate value of threshold, the mutualinformation reaches a maximum. The detection thresholdfor each fragment is selected to maximize the informationMI(fi;C) between the fragment and the class.After finding thefragment with the highest mutual information score, thesearch identifies the next fragment that delivers themaximal amount of additional information with respect topreviously selected fragments. At iteration i the fragmentfi was selected to increase the mutual information of thefragment set by maximizing the minimal addition inmutual information withHere Ki is the set of candidate fragments, Si is the set ofselected fragments up to iteration i, fi is the fragment to beselected at iteration i. The min is taken over all previouslyselected fj, to avoid redundancy: if fk is similar to one ofthe selected fragments, this minimum will be small. Themax stage then finds the candidate in the pool with thelargest additional contribution. In empirical testing, thisalgorithm was shown to
  • #9 For any feature, the aim is to repeat the above process using the new feature as the target, rather than the overall object. First, we need to prepare trainingexample sets. For the top level of features, example sets only include detected or non detected features. But with hierarch structure, features can be decomposed to many sub-features, which can detect more difficult examples.
  • #10 After the positive and negative examples are set up, now we can extract sub-fragments using the same procedure as introduced by Eng. if set of sub-features increases information, they are added to the hierarchy. This process is easy to be explained using a family tree of fragment.
  • #11 After the positive and negative examples are set up, now we can extract sub-fragments using the same procedure as introduced by Eng. if set of sub-features increases information, they are added to the hierarchy. This process is easy to be explained using a family tree of fragment. it is considered as atomic fragment, it cannot be decomposed again. Usually it contains edges, corners or lines.
  • #15 The input is a set of Positive class image, which is the face and a set of Negative non-class images. First, we initialize H as a tree with a single node. We extract a set of first level fragments and add as children. Then, evaluate the mutual information between the trees and the class.For e ach leaf fragment f, we determine whether the set is positive or negative.And find the set of the most informative from there sub-fragments of f
  • #16 We add these fragment as children and evaluate their mutual information again. If it does not increase (compared with the case when we do not use these fragments), we remove it and mark the leaf node as ‘atomic’ fragment. Otherwise, add these fragment in our tree.We repeat these steps until all of the fragments are marked as atomic. Also, we should optimize the ROI size as described before.
  • #17 For classification stage, we compute the correlations of ALL the leaf nodes of H with the image and store these values, so that we get the response maps.
  • #18 For every node of H whose children’s response maps have been computed, we should computed its own response map. How?In each position of the feature within the image, find maximal response of its children in their ROIs and combined their responses.We repeat these process from the bottom or children to the parents or top of the hierarchy.Using the response map of the top node, we can get the maximal response within its ROI and we compare it to 0. If it is higher, classify it as 1. Otherwise, set it as 0.
  • #19 For each object classes, the most informative holistic feature was determined and for comparison, a hierarchy of sub feature was extracted. Using 150 top-level fragments in the hierarchies, the ROC is better than using the holistic features. As we can see, the ROC detection curves also improved when we use a full hierarchy for classification. But, the lowest ROC obtained for a decomposition using fixed spacing and sub-fragment size. Here, we conclude that optimizing the size and location of sub-fragments adds significantly to the MI.This curve shows the mean difference between the ROC curves of classifier based on a single holistic feature and its hierarchical decomposition.
  • #20 Next, we compare the performances of classifier. First, to determine the number of fragments required for full classifier, the Equal Error Probabilities or EEP were calculated using 50 fragments. And the classifier performs best at 30-40 fragments.Then, the performances of full classifiers using 50 holistic features and 50 hierarchical features were compared. A higher ROC curve here shows the advantage of hierarchical features.
  • #21 Including the selection of the sub-features as well as their combination weights and ROIsBoth in the amount of delivered information and recognition performanceAnd also to extend them to provide fuller description of the object: with their part and sub-part at different levels.Features are chosen for maximum mutual information, then have ROIs optimized: it&apos;s possible that some ultimately more optimal features are excludedResponse map must be calculated over entire image for every node in hierarchyUse of actual image fragments as features seems sub-optimal