Deep Network With Support Vector Machines: Abstract. Deep Learning Methods Aims at Learning Features Automatically at
Deep Network With Support Vector Machines: Abstract. Deep Learning Methods Aims at Learning Features Automatically at
1 Introduction
Classification results of learning algorithms are inherently limited in performance by
the features extracted [1]. Deep learning is required to learn complicated function that
can represent higher level extractions. Deep learning architectures consist of multiple
intermediate layers rather than a single hidden layer, and adequate learning algorithm
to train those layers. For the deep learning, multiple layers are expected to replace
manual domain-specific feature engineering [2]. Also, recent neuroscience researches
have provided backgrounds to deep feature extraction [1]. Besides the early attentions
to the importance of deep architecture [3,4], deep learning was not prevalent since
there was no effective learning method applicable for existing learning machines ex-
cept few models [5,6]. Restricted Boltzmann Machine (RBM) is a generative stochas-
tic neural network that can learn a probability distribution over its set of inputs and
initially, was invented by Smolensky in 1986 [7]. But as G. Hinton et al. proposed the
RBM network with contrastive divergence [8], deep architectures using RBM
network become popular for many pattern recognition and machine learning applica-
tion and start to win prizes at several pattern recognition competitions without
complex manual feature engineering. Although well-trained RBM networks show
*
Corresponding author.
M. Lee et al. (Eds.): ICONIP 2013, Part I, LNCS 8226, pp. 458–465, 2013.
© Springer-Verlag Berlin Heidelberg 2013
Deep Network with SVMs 459
good performance, the learning algorithm requires setting user determined meeta-
parameters such as the leaarning rate, the momentum, the weight-cost, the sparrsity
target, the initial values off the weights, the number of hidden units and the sizee of
each mini-batch [9]. Witho out careful considerations or optimal engineering of thhese
parameters the training wou uld not be performed well or easily fall into the over-fittting
problem. This gives an unsu uccessful generalization performance.
Support Vector Machinee is the supervised machine learning algorithm and is ppro-
posed by Vladimir N. Vap pnik [10]. SVM is widely adopted for classification and
regression especially with kernel
k trick which makes predictions for new inputs depend
only on the kernel function n evaluated at a sparse subset of the training data pooints
[11]. SVM constructs maaximal-margin hyper-plane which discriminates differrent
patterns efficiently and max ximal-margin enables the high generalization performaance
since the generalization erroors can be bounded in terms of margins.
In this paper, we proposse deep learning algorithm with support vector machiines
which allow the training off layer by layer learning. By stacking SVM, we can extrract
high-order discriminative features
f with support vectors which maximize the marrgin
and guarantees generalizatiion performance. The proposed method for deep learnning
requires setting a few user determined parameters such as the number of layers whhich
improves its applicability too various domains of engineering.
The rest of this paper is organized
o as follows. In Section 2, we describe the structture
and the algorithm of the pro oposed model. In Section 3 we present the experimentall re-
sults to evaluate the performmance of the proposed. Finally, we draw our conclusionns in
Section 4.
2 Proposed Modell
2.1 Support Vector Ma
achine
Support vector machine is th
he supervised learning method and widely used in classiffica-
tion and regression tasks. For
F the linearly separable problem, SVM obtains maxim mal-
margin hyper-plane and the distance from the hyper-plane to the nearest data pointss on
each side is maximized.