0% found this document useful (0 votes)
18 views

Chapter 5 (PR)

Linear discriminant functions use weight vectors and biases to classify patterns into categories. For two categories, only two parameters are needed instead of four. Gradient descent is used to minimize an objective function and determine the optimal weight vector and bias, which is done through iterative updates proportional to the negative gradient. This allows classification of training examples into two categories using a linear discriminant function.

Uploaded by

Shikha Anand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Chapter 5 (PR)

Linear discriminant functions use weight vectors and biases to classify patterns into categories. For two categories, only two parameters are needed instead of four. Gradient descent is used to minimize an objective function and determine the optimal weight vector and bias, which is done through iterative updates proportional to the negative gradient. This allows classification of training examples into two categories using a linear discriminant function.

Uploaded by

Shikha Anand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Chapter 5

Linear Discriminant Functions

Pattern Recognition Soochow, Fall Semester, 2012 1


Discriminant Function
Discriminant functions

 Useful way to represent classifiers


 One function per category

Minimum risk:
Minimum-error-rate:

Pattern Recognition Soochow, Fall Semester, 2012 2


Discriminant Function (Cont.)
Decision region
c discriminant functions c decision regions

where and

Decision boundary decision


boundary
surface in feature space where
ties occur among several largest
discriminant functions
Pattern Recognition Soochow, Fall Semester, 2012 3
Linear Discriminant Functions

: weight vector (权值向量, d-dimensional)

: bias/threshold (偏置/阈值, scalar)

Pattern Recognition Soochow, Fall Semester, 2012 4


Linear Discriminant Functions (Cont.)
The two-category case

It suffices to consider only d+1 parameters (w and b)


instead of 2(d+1) parameters under two-category case

Pattern Recognition Soochow, Fall Semester, 2012 5


Two-Category Case
Training set

The task
Determine which can classify all training
examples in correctly

Pattern Recognition Soochow, Fall Semester, 2012 6


Two-Category Case (Cont.)
Solution to (w, b)
Minimize a criterion/objective function (准则函数) J(w,b)
based on the training examples

How to minimize
the criterion
function J(w,b)?

Gradient Descent
…… (梯度下降)

Pattern Recognition Soochow, Fall Semester, 2012 7


Gradient Descent
Taylor Expansion (泰勒展式)

: a real-valued d-variate function

: a point in the d-dimensional Euclidean space

: a small shift in the d-dimensional Euclidean space

: gradient of f() at x

: the big oh order of [appendix A.8]

Pattern Recognition Soochow, Fall Semester, 2012 8


Gradient Descent (Cont.)
Taylor Expansion (泰勒展式)

What happens if we set to be negatively proportional


to the gradient at x, i.e.:
( being a small positive scalar)

ignored when
being non-negative
Therefore, we have ! is small

Pattern Recognition Soochow, Fall Semester, 2012 9


Gradient Descent (Cont.)
Basic strategy
To minimize some d-variate function f(), the general gradient
descent techniques work in the following iterative way:

1. Set learning rate and a small threshold


2. Randomly initialize as the starting point; Set k=0
3. do k=k+1
4. (gradient descent step)
5. until
6. Return xk and f(xk)

Pattern Recognition Soochow, Fall Semester, 2012 10


Gradient Descent for Two-Category
Linear Discriminant Functions
Task revisited
Determine (w,b) such that can classify all
examples in correctly, where

The solution
Invoke the standard
Choose certain
gradient descent
criterion function
procedure on the (d+1)-
J(w,b) defined
variate function J(,) to
over [ref: slide 8]
determine (w,b)

Pattern Recognition Soochow, Fall Semester, 2012 11


Gradient Descent for Two-Category
Linear Discriminant Functions (Cont.)
Two examples

Pattern Recognition Soochow, Fall Semester, 2012 12


Summary
 Discriminant functions
 Linear discriminant functions
 The general setting:
 The two-category case:
 Minimization of criterion/objection function

……

Pattern Recognition Soochow, Fall Semester, 2012 13


Summary (Cont.)
 Gradient descent
 Taylor expansion

 Key iterative gradient descent step

 For two-category linear discriminant functions

Pattern Recognition Soochow, Fall Semester, 2012 14

You might also like