0% found this document useful (0 votes)
10 views

Lecture Note (10) Curve-Fitting Least-Squares Regression

Uploaded by

ejae555
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Lecture Note (10) Curve-Fitting Least-Squares Regression

Uploaded by

ejae555
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Lecture Note for Numerical Analysis (10): Least-Squares Regression

1. Regression and Interpolation (Curve Fitting)

 Given n data points : {( x1 , y1 ), ( x2 , y2 ), ( x3 , y3 ), , ( x j , y j ),, ( xn , yn )}

 Regression: find a curve fitting best to the points ( x j , y j ), j  1,2,, n

 Interpolation: find a curve fitting best to and passing the points ( x j , y j ), j  1,2,, n

(a) linear regression (b) linear interpolation (c) nonlinear interpolation


 Regression Error

(a) small error (b) large error

2. Mathematical Expression of One Dimensional Least Squares Repression Problem


 Approximation function: y  f ( x; a) where a is a parameter vector to be determined and x is an independent
variable.
 Approximation error: e j  f ( x j ; a)  y j , j  1,2,, n
 Problem statement for least squares regression as an unconstrained NLP

 e j  min  f ( x j ; a)  y j 
n n
2 2
find f(x;a) such that min
j 1 j 1

1
(a) Example#1: linear regression
f ( x; a , b)  ax  b (1st order)
(b) Example #2: polynomial regression

f ( x; a, b, c )  ax 2  bx  c (2nd
order)
(c) Example #3: regression with power equation

f ( x; a)  ln(a0 x1a1 x2a 2 x3a 3  xna n )


Multi-dimensional nonlinear regression

3. Solution of Linear Least Squares Repression Problem


 Given n data points : {( x1 , y1 ), ( x2 , y2 ), ( x3 , y3 ), , ( x j , y j ),, ( xn , yn )}

 approximation function: f ( x; a , b)  ax  b

 approximation error: e j ( a, b)  f ( x j ; a, b)  y j  ax j  b  y j , j  1,2,, n

 find the regression parameters a, b to minimize the regression error such that
n n
min  e j  min  (ax j  b  y j ) 2
2
a ,b
j 1 j 1

 2-dimensional minimization problem to find a, b with the following cost function J ( a , b)


n
J ( a, b)   ( ax j  b  y j ) 2
j 1

 n 2  n   n   n   n 2
   x j a 2  nb2  2  x j ab  2  x j y j a  2  y j b    y j 
 j 1   j 1   j 1   j 1   j 1 
- The first order optimality condition

J (a, b)  n 2  n   n 
 2  x j a  2  x j b  2  x j y j   0
a  j 1   j 1   j 1 
J (a, b)  n   n 
 2nb  2  x j a  2  y j   0
b  j 1   j 1 
 n 2   n   a    n 
 xj    x j       x j y j  
  j 1    
 j 1       j 1


  n    n 
  x  
n      y j  

  j  b   j 1 
  j 1      

2
Define the mean values as

1 n 
x  xj 
n  j 1 

1 n 
y  y j  xy   x y
n  j 1 
  x 2 mean x  a    xy mean  a  2 mean
     
b y
  x mean  x 2
 n  x 1    
xy mean  1   x j y j   b  y  ax
n  j 1 

x 2 mean  n1   x j 2 


n

 j 1 

3
4. Solution of Least Squares Polynomial Repression for One Dimensional Problem

 Given n data points : {( x1 , y1 ), ( x2 , y2 ), ( x3 , y3 ), , ( x j , y j ),, ( xn , yn )}

 approximation function: f ( x; a)  am x m  am 1 x m 1    a1 x  a0 where a  (am , am 1 ,, a1 , a0 )


m m 1
 approximation error: e j (a)  f ( x j ; a)  y j  am x j  am 1x j    a1x j  a0  y j , j  1,2,, m

 find a such that 𝑚𝑖𝑛 ∑ 𝑒 = 𝑚𝑖𝑛 ∑ (𝑎 𝑥 +𝑎 𝑥 +⋯+𝑎 𝑥 +𝑎 −𝑦 )


𝐚

 (m+1)-dimensional minimization problem to find a with the following cost function J( a )


n
J (a)   (am x j  am 1 x j
m m 1
   a1 x j  a0  y j ) 2
j 1

- The first order optimality condition


J (a) n
 2 (am x j  am 1 x j    a1 x j  a0  y j ) x kj  0, (k  0,1,2,, m  1, m)
 m m 1

ak j 1

 (a
m k m 1 k 1 k
m xj  am 1 x j    a1 x j  a0 x kj  y j x kj )  0
j 1
or
 n mk   n m 1 k   n   n   n 
   x j an    x j an 1      x j1 k a1    x kj a0    y j x kj 
      
 j 1   j 1   j 1   j 1   j 1 
( k  0,1,2,, m  1, m)

Example: 2nd order Polynomial Repression Problem (m=2)

k  0;  x a   x a  na   y 
j
2
2 j 1 0 j

k  1;  x a   x a   x a   x y 
j
3
2 j
2
1 j 0 j j

k  2;  x a   x a   x a   x y 
j
4
2 j
3
1 j
2
0 j
2
j

 n

 x   x  a    y  
2

 x   x  a     x y  
j j 0 j

  x j  2 3

   x   x  a    x y 
j j 1 j j

  xj
2 3 4 2
j j 2 j j

4
5. General Nonlinear Least Squares Regression for One Dimensional Problem

 Given m data points : {( x1 , y1 ), ( x2 , y2 ), ( x3 , y3 ), , ( x j , y j ),, ( xm , ym )}

 approximation function: y  f ( x; a), a  R L


in general m  L and f ( x; a) is a linear function of the parameter vector a

 approximation error: e j  f ( x j ; a)  y j , j  1,2, , m


 find a such that
min 𝑒 = min 𝑓(𝑥 ; 𝒂) − 𝑦
𝐚

 (m)-dimensional minimization problem to find a  R L with the following cost function J( a )

J (a)  min  f ( x j ; a)  y j 
m
2

j 1

- The first order optimality condition (KKT condition: Karush-Kuhn-Tucker condition)


J (a) f ( x j ; a)
 2  f ( x j ; a)  y j 
m
  0, (k  1,2,, L  1, L)
ak j 1 ak
 m nonlinear algebraic equations for a
f ( x j ; a)
g k (a)   f ( x j ; a)  y j 
m
 0, (k  1,2,, L  1, L)
j 1 ak
 g1 (a) 
 
 g 2 (a) 
g(a)   g 3 (a)   0  g(a( i )  d)  g(a(i ) )   ag d  0  d  ag  g
1
 
  
 
 g L (a) 

a ( n1)  a ( n )  d  a ( n )   a g  g
1

f ( x j ; a)
g k (a)   f ( x j ; a)  y j 
m
, (k  1,2,, L  1, L)
j 1 ak
g k (a) m f ( x j ; a) f ( x j ; a) m  2 f ( x j ; a)
    f ( x j ; a)  y j 
ai j 1 ai ak j 1 ai ak

5
[Example 1]
 Problem statement
- given data: {( x1 , y1 ), ( x2 , y2 ), ( x3 , y3 ), , ( x j , y j ),, ( xm , ym )}

- regression equation: y  f ( x; a0 , a1 , a2 )  a0  a1 cos x  a2 sin x

 Solution
f
1
a0
f 2 f
 cos x  0, (i  1,2,3, k  1,2,3)
a1 ai ak
f
 sin x
a2

f ( x j ; a)
g 0 (a)   f ( x j ; a)  y j    a0  a1 cos x j  a2 sin x j  y j 
m m

j 1 a0 j 1

f ( x j ; a)
g1 (a)   f ( x j ; a)  y j    a0  a1 cos x j  a2 sin x j  y j cos x j
m m

j 1 a1 j 1

f ( x j ; a)
g 2 (a)   f ( x j ; a)  y j    a0  a1 cos x j  a2 sin x j  y j sin x j
m m

j 1 a2 j 1

 g (a)   n f ( x j ; a) f ( x j ; a) 
 a g  Gki    k     

 a i   j 1 ai  a k 
 m m m 
 1  cos x j  sin x j

 j 1 j 1 j 1 
 m m m 
   cos x j  cos 2
x j  cos x j sin x j 
 j 1 j 1 j 1 
 m m m

  sin x j  cos x sin x j  sin x j
2
j 
 j 1 j 1 j 1 
n
Therefore, with the given n-th interate of a , we can get the (n+1)-th iterate as
1
 m n 

 1
m m

 cos x j
m

 sin x j

 
  a0  a1n cos x j  a2n sin x j  y j  
 a0n1   a0n   j 1 j 1 j 1   j 1 
     m   m 
 
m m
 a1    a1     cos x j  cos  

n 1 n 2
x j cos x j sin x j  a n
 a n
cos x  a n
sin x  y cos x 
 n1   n   j 1 j 1 j 1  j 1
0 1 j 2 j j j

 a2   a 2   m 
     m m m

  sin x j  cos x sin x j  sin x j    
2

  a0  a1 cos x j  a2 sin x j  y j sin x j 


j  n n n

 j 1 j 1 j 1   j 1 

6
[Example 2]
 Problem statement
- given data: {( x1 , y1 ), ( x2 , y2 ), ( x3 , y3 ), , ( x j , y j ),, ( xm , ym )}

- regression equation: y  f ( x; a0 , a1 )  a0 (1  e  a1 x )

 Solution
f 2 f 2 f
 1  e  a1 x  0,  xe  a1 x
a0 a0a0 a1a0
f 2 f 2 f
 a0 xe a1 x  xe  a1 x ,   a0 x 2e  a1 x
a1 a0a1 a1a1

g 0 (a)   f ( x j ; a)  y j 
f ( x j ; a) m
  
m
  a0 (1  e 1 j )  y j (1  e 1 j )
a x a x

j 1 a0 j 1

g1 (a)   f ( x j ; a)  y j 
f ( x j ; a) m
  
m
  a0 (1  e 1 j )  y j a0 x j e 1 j
a x a x

j 1 a1 j 1

 g k (a)   m f ( x j ; a) f ( x j ; a) m  2 f ( x j ; a) 
 ag  Gki      

 j 1 a    f ( x j ; a)  y j  

  a i   i  a k j 1 a i a k 
H H12 
  11 
 H 21 H 22 

    1  e a x e
m m
H11   1  e
 a1 x j 2  a1 x j  a1 x j
0 j
j 1 j 1

 a x e  
m m
H12   1  e   a0 (1  e
 a1 x j  a1 x j  a1 x j  a1 x j
0 j )  y j x je
j 1 j 1

 a x e  
m m
H 21   1  e   a0 (1  e
 a1 x j  a1 x j  a1 x j  a1 x j
0 j )  y j x je
j 1 j 1

    a (1  e 
m 2 m
H 22   a0 x j e
 a1 x j  a1 x j 2  a1 x j
0 )  y j a0 x j e
j 1 j 1

n
Therefore, with the given nth iterate of a , we can get the (n+1)-th iterate as

 a0n1   a0n   H 11 H 12 
1
 m

  a0 (1  e 1 j )  y j (1  e 1 j ) 
a x a x 
 
      j 1 
  m 
 a n1   a n   H H 22  n
 1   1   21 
  a0 (1  e a1x j )  y j a0 x j e a1x j 
   
 j 1 n

7
[Example 3] Regression with the power equations in the general form

 Problem statement
- given data: {([ x1 ,, xn ]1 , y1 ), ([ x1 ,, xn ]2 , y2 ),, ([ x1 ,, xn ]m , ym )}

- regression equation: y  f ( x; a)  a0 x1a1 x2a 2 x3a3  xna n

 x1   a0 
   
x    , a    
x  a 
 n  n

 Solution using linear least squares regression


By taking the natural logarithmic operation for each side

ln y  ln(a0 x1a1 x2a2 x3a3  xna n )


 ln a0  ln x1a1  ln x2a 2    ln xna n
 ln a0  a1 ln x1  a2 ln x2    an ln xn

y  a0  a1 ln x1  a2 ln x2    an ln xn  y  ln y, a0  ln a0

which is the same form as the multi-dimensional linear regression

[Example 4] Regression for the Lagrange interpolation polynomial


 Problem statement
- given data: {( x j , y j ) | y  f ( x ), j  0,1,2,, m }

with x0  1
x j 1  x j  x, j  0,1,, m
2
x 
m
( x  0.7)( x  0.3)( x  0.3)( x  0.7)
f ( x)  , normalized with f (1)  1
1.7  1.3  0.7  0.3

- Given Data Points with the different m

8
1.2
m=5
m=8
1
m=12

0.8

0.6

0.4

0.2

-0.2
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

- regression equation in the general polynomial form


y  f ( x; a)  a0  a1 x    an x n  xT a

 Solution
1
   a0   y0 
x    
x   , a    , y    
 a  y 
 
 xn   n  m
 
n T
e j  f ( x j ; a)  y j  a0  a1 x j    an x j  x j a  y j , ( j  0,1,, m)

 e1   x1   x1T  1 x1
T
    x1n 
   a0   y0  
 e2   x 2      xT  1 x2
T
 x2n 
e              Xa  y where  2   
        
     an   ym 
e   T   xT  1 xm  xmn 
 m   xm   m

9
- least squares solution
m
Error   ek2 eT e  ( Xa  y )T ( Xa  y )
k 1

 
a  XT X XT y
1

 Homework for m=10


(1) n=1: y  f ( x; a)  a0  a1 x

(2) n=2: y  f ( x; a)  a0  a1 x  a2 x 2

(3) n=3: y  f ( x; a)  a0  a1 x  a2 x 2  a3 x 3

(4) n=4: y  f ( x; a)  a0  a1 x  a2 x 2  a3 x 3  a4 x 4

10
6. Matrix expression for multi-dimensional linear regression problem
 Regression formula to find the regression coefficient a  [ a0 , a1 ,  an ]
T

y  a0 z0  a1 z1  a2 z2    an zn  z T a

z  [ z0 , z1 , zn ]T : independent variable vector

a  [a0 , a1 , an ]T : regression parameter vector

 Given data (measured data) at each position of z j , j  1,2,, m

( z j , y j ), j  1,2,, m
Where
z j, j  1,2,, m : independent variable vector

yj, j  1,2,, m : dependent variable

 Approximation error

 ( z0 )1 ( z1 )1  ( zn )1  a0   ( y )1 
    
T  (z ) ( z1 ) 2  ( zn ) 2  a1   ( y ) 2 
ej  z j a  y j , j  1,2,, m  e  Za  y   0 2 
         
    
 (z ) ( z1 ) m  ( zn ) m  an   ( y ) m 
 0 m
m
E   e 2j eT e  (Za  y )T (Za  y )  (aT ZT  y T )(Za  y )  aT ZT Za  2y T Za  y T y
j 1

 Least square solution


minn E
aR

y  ( xT c)  (cT x)   x y  cT
Optimality condition using
y  xT Ax  ( xT A ) x  ( Ax)T x   x y  ( AT  A)x
E
 0  2ZT Za  2ZT y  0
a


a  ZT Z ZT y 
1

where Z  R m n , ZT  R n m , ZT Z  R n n , y  R m1 , ZT y  R n1


T
The matrix Z Z  1
is called the pseudo inverse of Z .

If Z is full row rank (all row is linearly independent, which means the given data point are not
T
overlapped), Z Z  
1
is non-singular and the pseudo inverse exists.
[Example 5] Local quadratic approximation of a multivariable function

11
Let’s consider the data generated from the following functions
g ( x, y )  ( x  1) 4  ( y  1)4  0.3( x  1)( y  1)  3
If we have function values at each of the following independent variable vector such that

(0,0), (1,0), ( 2,0), 


 ( x j , y j ), j  1, 9 (0,1), (1,1), (2,1), 

(0,2), (1,2), ( 2,2) 
 
 5.3000 4.0000 4.7000 
 g j  g ( x j , y j ), j  1, 9  4.0000 3.0000 4.0000 

 4.7000 4.0000 5.3000 
 
Find the coefficients of the following approximation function using the least square method.
f ( x , y )  a0  a1 x  a2 y  a3 x 2  a 4 xy  a5 y 2
Or

 a0  1   a0 
     
 a1  x   a1 
a  y  a 
f ( x, y )  1 x y 2    xT a where x   2 , a   2 
2
y x2 xy
 a3  x   a3 
a    a 
 4  xy   4
a   y2  a 
 5    5
[Solution]
Using the given data points, we can define the regression error as
e j ( x j , y j ; a)  f ( x j , y j )  g j  x Tj a  g j ( j  1,2,,9)
The least square method can be applied to get the coefficient a as
j 9 j 9
min E (a)   e   x Tj a  g j 
2 2
j
a
j 1 j 1
j 9 j 9
E (a)   x Tj a  g j    aT x j  g j x Tj a  g j 
2

j 1 j 1
j 9
  aT x j x Tj a  2 g j x Tj a  g 2j 
j 1

Then, the first order optimality condition can be derived using the result of Appendix C and D such as

 a T x j x Tj a 
 2x j x Tj a
a
 x a 
T
j
xj
a

Therefore,

12
E (a)   j  9 T 
  a x j x Tj a  2 g j x Tj a  g 2j 
a a  j 1 
j 9
  2x j x Tj a  2 g j x j 
j 1

 j 9 T 
j 9
 2  x j x j a  2 g j x j   0
 j 1  j 1

The first order optimality condition can be expressed as

 j 9 T 
j 9
x x 
 j j   g j x j 
a 
 j 1  j 1

Using

1  1 x y x2 xy y2 
   
x   x x2 xy x3 x2 y xy 2 
y   y y2 x2 y xy 2 y3 
xx   2 1 x y   2
xy
T
y x2 xy 2

x  x x3 x2 y x4 x3 y x2 y 2 
   
 xy   xy x2 y xy 2 x3 y x2 y2 xy 3 
 y2   y2 xy 2 y3 x2 y 2 xy 3 y 4 
  
The final form of the optimality condition can be expressed by

Ja  b with J  J T
 j 9 
 j 9 j 9 j 9 j 9 j 9
2   a0 
  g j  


9  x  j  y  j  x   x y 
2
j j j   y j      j 1
   j  9

j 1 j 1 j 1 j 1 j 1 
 j 9
  x j 
j 9
2
j
j 9

 x   x y   x   x y  j j
j 9
3
j
j 9
2
j j
j 9



x j y 2j  a1   
   g x
j j  

j 1
 j 1 j 1 j 1 j 1 j 1 j 1    j  9 
 j 9
y j   a2    g j y j  
3 
j 9 j 9 j 9 j 9 j 9

  y j   x y   y   x y   x y 
 
j j
2
j
2
j j j
2
j 
 jj 19 j 1
j 9
j 1
j 9
j 1
j 9
j 1
j 9
j 1
j 9
     j 1 
  x j   x   x y   x   x y  x j y j  a3    g j x 2j  
 2 2   
j 9
2 3
j
2
j j
4
j
3
j j 
 j j91 j 1
j 9
j 1
j 9
j 1
j 9
j 1
j 9
j 1
j 9     j 1 
 x y 
  x y   x y   x y   x y   x j y j  a4    g j x j y j 
3   
2 2 3 2 2 j 9
j j j j j j j j j j
j 1 j 1 j 1 j 1 j 1 j 1
 j 9 j 9 j 9 j 9 j 9 j 9     j 1 
   y 2j   x y   y   x y   x y 
2 3 2 2 3
  y 
4 
   j 9 
  a5   g y 2  
    j j 
j j j j j j j j
 j 1 j 1 j 1 j 1 j 1 j 1
 j 1 

13
Results of the quadratic approximation

 a0   5.3000 
   
 a1   - 2.3000 
 a   - 2.3000 
 2    f ( x, y )  5.3  2.3x  2.3 y  x 2  0.3xy  y 2
 a3   1.0000 
 a   0.3000 
 4  
 a   1.0000 
 5  

5.5
5.5
5

approximated function
5
exact function

4.5
4.5
4

3.5 4

3 3.5

2.5 3
2 2
1.5 2 1.5 2
1 1.5 1.5
1
1 1
0.5 0.5 0.5 0.5
y 0 0 y 0 0
x x

Exact function distribution Approximated function distribution

0.5

0.4
approximation error

0.3

0.2

0.1

0
2
1.5 2
1 1.5
1
0.5 0.5
y 0 0
x

Distribution of approximation error Pointwise errors at the given points

14
[Program List] Main program plus User defined function

1. Main program
%----------------------------------------------------------------
% (1) Data generation using the precribed user function
%----------------------------------------------------------------
xd = 0:1:2 ; xd=xd' ;
yd = 0:1:2 ; yd=yd' ;
%
No_data = 0;
for j=1:3
for k=1:3
%--data --------------------------------------------------------------
x1 = xd(j,1) ;
y1 = yd(k,1) ;
%--data vector for (1 x y x^2 xy y^2) ------------------------------
No_data = No_data + 1 ;
Nd = No_data ;
xvec(1,1) = 1.0 ;
xvec(2,1) = x1 ;
xvec(3,1) = y1 ;
xvec(4,1) = x1^2 ;
xvec(5,1) = x1*y1 ;
xvec(6,1) = y1^2 ;
%
xd_vec(1:6,Nd) = xvec(1:6,1) ;
%--function data vector -----------------------------------------------
gfun = User_fun(x1,y1) ;
gvec(Nd,1) = gfun ;
end
end
%
%----------------------------------------------------------------
% (2) Regression Analysis
%----------------------------------------------------------------
% (2-1) Build Matrix and RHS vector
%----------------------------------------------------------------
AA(1:6,1:6) = 0.0 ;
bb(1:6,1) = 0.0 ;
for j=1:No_data
xvec(1:6,1) = xd_vec(1:6,j) ; % data vector
%
AA(1:6,1:6) = AA(1:6,1:6) + xvec(1:6,1)*xvec(1:6,1)' ; % A=A + (x^t)*x
bb(1:6,1) = bb(1:6,1) + gvec(j,1)*xvec(1:6,1) ; % b=b + gj*x
end
%----------------------------------------------------------------
% (2-2) Solution of the coefficients
%----------------------------------------------------------------
avec(1:6,1) = AA(1:6,1:6) \ bb(1:6,1) ; % a = inv(AA)*bb
%
%----------------------------------------------------------------
% (3) Error Analysis for the given data
%----------------------------------------------------------------
for j=1:No_data
xvec(1:6,1) = xd_vec(1:6,j) ; % data vector
%----------------------------------------------------------------
% approximation
%----------------------------------------------------------------
fun(j,1) = xvec(1:6,1)'*avec(1:6,1) ;
15
eer(j,1) = abs(fun(j,1) - gvec(j,1) ) ;
end
%----------------------------------------------------------------
% plot the error
%----------------------------------------------------------------
figure(1),
Nd = 1:No_data; Nd = Nd' ;
semilogy(Nd(1:No_data,1),eer(1:No_data,1),'-or') ; hold on ;
xlabel('data point'); ylabel('error') ;
%
%
%----------------------------------------------------------------
% (4) Surface plot of functions and error
%----------------------------------------------------------------
Nx = 31 ;
xmin = 0.0 ;
xmax = 2.0 ;
Dx = (xmax - xmin)/(Nx-1) ;
%
Ny = 31 ;
ymin = 0.0 ;
ymax = 2.0 ;
Dy = (ymax - ymin)/(Ny-1) ;
%
No_xy = 0 ;
for j=1:Nx
for k=1:Ny
x1 = xmin + Dx*(j-1) ;
y1 = ymin + Dy*(k-1) ;
% exact dunction values
xg(j,k) = x1 ;
yg(j,k) = y1 ;
gf(j,k) = User_fun(x1,y1) ;
% approximation
xvec(1,1) = 1.0 ;
xvec(2,1) = x1 ;
xvec(3,1) = y1 ;
xvec(4,1) = x1^2 ;
xvec(5,1) = x1*y1 ;
xvec(6,1) = y1^2 ;

ff(j,k) = xvec(1:6,1)'*avec(1:6,1) ;
% error
err(j,k) = abs(ff(j,k) - gf(j,k) ) ;
end
end
%
% surface plot of functions and error
%
figure(2)
surfc(xg,yg,gf)
colormap hsv
xlabel('x'); ylabel('y'); zlabel('exact function'); hold on ;
%
figure(3)
surfc(xg,yg,ff)
colormap hsv
xlabel('x'); ylabel('y'); zlabel('approximated function'); hold on ;
%
figure(4)
surfc(xg,yg,err)
colormap hsv
16
xlabel('x'); ylabel('y'); zlabel('approximation error'); hold on ;
%----------------------------------------------------------------

2. User defined function program


function fun = User_fun(x,y)
%---------------------------------------
% function #1
%---------------------------------------
% fun = (x-1)^2 + 4.0*(y-1)^2 + (x-1)*(y-1) + 3 ;
%
%---------------------------------------
% function #2
%---------------------------------------
% xm = x - 1 ;
% ym = y - 1 ;
% fun = xm^4 + 4*ym^4 + (xm^2)*(ym^2) + 5*xm*ym + 3 ;
%---------------------------------------
% function #3
%---------------------------------------
% xm = x - 1 ;
% ym = y - 1 ;
% fun = xm^4 + ym^4 + 5*xm*ym + 3 ;
%---------------------------------------
% function #4
%---------------------------------------
xm = x - 1 ;
ym = y - 1 ;
fun = xm^4 + ym^4 +0.3*xm*ym + 3 ;
%---------------------------------------

17
Appendix: Important Relations

[Appendix A] Definition of a Vector, Gradient of a function (in row vector form), Hessian of a function

 x1   f   2 f

2 f 

    ,,
x   ,  x1   x1 x1 x1 x n 
f ( x )   ,  2 f (x )  
f ( x )     f (x) 
2
   
x  x   x 2  
 n  f   2 f 2 f 
 x   ,, 
 n  xn x1 xn xn 

f  R , x  R n , f  R n1 ,  2 f ( x )  R nn and the Hessian matrix is symmetric

[Appendix B] Gradient of a function vector and time derivative of a vector function of x (t )


 f1 f   dx1 
 f1 (x )   ,, m   
   x1 x1  dt 
 f 2 (x)  f (x ) dx 
f (x)   ,  f ( x )    ,   
  x   dt  
   f 1 ,, f m   dxn 
 f m (x)   x x n   dt 
 n

where f  R m , x  R n , f  R nm

 dx1 
 
 dt  T
df ( x ) f dx1 f dx 2 f dx n  f f   f  dx
     ,,      
dt x1 dt x 2 dt x n dt  x1 x n    x  dt
 dx n 
 dt 

  f 1  T 
  
  x  
 f1 (x )   T 
  f
df ( x ) d  f 2 (x )    2   dx  f (x )  T dx
     x    
dt dt     dt  x  dt
    
 f m (x)  
T 
  f m  
 x 
  

18
[Appendix C] Gradient of the product of vectors and the multiplication between a matrix and a vector

y  ( x T a)  (a T x )  y  R, x  R n , a  R n
y  x T a  a T x  a1 x1    a n x n  y  R n ,
 a1 
y  
y    a
x  
 an 

 a11 x1    a1n x n 
 
y  Ax      y  R m , x  R n , A  R mn
a x  a x 
 m1 1 mn n 
,
 a11 ,, a m1 
Ax  
y    A
T
 y  R n m
x  
 a1n ,, a mn 

[Appendix D] Gradient of a quadratic function of

y  x T Ax  ( xT A ) x  ( Ax )T x , y  R, y  R n , A  R n n

dx1 dx 2
T
Let x  x1  x 2 , then y  (x1 A)x 2  ( Ax 2 ) x1 and
T
 I
dx dx

dy dy dx1 dy dx 2 T
y     ( Ax 2 )I  ( x1 A ) T I
dx dx1 dx dx 2 dx
 Ax 2  A T x1  Ax  A T x  ( A T  A )x

If A is a symmetric matrix, y  2AT x  2Ax

19

You might also like