0% found this document useful (0 votes)
45 views5 pages

Numerical Analysis: Least Squares & Errors

The document covers topics in numerical analysis, including least-square solutions for non-square linear systems, floating point errors, and LU factorization. It provides exercises on solving linear systems using LU and Cholesky factorizations, as well as approximating differences using rounding and chopping methods. Additionally, it discusses the representation of machine numbers in binary and explores the smallest and largest positive numbers in floating point format.

Uploaded by

Locke Cole
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views5 pages

Numerical Analysis: Least Squares & Errors

The document covers topics in numerical analysis, including least-square solutions for non-square linear systems, floating point errors, and LU factorization. It provides exercises on solving linear systems using LU and Cholesky factorizations, as well as approximating differences using rounding and chopping methods. Additionally, it discusses the representation of machine numbers in binary and explores the smallest and largest positive numbers in floating point format.

Uploaded by

Locke Cole
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

MATH3230B Numerical Analysis

Tutorial 8

1 Recall:
1. Least-square solution for general non-square linear systems:
Let A be a m × n matrix, with m > n and now we consider a general linear system
Ax = b
The least square solution seeks for some vector x that minimizes the error (Ax − b) in the least square sense,
that is
minn kAx − bk22
x∈R
We assume that the columns of A are linearly independent. We define
f (x) = kAx − bk22
The minimizer of f (x) satisfies the normal equation
AT Ax = AT b.

2. Floating point error


Recall that most computers adopt the binary system. A machine number is a string consisting of bits, whose
value are decoded as the following normalized floating point:
p
·m
a = (−1)s q × 2(−1)
For a 32-bits floating-point binary storage formats (also called single precision), we have
1 bit for the sign (s), 8 bit for the exponent (1 bit for p,7 bits for m), 23 bits for the mantissa.
For a 64-bits floating-point binary storage formats (also called double precision), we have
1 bit for the sign (s), 11 bits for the exponent (1 bit for p,10 bits for m), 52 bits for the mantissa.
During the decimal-binary conversion, small roundoff errors occurs.
Also rounding is usually adapted in scientific computing. If x is rounded to x e with n digits after the decimal
points, we have the error estimate
1
e| ≤ × 10−n
|x − x
2
In addition to rounding input, rounding is also needed after most arithmetic operations. The roundoff error
is less than 2−24 (32-bits) or 2−53 (64-bits) and it is called the unit rounded error. All these put together to
form floating point arithmetics.

Given a real number x, let f l(x) be the floating point representation of x, which means
f l(x) − x
≤ 2−β := m
x
where m is the machine precision/ machine unit roundoff error. Then we can write
f l(x) = x(1 + )
with  ≤ m .

1
2 Exercise:
1. Consider the following tridiagonal matrix
 
b1 c1
 a2 b2 c2 
 
A=
 .. .. .. 
 . . . 

 an−1 bn−1 cn−1 
an bn

(a) What is the LU factorization of A?


(b) Using the result of (a), solve the following linear system
    
4 −1 x1 2
 −1 4 −1  x2   6 
    

 −1 4 −1 
 x3 =
  0 

 −1 4 −1   x4   6 
−1 4 x5 2

Solution. (a) By direct LU decomposition, we have


  
1 0 b1 c1
 l2 1  v2 c2 
  
A=
 l3 1  .. .. 
 . . 
 .. ..  
 . .  vn−1 cn−1 
0 ln 1 vn
ak
where lk = vk−1 and vk = bk − lk ck−1 , k = 2, ..., n
(b) By LU factorization, we have
 
1
 −1 1 
 4 4

L=
 − 15 1 

15
 − 56 1 
56
− 209 1

and  
4 −1
15

 4 −1 
56

U =
 15 −1 

209

56 −1 
780
209

Solving    
y1 2

 y2  
  6 

L
 y3 =
  0 

 y4   6 
y5 2
we have    
y1 2
13

 y2  
  2

26


 y3 =
  15


181
 y4  
28

780
y5 209

2
Then solving    
x1 y1

 x2  
  y2 

U
 x3 =
  y3 

 x4   y4 
x5 y5
we have    
x1 1

 x2  
  2 


 x3 =
  1 

 x4   2 
x5 1

2. (a) Assume that if matrix Am×n is full rank, we can find the least square solution x∗ which minimizes the
energy:
1 2
kAx − bk2 .
2
Prove that solving for x∗ is equal to solve the following equation
AT Ax = AT b.

(b) Solve the following linear system using the Cholesky factorization in least square sense.
   
1 −1 1 2
 −1 1 1 , b =  1 
 
Ax = b, A =   0 1 −1   1 
0 1 −1 −2

Solution. (a) Let


1 2
F (x) = kAx − bk2
2
Then
∇F (x) = AT Ax − AT b
Therefore minimum is attended if
∇F (x) = 0 ⇐⇒ AT Ax = AT b

(b) Note that  


2 −2 0
AT A =  −2 4 −2 
0 −2 4
Then  √ √ T  √ √ 
2 −√ 2 √0 2 −√ 2 √0
T
A A=  0 2 −√ 2   0 2 −√ 2  = RT R
0 0 2 0 0 2
First we solve Ry = AT b, we have
√1
 
2
y=
 − √12 

√3
2
then we solve Rx = y, we have
3
 
2
x= 1 
3
2

3
3. Let p = 0.54617 and q = 0.54601. Use four-digit arithmetic to approximate r = p − q and determine the
relative errors using
(a) rounding approximation;
(b) chopping approximation.

Please also write down the number of significant digits for the approximation (a) and (b) respectively. (Hint:
consider the relative error.)

Solution. The exact value of r = p − q is r = 0.00016.


(a) Rounding approximation: p∗ = 0.5462 and q ∗ = 0.5460.
r∗ = p∗ − q ∗ = 0.0002. The relative error is

|r − r∗ | |0.00016 − 0.0002|
= = 0.25,
|r| |0.00016|

so the result has only one significant digit.


(b) Chopping approximation: p∗ = 0.5461 and q ∗ = 0.5460.
r∗ = p∗ − q ∗ = 0.0001. The relative error is

|r − r∗ | |0.00016 − 0.0001|
= = 0.375,
|r| |0.00016|

so the result has only one significant digit.

4. Recall that most computers adopt the binary system. A machine number is a string consisting of bits, whose
value are decoded as the following normalized floating point:
p
·m
a = (−1)s q × 2(−1) , (1)

where s, p = 0, 1, m is the 7-bit exponent, and q = (1.f )2 with f being 23-bit fractional part.
(a) Find the smallest and second smallest positive numbers of the form (1).
(b) Find the largest and second largest numbers of the form (1).
(c) Suppose that x is a real number. If x is rounded to x̃ with n digits after the decimal point, show that
1
|x − x̃| ≤ × 10−n .
2

Solution. (a) Put s = 0, p = 1, f = 000...000 and m = 1111111,


the smallest positive number is 2−127 .
Put s = 0, p = 1, f = 000...001 and m = 1111111,
the second smallest positive number is:

(1.000...001) × 2−127 = (1 + 2−23 ) × 2−127 .

(b) Put s = 0, p = 0, f = 111...111 and m = 1111111, the largest number is:

(1.111...111) × 2127 = (2 − 2−23 ) × 2127 .

Put s = 0, p = 0, f = 111...110 and m = 1111111, the second largest number is:

(1.111...110) × 2127 = (2 − 2−22 ) × 2127 .

4
(c) If the (n + 1)th digit of x is 0,1,2,3, or 4, then x = x̃ +  with  < 0.5 × 10−n .
If the (n + 1)th digit of x is 5,6,7,8, or 9, then x̃ = x̂ + 10−n , where x̂ is a number with the same n digits
as x and all digits beyond the nth are 0. So we have

x = x̂ + δ × 10−n

with δ ≥ 0.5 and


1
x̂ − x = (1 − δ) × 10−n ≤ × 10−n .
2
Result follows.

You might also like