Chapter 3 - Interpolation
3.1 The Interpolating Polynomial
Interpolation is the process of defining a function that
“connects the dots” between specified (data) points.
In this chapter, we focus on two closely related
interpolants, the cubic spline and the shape-preserving
cubic spline called “pchip”.
Two distinct points uniquely determine a straight line.
Restated in more mathematical terms, any pair of
points (x1, y1) and (x2, y2) with x1 6= x2 determine a
unique polynomial in x of degree less than two whose
graph passes through the two points.
There may be different formulas for the polynomial,
but they all describe the same straight line.
This idea generalizes to more than two points.
1
For example, any three points (x1, y1), (x2, y2), and
(x3, y3) with x1 6= x2 6= x3 determine a unique
polynomial in x of degree less than three whose graph
passes through the three points.
In general, given n points (xk , yk ), k = 1, 2, . . . , n,
with distinct xk , there is a unique polynomial in x of
degree less than n whose graph passes through the n
points.
It is easiest to remember that the number of data
points n is also the number of polynomial coefficients.
Note: Some of the leading coefficients might be zero,
so the degree might actually be less than n − 1.
Again, there may be many different ways to express
the polynomial, but they are all equivalent algebraically,
and they all plot the same curve.
This polynomial is called the interpolating polynomial
because it passes through the given points; i.e.,
Pn(xk ) = yk , k = 1, 2, . . . , n.
2
Note: Sometimes other polynomials of lower degree
are used to fit data; e.g., the line of best fit.
These are not interpolating polynomials!
The most direct form in which to express Pn(x) is the
Lagrange form:
n n
X Y x − xj
Pn(x) = yk .
xk − x j
k=1 j=1
j6=k
Note: There are n terms in the sum and n − 1 terms
in each product ↔ a polynomial of degree less than n.
If Pn(x) is evaluated at x = xK , all the products
except the one where k = K are 0.
Furthermore, the Kth product is equal to 1
↔ the sum evaluates to yK just as it should!
3
Consider the following data set:
x 0 1 2 3
y −5 −6 −1 16
Then the Lagrange interpolating polynomial is
(x − 1)(x − 2)(x − 3) (x − 0)(x − 2)(x − 3)
P4 (x) = (−5) + (−6)
(0 − 1)(0 − 2)(0 − 3) (1 − 0)(1 − 2)(1 − 3)
(x − 0)(x − 1)(x − 3) (x − 0)(x − 1)(x − 2)
+ (−1) + (16).
(2 − 0)(2 − 1)(2 − 3) (3 − 0)(3 − 1)(3 − 2)
Each term is of degree 3, so P4(x) is of degree (at
most) 3.
(Because the coefficient of x3 does not vanish, the
degree is 3. Verify!)
If we substitute x = 0, 1, 2, or 3, three of the terms
vanish, and the fourth produces the corresponding
value from the data set.
4
Polynomials are not usually written in Lagrange form.
They are usually written in power form; e.g., the
previous Lagrange polynomial can be written as
x3 − 2x − 5.
Of course, a polynomial in Lagrange form can always
be written out in power form if you like.
But if we want to obtain the power form of an
interpolating polynomial directly,
Pn(x) = c1xn−1 + c2xn−2 + . . . + cn−1x + cn,
its coefficients ck can (in principle!) be computed by
solving a system of linear equations:
xn−1 xn−2
1 1 ... x1 1 c1 y1
xn−1 xn−2 ... x2 1 c2 y2
2. 2
... ... =
... ... ... ... .
..
xn−1
n xn−2
n ... xn 1 cn yn
5
The coefficient matrix of this linear system has a special
structure: It is known as a Vandermonde matrix, V.
Its elements are
vij = xn−j
i .
Note: Depending on the definition being used, the
columns of a Vandermonde matrix are sometimes
written in the opposite order.
But in Matlab, polynomial coefficient vectors are
always assumed to be in decreasing order; i.e., the
coefficient of the highest power is the first element.
The Matlab function vander automatically
generates Vandermonde matrices.
See vanderDemo.m
It can be shown that Vandermonde matrices are non-
singular as long as the points xk are distinct.
However, it can also be shown that Vandermonde
matrices are often badly conditioned!
6
Consequently, using the Vandermonde matrix to find a
polynomial interpolant in power form only works well
with a few well-spaced and well-scaled data points.
It is dangerous to use as a general-purpose approach!
There are several (external) Matlab functions that
implement different interpolation algorithms.
All of them are called as follows:
P = polyinterp(x k,y k,x)
The first 2 input arguments x k and y k are vectors
of the same length that contain the data.
The third input argument x is a vector of points where
you would like the interpolant to be evaluated.
The output P is the same length as x and has elements
P(i) = polyinterp(x k,y k,x(i)).
Of course, if you evaluate P at x k, you get y k.
The interpolation function polyinterp is based on
the Lagrange interpolating polynomial.
See polyinterpDemo.m
7
Newton Interpolation
We have seen two extreme cases of representations of
polynomial interpolants:
1. The Lagrange form, which is complicated, but allows
you to write out Pn(x) directly.
2. The power form, which is easy to use, but
requires the solution of a typically ill-conditioned
Vandermonde linear system.
Newton interpolation provides a trade-off between
these two extremes.
The Newton interpolating polynomial takes the form
n−1
Y
Pn (x) = c1 + c2 (x − x1 ) + c3 (x − x1 )(x − x2 ) + . . . + cn (x − xk ).
k=1
8
Example
Interpolate the points (−2, −27), (0, −1), (1, 0) using
Newton interpolation.
Let
P3(x) = c1 + c2(x − x1) + c3(x − x1)(x − x2).
Then,
−27 = c1 ,
26
−1 = c1 + c2 (0 + 2) =⇒ c2 = = 13,
2
27 − 13(3)
0 = c1 + c2 (1 + 2) + c3 (1 + 2)(1 − 0) =⇒ c3 = = −4.
3
Thus,
P3(x) = −27 + 13(x + 2) − 4(x + 2)(x).
9
Once the coefficients ck , k = 1, 2, . . . , n, have been
computed, the Newton polynomial can be efficiently
evaluated using a modified version of Horner’s method.
Note also that Newton interpolation can be done
incrementally; i.e., the interpolant can be created as
points are being added.
It may be important to be able to do this depending
on the application.
So we fit a straight line to two points, then add a point
and fit a quadratic to three points, then add a point
and fit a cubic to four points, etc.
With the other two methods, if more data points are
added, the computation of the interpolants has to be
started from scratch.
10
Final notes:
• The coefficients ck can be obtained recursively in
O(n2) operations using divided differences.
These computations are less prone to overflow and
underflow than the previous methods.
• In theory, any order of the interpolation points xk is
OK, but the conditioning depends on this ordering!
Left-to-right ordering is not necessarily the best!
Two better ideas are to order the points in increasing
distance from either their mean or from a specified
point at which the interpolant is to be evaluated.
11
Another Example
We will also be making use of the following data set in
the remainder of this chapter.
See polyinterpDemo2.m
Here, we see the primary difficulty with high-degree
polynomial interpolation at equally spaced points.
Even with only six equally spaced points, the
interpolant shows an unnatural-looking amount of
variation (overshoots, wiggles, etc.), especially in the
first and last subintervals.
Consequently, high-degree polynomial interpolation at
equally spaced points is hardly ever used for data and
curve fitting.
In this course, its primary application is in the
derivation of other numerical methods.
12
Parametric Interpolation
None of the techniques described so far can be used
to generate curves like the letter “S”.
That’s because the letter “S” is not a function (a
vertical line intersects “S” more than once).
One way to get around this problem is to describe the
curve in terms of a parameter t.
We connect the points (x0, y0), (x1, y1), . . . , (xn, yn)
in that order by using a parameter t ∈ [t0, tn] with
t0 < t1 < . . . < tn such that
xk = x(tk ), yk = y(tk ), k = 0, 1, . . . , n.
In other words, we create two polynomial interpolants
for two functions x(t) and y(t).
13
For example, suppose we have the following set of 5
data points, and we parameterize it by a parameter t
that takes on 5 equally spaced points in [0, 1].
k 0 1 2 3 4
tk 0 0.25 0.5 0.75 1
xk −1 0 1 0 1
yk 0 1 0.5 0 −1
We construct a pair of Lagrange polynomials to
interpolate x(t) and y(t).
The data and the interpolant are shown in the figure.
See parametricInterpolation.m
1.5
0.5
y
−0.5
−1
−1.5 −1 −0.5 0 0.5 1
x
14
3.2 Piecewise Linear Interpolation
This is the perhaps the most intuitive form of
interpolation, even if you’re still not sure what all
the words mean.
Piecewise linear interpolation is simply connecting data
points by straight lines.
“Linear interpolation” means to use straight-line
interpolants.
We say it is “piecewise” interpolation because you
normally need different straight lines to connect
different pairs of points.
This is the default behaviour in Matlab’s plot
routine.
If the data points are sufficiently close together,
the jaggedness associated with piecewise linear
interpolation is not noticeable.
See plotDemo.m
15
Here is an outline of the algorithm used to produce a
piecewise linear interpolant.
The steps of this algorithm are used as a basis for
more sophisticated piecewise polynomial interpolants
(or splines).
Suppose we have a data set consisting of points
(xk , yk ), and we wish to construct the piecewise
polynomial interpolant through them.
First, for a given x, the interval index k is determined
such that
xk ≤ x ≤ xk+1.
Second, we define a local variable s := x − xk .
Third, we compute the first divided difference
yk+1 − yk
δk := .
xk+1 − xk
16
Finally, we construct the interpolant
yk+1 − yk
P (x) = yk + (x − xk )
xk+1 − xk
= yk + δk s.
This formula should look familiar!
This is the Newton form of the (linear) interpolating
polynomial.
It can be generalized to higher-degree interpolants
by using higher-order divided differences; i.e., divided
differences of divided differences.
So we have constructed the straight line that passes
through (xk , yk ) and (xk+1, yk+1).
The points xk are sometimes called breakpoints.
Note: P (x) is a continuous function of x, but its first
derivative P 0(x) is not!
→ P 0(x) = δk on each subinterval and jumps at the
breakpoints.
17
3.3 Piecewise Cubic Hermite
Interpolation
Many of the most effective interpolants are based on
piecewise cubic polynomials.
Let hk := xk+1 − xk be the length of the kth
subinterval.
Then
yk+1 − yk
δk = .
hk
Let dk := P 0(xk ).
Note: If P (x) is piecewise linear, then dk is not really
defined because dk = δk−1 on the left of xk , but
dk = δk on the right of xk . Usually δk−1 6= δk !
But for higher-order interpolants (like cubics), it is
possible to force the interpolant to be smooth at the
breakpoints.
18
This is done by forcing the derivative at the right end
of one piecewise cubic to agree with the derivative at
the left end of the next piecewise cubic.
Why not? A cubic polynomial has 4 degrees of freedom
(i.e., the 4 coefficients c1, c2, c3, c4).
We can specify 4 pieces of information, and as long as
the 4 × 4 system of linear equations is non-singular, we
can obtain unique values for the 4 unknowns ci and
hence uniquely specify the cubic polynomial.
Note: Until now, we have specified that the 4 pieces
of information should all be function (data) values
(xk , yk ); i.e., P (x) is the unique cubic that passes
through the 4 data points.
This is not necessary!
In fact, what we will do now (in order to enforce
smoothness) is to specify function values and slopes
(first derivatives) at the endpoints of each subinterval
to define the piecewise cubic polynomial.
19
For example, suppose P (x) = c1x3 + c2x2 + c3x + c4.
We can specify say P (0) = 0, P (1) = 3, P 0(0) = 1,
and P 0(1) = 2.
This leads to 4 equations:
c4 = 0,
c1 + c2 + c3 + c4 = 3,
c3 = 1,
3c1 + 2c2 + c3 = 2.
These can be solved to obtain c4 = 0, c3 = 1, c2 = 5,
and c1 = −3.
Hence, we have a cubic polynomial that agrees with
the two data points and the two slopes we specified.
20
Our equations were simplified because we considered
the interval [0, 1].
Consider the following cubic polynomial on the interval
xk ≤ x ≤ xk+1 expressed in terms of local variables
s = x − xk and h = hk :
3hs2 − 2s3 h3 − 3hs2 + 2s3
P (x) = yk+1 + yk
h3 h3
s2(s − h) s(s − h)2
+ dk+1 + dk .
h2 h2
This is a cubic polynomial in s (and hence in x)
that satisfies 4 interpolation conditions: 2 on function
values and 2 on (possibly unknown) derivative values.
P (xk ) = yk , P (xk+1) = yk+1,
P 0(xk ) = dk , P 0(xk+1) = dk+1.
21
Interpolants for derivatives are known as Hermite or
osculatory interpolants because of the higher-order
contact at the breakpoints.
(Osculari is Latin for “to kiss”.)
If we know both function values and first derivatives
at a set of points, then a piecewise cubic Hermite
interpolant can be fit to those data.
But if we are not given the derivative values, we need
to define the slopes dk somehow.
We now study two possible ways to do this, leading to
the functions pchip and spline in Matlab.
22
3.4 Shape-Preserving Cubic Spline
(pchip)
pchip stands for “piecewise cubic Hermite
interpolating polynomial.”
Unfortunately, the catchy name does not precisely
specify how the interpolant is defined – there are many
ways to have a “piecewise cubic Hermite interpolating
polynomial.”
The key features of the pchip interpolant in
Matlab is that it is shape preserving and somehow
“visually pleasing”.
The key idea is to determine the slopes dk so that the
interpolant does not oscillate too much.
We will study the pchip spline in mathematical detail
in the case studies.
23
3.5 Cubic Spline
The final piecewise cubic interpolating function we
consider in this course is a cubic spline.
The term “spline” originates from the name of an
instrument used in drafting.
A real spline is a thin, flexible wooden or plastic
instrument that is passed through given data points; it
is used to define a smooth curve in between the points.
Physically, the spline takes its shape by naturally
minimizing its own potential energy subject to it
passing through the data points.
Mathematically, the spline must have a continuous
second derivative (curvature) and pass through the
data points.
The breakpoints of a spline are also referred to as knots
or nodes.
Note: Splines extend far beyond the one-dimensional,
cubic, interpolatory spline we are studying.
24
There are multidimensional, high-order, variable-knot,
and approximating splines.
The first derivative P 0(x) of our piecewise cubic
function is defined by different formulas on either side
of a knot xk .
However, both formulas are designed to give the same
value dk at xk , so P 0(x) is continuous.
We have no such guarantee for the second derivative;
however, we choose continuity of the second derivative
to be a defining condition for the cubic spline.
Applying the preceding approach to each interior knot
xk , k = 2, 3, . . . , n − 1, gives n − 2 equations involving
the n unknowns dk .
A different approach is necessary near the ends of the
interval.
One effective strategy is known as “not-a-knot”.
The idea is to use a single cubic on the first two
subintervals (x1 ≤ x ≤ x3) and on the last two
subintervals (xn−2 ≤ x ≤ xn).
25
Thus, if there is only one cubic on each of the first and
last pairs of intervals, it is as if x2 and xn−1 are not
there – they are not treated like other knots.
For example, on the first pair of intervals, we can
pretend there are two different cubic polynomials
P1(x), P2(x) and impose the condition
P1000(x2+) = P2000(x2−).
Together with the continuity of the cubic spline and
its first two derivatives, this forces P1(x) ≡ P2(x).
With the two end conditions, we now have n linear
equations in n unknowns:
Ad = r,
where d = (d1, d2, . . . , dn)T is the vector of slopes.
26
The slopes can now be computed by
d = A\r.
Because most of the elements of A are zero, it is
appropriate to store A in a sparse data structure.
(In this case, it turns out that A is tridiagonal.)
The \ operator can then take advantage of the
tridiagonal structure and solve the linear equations
in time and storage proportional to n, the number of
data points.
We have also seen that a specialized algorithm such as
the Thomas algorithm can do this too if you’re willing
to write and able to your own code.
27
Bézier Curves
Applications in computer graphics and computer-aided
design (CAD) require the rapid generation of smooth
curves that can be quickly and conveniently modified.
For reasons of aesthetics and computational expense,
we do not want the entire shape of the curve to be
affected by small local changes.
This rules out interpolating polynomials or splines!
The first solution was proposed by the French
mathematician Paul de Casteljau in 1959 while working
at the French auto maker Citroën.
Bézier curves are named after Pierre Bézier, a French
engineer who used these curves in automobile body
design software for the French auto maker Renault.
The Renault software was described in several
publications by Bezier starting in 1962, and that’s how
his name has become associated with these curves.
28
Bernstein polynomials
To understand Bézier curves, we start with the
Bernstein polynomials of degree n on the interval [0, 1]:
(n) n
Bi (x) = (1 − x)n−ixi,
i
where the binomial coefficient is
n n!
= .
i i!(n − i)!
In computer graphics, the most popular Bézier curve
is cubic (n = 3).
This leads to the polynomials
(3) (3)
B0 (x) = (1 − x)3, B1 (x) = 3(1 − x)2x,
(3) (3)
B2 (x) = 3(1 − x)x2, B3 (x) = x3.
29
Construction of Bézier curves
Four points c0, c1, c2, c3 (in 2 or 3 dimensions) now
define the cubic Bézier curve:
3
(3)
X
B(x) = ciBi (x).
i=0
The points ci are known as control points.
B(x) starts at c0 going toward c1 and arrives at c3
coming from c2.
B(x) only interpolates c0 and c3; i.e., it does not
generally pass through c1 or c2 — these points only
provide directional information.
In fact, B(x) is tangent to the line connecting c0 and
c1 (and to the line connecting c2 and c3), but it is
“more tangent” the further c1 is away from c0 (and
the further c2 is away from c3).
30
Example
For a non-interactive demo see
bezierDemo.m
To see the effects on the curve from changing the
control points, try using nodes (0, 0), (1, 0) with
control points at
• (0.25, 0.25) and (0.75, 0.25)
• (1, 1) and (0.5, 0.5)
• (2, 2) and (2, −1)
• (0.5, 0.5) and (2, −1)
A more interactive demo program can be downloaded
from Matlab Central:
Yet another Bézier curve demo
31
Summary of observations
When interpolating data, there is a tradeoff
between smoothness of the interpolant and a
somewhat subjective property that we might call local
monotonicity or “shape preservation”.
At one extreme, we have the piecewise linear
interpolant: It has hardly any smoothness. It is
continuous, but there are jumps in its first derivative.
On the other hand, it preserves the local monotonicity
of the data. It never overshoots the data, and it
is increasing, decreasing, or constant on the same
intervals as the data.
At the other extreme, we have the full-degree
polynomial interpolant. It is infinitely differentiable.
But it often fails to preserve shape, particularly near
the ends of the interval.
The pchip and spline interpolants are in between
these two extremes.
32
The spline is smoother than pchip. The spline has
2 continuous derivatives, whereas pchip has only 1.
A discontinuous second derivative implies discontinuous
curvature. The human eye can detect large jumps in
curvature! This might be a factor in choosing spline
over pchip.
On the other hand, pchip is guaranteed to preserve
shape, whereas spline might not.
The best spline is often in the eye of the beholder!
33