0% found this document useful (0 votes)
67 views204 pages

Units1 2 3 4Block1MST2092ndedOU2008

The document outlines the structure and content of the MST209 Mathematical Methods and Models course, detailing various units covering topics such as differential equations, vector algebra, and calculus. It emphasizes the importance of understanding mathematical techniques and their applications in modeling. Additionally, it provides guidance on study strategies and the use of a computer algebra package to aid in learning.

Uploaded by

Rui Fonte
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views204 pages

Units1 2 3 4Block1MST2092ndedOU2008

The document outlines the structure and content of the MST209 Mathematical Methods and Models course, detailing various units covering topics such as differential equations, vector algebra, and calculus. It emphasizes the importance of understanding mathematical techniques and their applications in modeling. Additionally, it provides guidance on study strategies and the use of a computer algebra package to aid in learning.

Uploaded by

Rui Fonte
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 204

C M Y CM MY CY CMY K

MST209 Block 1
MST209 Mathematical
methods and models

Block 1

MST209 Mathematical methods and models

Block 1 Block 4 continued


Unit 1 Getting started Unit 15 Modelling heat transfer
Unit 2 First-order differential equations Unit 16 Interpretation of mathematical
Unit 3 Second-order differential models
equations
Unit 4 Vector algebra Block 5
Unit 17 Damping, forcing and resonance
Block 2 Unit 18 Normal modes
Unit 5 Statics Unit 19 Systems of particles
Unit 6 Dynamics Unit 20 Circular motion
Unit 7 Oscillations
Unit 8 Energy and consolidation Block 6
Unit 21 Fourier series
Block 3 Unit 22 Partial differential equations
Unit 9 Matrices and determinants Unit 23 Scalar and vector fields
Unit 10 Eigenvalues and eigenvectors Unit 24 Vector calculus
Unit 11 Systems of differential equations
Unit 12 Functions of several variables Block 7
Unit 25 Multiple integrals
Block 4 Unit 26 Numerical methods for differential
Unit 13 Modelling with non-linear equations
differential equations Unit 27 Rotating bodies and angular
Unit 14 Modelling motion in two and momentum
three dimensions Unit 28 Planetary orbits
The Open University

MST209 Mathematical methods and models


ISBN 978 0 7492 5281 6
MST209 Mathematical methods
and models

Block 1
Contents
UNIT 1 Getting started 5

Introduction 6

1 Numbers, measurement and accuracy 6

2 Some standard functions 9


2.1 Functions, variables and parameters 9
2.2 Linear functions 11
2.3 Quadratic functions 13
2.4 Exponential and logarithm functions 17
2.5 Combining functions 20
3 Trigonometric functions 22
3.1 Introducing the trigonometric functions 22
3.2 Inverse trigonometric functions 24
3.3 Some useful trigonometric identities 26

4 Complex numbers 27
4.1 The arithmetic of complex numbers 28
4.2 Polar form 29

5 Differentiation 31
5.1 Rates of change 31
5.2 Differentiating combinations of functions 35
5.3 Investigating functions 38
6 Integration 41
6.1 Reversing differentiation 41
6.2 Evaluating integrals 43
6.3 Integration by parts and by substitution 45
6.4 Definite integrals 47
7 Computer activities 50

Outcomes 52

Solutions to the exercises 53

UNIT 2 First-order differential equations 61


Introduction 62

1 Some basics 63
1.1 Why differential equations? 63
1.2 Differential equations and solutions 65
1.3 Approximations in calculations 70

2 Direction fields and Euler’s method 71


2.1 Direction fields 72
2.2 Euler’s method 74
2.3 Finding numerical solutions on the computer 82
3 Finding analytic solutions 83
3.1 Direct integration 84
3.2 Separation of variables 86

4 Solving linear differential equations 91


4.1 Linear differential equations 91
4.2 The integrating factor method 92

5 Finding analytic solutions on the computer 98


Outcomes 99

Solutions to the exercises 100

UNIT 3 Second-order differential equations 109

Introduction 110
1 Homogeneous differential equations 111
1.1 First thoughts 111
1.2 Method of solution 114
1.3 The general solution 121

2 Inhomogeneous differential equations 124


2.1 General method of solution 124
2.2 Finding a particular integral by the method of
undetermined coefficients 127
2.3 Exceptional cases 132
2.4 Combining cases 134

3 Initial conditions and boundary conditions 135


3.1 Initial-value problems 136
3.2 Boundary-value problems 138

4 The nature of solutions 141


4.1 Transients 141
4.2 Solving initial-value problems on the computer 144

Outcomes 145

Solutions to the exercises 146

UNIT 4 Vector algebra 153


Introduction 154

1 Describing and representing vectors 155


1.1 Scalars and vectors 155
1.2 Vector notation 155
1.3 Using arrows to represent vectors 156
1.4 Equality of vectors 158
1.5 Polar representation of two-dimensional vectors 159
2 Scaling and adding vectors 161
2.1 Scaling of a vector 161
2.2 Addition of vectors 165
2.3 Algebraic rules for scaling and adding vectors 166
3 Cartesian components of a vector 168
3.1 Vectors in two dimensions 168
3.2 Vectors in three dimensions 172
4 Products of vectors 177
4.1 The dot product 177
4.2 The cross product 184
Outcomes 189

Solutions to the exercises 190

Index 198

The Open University, Walton Hall, Milton Keynes, MK7 6AA.


First published 2005. Second edition 2008.
Copyright 
c 2005, 2008 The Open University
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
transmitted or utilised in any form or by any means, electronic, mechanical, photocopying,
recording or otherwise, without written permission from the publisher or a licence from the
Copyright Licensing Agency Ltd. Details of such licences (for reprographic reproduction) may
be obtained from the Copyright Licensing Agency Ltd, Saffron House, 6–10 Kirby Street,
London EC1N 8TS; website http://www.cla.co.uk.
Open University course materials may also be made available in electronic formats for use by
students of the University. All rights, including copyright and related rights and database rights,
in electronic course materials and their contents are owned by or licensed to The Open
University, or otherwise used by The Open University as permitted by applicable law.
In using electronic course materials and their contents you agree that your use will be solely for
the purposes of following an Open University course of study or otherwise as licensed by The
Open University or its assigns.
Except as permitted above you undertake not to copy, store in any medium (including electronic
storage or use in a website), distribute, transmit or retransmit, broadcast, modify or show in
public such electronic materials in whole or in part without the prior written consent of The
Open University or in accordance with the Copyright, Designs and Patents Act 1988.
Edited, designed and typeset by The Open University, using the Open University TEX System.
Printed and bound in the United Kingdom by The Charlesworth Group, Wakefield.
ISBN 978 0 7492 5281 6

1.1
UNIT 1 Getting star ted
1

Study guide for Unit 1


This unit reviews material that you will need as a basis for your study of 2
MST209. You should have covered most of it in previous courses.
The only non-text study for this unit involves the use of the computer algebra
package for the course. All the computer activities appear in Section 7.
These can be studied when you have finished each section.
The time you will require to study this unit will depend on how familiar 3
you are with the material that it contains. Most sections begin with a short
diagnostic test. If you find that you can answer the test question(s) correctly,
then it is probably safe to progress directly to the next section — you can
always return to re-read a section if the need arises, and you can further
check your knowledge by trying the other exercises. If you choose not to
study a subsection in detail, do check for any new ideas introduced, and 4
make sure that you look at those. (New terms are set in bold type.)
Because this unit reviews a large amount of material, it is longer than other
units in the course; so, even if the material is not new, you may find that
there is a good deal to cover in one week. If you do not have time to
study the whole unit, make sure that you are familiar with the material
in Sections 2 and 3 and Subsections 5.1, 5.2 and 6.1, as this material is 5
particularly important. You can always use this unit later to revise other
topics when you find you need them.
The unit is structured so that it is natural to study the material in the order
in which it appears in the text. However, you can if you wish leave study of
Section 4 until last. 6

7
PC

5
Unit 1 Getting started

Introduction
The main purpose of this unit is to review ideas that you should have met
before, and that you will need as a basis for your study of MST209. The unit
focuses mainly on mathematical techniques, but also covers some examples
involving skills in the application of mathematics. The use of mathemat-
ics to investigate questions arising in non-mathematical contexts is broadly
referred to as ‘mathematical modelling’. In this course, the study of math-
ematical techniques will quite often be separated from their use in models,
as this enables you to practise the mathematical methods before you go on
to use them. However, the methods introduced in the course are chosen
because of their wide application in modelling.
The unit contains a number of ‘standard formulae’: for example, for the
solution of a quadratic equation, for expanding sin(a + b) and cos(a + b),
and for the derivatives and integrals of standard functions. It is helpful if
you are able to remember such formulae, but not essential; they are all given
in the course Handbook. You do need to be aware that the formulae exist,
however, and to be able to find them in the Handbook and to apply them.
The computer algebra package for the course can be used to help with much
of the work in this unit, and the unit reviews how this is done. However, the
majority of the unit concentrates on how the mathematical techniques can
be performed ‘by hand’; that is, without use of a computer (or calculator). Generally, we advise you to
This is because, in the long run, a familiarity and confidence with using first attempt all exercises
common mathematical formulae and techniques by hand will speed up your without using a computer,
unless they are specifically
study, and because it is not always convenient to resort to a computer. marked as computer
(Having said that, the computer remains a valuable tool, both for checking activities.
hand calculations and for addressing problems too complicated or time-
consuming to be worked on by hand.)
Section 1 starts by reviewing some basic points about numbers. Sections 2
and 3 cover a number of important standard functions: linear, quadratic,
logarithmic and exponential functions in Section 2, and trigonometric func-
tions in Section 3. All these functions occur frequently. These sections also
remind you of some important mathematical techniques: for example, for
manipulating algebraic expressions, for solving a quadratic equation, and for
manipulating expressions involving functions such as sin and cos. Section 4
covers some basic ideas about complex numbers.
Sections 5 and 6 discuss the fundamental concepts of calculus: differentiation
and integration. It is important that you understand what these ideas are,
and how they arise in models. These sections also provide plenty of exercises
on performing basic calculus operations by hand, as being able to do this Note, however, that the
quickly will stand you in good stead for the rest of the course. computer algebra package for
the course can be used to find
The final section shows how the computer algebra package for the course most derivatives and many
may be used in many of the techniques in the unit. integrals.

1 Numbers, measurement and accuracy


This section covers some fundamental terminology and notation relating to
numbers.

6
Section 1 Numbers, measurement and accuracy

Diagnostic test 1.1


Do Exercise 1.1 on page 9, and check your answers with the solutions given
on page 53. If you are happy with your answers, you may proceed directly
to Section 2.

We distinguish various types of number. The integers are the positive and
negative whole numbers, together with zero:
. . . , −4, −3, −2, −1, 0, 1, 2, 3, 4, . . . .
We denote the set of all integers by Z. These are fine for counting, but
insufficient for measuring lengths, for example. Non-integer quantities can
sometimes be represented exactly, for example as fractions (such as 73 ) or

roots (such as 5). Often in this course, however, decimals will be used. In
decimal notation, we can sometimes express a number exactly, but frequently
we need to approximate. We may write a number correct to a particular
number of decimal places, as in, for example,
7
3 = 2.33 (to two decimal places),

2 = 1.4142 (to four decimal places).
For large or small numbers we extend decimal notation by ‘taking out’ pow-
ers of 10, as in, for example,
3.414 × 106 (for 3 414 000),
3.42 × 10−7 (for 0.000 000 342).
In this course we shall use the convention that a non-zero number is ex-
pressed in scientific notation as Other conventions for
c scientific notation exist; for
±b × 10 , example, choosing b to satisfy
where 1 ≤ b < 10 and c is an integer. 0.1 ≤ b < 1.

If 3.414 is quoted to an accuracy of three decimal places, then


3.414 × 106 = 3 414 000 almost certainly is not. It is, however, accurate to
four significant figures. That is, giving a number to ‘so many decimal
places’ indicates its absolute level of accuracy, while to ‘so many significant
figures’ indicates the accuracy relative to the size of the number itself.
Many numbers cannot be represented exactly by a decimal, but any such
number can be approximated arbitrarily closely by a decimal. For example,
taking more and more decimal places, the number π is
3.1, 3.14, 3.142, 3.1416, 3.141 59, 3.141 593, 3.141 592 7, . . . .
Any number that can be approximated arbitrarily closely by a decimal (or
is actually equal to one) is called a real number. We denote the set of all
real numbers by R. Those real numbers that can be expressed exactly as a
337
fraction (such as 1149 ) are referred to as rational numbers, and those that Rationals have either a
√ terminating or a recurring
are not equal to any such fraction (such as π and 2) are called irrational.
decimal representation;
Sometimes one wishes to reduce the accuracy to which a number is given. irrationals do not. For
For example, a calculator may give you the result of some calculation to example, 32 has the
an accuracy of ten significant figures, while the assumptions on which the terminating decimal
representation 1.5, whereas 43
calculation was based justify quoting the result only to three significant has the recurring decimal
figures. representation 1.333 333 . . . .

7
Unit 1 Getting started

The process of reducing the number of significant figures or decimal places


to which a number is expressed is referred to as rounding. To express
3.141 59 to three decimal places, we simply take the number expressed to
three decimal places that is closest to 3.141 59, which is 3.142. It is usual to
make clear that a number is accurate only to some specified level by writing,
for example, ‘3.142 (to three decimal places)’. The process of rounding
is straightforward except in one case. The number 4.15, for example, is
equally close to 4.1 and 4.2. To round 4.15 to one decimal place we shall
use a standard, if arbitrary, convention, in which 5 is always rounded up for
positive numbers. So 4.15 is expressed as 4.2 (to one decimal place). For
negative numbers, 5 is rounded down: for example, −4.15 is expressed as
−4.2 (to one decimal place).
The choice of how accurately to express some real number depends on cir- When considering problems
cumstances. If the number represents a measurement, you may know how arising from the real world,
accurately the measurement was made. For example, using a tape measure we shall generally quote
numbers to relatively few
(see Figure 1.1) to measure a length carefully, one might hope to measure significant figures. When
it to the nearest millimetre, and would express the measurement as, for discussing numerical
example, 1.274 metres (to three decimal places). methods, a larger number of
significant figures will be
used. The number of
significant figures used will
depend on the context.

Figure 1.1 A tape measure

If we know that a real number x is 1.274 to three decimal places, then x lies
between 1.2735 and 1.2745; that is, x lies in the interval
[1.2735, 1.2745].
This ‘closed interval’ notation represents the set of real numbers between
1.2735 and 1.2745, inclusive of the endpoints. We can actually say slightly
more, for we know that x is not 1.2745, since that would round up to 1.275.
So x is in fact in the interval
[1.2735, 1.2745), Note the round bracket on
the right.
which represents the set of real numbers between 1.2735 and 1.2745, with
1.2735 included, but 1.2745 not included. Other, similar, ‘round bracket’
notations are also used.
[a, b] means all real numbers x between a and b, that is, a ≤ x ≤ b
with a and b both included.
[a, b) means all real numbers x between a and b, that is, a ≤ x < b
with a included and b excluded.
(a, b] means all real numbers x between a and b, that is, a < x ≤ b
with a excluded and b included.
(a, b) means all real numbers x between a and b, that is, a < x < b
with a and b both excluded.
The interval [a, b] is sometimes referred to as a closed interval, the interval
(a, b) as an open interval, and the intervals [a, b) and (a, b] as half-open
intervals. As you will see later, interval notation is useful in expressing
domains of functions and in discussing the accuracy of calculations, and
sometimes it is important to be able to say whether or not the endpoints
are to be included.
It is common scientific practice to quote measurements with ‘error bounds’.

8
Section 2 Some standard functions

We might write, for example,


y = 32.62 ± 0.08 m
to indicate that the observed measurement of y is 32.62 m, but this can be
expected to be accurate only to within an error bound of 0.08 m; that is,
the actual value of y may lie anywhere between 32.62 − 0.08 = 32.54 m and
32.62 + 0.08 = 32.70 m, inclusive (i.e. anywhere in the interval [32.54, 32.70]).
Such error bounds may not fit exactly with expressing y to a particular
number of decimal places or significant figures (see Exercise 1.1(b)). We can
also express the information that y is within 0.08 m of 32.62 m by writing
−0.08 ≤ y − 32.62 ≤ 0.08,
or more succinctly by writing
|y − 32.62| ≤ 0.08. We refer to |x| (read as
‘mod x’) as the modulus
Remember, |x| is a non-negative number with the same magnitude as x. So, (or magnitude or absolute
for example, |5.72| = 5.72, while |−3.8907| = 3.8907. value) of x.
*Exercise 1.1 Note that in this course we
(a) Express the following numbers in scientific notation. use the convention of placing
a * against not-to-be-missed
(i) 64 823.5 (ii) 0.000 073 exercises.
(b) Suppose that we know that a measured value y is
y = 127.683 ± 0.006 m.
(i) Give an interval in which the number y must lie.
(ii) To how many significant figures can we give y with certainty?
(c) Suppose that the number x satisfies the condition
|x − 2.763| < 5 × 10−4 .
What is the smallest interval in which this condition tells you that x
must lie? To how many decimal places can we give x with certainty?

2 Some standard functions


Functions play a central role in mathematics. After a brief look at some
general ideas about functions (in Subsection 2.1), this section reviews some
key classes of functions (linear, quadratic and exponential). These form part
of a ‘library’ of standard functions, central both to building models and
to solving more complicated mathematical problems (such as differential
equations). The section also looks briefly at how such functions may be
combined.

2.1 Functions, variables and parameter s


Diagnostic test 2.1
If you are familiar with each of the following terms (and the distinction
between them), you may proceed directly to Subsection 2.2. If not, it is
advisable to read this subsection.
(a) Continuous model, Discrete model.
(b) Variable, Parameter.
(c) Domain, Image set.

9
Unit 1 Getting started

Consider the following example. At midday on 1 June, a reservoir contains


2 × 106 cubic metres of water. Each day, 25 000 cubic metres of water are
removed from the reservoir, while only 3400 cubic metres flow into the reser-
voir, and it is expected that these conditions will continue for 50 days. If
there are no other factors affecting the quantity of water in the reservoir,
then there is a daily net reduction of 21 600 cubic metres of water. Assum-
ing that the rate of reduction is exactly the same at all times, this means
that the quantity of water in the reservoir reduces by 21 600/(24 × 60) = 15
cubic metres each minute. Suppose that, at a time t minutes after midday
on 1 June, the reservoir contains V cubic metres of water. Then we might
use the equation
V = 2 × 106 − 15t (t ∈ R, 0 ≤ t ≤ 72 000) (2.1)
to model the quantity of water in the reservoir for the 50 days (= 72 000
minutes) after midday on 1 June. The letters V and t represent measurable
quantities. We call V and t variables. Here, V depends on t, and we call
V the dependent variable and t the independent variable.
A different way to approach the same problem is to form a recurrence
system as follows. Denote the volume of water in the reservoir at the end
of minute i as Vi . The initial volume of water is 2 × 106 cubic metres, and
we denote this as V0 . Then (using the above arithmetic) we see that
V0 = 2 × 106 , Vi+1 = Vi − 15 (i = 0, 1, 2, . . . , 72 000). (2.2)
You may have met such recurrence systems before, and recognize that (2.1)
is the closed form solution.
Equations (2.2) relate two variables: volume (Vi ) and time (i). However, in
(2.2) the independent variable i is constrained to take only integer values
(between 0 and 72 000), while in (2.1) the independent variable t may take
any real value (between 0 and 72 000). We call (2.1) a continuous model,
while (2.2) is a discrete model.
A function is a process that can be applied to each of a specified set of input
values to produce an output value. One example is ‘given t between 0 and
72 000, calculate 2 × 106 − 15t’. If we denote this function by f , then we can
write Equation (2.1) as V = f (t). The domain of a function is the set of
permitted input values. The function f associated with Equation (2.1) has
as domain the set of real numbers t with 0 ≤ t ≤ 72 000; that is, the interval
[0, 72 000]. If we were to associate Equations (2.2) with a function g, say,
then g would have as domain the set consisting of those integers i between
0 and 72 000. The image set of a function is the set of output values. The
function f associated with Equation (2.1) has as image set the set of values
of 2 × 106 − 15t for 0 ≤ t ≤ 72 000; that is, the interval [920 000, 2 000 000].
In this course we shall usually be concerned with continuous models, and
with functions whose domain is R, or a part of R such as an interval. We
may specify such a function, say f , by writing it in a form such as
f (t) = 2 × 106 − 15t (0 ≤ t ≤ 72 000), Since in MST209 we are
almost always concerned with
where the expression 2 × 106 − 15t on the right-hand side gives the rule or continuous models, we shall
formula that specifies the function, and the bracketed conditions indicate usually omit ‘t ∈ R’ from the
its domain. bracketed conditions.

To define a function, a process must produce a unique output value for each
allowed input.

10
Section 2 Some standard functions

So, for example,


√ √
f (x) = ± x (x ≥ 0) We write ± x to denote the
√ positive and negative square
does not define a function f , since ± x does not specify a unique value roots of x because,
√ by
(given x). convention, x denotes only
the positive square root.
Now consider a generalization of the situation described by Equation (2.1).
Suppose that the reservoir initially contains V0 cubic metres of water and
that the net loss per minute is L cubic metres. Then we have
V = V0 − Lt (0 ≤ t ≤ 72 000). (2.3)
We now have an equation involving several letters, with differing roles. As-
suming that we want to use (2.3) to describe how V will change with time t,
we continue to call t the independent variable and V the dependent vari-
able. The quantities V0 and L do not depend on t. They may, however, take
different values in different uses of (2.3) — in an application to a different
reservoir, for example. We call V0 and L parameters. Whatever the values
of the parameters V0 and L, Equation (2.3) gives a similar form of relation-
ship between V and t: for example, the independent variable t appears in a
similar way in any of the expressions 12 000 − 5t, 300 − 6.6t and 14 − 2t.
In this course we shall often be concerned with relationships between vari-
ables, and it will be convenient to use language and notation that blur the
abstract idea of a function. For example, we may write V = V (t), rather
than introducing a separate symbol (such as f ) for the function relating V
to t, and say ‘V is a function of t’. Then, for example, V (3) denotes the
value of V when t = 3.

Exercise 2.1
Suppose that, at midday on 1 May, a reservoir is at 60% of its capacity. The
forecast for the next 50 days suggests that 30 000 m3 of water will be removed The notation m3 is shorthand
each day for consumption, while only 15 600 m3 will be added to the reservoir for ‘cubic metres’.
each day, so that on average the quantity of water in the reservoir reduces by
(30 000 − 15 600)/(24 × 60) = 10 cubic metres each minute. Crisis measures
will be introduced when the reservoir falls to 20% of its capacity.
Let V (measured in m3 ) be the volume of water t minutes after midday on
1 May, and let C (measured in m3 ) be the reservoir’s overall capacity. The
volume of water can be modelled using
V = V0 − 10t (0 ≤ t ≤ 72 000),
with a suitable choice of V0 .
Determine a suitable expression for V0 , and hence use the model to obtain
an expression in terms of C for the time at which crisis measures will be
needed (according to this model).

2.2 Linear functions


Diagnostic test 2.2
Read the description of the problem below, and do Exercise 2.2. Then do
Exercise 2.3 on page 13, and check your answers with the solutions given on
page 53. If you are happy with your answers, you may proceed directly to
Subsection 2.3.

11
Unit 1 Getting started

A linear function relating y to x is one of the general form


y = mx + c, In this course, if no domain is
specified for a function,
where m and c are constants. The graph of such a function is a straight line assume it to be R.
(hence the term ‘linear’), as in Figure 2.1. The constant c represents the
value of y at the point where the line crosses the y-axis. The gradient (or
slope) of the graph is the same everywhere, and is equal to m. That is, for
any two points (x1 , y1 ) and (x2 , y2 ) on the graph, we have
y2 − y1
= m.
x2 − x1

y
( x 2, y 2)
y2 − y1
gradient m = tan q =
y2 – y1 x2 − x1
( x1, y1)
x2 – x1
c
q
x

Figure 2.1 The graph of y = mx + c

One situation where linear functions arise is when an object is moving in a


straight line with constant speed. Let us look at an example.
A boat, suspected of carrying contraband, passed a detector buoy 2 kilome-
tres from port at 11.00 pm, and is moving at a steady 5 metres per second on
a straight course directly away from port (along the line AZ in Figure 2.2).
A coastguard cutter leaves the port in pursuit at midnight, travelling at
7 metres per second. When will it catch the boat?

Y
X
7 m s –1 the 5 m s –1
buoy The notation m s−1 is
shorthand for ‘metres per
second’.
the A B Z
port 2000 m

Figure 2.2

Suppose that we choose to measure time in seconds, starting from midnight, In this course, we shall
and distance in metres, measured from A. Let X metres be the distance usually use SI units. The SI
of the coastguard cutter from A at time t seconds after midnight, and let units of distance and time are
metres and seconds (denoted
Y metres be the distance of the boat from A at the same time. We can by m and s, respectively).
readily obtain an expression for X in terms of t, since X = 0 when t = 0, Commonly used SI units are
and the cutter travels at a constant speed of 7 metres per second: we have given in the Handbook.
X = 7t.
We also want an expression giving Y in terms of t. We know that (at
point B) Y = 2000 at 11.00 pm, which is 1 hour, or 602 seconds, before
midnight and so corresponds to t = −3600. Also, as the boat is moving at
a constant speed of 5 metres per second, Y will be related to t by a linear
function of the form
Y = 5t + c.

12
Section 2 Some standard functions

*Exercise 2.2
(a) Find the value of c such that Y = 5t + c satisfies the condition Y = 2000
at t = −3600.
(b) (i) When will the coastguard cutter catch the boat?
y
(ii) In the direction that the boat is travelling, the limit of territorial
waters is 100 kilometres from A. Will the cutter catch the boat within
territorial waters?
4x + 3y = –1

3x + y = 3
Simultaneous linear equations
x
Suppose that we know the paths of two aircraft, each of which is travelling
in a straight line, and wish to know where these paths cross. This is one
of a wide variety of situations where we need to find the intersection of two
straight-line graphs, which is equivalent to the algebraic problem of solving
two simultaneous linear equations. Consider, for example, the following Figure 2.3
linear equations (see Figure 2.3):
4x + 3y = −1, (2.4) These equations are linear,
since they can be rewritten in
3x + y = 3. (2.5)
the form y = mx + c.
There are many ways of solving these equations: one quick method is
Gaussian elimination. Let us see how this works for (2.4) and (2.5). This method was known to
Chinese mathematicians in
The aim of the method is to subtract a multiple of the first equation from about 100 bc, but not to
the second in order to eliminate the x terms. First, we multiply (2.4) by 34 , European ones until it was
to obtain an equation with the same coefficient of x as in (2.5): discovered by the German
mathematician Carl Friedrich
3
4 × 4x + 3
4 × 3y = 34 (−1), Gauss (1777–1855). (‘Gauss’
is pronounced as ‘gowce’.)
which simplifies to
3x + 94 y = − 34 . (2.6)
Now we subtract (2.6) from (2.5), to eliminate x, and obtain
y − 94 y = 3 − (− 34 ) = 3 + 3
4 = 15
4 ,

that is, − 54 y = 15
4 , so y = −3.
To find x, substitute this value of y into (2.4), to obtain
4x + 3(−3) = −1,
which gives 4x = −1 + 9 = 8, and hence x = 2.
So the solution of Equations (2.4) and (2.5) is x = 2, y = −3. You may like to check these
values, by substitution
*Exercise 2.3 into (2.4) and (2.5).
Use Gaussian elimination to solve the equations below for u and v:
2u − 5v = 19,
3u + 4v = −29.

2.3 Quadratic functions


Diagnostic test 2.3
Do Exercise 2.4 on page 15, and check your answers with the solutions on
page 53. If you are happy with your answers, you may proceed directly to
Subsection 2.4.

13
Unit 1 Getting started

A quadratic function relating y to x is a function of the general form


y = ax2 + bx + c,
where a, b and c are constants and a = 0. The graph of such a quadratic
function is a parabola, and may open ‘up’ or ‘down’, depending on the sign
of a, as illustrated in Figure 2.4.

y y
a>0 a<0

x
x

(a) y = 2x2 + 3 x – 4 (b) y = –x2 + 4 x + 15


s
Figure 2.4
g = 9.81 m s –2
Where the graph opens ‘up’ (a > 0), values of y may become arbitrarily
large, but there is a smallest (minimum) value that y can take. If a < 0, then
v0 = 10 m s –1
the graph opens ‘down’, and negative values of y may become arbitrarily
large in magnitude, but there is a largest (maximum) value that y can take.
2m
For example, suppose that a ball is thrown directly upwards at time t = 0,
with velocity 10 m s−1 , and from a height of 2 metres (see Figure 2.5).
The ball moves under the influence of gravity, and the position s after t sec-
onds is given by Figure 2.5
1 2
s= 2 (−9.81)t + 10t + 2. This equation will be derived
in Unit 6.
Suppose that we want to find when the ball will hit the ground; that is, the
value of t when s = 0. Then we need to solve the quadratic equation
1 2
2 (−9.81)t + 10t + 2 = 0. (2.7)
You will have met before the formula for the solution of a general quadratic
equation, given below.

Solution of a quadratic equation


The quadratic equation
ax2 + bx + c = 0,
where a, b and c are constants and a = 0, can be solved for x using the
formula

−b ± b2 − 4ac
x= . (2.8)
2a
The solutions of a quadratic equation are often referred to as its roots. The term ‘root’ is also used
for a solution to other sorts of
Notice that the sum of the roots is −b/a, which is a useful check. equation, as you will see in
Section 4.
Using the formula to solve (2.7) for t gives

−10 ± 100 + 39.24
t= = 2.22 or −0.18 (to two decimal places).
−9.81

14
Section 2 Some standard functions

Here the solution t = −0.18 refers to a time before the ball is thrown, so can
be discarded. The ball hits the ground about 2.2 seconds after it is thrown.
In this example, the quadratic equation has two solutions. Look at the
graphs in Figure 2.4, and imagine moving them up and down (which corre-
sponds to varying the value of c). The x-axis may meet a quadratic graph
in two places, or not at all, or it may happen just to touch the minimum
(or maximum) point of the graph √ (see Figure 2.6). In formula (2.8), we
need to find the square root b2 − 4ac. If b2 − 4ac > 0, then we find a
real value, greater than 0, for this square root, and there are two different
solutions to the quadratic equation. If b2 − 4ac = 0, then there is just one Complex numbers, which are
solution (though, for reasons given below, this one solution is sometimes discussed in Section 4, enable
considered as two equal solutions). If b2 − 4ac < 0, then there are no (real) us to express square roots of
negative numbers and hence
solutions. The quantity b2 − 4ac is often referred to as the discriminant of to produce (complex)
the quadratic equation because it discriminates between the cases shown in solutions to a quadratic
Figure 2.6. equation when b2 − 4ac < 0.

y y y

x x x

b 2 – 4ac > 0 b 2 – 4ac = 0 b 2 – 4ac < 0

Figure 2.6

*Exercise 2.4
Solve for x the following equations.
(a) 2x2 + 7x − 4 = 0 (b) x2 + x − 6 = 0

Sometimes, you may find that you need to solve a quadratic equation where
the coefficients are letters rather than numbers.

Exercise 2.5
Show that the solutions (for x) of
mx2 + 2kx + mw2 = 0 (m =  0)

are x = −K ± K 2 − w2 , where K = k/m.

The solutions of a quadratic equation correspond to a factorization of the


corresponding quadratic function. For example, x2 + x − 6 = 0 has solutions
x = 2 and x = −3, and we have the factorization You may like to check this by
2 multiplying out
x + x − 6 = (x − 2)(x + 3). (x − 2) × (x + 3).
With experience, you may find that such factorizations provide a convenient
way of solving some quadratic equations, but the formula provides a reliable
method that can be used in all cases.

15
Unit 1 Getting started

One point of caution: if you want to factorize a quadratic function, you can
do this by first solving the equation (e.g. by using formula (2.8)), but you
need to be careful to match the coefficient of x2 in the original quadratic
function with that in the factorization. For example, 2x2 + 7x − 4 = 0 has
solutions x = 12 and x = −4, but to factorize 2x2 + 7x − 4 we write
2x2 + 7x − 4 = 2(x − 12 )(x + 4) = (2x − 1)(x + 4),
where the 2 is needed to ensure that the coefficients of x2 are the same on
each side.
There are some particular factorizations that it is helpful to recognize. Two
useful ones are Note that if we allow A to be
2 2 2 2 2 2 positive or negative, then
(x + A) = x + 2Ax + A and (x − A) = x − 2Ax + A . both cases can be written as
So, for example, − 6x + 9 = (x − 3)2 . We refer to such quadratics as
x2 (x + A)2 = x2 + 2Ax + A2 .
perfect squares. Perfect squares correspond to quadratic equations in
which the discriminant is b2 − 4ac = 0. (You may like to check this for
yourself.) Thus equations in which the discriminant is zero can be written in
the form (x + A)(x + A) = 0 or (x − A)(x − A) = 0, and these factorizations
lead us sometimes to consider such equations as having two equal roots
x = −A and x = −A, or x = A and x = A, rather just one root.
Another useful factorization is
(x + A)(x − A) = x2 − A2 .
So, for example, x2 − 16 = (x + 4)(x − 4). We refer to such a quadratic as
a difference of two squares.
One needs to be particularly careful when solving a quadratic equation that
involves the same letters as appear in the standard formula (2.8), but in a
different way.

Example 2.1
Solve for x the equation
abx2 − (a + b)x + 1 = 0,
where a and b are non-zero constants.

Solution
You need to keep a cool head here, because the letters used in formula (2.8)
are used in a different way in the given equation. In (2.8), we need
ab for a, −(a + b) for b, 1 for c.
So we obtain the solutions

a + b ± (a + b)2 − 4ab
x= .
2ab
This expression gives the solutions, but it turns out to be possible to express You will find it advantageous
them in a much simpler form. We have (a + b)2 = a2 + 2ab + b2 , so the in this course to be able to
discriminant can be written as perform manipulations like
this by hand. If you find
(a + b)2 − 4ab = (a2 + 2ab + b2 ) − 4ab = a2 − 2ab + b2 = (a − b)2 . them difficult, however, you
may like to make use of the
Therefore computer algebra package for
  the course.
a+b± (a + b)2 − 4ab a + b ± (a − b)2 a + b ± (a − b)
= = .
2ab 2ab 2ab
Now (a + b) + (a − b) = 2a and (a + b) − (a − b) = 2b, so the two solutions
are 1/a and 1/b.

16
Section 2 Some standard functions

2.4 Exponential and logarithm functions

Diagnostic test 2.4


Do Exercises 2.6 and 2.7 below, and check your answers with the solutions
starting on page 53. If you are happy with your answers, you may proceed
directly to Subsection 2.5.

A function relating y to x of the form y = bax (where a and b are constants,


with a > 0 and a =  1) is referred to as an exponential function. Use
of such a function with domain R requires us to assign a meaning to ax
for non-integer values of x. We start by revising the properties of integer
powers.

Powers
You will be familiar with the meaning of a positive integer power of a num-
ber, such as 105 = 10 × 10 × 10 × 10 × 10. In general, an means the product In an , a is called the base,
of n copies of a (for any real number a and any positive integer n). In and n may be referred to as
particular, a1 = a. the power, the index or the
exponent.
For positive integers m and n, we have the property
an × am = an+m , (2.9)
since each side is the product of n + m copies of a. For example,
102 × 105 = 107 .
Consequently, if we multiply m copies of an , we obtain
m times
  
n n n n n+n+n+···+n
a
 × a × a
 × · · · × a  = a ;
m times
that is,
(an )m = an×m . (2.10)
For example, (102 )3 = 106 .
The definition of an can be extended to cases where n is not a positive
integer by assuming that (2.9) and (2.10) hold more generally. For a = 0,
this assumption leads to the definition of a0 as 1, and a−n as 1/an ; and, for
a > 0, to the definition of a1/n as the nth root of a, and am/n as the nth Recall that the nth root of a
root of am . So, for example: number a is a number b such
bn = a, and we write
that √
10−4 = 1/104 = 0.0001; n
b = a.

3
271/3 = 27 = 3 (since 33 = 27);
1 1 1 1
4−3/2 = 3/2 = √2 3 =
√ = .
4 4 64 8
It is conventional to take fractional powers to
√ mean positive roots (where
there is a choice). So, for example, 91/2 = 9 means 3 rather than −3. The negative square root of 5,
In general, roots of negative numbers do not necessarily exist (at least, for example, would
√ be written
not as real numbers); but√where they do, we use the same notation. So, as −51/2 or − 5.
for example, (−27)1/3 = 3 −27 = −3 (since (−3)3 = −27, and there is no
positive cube root in this case).

17
Unit 1 Getting started

We can define ax for a > 0 and for irrational values of x by means of a For a = 0, 0x is taken to
limiting process that need not concern us here. (The value of ax for any equal 0. For a < 0, the
particular a > 0 and x can be found using your calculator.) This definition definition of ax involves
complex numbers and need
of ax leads to the following properties of powers that hold for all real numbers not concern us here.
a > 0 and all real exponents x and y:
ax > 0, We have not proved these
properties, but we shall make
a−x = 1/ax ,
use of them as necessary.
ax+y = ax × ay ,
(ax )y = ax×y ,
ax /ay = ax−y .
Finally, note that for powers of a product or a quotient, we have
(ab)x = ax bx and (a/b)x = ax /bx .
For example, 157 = 37 57 and (5/3)4 = 54 /34 .

*Exercise 2.6
Use the properties of indices to simplify each of the following.
(a) a3 a5 (b) a3 /a5 (c) (a3 )5 (d) (2−1 )4 × 43
 3/2
(e) 8−1/3 (f ) 163/4 (g) 49 (h) (16x4 )1/2

The exponential and logarithm functions


One function that is particularly important is ex , where e is the number
2.718 28 . . . . This function arises, for example, in the solution of differential Differential equations occur
equations. This is a consequence of its property that it is unchanged by throughout the course,
differentiation. (Indeed, the only functions f that are unchanged by differ- beginning in Unit 2.
entiation are functions of the form f (x) = Aex , where A is a constant.) The Differentiation is discussed in
function ex is often referred to as the exponential function, and may also Section 5.
be written exp x. Note that ex is always positive: ex > 0 for all real x.
The inverse function of ex is the natural logarithm function, written ln x. Essentially, the inverse
Now ln x is defined only for x > 0 (since the domain of the inverse function function of a function f is
of ex is the same as the image set of ex ). Since these functions are inverse one that reverses the effect
of f .
to each other, we have:
ln(exp x) = x for all real x;
exp(ln x) = x for all real x > 0.
Another way of looking at this relationship is: if ey = x, then y = ln x. In This can be taken as a
particular, since e0 = 1, we have ln 1 = 0. definition of ln x.

The properties of powers given above lead to corresponding results about


the logarithm function:
Another logarithm function
ln(1/u) = − ln u, that will be used occasionally
ln(u × v) = ln u + ln v, in the course is log10 , where
if 10y = x, then y = log10 x.
ln(uv ) = v × ln u, The results given here for ln
ln(u/v) = ln u − ln v. also hold for log10 .

Before the advent of calculators and computers, these properties were com-
monly used in calculating powers, reciprocals and products ‘using loga-
rithms’. Such applications are no longer needed, but these properties of
logarithms are still important in the manipulation of expressions involving
exponentials and logarithms.

18
Section 2 Some standard functions

*Exercise 2.7
Simplify each of the following (where a > 0, b > 0 and x > 0).
(a) ln 7 + ln 4 − ln 14 (b) ln a + 2 ln b − ln(a2 b)
(c) ex × (ey )2 ÷ e2x (d) ln(ex × ey ) (e) e2 ln x (f) e−2 ln x
(g) exp(2 ln x + ln(x + 1))

Exercise 2.8
By taking logs of both sides of the equation
ax = ekx ,
where a > 0, show that we can find a value for k so that this equation holds
for all values of x.

In light of the equivalence established in Exercise 2.8, it is common practice


to use functions of the form ekx , for suitable values of k, rather than expo-
nential functions of the form ax for values of a (> 0) other than e. (This is
more convenient when doing calculus, for example.) For k =  0, the graph
of this standard exponential function takes one of the forms shown in Fig-
ure 2.7. For k > 0, the larger the value of k, the faster the value of ekx
increases as x increases (and so the graph climbs more steeply). Similarly,
for k < 0, the larger the magnitude of k, the faster ekx decreases.

y y
k>0 k<0

1 1

x x

Figure 2.7 Graphs of y = ekx

Log plots
Suppose that you have data on some quantity y at various times t, and you
believe that y is an exponential function of t. To test such a hypothesis, you
can plot ln y against t. If such a plot gives a straight line, then this confirms Such plots are often referred
that y is of the form Aekt . For example, suppose that a plot of ln y against to as log–linear plots.
t suggests the linear relationship
ln y = 1.47t + 3.82.
Then, taking exponentials of each side, we have
y = exp(1.47t + 3.82)
= exp(1.47t) × exp(3.82)
= 45.6e1.47t .
The next exercise shows how we can test data (on variables x and y) for a
different form of relationship by plotting ln x against ln y.

19
Unit 1 Getting started

Exercise 2.9
(a) Suppose that
Plots of ln y against ln x are
ln y = 2.83 ln x + 0.37. called log–log plots.
Using the properties of exp and ln, express y as a function of x.
(b) In general, suppose that
ln y = a ln x + b,
where a and b are constants. What form of relationship is there between
y and x?

From Exercise 2.9(b), we see that if a plot of ln y against ln x is linear, then


y is a power function of x; that is,
y = cxa ,
where c and a are constants. Note that this is not an exponential function. In an exponential function, x
Power functions of x include appears as the exponent, as
√ in for example
x2 , x3 , x5/2 (= x2 x). 2x , 3x , 2.5x ,
If a is a positive integer, then the power function f (x) = xa is defined for each of which can be
all real x; but for other values of a, this power function has domain x > 0. expressed as ekx for a suitable
value of k.

2.5 Combining functions


Diagnostic test 2.5
(a) If f (x) = ln x and g(x) = 1/(x − 1)2 , and x > 1, find the following.
(i) f (g(x)) (ii) g(f (x))
(b) Express h(x) = (1 + ex )2 as a composition of basic functions.
Now check your solutions against those given below. If you are happy with
the answers, you may proceed directly to Section 3.
Solution
(a) (i) f (g(x)) = ln(1/(x − 1)2 ) = −2 ln(x − 1)
(ii) g(f (x)) = 1/(ln x − 1)2 (This has no easy simplification.)
(b) We can obtain h(x) in three steps.
Step 1 Start with ex .
Step 2 Add 1 to the result of Step 1.
Step 3 Apply the square function to the result of Step 2.
Then h(x) = p(q(r(x))), where
p(x) = x2 , q(x) = x + 1, r(x) = ex .

Consider the motion of a stone that is thrown vertically upwards with a


velocity of 10 m s−1 from a point 2 m above the ground (see Figure 2.5 on
page 14). We shall see in Unit 6 that the height s of the stone in terms of
time t may be modelled by the equation
s = − 12 gt2 + 10t + 2, (2.11)

20
Section 2 Some standard functions

and that its velocity is given by


v = 10 − gt, (2.12)
where g = 9.81 m s−2 is the acceleration due to gravity.
Now suppose that we wish to know the velocity as a function of height
(v = v(s)). By manipulating the two equations above, eliminating t, you
can see that
v 2 = 4g + 100 − 2gs,
of which one solution is Here we take the positive
 square root, which
v = 4g + 100 − 2gs. (2.13) corresponds to the upward
motion of the stone. Unit 6
However, there is now potential for confusion about what is meant by v. In
will deal with both cases.
Equation (2.12), v = v(t) is a function of time t; in Equation (2.13), v = v(s)
is a function of height s.
Unfortunately, this notation suggests that we are using the same ‘name’, v,
for two different functions. (The expressions on the right in (2.12) and (2.13)
are different!) In such a context, it may be necessary, for clarity, to introduce
different names for these functions. In fact, we could resolve the situation
by solving Equation (2.11) for t, giving Again we must be careful
√ about the sign of the square
10 − 100 + 4g − 2gs root, so that the equation
t= = f (s), say. (2.14)
g corresponds to the upward
motion of the stone.
Now Equation (2.12) is of the form v = g(t) (where g(t) = 10 − gt). Then
Equation (2.13), which shows the dependency of v on s, can be written as
v = g(f (s)).
In general, the function h with the rule
h(x) = g(f (x))
is called a composite function; it is the composition of the functions g
and f .
When combining functions in this way, it is important to check how the
domains affect each other. In the above example, (2.14) is valid only when
100 + 4g − 2gs ≥ 0, and we can see from (2.13) that the same condition
must hold in order to find the composite function v = h(s).
You will see later that when doing calculus it is useful to be able to form
compositions of functions, and even more useful to be able to recognize a
complicated function as the composition of simpler ones.

Example 2.2
If f (x) = ex (x ∈ R) and g(x) = 1 + x2 (x ∈ R), what are the following?
(a) g(f (x)) (b) f (g(x))

Solution
(a) g(f (x)) = g(ex ) = 1 + (ex )2 = 1 + e2x In this example, both
1+x2 functions f and g are defined
(b) f (g(x)) = f (1 + x2 ) = e for all x in R, so we need not
worry about domains.
Notice that g(f (x)) and f (g(x)) are different. The function g(f (x)) is ‘apply
f first, then g’, while f (g(x)) is ‘apply g first, then f ’; the order in which f
and g are applied matters!

21
Unit 1 Getting started

Example 2.3
Express the function
1
h(x) = √ 3 (2.15)
1 + 2x2
as a composite of a quadratic function and a power function.
Solution
Note first that 1 + 2x2 is a quadratic function and that, writing y = 1 + 2x2 ,
1
the right-hand side of (2.15) becomes √ 3 = y −3/2 (a power function). So
( y)
we can obtain h(x) in two steps.
Step 1 Calculate y = 1 + 2x2 .
1
Step 2 Apply √ 3 = y −3/2 to the result of Step 1.
( y)
So if f (x) = 1 + 2x2 and g(x) = x−3/2 , then h(x) = g(f (x)). Here, the domain of g is
x > 0, but since
f (x) = 1 + 2x2 is always
Exercise 2.10 greater than 0, there is no
(a) If f (x) = e−x and g(x) = 1 − x3 , find the following. problem.
(i) f (g(x)) (ii) g(f (x))
(b) Express
1
h(x) =
(4 + 9x2 )4
as a composite of a quadratic function and a power function.

3 Trigonometric functions
In this section, we add another class of functions to the ‘library’ developed
in Section 2. These are the trigonometric functions. They originate in the
geometry of right-angled triangles, but in this course we are equally often
concerned with their use in modelling repetitive or oscillatory behaviour. In
particular, they arise as solutions of certain differential equations.

3.1 Introducing the trigonometric functions


Diagnostic test 3.1
Do Exercise 3.1 on page 23, and check your answers with the solutions given
on page 54. If you are happy with your answers, you may proceed directly
to Subsection 3.2.

You will have met sin θ = a/h, cos θ = b/h and tan θ = a/b as ratios in a The Greek letter θ is read as
right-angled triangle (see Figure 3.1). However, these definitions of the sine, ‘theta’. The Greek alphabet
cosine and tangent functions work only for 0 < θ < π2 . (Note that we shall is given in the Handbook.
almost always express angles in radians in this course.) Recall that 180◦ = π radians.

22
Section 3 Trigonometric functions

To define the sine and cosine functions for a general value of θ, we can
use Figure 3.2, which shows a circle of radius 1. Imagine that the line OA
started along the x-axis, and was then rotated anticlockwise through an
angle θ. Then the point A has coordinates (cos θ, sin θ). Here θ may be any
value, positive or negative. (A negative value of θ corresponds to a rotation
clockwise.)
y

A
h 1 q
a
O x

q
b

Figure 3.1 Figure 3.2

If we rotate through 2π radians (360◦), then we go round a full circle. So


rotations of θ and θ + 2π leave A in exactly the same place. This leads to
the repetitive nature of the graphs of sin and cos (see Figure 3.3). The word
periodic is used to refer to the fact that these functions repeat their values
every 2π: that is, sin(θ + 2π) = sin θ and cos(θ + 2π) = cos θ, for any θ.

y y
1 1

–2π 3π –π π 0 π π 3π 2π q –2π

− 2 –π
π
−2 0 π
π 3π
2π q
− 2 −2 2 2 2 2

–1 –1

(a) y = sin q (b) y = cos q

Figure 3.3

Other trigonometric functions can be defined in terms of sin and cos. You
will have met tan θ = sin θ/ cos θ. This is defined for all real θ except where
cos θ = 0 (i.e. at θ = ± π2 , ± 32π , and so on). You may also have met These functions may be
referred to as tangent,
1 1 1 cos θ secant, cosecant and
sec θ = , cosec θ = and cot θ = = .
cos θ sin θ tan θ sin θ cotangent.
We need to restrict the domains of cosec and cot to exclude points where
sin θ = 0, and the domains of sec and tan to exclude points where cos θ = 0.

*Exercise 3.1
π
(a) Find the values of sin θ and cos θ for θ = 0 and θ = π2 . 6
2
(b) Hence find the values of tan θ, sec θ, cosec θ and cot θ for θ = 0 and
3
θ = π2 , where they are defined. 2
1
(c) Two right-angled triangles are shown in Figure 3.4. Use these to calcu- π π
4 3
late the values of sin θ, cos θ, tan θ, cosec θ, sec θ and cot θ for θ equal to 1 1
each of π6 , π4 and π3 .
(d) For what values of θ is sin θ = 0? (Refer to Figure 3.3(a).) Figure 3.4

23
Unit 1 Getting started

The function tan has the graph shown in Figure 3.5. Notice that tan ac-
tually repeats its values every π. (This is because sin(θ + π) = − sin θ and
cos(θ + π) = − cos θ, so that tan(θ + π) = tan θ.)

y
y = tan q
The graphs of sec, cosec and
cot are given in the
Handbook (as well as those of
sin, cos and tan).


− 2 –π π
− 2 0 π π 3π
q
2 2

Figure 3.5

3.2 Inverse trigonometric functions


Diagnostic test 3.2
(a) Find all the solutions of cos θ = 0.5 in the range −2π to 4π.

(b) Find all the solutions of tan θ = 1/ 3.
Now check your solutions against those given below. If you are happy with
the answers, you may proceed directly to Subsection 3.3.
Solution
(a) In the range −π to π, we know that cos(± π3 ) = 12 = 0.5. Further solu-
tions may be found by adding or subtracting multiples of 2π to these
values:
− 53π , − π3 , π 5π 7π 11π
3, 3 , 3 , 3 .

(b) In the range 0 to π2 , we know that tan π6 = 1/ 3. From the graph in
Figure 3.5 we see that the required solutions are obtained by adding or
subtracting any multiple of π, so the solutions are
π
6 + nπ (n ∈ Z).

Suppose that you need to solve for x the equation


cos x = 12 .
What solutions are there? You have seen (in Exercise 3.1(c)) that cos π3 = 12 ,
so one solution is certainly x = π3 . There are others, however. For instance,
since cos repeats its values every 2π, another solution is x = π3 + 2π. We
can find an infinite number of solutions by adding or subtracting multiples
of 2π to/from π3 . There are even more solutions. If you look at the graph of
cos in Figure 3.3(b), you can see that a horizontal line at y = 12 would cut it
twice between 0 and 2π: we also have cos 53π = 12 . And more solutions can
be found by adding or subtracting multiples of 2π to/from 53π .

24
Section 3 Trigonometric functions

In general, an equation of the form


cos x = y (3.1)
is solved for x by finding a value of the inverse trigonometric function
arccos:
x = arccos y.
However, we need to be careful here. Solutions of Equation (3.1) are not
unique, as we saw for y = 12 . If we reverse the roles of the axes in Fig-
ure 3.3(b), we obtain the curve shown in Figure 3.6. However, this is not
the graph of a function: a vertical line may meet the curve in many places,
reflecting the fact that, given y, Equation (3.1) may have multiple solu-
tions x. To ensure that, given y, arccos y has a unique value, we need to
restrict the range in which values of arccos can fall. This is equivalent to
specifying a codomain for the function arccos. The codomain of arccos is The codomain of a function
given in Table 3.1, together with codomains for two other inverse trigono- is a set within which its
metric functions, arcsin and arctan. In Figure 3.6, when the values taken by values must lie.
arccos are restricted to this codomain, we obtain just the part of the curve
shown in bold, which is the graph of a function.
x


2
π
π
2

y The graphs of arcsin and


–1 0 1 arctan, as well as arccos, are
π
−2
given in the Handbook.

Figure 3.6 The graph of


arccos is just the bold part of
the curve

Table 3.1
Function Inverse Codomain of Domain of
inverse function inverse function
y = sin x x = arcsin y − π2 ≤ x ≤ π2 −1 ≤ y ≤ 1 Some texts use sin−1 , cos−1
y = cos x x = arccos y 0≤x≤π −1 ≤ y ≤ 1 and tan−1 rather than arcsin,
arccos and arctan.
y = tan x x = arctan y − 2 < x < π2
π
R

Calculators and computer software can be expected to give values of the


inverse trigonometric functions drawn from suitably restricted codomains,
such as those in Table 3.1. However, in a particular model this may not give
the appropriate value, and in such a situation it is important to be alert to
the fact that an equation such as (3.1) actually has infinitely many solutions:
there are usually two solutions in the range 0 to 2π, together with infinitely There is another possibility:
many others obtained by shifting these two by multiples of 2π. if |y| > 1, then Equation (3.1)
has no solutions.
Exercise 3.2
(a) Find all the solutions of sin θ = 0.8 in the range 0 to 6π.
(b) Find all the solutions of tan θ = 1.

25
Unit 1 Getting started

3.3 Some useful trigonometric identities All the trigonometric


identities discussed in this
subsection are included in the
Diagnostic test 3.3 course Handbook for easy
Do Exercise 3.3 on page 27, and check your answers with the solutions given reference.
on page 55. If you are happy with your answers, you may proceed directly
to Section 4.

y
Figure 3.7 shows the relation between a clockwise rotation through θ (re- sin q
(cos q, sin q)
A
garded as a rotation through −θ) and an anticlockwise rotation through θ.
1
Notice that such rotations lead to equal x-coordinates but to y-coordinates
of opposite signs. So we have q X cos q =
O q cos(–q ) x
cos(−θ) = cos θ and sin(−θ) = − sin θ.
1
These relations hold for all values of θ, and are examples of trigonometric B
sin(– q)
identities. These can be useful in a variety of contexts, such as simplifying (cos(–q), sin(– q))
expressions involving trigonometric functions.
To derive one particularly useful identity, apply Pythagoras’s Theorem to Figure 3.7
the right-angled triangle OAX in Figure 3.7. This leads to
cos2 θ + sin2 θ = 1. (3.2) Notice that we write (sin θ)2
as sin2 θ.
If we divide each side of (3.2) by cos2 θ, we obtain the identity
1 + tan2 θ = sec2 θ. Strictly speaking, this and the
following identities hold only
Similarly, if we divide each side of (3.2) by sin2 θ, we obtain where the functions tan, sec,
etc. are defined.
cot2 θ + 1 = cosec2 θ.
You may also have met previously the identities These identities can be
derived using transformation
sin(θ + φ) = sin θ cos φ + cos θ sin φ, (3.3) matrices, for example, but we
cos(θ + φ) = cos θ cos φ − sin θ sin φ. (3.4) shall not discuss here how
this is done.
Replacing φ by −φ in these identities (and using cos(−φ) = cos φ and
sin(−φ) = − sin φ), we obtain
sin(θ − φ) = sin θ cos φ − cos θ sin φ,
cos(θ − φ) = cos θ cos φ + sin θ sin φ.
We can obtain an identity for tan(θ + φ) by dividing those for sin(θ + φ)
and cos(θ + φ):
sin(θ + φ) sin θ cos φ + cos θ sin φ
tan(θ + φ) = = ,
cos(θ + φ) cos θ cos φ − sin θ sin φ
and dividing top and bottom by cos θ cos φ gives
tan θ + tan φ
tan(θ + φ) = ,
1 − tan θ tan φ
All these identities can be useful when manipulating expressions involving
trigonometric functions. One situation where such manipulations are needed
is when performing certain integrations by hand (as you will see in Section 6),
and there expressions for sin 2θ and cos 2θ can be particularly useful. We
ask you to obtain such expressions in the next exercise.
(Note that it is usual to write sin 2θ rather than sin(2θ) — the spacing
makes the meaning clear. However, when using the course computer algebra
package you will have to include the brackets. When there is any danger of
ambiguity, we shall also use brackets in the text.)

26
Section 4 Complex numbers

*Exercise 3.3
(a) By putting φ = θ in the expressions for sin(θ + φ) and cos(θ + φ),
establish the following identities.
(i) sin 2θ = 2 sin θ cos θ (ii) cos 2θ = cos2 θ − sin2 θ
(b) Using trigonometric identities, and particular values of sin and cos,
simplify each of the following.
(i) sin(2π − θ) (ii) cos(2π − θ) (iii) sin(π − θ)
(iv) cos(π − θ) (v) sin( π2 − θ) (vi) cos( π2 − θ)
 3π 
(vii) cos 2 + x

The most useful identities to remember are (3.2), (3.3) and (3.4), as most
of the others can be derived from these.

4 Complex number s

Diagnostic test 4.1


Do Exercises 4.1, 4.2, 4.3 and 4.7 below, and check your answers with the
solutions starting on page 55. If you are happy with your answers, you may
proceed directly to Section 5.

Complex numbers provide a system within which we can solve any quadratic
equation (and, indeed, any polynomial equation). They are helpful in some
of the mathematical techniques introduced in this course, although the use
we shall make of them is quite limited.
There is no real number x satisfying the equation
x2 = −1.
However, there are circumstances where it is convenient to have a system
of ‘numbers’ in which such an equation can be solved. Such a system is the
system of complex numbers. A complex number is one of the form
Engineers √
commonly use j to
z = a + bi (or, equivalently, z = a + ib), represent −1.

where i = −1, and a and b are real numbers. We refer to a as the real part
of z, written Re(z), and to b as the imaginary part of z, written Im(z). Note that Im(z) is the real
A complex number of the form a + 0i is, in effect, just the real number a; number b; Im(z) is not equal
so the real numbers are seen as part of (a subset of) the complex numbers. to bi.

We denote the set of all complex numbers by C. Within C, we can solve any
quadratic
√ equation, since the formula will always give a solution once we
can use −1. For example, the equation x2 − 2x + 2 = 0 has the solutions
√ √ √ √
2 ± 22 − 4 × 2 2 ± −4 2 ± 4 × −1 2 ± 2i
x= = = = = 1 ± i,
2 2 2 2
and the equation x2 = −1 has the solutions x = ±i.

27
Unit 1 Getting started

An nth-order polynomial with real coefficients is a function of the form An nth-order polynomial is
n n−1 sometimes referred to as a
p(x) = an x + an−1 x + · · · + a1 x + a0 , polynomial of degree n.
where an = 0 and each coefficient ak (k = 0, 1, . . . , n) is a constant in R.
Within the complex numbers, any such polynomial can be written as a In fact, this result also holds
product of an and n factors of the form x − ck (k = 1, 2, . . . , n), with each if the coefficients ak are
ck in C. These n factors correspond to the n roots (i.e. solutions) x = ck complex.
(k = 1, 2, . . . , n) of the corresponding polynomial equation p(x) = 0; if a fac-
tor x − c occurs more than once, then the root x = c is a repeated root. Repeated roots are sometimes
In Subsection 2.3 we saw that for second-order (i.e. quadratic) polynomi- referred to as equal roots or
als, repeated roots correspond to perfect squares (i.e. factorizations such as coincident roots.
x2 − 2cx + c2 = (x − c)2 ).

4.1 The arithmetic of complex number s


We can perform arithmetic (and algebra) with complex numbers, and this
follows all the familiar rules for real numbers, such as
u(v + w) = uv + uw and u × v = v × u.
To add, subtract or multiply complex numbers, just manipulate brackets in
the usual way, and remember that i2 = −1. For example,
(2 + 3i) + (4 − 7i) = 2 + 4 + 3i − 7i = 6 − 4i
and
(2 + 3i) × (4 − 7i) = 2 × (4 − 7i) + 3i × (4 − 7i)
= 8 − 14i + 12i − 21i2
= 8 + 21 − 2i
= 29 − 2i.
Division of complex numbers is a little more complicated. The complex
conjugate of a complex number z = a + bi is z = a − bi, and the rule for
division is best expressed in terms of this. To simplify, for example,
2 + 3i
,
4 − 7i
multiply top and bottom by 4 + 7i, the complex conjugate of the denomi-
nator, to obtain
2 + 3i (2 + 3i) × (4 + 7i)
=
4 − 7i (4 − 7i) × (4 + 7i)
8 + 14i + 12i − 21
=
16 + 28i − 28i + 49
−13 + 26i
=
65
−1 + 2i
=
5
= − 5 + 25 i.
1

This process always reduces the denominator to a real number, since


(a + bi) × (a − bi) = a2 + b2 is always real. Thus, in general, Note that a2 + b2 is always
positive, unless a = b = 0.
c + di (c + di) × (a − bi)
= .
a + bi a2 + b 2

28
Section 4 Complex numbers


The modulus of a complex number z = a + bi is a2 + b2 , written |z|, so
the rule for division can be written, for complex numbers u and v, as
u uv
= 2.
v |v|

*Exercise 4.1
Let v = 3 − 4i and w = 2 − i. Evaluate each of the following.
(a) v (b) |v| (c) v − w (d) vw
(e) w/v (f) 1/w (g) w2 (h) 2w − 3v

*Exercise 4.2
Solve (for x in C) the quadratic equation 2x2 + 2x + 1 = 0.

If a quadratic equation with real coefficients has complex roots (as in Exer-
cise 4.2), then these always form a pair of complex conjugates (of the form This follows from the
a ± bi). formula (2.8) for the solution
of a quadratic equation.

4.2 Polar form

Polar coordinates

Polar coordinates provide an alternative way of representing points in We shall use angle brackets to
the plane. Figure 4.1 shows a point A with Cartesian coordinates (x, y) and distinguish polar from
polar coordinates r, θ. The quantity r is the distance from A to the origin, Cartesian coordinates. (This
is not a universal convention.)
so r ≥ 0. The angle θ is measured anticlockwise from the x-axis. (Negative
angles correspond to measuring clockwise from the x-axis.) It is convenient
to allow θ to take any real value, but this has the consequence that the polar y
representation of a point is not unique. For example, r, θ and r, θ + 2π y = r sin q
A (x, y)
provide polar coordinates of the same point. We can see from Figure 4.1
that if a point has polar coordinates r, θ and Cartesian coordinates (x, y), r
then
q
x = r cos θ and y = r sin θ. x = r cos q x
These equations allow us to translate from polar to Cartesian coordinates.
Figure 4.1
To translate from Cartesian to polar coordinates, we can use (see Figure 4.1)

r = x2 + y 2 , cos θ = x/r, sin θ = y/r (r = 0). (4.1) If r = 0, then we can choose
any value for θ.
Equations (4.1) do not have a unique solution for θ in R, but they do have
a unique solution in the range −π < θ ≤ π.

*Exercise 4.3
What are the polar coordinates r, θ of each of the following points, for θ
in the range −π < θ ≤ π?
(a) (−2, 0) (b) (1, 1) (c) (−1, −1)

(d) (4, 0) (e) (0, 4) (f ) (− 3, 1)

29
Unit 1 Getting started

The polar form of a complex number


A complex number x + yi can be represented geometrically on an Argand
diagram as the point with Cartesian coordinates (x, y). For example, Fig-
ure 4.2 shows on an Argand diagram the point 3 + 2i, with real part 3 and 3i
imaginary part 2. (3,2)
2i
Combining polar coordinates and the Argand diagram leads to the polar i
form of a complex number. For z = x + yi, find the polar coordinates r, θ
of the point with Cartesian coordinates (x, y). Then, using the relation –2 –1 0 1 2 3 4
between polar and Cartesian coordinates, we have –i
–2i
z = x + yi = x + iy = r cos θ + ir sin θ = r(cos θ + i sin θ).

This is the polar form of z. Here, r = x2 + y 2 = |z| is the modulus Figure 4.2
of z. We call θ an argument of z. As noted above, θ is not unique, but
there is a unique value of θ in the range −π < θ ≤ π. This is called the
principal value of the argument, and we write it as Arg(z). When there is
no possibility of confusion, we often write r, θ as shorthand for the polar
form r(cos θ + i sin θ).

Exercise 4.4
If a complex number z has polar form 2, − π4 , what is its Cartesian form?

Exercise 4.5
Express each of the following complex numbers in polar form, choosing the
principal value of the argument.
(a) −2 (b) 1 + i (c) −1 − i

(d) 4 (e) 4i (f) − 3 + i

Multiplication of complex number s in polar form


Multiplication of complex numbers is simpler in polar than in Cartesian
form. We have
r1 , θ1  × r2 , θ2  = r1 r2 , θ1 + θ2 . (4.2) Note that although θ1 + θ2 is
an argument of the product,
That is, to multiply two numbers in polar form, we just multiply their moduli it may not be the principal
and add their arguments. The equation above can be justified as follows: value of the argument.
r1 (cos θ1 + i sin θ1 ) × r2 (cos θ2 + i sin θ2 )
= r1 r2 (cos θ1 cos θ2 − sin θ1 sin θ2 + i(sin θ1 cos θ2 + cos θ1 sin θ2 ))
= r1 r2 (cos(θ1 + θ2 ) + i sin(θ1 + θ2 )),
using trigonometric identities (3.4) and (3.3) from Section 3.
From (4.2), we can deduce a formula for division of complex numbers in The proof of this is not
polar form: difficult, but we omit it for
reasons of space.
r1 , θ1  ÷ r2 , θ2  = r1 /r2 , θ1 − θ2 .
Also, if we multiply the complex number r, θ by itself repeatedly, we obtain
a formula for an integer power of a complex number:
r, θn = rn , nθ.

30
Section 5 Differentiation

With r = 1, and written as


(cos θ + i sin θ)n = cos nθ + i sin nθ,
this result is known as De Moivre’s Theorem. Abraham de Moivre
(1667–1754) was born in
Exercise 4.6 France but spent his adult life
living and working in
Find (1 − i)20 . England.

Complex exponentials
For a complex number z = x + iy, we can define the complex exponential
ez by the formula
ez = ex (cos y + i sin y).
We choose this definition because it works! That is, this complex exponential
behaves, as we would hope, like the real exponential function. In particular,
it retains the property that, for any complex numbers u and v,
eu × ev = eu+v .
In the case when x = 0, the definition of the complex exponential gives
This equation is known as
eiy = cos y + i sin y. Euler’s formula.
This leads us to a third way of expressing a complex number, which is often Leonhard Euler (1707–1783)
convenient. If z has polar form r, θ, then we have was a prolific mathematician,
making many fundamental
z = r(cos θ + i sin θ) = reiθ , contributions to diverse areas
of mathematics and science.
where reiθ is referred to as the exponential form of the complex number z.
In this form, r is the modulus of z and θ is the argument of z. As with the
polar form, the value of θ is not unique, but there is a unique choice of θ in
the range −π < θ ≤ π.

*Exercise 4.7
Let z = r, θ. Use the exponential form of z to find Re(zeiωt ).

5 Differentiation
The concepts and techniques of calculus are central to many of the mathe-
matical methods discussed in this course. In this section, we consider dif-
ferentiation.

5.1 Rates of change


Diagnostic test 5.1
Do Exercise 5.3 on page 34, and check your answers with the solutions given
on page 56. If you are happy with your answers, you may proceed directly
to Subsection 5.2.

31
Unit 1 Getting started

Differentiation gives the rate of change of one variable with respect to


another. As in Section 2 (see page 10), suppose that a reservoir contains
V cubic metres of water at time t minutes after midday on 1 June, where
V = 2 × 106 − 15t.
For a linear function such as this, the rate at which V is changing as t
changes is the same at all times t. (The volume of water is falling by the same
amount each minute.) This corresponds to the fact that a linear function
has a straight-line graph, whose gradient (or slope) is the same everywhere.
Now consider a non-linear function, such as
s = 12 (−9.81)t2 + 10t + 2, (5.1)
which was used on page 14 to model the height (s metres at time t seconds)
of a ball thrown vertically upwards. The rate of change of height with time,
ds
written , is equal to the velocity v of the ball, and this varies with time.
dt
This function has a graph that is a parabola, and the gradient (or slope) of
a parabola varies from point to point. Using rules discussed below, we can
differentiate (5.1) to obtain
ds
v= = −9.81t + 10.
dt
Thus differentiation of the function s = f (t) produces another function,
v = f  (t), called the derivative or derived function of f . At each value
of t, f  (t) = −9.81t + 10 gives the gradient of the graph of (5.1).
y
The gradient of a general graph y = f (x) at a particular point x = x0 gives y = f (x)
the derivative f  of the function f at that point. The gradient of f at x0 may B
be defined as the limiting value of the gradient of the chord AB in Figure 5.1, f(x0 +h ) – f(x 0)
as B approaches A. The gradient of this chord is (f (x0 + h) − f (x0 ))/h. The A
process ‘B approaches A’ corresponds to h tending to 0, which we write as h
h → 0. The definition applies only to suitable functions (called ‘smooth’),
where this limit exists and is the same whether h approaches 0 through x0 x0 + h x
positive or negative values. Thus we may formally define the derivative of
f at x0 by Figure 5.1
f (x0 + h) − f (x0 )
f  (x0 ) = lim .
h→0 h
Working from this definition, we can obtain the derivatives of the various
standard functions discussed in Sections 2 and 3. (The details of this need
not concern us here.) These derivatives are tabulated in the Handbook, and
we shall use them as required.
Derivatives can also be found using the computer algebra package for the
course. However, it is not always convenient to use the computer (and indeed
it will not be available to you in an examination), so it is useful to be able
to perform differentiation by hand, at least in relatively simple cases. To do
this, we combine two elements:
• derivatives of standard functions;
• rules for differentiating combinations of functions of various types, in
sums, products, quotients and compositions.
The simplest rules concern constant multiples and sums. In general, the The more complicated rules
derivative of a combination af (x) + bg(x), where a and b are constants, is are discussed in
af  (x) + bg  (x). This rule enables us to differentiate (5.1) to obtain Subsection 5.2.

32
Section 5 Differentiation

ds d d d
= 12 (−9.81) (t2 ) + 10 (t) + (2)
dt dt dt dt
= 12 (−9.81)(2t) + 10(1) + 0
= −9.81t + 10
(using the derivative of the standard function tn for n = 2 and n = 1, and the
fact that the derivative of a constant is 0). Similarly, if V = 2 × 106 − 15t,
we find its derived function to be V  = −15. This is constant (not dependent
on t), corresponding to the fact that this function has a straight-line graph
(with constant gradient).
There are various notations for derivatives, some of which we have used
above. We shall use whichever is convenient in a particular context. No-
ds
tation expressed purely in terms of variables, such as ds/dt, is referred to In text, may be written as
dt
as Leibniz notation (after its inventor, G. W. Leibniz (1646–1717)). This ds/dt, to save space.
notation is extended to write, for example,
d d
(3t + 5 sin 2t) or (ax + bx2 ).
dt dx
Leibniz notation is sometimes a little clumsy, and may be inconvenient in
situations where the role of functions is prominent. There the more modern
function notation of adding a prime ( ) to the function name is preferred. We
can then write, for example, f  (3) to mean the value of the derived function
of f at 3. Sometimes we find it convenient to mix function and variable
names, and write, for example, s , rather than introducing a separate name We may sometimes write s (t)
for the function relating the variables s and t. When using Leibniz notation, if we want to emphasize that
ds s is a function of t.
we may sometimes write (t) to emphasize that this derivative is a function
dt
ds
of t, or (4) to mean the value of the derivative when t = 4. Some of the
dt
simpler forms of notation are open to ambiguity if used in inappropriate
contexts, and at times we need to be careful about how we express things,
for example by using the function notation in a precise manner.
Differentiation of a derivative produces the so-called ‘higher’ derivatives.
If, for example, we have f (x) = x3 + 5x, then differentiation gives f  (x) =
3x2 + 5. This derivative is itself a function of x, and can be differentiated
again. This gives the second derivative as f  (x). (In the above example, The derivative dy/dx is
d dy sometimes referred to as the
f  (x) = 6x.) In Leibniz notation, we write the second derivative, , first derivative.
dx dx
d2 y
as . Differentiating yet again leads to the third derivative, written
dx2
3
d y
, or f  (x), which may also be written f (3) (x). (f  (x) = 6 in this case.)
dx3
The process can be continued, and a general nth derivative may be written
dn y
as or f (n) (x), where n is referred to as the order of the derivative.
dxn
There is one final piece of notation to mention. We so often find that time
(habitually denoted by t) is the independent variable that there is a separate
notational convention for differentiation with respect to time. We use a dot This notation is attributed to
over the variable to indicate a first derivative with respect to t, and two dots Isaac Newton (1642–1727),
to indicate a second derivative. So if x(t) is the position of an object as a and so is sometimes referred
to as Newtonian notation.
function of time t, then ẋ(t) means the same as x (t), and is the velocity of
¨ means the same as x (t), and is its acceleration.
the object, while x(t)

33
Unit 1 Getting started

The next four exercises offer practice in differentiating standard functions,


and constant multiples and sums of these. The Handbook contains a num-
ber of standard derivatives, and you can refer to that in answering these
exercises. You will, however, find it helpful later in the course if you are
able to differentiate polynomials, exponentials and trigonometric functions
without reference to the Handbook.

Exercise 5.1
Suppose that an object is moving in a straight line so that its position x
(measured from a chosen origin) is related to time t by the equation
x = 5 + 7 cos(3t + 2).
(a) Find expressions in terms of t for the velocity ẋ(t) and acceleration ẍ(t)
of the object.
(b) Use the above equation to eliminate t from your expression for ẍ(t), and
hence find a relationship between the position and acceleration of the
object that holds at all times.

Exercise 5.2
The weekly wage bill of a company, t years in the future, is projected to be
£B, where
B = 105 exp(0.04t).
Find an expression for the rate at which the wage bill will be rising in t years’
time. What will this rate of rise be as a percentage of the wage bill at the
time?

*Exercise 5.3
Calculate the following derivatives.
dy
(a) , where y = 1 − 0.9 exp(−0.5x).
dx
(b) F  (2), where F (x) = 3x4 − 4x + 1.
d2 y
(c) , where y = ln t (t > 0).
dt2
(d) F  ( π6 ), where F (x) = 3 sec(2x) − 4 cos(−3x).
(e) g  (0), where g(t) = a cos(3t + φ) + b sin(3t + φ) (and a, b and φ are
constants).

Exercise 5.4
Calculate the following derivatives.
(a) v  (z), where v = 3 tan z + 2 cos z.
dy π
(b) at t = 12 , where y = A sin 3t + B cos 3t (and A and B are constants).
dt
(c) f (4) ( π2 ), where f (t) = 2 sin 3t.
(d) f  (y), where f (y) = arctan(3y).
dz
(e) when x = 0, where z = ln(cx + d) (and c and d are constants, with
dx
d > 0).

34
Section 5 Differentiation

5.2 Differentiating combinations of functions


Diagnostic test 5.2
Do the starred parts of Exercises 5.5, 5.6 and 5.8 below, and check your
answers with the solutions starting on page 56. If you are happy with your
answers, you may proceed directly to Subsection 5.3.

As we have shown, derivatives of constant multiples and sums are calculated


in a natural way. For derivatives of other combinations — products, quo-
tients and composites — the rules are less obvious. These rules are given
below and in the Handbook; you will find it advantageous later in the course
if you are familiar with these rules and can use them without reference to
the Handbook.

Product Rule (f g) = f  g + f g  . A useful way of remembering


this rule is: derivative of first
Or, in Leibniz notation, times second, plus first times
d du dv derivative of second.
(uv) = v+u .
dx dx dx


f f g − f g
Quotient Rule = . A useful way of remembering
g g2 this rule is: derivative of top
Or, in Leibniz notation, times bottom, minus top
times derivative of bottom, all
du dv over bottom squared.
d u dx
v−u
dx
= .
dx v v2

Example 5.1
Find h (x), where h(x) = x3 cos 2x.
Solution
The function h(x) is a product, f (x)g(x), with f (x) = x3 , g(x) = cos 2x.
We have
f  (x) = 3x2 and g  (x) = −2 sin 2x.
So, using the Product Rule, we have
h (x) = 3x2 cos 2x − 2x3 sin 2x.

Exercise 5.5
dy ln x
*(a) Find , where y = 2 .
dx x +1
*(b) Find f  (t), where f (t) = t5 ln(3t + 4).
(c) Find g  (0) (in terms of the constants A, B and C), where
g(t) = (At + B) sin(At + C).
*(d) If the position of an object at time t is given by e−3t sin 4t, find its
velocity and acceleration.

35
Unit 1 Getting started

The rule for composite functions is a little more complicated to use.

Composite Rule If h(x) = g(f (x)), then h (x) = g  (f (x))f  (x).

Expressed in Leibniz notation, this rule looks rather different: if y is a


function of u, and u is a function of x (so y = g(u) and u = f (x)), then
In this form, the Composite
dy dy du Rule is referred to as the
= .
dx du dx Chain Rule.
In Section 2 (page 20) we mentioned an example of a composite function.
There we had velocity v related to time t by the function
v = g(t) = 10 − gt,
and time related to height s by the function

10 − 100 + 4g − 2gs
t = f (s) = .
g
If we wish to calculate the rate of change of v with respect to s, it is natural
to use the Composite Rule in its Leibniz form:
dv dv dt
= .
ds dt ds
Now
dv
= −g
dt
and

dt d 10 − 100 + 4g − 2gs 1
= =√ .
ds ds g 100 + 4g − 2gs
So, using the Composite Rule,
dv 1 g
= −g √ = −√ .
ds 100 + 4g − 2gs 100 + 4g − 2gs
(You can check that this is correct by differentiating Equation (2.13) di-
rectly.)

Example 5.2
Find f  (x), where f (x) = sin3 x.

Solution
If we let u = sin x, then we have f (x) = u3 . The recognition of sin3 x as a
composite function, and of
We then have how to break it down into two
df du parts, each consisting of a
f  (x) = = 3u2 cos x = 3 sin2 x cos x, standard function, is the key
du dx
to differentiating it.
replacing the variable u by sin x.

Your proficiency with differentiation will depend on your experience and will
develop with practice. It will be helpful for your study of MST209 if you
are able to differentiate expressions such as that in Example 5.2 without
recourse to the computer algebra package for the course — and, ideally,
without even needing to refer to the Handbook. But such proficiency may
take some time to develop, and while it is developing feel free to check your
work using the computer.

36
Section 5 Differentiation

Exercise 5.6
Use the Composite Rule to differentiate each of the following.
*(a) y = exp(t2 ) (b) f (x) = (3x3 + 4)6

*(c) z = tan(3v + 4) (d) g(z) = 4 − z 2
1
*(e) f (x) = √ 3
1 + 2x2

Exercise 5.7
Differentiate the following functions. These differentiations involve
x more than one rule.
(a) y = sec (b) z = t2 exp(t3 + 1)
x2 + 1

Suppose that we want to find the gradient at the point (2, 1) of the tangent
to the ellipse with equation
x2 + 4y 2 = 8. (5.2)
We want dy/dx at x = 2. We could start by expressing y as a function
of x, but a more convenient approach is to differentiate the equation as it
stands. To differentiate y 2 with respect to x, we use the Composite Rule,
d(y 2 ) dy dy
and obtain = 2y . So, differentiating both sides of Equation (5.2)
dy dx dx
with respect to x, we obtain
dy
2x + 4(2y) = 0.
dx
When x = 2 and y = 1, this gives 4 + 8 dy/dx = 0, so dy/dx = − 12 . There-
fore the gradient of the tangent to this ellipse at (2, 1) is − 12 .
Differentiation with respect to x of an expression such as x2 + 4y 2 , where y
is a function of x, is known as implicit differentiation.

*Exercise 5.8
(a) Use the Product and Composite Rules to find the following in terms of
x, y and dy/dx.
d 2 d 3
(i) (x y) (ii) (y )
dx dx
(b) Find the gradient at the point (−1, 1) of the tangent to the curve
x3 + x2 y + y 3 = 1.

Just occasionally, we need to consider differentiation of a complex-valued


function, of the form
f (t) = g(t) + ih(t),
where g and h are real functions. Differentiation of such a function is defined
in a natural way, as
f  (t) = g  (t) + ih (t).
So, for example, if f (t) = cos 3t + i sin 3t, then f  (t) = −3 sin 3t + 3i cos 3t.

37
Unit 1 Getting started

Exercise 5.9
Find the second derivative of the function f (t) = cos 2t + i sin 2t.

5.3 Investigating functions

Diagnostic test 5.3


Do Exercise 5.10 on page 39, and check your answers with the solutions
given on page 57. If you are happy with your answers, you may proceed
directly to Section 6.

Faced with an expression made up of some combination of standard func-


tions, how might you investigate its behaviour? As an example, consider
the function
v
f= ,
4 + 1.5v + 0.008v 2
where we would like to see how f varies as v varies.
A sketch graph of f against v helps with this, and a computer algebra
package or graphics calculator will provide such a graph. However, it is not
always obvious for what range of values to plot the graph, so it is helpful
to be able to deduce some information about the general behaviour of a
function ‘by hand’, without recourse to a machine. Such information can
also be used to cross-check results obtained from a machine, and to flesh
out the picture more fully. This example will be continued in Exercise 5.11,
but first we shall make some general remarks about sketching graphs.
Questions that you might consider when sketching a graph include the fol-
lowing.
• Are there any points where the function is not defined?
(This often happens if the expression is a quotient in which the denom-
inator can be 0.)
• Where does the function cross the axes?
(To find the points where the function crosses the horizontal axis, solve
the equation f (x) = 0 for x. The function crosses the vertical axis at
the point y = f (0).)
• How does the function behave for large and small values of the indepen-
dent variable (or at the endpoints of the domain if this is an interval)?
(For a function with domain R, examine the values of the function at
large positive and negative values. For a function defined on an interval,
simply evaluate the function at the endpoints.)
• On which parts of its domain is the function increasing, and on which is A function f (x) is
it decreasing? increasing on an interval if
(You can look at the sign of the gradient at various points.) f (x) increases in value as x
increases (or equivalently if
• Are there any local maximum or minimum values? f  (x) > 0). Similarly, f (x) is
decreasing on an interval if
The last question can be answered using differentiation. A stationary f (x) decreases in value as x
point of a function f (x) is a value of x where f  (x) = 0. Local maxima and increases (or equivalently if
local minima occur at stationary points, although a stationary point need f  (x) < 0).
not necessarily be either. Figure 5.2 illustrates such stationary points.

38
Section 5 Differentiation

y
A
local
maximum gradient is 0,
but neither
maximum nor
minimum
local
minimum
x
global
B minimum

Figure 5.2

In models, we often wish to find the overall maximum or overall minimum


of some function, usually referred to as the global maximum or global min-
imum, respectively. The global minimum or maximum may well occur at
a stationary point; but caution is needed, for they need not necessarily do
so. For example, the global minimum of the function with domain [0, ∞)
illustrated in Figure 5.2 occurs at an endpoint of its domain (x = 0), which
for that function is not a stationary point. In fact, a function need not
have a global maximum or global minimum. For example, the function f (x)
in Figure 5.2 exceeds the local maximum value when x is large, but never
reaches a global maximum. However, the value of f (x) in Figure 5.2 does
not become arbitrarily large either: it is bounded above by (but never actu- A function f (x) is bounded
ally attains) the value A (but it comes arbitrarily close as x becomes large). above by a number A if
We refer to the line y = A as an asymptote of the graph of f . f (x) ≤ A for all x in the
domain of f . Similarly, f (x)
Having found a stationary point of a function, we can determine whether it is bounded below by B if
is a local maximum or a local minimum either by looking at the sign of the f (x) ≥ B for all x in the
domain of f . The numbers A
second derivative at the stationary point, or by looking at the sign of the
and B are referred to as an
first derivative to either side of the point. (For example, if b is a stationary upper bound and a lower
point, and f  (x) is positive for x less than b and negative for x greater than b, bound for f , respectively.
or if f  (b) is negative, then b is a local maximum. These tests are given in
detail in the Handbook.)

*Exercise 5.10
Find any stationary points of the function
1
y(x) = 5 − 2(x + 1)e− 2 x (x ≥ 0).
Classify these as local mimima or local maxima or neither, and evaluate
y(x) at these points.

Example 5.3
Suppose that
(x2 − 3)y = x − 2.
Sketch a graph of y against x.

Solution
x−2
We have y = , but need to note that this expression for y is not
x2 − 3 √
defined if x2 − 3 = 0, i.e. if x = ± 3. We can see that y = 0 if (and only if)
x = 2, so the graph crosses the x-axis at this one point. If x is large (positive
or negative), then y will be close to zero.

39
Unit 1 Getting started

To look for stationary points, we use the Quotient Rule to calculate


dy (1)(x2 − 3) − (x − 2)(2x) −x2 + 4x − 3
= = .
dx (x2 − 3)2 (x2 − 3)2
This is zero if x2 − 4x + 3 = 0; i.e. if x = 1 or 3. The second derivative
is a bit complicated to calculate, so it is easier here to look at the sign of
the first derivative near x = 1 and x = 3 to check whether these stationary
points are local maxima or minima. If x is just less than 1, then dy/dx Try x = 0.9 and x = 1.1.
is negative, while if x is just greater than 1, dy/dx is positive, so x = 1 is
a local minimum. For x just below 3, dy/dx is positive, while for x just Try x = 2.9 and x = 3.1.
above 3, it is negative, so x = 3 is a local maximum. Note that at x = 1,
y = 12 , while at x = 3, y = 16 .
We can incorporate all this information
√ (plus other information, such as
the behaviour of y near x = ± 3 and the value of y at certain values of x,
e.g. x = 0) in constructing a sketch graph, as in Figure 5.3.

Again
√ we refer to the
√ lines
x = 3 and x = − 3 as
asymptotes of this graph.

– 3 1 3 2 3 x

Figure 5.3

A continuous function is one whose graph can be drawn without lifting your
(x − 2)/(x2 − 3) is not continuous
pen from the paper. The function√f (x) = √
on any domain containing either 3 or − 3.
A smooth function is continuous and has a continuous derivative. For
example, the function
3x, x ≥ 0,
f (x) =
−2x, x < 0,
is continuous, but is not smooth (since its derivative is not continuous at
x = 0).

Exercise 5.11
(a) Sketch a graph of the function
v
f (v) = (v ≥ 0).
4 + 1.5v + 0.008v 2
In particular, find: any values of v for which f (v) is zero; any values
of v for which f (v) is not defined; and any local maxima or minima of
f (v). Also, indicate how f (v) behaves as v becomes large.
(b) Find the global maximum and minimum of f .

40
Section 6 Integration

6 Integration
Subsection 6.1 provides a reminder of the basic idea of integration as ‘revers-
ing differentiation’. Subsection 6.2 discusses how we may calculate relatively
simple integrals by hand. In Subsection 6.3, we look at two techniques for
finding more complicated integrals by hand.
As well as ‘reversing differentiation’, integrals also arise as the limits of
certain sums. In Subsection 6.4, we see how this can lead to integrals arising
in models.

6.1 Reversing differentiation


Diagnostic test 6.1
Do Exercises 6.1 and 6.2 below, and check your answers with the solutions
starting on page 57. If you are happy with your answers, you may proceed
directly to Subsection 6.2.

Throughout this course you will meet a variety of differential equations.


These are equations involving the derivative of a function, for example
ds
= 5t + 7. (6.1)
dt
The objective is usually to ‘solve’ the equation by finding an expression for
the function itself (rather than its derivative). To do this involves ‘reversing’
the differentiation, and this process is referred to as integration. In the
above example we integrate both sides of the equation with respect to t,
obtaining

s= (5t + 7) dt.

To evaluate this integral, you can use the table of standard integrals in the
Handbook. These show that t dt = 12 t2 and 1 dt = t, and on integration
we obtain
s = 52 t2 + 7t + c, (6.2)
where c may be any constant. To confirm this, note that with s given The constant c is often
by (6.2), we have referred to as an arbitrary
constant or a constant of
ds d 5 2  integration.
= 2 t + 7t + c = 5t + 7,
dt dt
as required by (6.1). Since c may be any constant, we see that the differential
equation (6.1) does not have a unique solution.
Generalizing, suppose that f is a known function, and
F  (x) = f (x).
We write the general solution of this differential equation as

F (x) = f (x) dx,



where the right-hand side, f (x) dx, is called the indefinite integral of
f (x), and the function to be integrated, f (x), is called the integrand.

41
Unit 1 Getting started

Given a differential equation such as


1
F  (x) = , (6.3)
1 + x2
finding a function F whose derivative is the function on the right-hand side
is a non-trivial task. For (6.3), we are lucky, since the table of standard
d 1
derivatives in the Handbook gives (arctan x) = . Hence we have
dx 1 + x2
1 Any function that
dx = arctan x + c, differentiates to 1/(1 + x2 ),
1 + x2 such as F (x) = arctan x + 5,
where c is an arbitrary constant. is referred to as an integral
(or antiderivative) of
In contrast, consider finding the integral 1/(1 + x2 ).

exp(−x2 ) dx.

At first sight, this might seem no harder a problem to solve than (6.3).
In fact, however, it is impossible! To be more precise, there is no simple
combination of the standard functions (polynomials, sin, cos, exp and ln)
that when differentiated gives exp(−x2 ).
Finding explicit expressions for integrals is a much harder task than finding
derivatives. The rules of differentiation ensure that we can, in principle,
find an explicit expression for the derivative of any combination of standard
functions. The equivalent is not true for integrals. What is more, even
where integrals can be found, this may be a messy process. The methods
whereby particular functions can be integrated depend on recognizing what
happens to work in various particular cases.
Integrals are readily foundif they appear
 in7 the table of standard integrals
2x
in the Handbook (as do e dx and x dx, for example). So are the
integrals
 of constant multiples and sums of constant multiples of these, such
as (4e2x + 9x7 ) dx. There are also integration techniques that enable you
to find some more complicated integrals. Integration by substitution is based
on the rule for differentiating composite functions, while integration by parts
is based on the rule for differentiating a product. There are brief reminders
of these two techniques in Subsection 6.3.
The table of standard integrals in the Handbook contains quite a wide se-
lection of integrals. Some of these integrals are deduced from the table of
standard derivatives, others from using integration by parts or substitution.
You can regard them all as the fruit of others’ experience, and draw on
them as needed. The correctness of an integral, obtained either by using
the Handbook or from the computer algebra package for the course, can be
verified by differentiation.

*Exercise 6.1
Use differentiation to verify that the following integrals are correct (where
a = 0 is a constant and c is an arbitrary constant).
x 1
(a) x sin ax dx = − cos ax + 2 sin ax + c
a a
1  π 
(b) tan ax dx = − ln(cos ax) + c − 2 < ax < π2
a

42
Section 6 Integration

*Exercise 6.2
(a) Use implicit differentiation (and a trigonometric formula) to show that
if x = tan y, then
dy 1
= .
dx 1 + x2
1
Hence confirm that dx = arctan x + c.
1 + x2
(b) Use implicit differentiation of x = sin y (and a trigonometric formula) Here −1 < x < 1,
to deduce an expression for the indefinite integral − π2 < y < π2 .
1
√ dx.
1 − x2
Check that your result agrees with the one in the Handbook.

6.2 Evaluating integrals


Diagnostic test 6.2
Do Exercises 6.4 and 6.6 below, and check your answers with the solutions
starting on page 58. If you are happy with your answers, you may proceed
directly to Subsection 6.3.

Your first recourse for finding an integral by hand is the table of standard Or, preferably, use your
integrals in the Handbook. If the integrals of functions f and g are known, memory!
then the integral of af + bg, where a and b are constants, is readily found,
using the rule

(af (x) + bg(x)) dx = a f (x) dx + b g(x) dx.


 
So, for example (referring to the Handbook for e2x dx and x7 dx),

(4e2x + 9x7 ) dx = 4 e2x dx + 9 x7 dx

= 4( 12 e2x ) + 9( 81 x8 ) + c
= 2e2x + 98 x8 + c.
Sometimes algebraic manipulation can transform an expression to be inte-
grated into a more amenable form. For example, the manipulation
3x2 + 2x 3x2 2x
√ = √ + √ = 3x3/2 + 2x1/2
x x x
transforms the expression on the left into a sum of constant multiples of func-
tions in the Handbook table. Less obvious transformations can be achieved
using trigonometric formulae. For example, using cos 2x = cos2 x − sin2 x
(see Exercise 3.3(a)(ii)) and sin2 x + cos2 x = 1 (Identity (3.2)), we obtain
cos 2x = 2 cos2 x − 1. Rearranging this gives
cos2 x = 21 (1 + cos 2x),
which enables us to integrate cos2 x.

43
Unit 1 Getting started

Exercise 6.3
Use the identity
cos2 ax = 21 (1 + cos 2ax)

(where a = 0 is a constant) to obtain cos2 ax dx.

At times, attention needs to be paid to domains, to avoid giving, as in-


tegrals, expressions that are not defined. For example, ln x is defined (as
1
a real number) only for x > 0, so dx = ln x + c holds only for x > 0.
x
1
To integrate 1/x for x < 0, we use dx = ln(−x) + c. The Handbook For x < 0,
x
gives domain restrictions where necessary. We sometimes write, for con- d 1
(ln(−x)) = × (−1)
1 dx −x
venience, dx = ln |x| + c (x =
 0), but do be aware that this covers 1
x = ,
the two separate cases of x < 0 and x > 0. Similarly, we sometimes write x
1 1 using the Composite Rule.
dx = ln |ax + b| + c (ax + b = 0).
ax + b a

*Exercise 6.4
Find the following integrals. Use the standard integrals
1 given in the Handbook as
(a) e5x dx (b) 6 sec2 (3t) dt (c) dv necessary.
36 + 4v 2
1   1  
(d) dy y < 32 (e) dy y > 32
3 − 2y 3 − 2y

Exercise 6.5
Find the following integrals. Use the standard integrals
1 given in the Handbook as
(a) (6 cos(−2t) + 8 sin 4t) dt (b) √ dt (−3 < t < 3) necessary.
9 − t2
5t3 + 7 2
(c) dt (t < 0) (d) 2 ln(4t) − dt (t > 0)
t t
1
(e) dx (−1 < x < 1)
(x − 1)(x + 1)

The next example again uses a standard integral from the Handbook, but
requires careful matching of parameters and attention to domains.

Example 6.1
1 1
For A > 0, x > , find dx.
A x(1 − Ax)
Solution
We can match the integrand with that of a standard integral by writing it
in the form
1 −1
= .
x(1 − Ax) A(x − 0)(x − A1 )

44
Section 6 Integration

1
The Handbook gives the integral dx for b < a. To match
(x − a)(x − b)
that, choose b = 0 and a = A1 . Since x > A1 , we must choose the standard
integral for the case a < x. So we obtain
1 1 1
dx = − dx
x(1 − Ax) A (x − 0)(x − A1 )
  
1 1 x − A1
=− ln +c
A A1 − 0 x−0
 
x − A1
= − ln +c
x
 
x Ax
= ln + c = ln + c.
x− A 1 Ax − 1

*Exercise 6.6
1
(a) For k > 0, −k < v < k, find dv.
v2 − k2
(Hint : Remember that v 2 − k 2 = (v − k)(v + k).)
 
a a 1
(b) For a > 0, b > 0, − <v< , find dv.
b b a − bv 2

6.3 Integration by parts and by substitution

Diagnostic test 6.3


Do Exercises 6.7 and 6.8 below, and check your answers with the solutions
starting on page 58. If you are happy with your answers, you may proceed
directly to Subsection 6.4.

In this subsection, we look briefly at two methods for evaluating more com-
plicated integrals. You will use these methods later in the course. For
now, it is particularly useful to recognize the type of integral illustrated in
Equation (6.5).

Integration by substitution
The formula for integration by substitution is

f (g(x))g  (x) dx = f (u) du. (6.4)

Or, in Leibniz notation,


du
f (u) dx = f (u) du.
dx
We say that we have ‘substituted’ u = g(x). The following example shows
one way of executing integration by substitution.

45
Unit 1 Getting started

Example 6.2
Find x sin(2 + 3x2 ) dx.

Solution
Let u = 2 + 3x2 , so du/dx = 6x. In (6.4), we are taking
g(x) = 2 + 3x2 .
We have

x sin(2 + 3x2 ) dx = 1
6 6x sin(2 + 3x2 ) dx Now g  (x) = 6x.

1 du
= 6 sin u dx
dx
1 du
= 6 sin u du Note how, in effect, dx is
dx
replaced by du.
= − 16 cos u + c
= − 16 cos(2 + 3x2 ) + c,
substituting at the end for u in terms of x.

Notice in this example that the integrand is a composite function,


sin(2 + 3x2 ), multiplied by (in effect) the derivative of the ‘inner function’
2 + 3x2 . We are only ‘out’ by a constant factor (of 16 ), and this shows up in
the final expression. In such a situation it is relatively easy to see that the
substitution u = 2 + 3x2 will help to simplify the integral. There are other
useful types of substitution that are much less apparent, but we need not
discuss those here.
One form of integral comes up sufficiently often to be worth separate men-
tion. We have, for g(x) = 0,
g  (x)
dx = ln |g(x)| + c. (6.5)
g(x)
This follows from treating the cases g(x) > 0 and g(x) < 0 separately, and
using the substitutions u = g(x) and u = −g(x), respectively. When
g(x) > 0, we have u = g(x) > 0, du/dx = g  (x) and
g  (x) 1
dx = du = ln u + c = ln(g(x)) + c.
g(x) u
When g(x) < 0, we have u = −g(x) > 0, du/dx = −g  (x) and
g  (x) −g  (x) 1
dx = dx = du = ln u + c = ln(−g(x)) + c.
g(x) −g(x) u

*Exercise 6.7
Find the following integrals.

(a) y 2 exp(2 + 4y 3 ) dy (b) cos y sin2 y dy


 x
(c) t 1 − t2 dt (−1 < t < 1) (d) dx Part (e) requires use of the
1 + x2 trigonometric formula for
sin 2t y sin 2t derived in
(e) dt (f) dy  ±1)
(y = Exercise 3.3(a)(i).
1 + sin2 t 1 − y2

46
Section 6 Integration

Integration by parts
The formula for integration by parts is

f (x)g  (x) dx = f (x)g(x) − f  (x)g(x) dx.

As with integration by substitution, this formula transforms an integral into


a different one, and the key to successful use is to ensure that the ‘new’
integral is easier to evaluate than the original.

Example 6.3
Find xe−2x dx.

Solution
In the formula, take f (x) = x and g  (x) = e−2x . Then f  (x) = 1 and
g(x) = − 12 e−2x , so Note that an arbitrary
constant need not be included
xe−2x dx = x(− 12 e−2x ) − 1(− 21 e−2x ) dx in the expression for g(x).

= − 12 xe−2x + 1
2 e−2x dx

= − 12 xe−2x − 41 e−2x + c
= − 14 (2x + 1)e−2x + c.

*Exercise 6.8
(a) Use integration by parts to find xe−x dx.

(b) Find x2 e−x dx. (Use integration by parts, then the result of part (a).)

6.4 Definite integrals


Diagnostic test 6.4
Calculate each of the following integrals.
1 2 2
(a) (x3 − 2) dx (b) (x3 − 2) dx (c) (x3 − 2) dx
0 1 0
Now check your solutions against those given below. If you are happy with
the answers, you may proceed directly to Section 7.
Solution
(a) We have
1 1 1
(x3 − 2) dx = 4x
4
− 2x 0
0
= ( 14 − 2) − (0 − 0) = − 47 .
(b) Similarly,
2 1 2
(x3 − 2) dx = 4x
4
− 2x 1
1

4 − 4) − ( 4 − 2) = 4 .
= ( 16 1 7

47
Unit 1 Getting started

(c) We can either evaluate the integral directly,


2 1 2
(x3 − 2) dx = 4x
4
− 2x 0
0

4 − 4) − (0 − 0) = 0,
= ( 16
or use parts (a) and (b):
2 1 2
(x3 − 2) dx = (x3 − 2) dx + (x3 − 2) dx
0 0 1
= − 74 + 7
4 = 0.

The indefinite integral is a function (or, to be exact, a family of functions


containing an arbitrary constant). A different, though closely related, form
of integral is the definite integral, whose value is a number. If we know
an integral, say F , of f , then a definite integral of f is readily found, since Any choice of F , an integral
the value of a definite integral is given by of f , leads to the same value
for F (b) − F (a).
b
f (x) dx = F (b) − F (a).
a

The difference F (b) − F (a) is commonly written as [F (x)]ba .


So, for example,
1
1   1 π
 dθ = arcsin 12 θ 0 = arcsin 21 − arcsin 0 = 6 − 0 = π6 .
0 4 − θ2

Exercise 6.9
3/2
1
Evaluate dz.
0 9 + 4z 2

b
A rough-and-ready way of thinking of a definite integral a f (x) dx is as ‘the
accumulation of the values taken by f (x) as x runs from a to b’. This ties
up with a useful way of visualizing definite integrals, as areas. If f (x) ≥ 0
b
for a ≤ x ≤ b, then the definite integral a f (x) dx is equal to the area under
the graph of f (x) between x = a and x = b (see Figure 6.1(a)). There is one
point to be careful about here. If f (x) < 0, corresponding to a region below
the x-axis, then we have a negative contribution to the integral, whereas
area is always a positive quantity. Thus for a function f as pictured in
b
Figure 6.1(b), a f (x) dx = area A1 − area A2 .

y y

y = f (x)
A1

A2
a b x

a b x y = f (x)
(a) (b)
b
Figure 6.1 a
f (x) dx as an area: (a) with f (x) > 0; (b) in general.

48
Section 6 Integration

Whereas indefinite integrals typically arise in solving differential equations


(‘reversing differentiation’), definite integrals can arise in other circum-
stances too, such as when an integral is seen as the limit of a sequence
of sums. As an example, consider the following model, to estimate the
number of seabird nests on an island.

Example 6.4
An island is modelled as a circle of radius 500 metres. The density of nests
is greatest on the edge of the island, and least at the centre. (The birds
prefer ready access to the sea.) The density is modelled as D, measured in
nests per square metre, where
D = 0.1 + r/500 = 0.1 + 0.002r,
where r is the distance from the centre of the island, measured in metres.
Estimate the number of nests on the island.
Solution

50
Imagine the island divided by concentric circles into narrow ‘annular’ strips,

0
m
each of width δr. Figure 6.2 shows a ‘typical’ strip, between circles of radius δr
r and r + δr.
r
The area of this typical strip is approximately 2πrδr (‘length × width’). The
number of nests within this strip is (the area) × (the density of nests), and
so is approximately
2πrδrD = 2πr(0.1 + 0.002r)δr.
The total number of nests on the island is the sum of the number of nests
in all the strips. If we take the limit of this sum as δr → 0, then the sum Figure 6.2
converges to the definite integral of 2πr(0.1 + 0.002r) between r = 0 (the
centre of the island) and r = 500 (at the edge). That is, an estimate of the
number of nests is
500
2πr(0.1 + 0.002r) dr = 6.021 × 105 ,
0
so there are approximately 600 000 nests.

The equivalence of the two views of integration, as ‘reversing differentiation’


and as the limit of a sequence of sums, is assured by the Fundamental
Theorem of Calculus, but that need not concern us here. Many texts use
the limit of a sequence of sums (rather than ‘reversing differentiation’) as
the basis for the definition of integration.

End-of-section Exercise
Exercise 6.10
Find the following integrals (where a is a constant).
500
(a) 2πr(0.1 + 0.002r) dr (b) exp(a − 2y) dy
0
1 −2
1 1
(c) du (d) du (e) ln(3t + a) dt
0 3u + 5 −3 3u + 5
π
4 sin 2x
(f) t cos(3t + a) dt (g) dx
0 3 + cos 2x

49
Unit 1 Getting started

7 Computer activities
This section demonstrates how the computer algebra package for the course
can assist in performing various tasks that have been studied in the unit. In
particular, we can easily perform algebraic manipulations, such as simpli-
fying expressions and solving quadratic equations. We can also solve equa-
tions numerically when an algebraic solution cannot be found, and handle
the arithmetic of complex numbers. Turning to calculus, you will see how
to produce graphs, and how to differentiate and integrate.

Use your computer to complete the following activities. PC


Symbolic algebra on the computer

Activity 7.1 The solutions to these


 4 1/2
(a) Evaluate . questions, and further
9
 4 1/2 comments, are given in the
(b) Simplify 9 . computer worksheets.
(c) Simplify each of the following.

(i) a3 a5 (ii) exp(2 ln x + ln(x + 1)) (iii) x2
(d) Try ‘simplifying’ x5 + x7 .

Activity 7.2
Expand each of the following.
(a) (x + 2)3 (b) (2x + 3)10

Activity 7.3
Suppose that ln y = 2.83 ln x + 0.37. Express y as a function of x.

Activity 7.4
(a) Solve for x the following equations.
(i) 2x2 + 7x − 4 = 0 (ii) x2 + x − 6 = 0 (iii) x2 − 2x + 2 = 0
(b) Solve for x the quadratic equation abx2 − (a + b)x + 1 = 0.

Solving equations numerically on the computer


Activity 7.5
Find the values of x that satisfy the equation cos x = 0.2x.

Complex number s on the computer

Activity 7.6
Evaluate the following.
(a) (1 + i)(1 − 2i) (b) |4 + i|

50
Section 7 Computer activities

Activity 7.7
 
(a) Expand (cos x + i sin x)4 . Find Re (cos x + i sin x)4 , and hence use De
Moivre’s Theorem to obtain an expression for cos 4x in terms of sin x
and cos x.
(b) Expand cos 4x. Compare the result with your answer to part (a).

Graphs and differentiation on the computer

Activity 7.8
Consider the function
y(x) = 5 − 2(x + 1) exp(− 12 x) (x ≥ 0).
(a) Obtain a graph of y(x) against x.
(b) How does y(x) appear to behave as x becomes large?
(c) Can you find the global maximum and global minimum values taken by You may wish to refer back to
y(x)? Exercise 5.10 in answering
this.
Activity 7.9
dy
(a) Find , where y = 1 − 0.9 exp(−0.5x). The functions in parts (a)
dx and (c) were also considered
dy x in Exercise 5.3.
(b) Find , where y = 2 .
dx x +1
d2 y
(c) Find 2 , where y = ln t.
dt

Integration on the computer


Activity 7.10
(a) Find the following integrals.
5t3 + 7
(i) exp(5x) dx (ii)dt
t
(b) Try to find the following integrals.
1 1
(i) dv (ii) dv
v2 − k2 a − bv 2

Activity 7.11
Evaluate the following definite integrals, first numerically and then symbol-
ically.
4 1
1
(a) dx (b) exp(−x2 ) dx
1 x 0

Activity 7.12
Evaluate the following.
b 2
1 1
(a) dx, symbolically (b) dx, numerically
a x −1 x

51
Unit 1 Getting started

Outcomes
After studying this unit you should be able to do the following.
• Interpret the following notation: scientific notation, Z, R, C, [a, b], (a, b],
[a, b), (a, b), |x|, f (x), exp(x), ex , ln x,√sin x, cos x, tan x, sec x, cosec x,
cot x, arccos x, arcsin x, arctan x, i (= −1), Re(z), Im(z), z, |z|, ez (for
dy d2 y dn y
z in C), r, θ, lim f (h), f  (x), f  (x), f (n) (x), , , , ẋ(t), x(t),
¨
  b h→0 dx dx2 dxn
f (x) dx, a f (x) dx, [F (x)]ba .
• Interpret the following terminology: integer, real number, interval, deci-
mal places, significant figures, rounding; variable, dependent variable, in-
dependent variable, parameter; function, domain, image set, codomain;
linear function, quadratic function, polynomial function (and root of a
polynomial equation); exponential function, logarithm function, power
function, trigonometric function; composite function; complex number,
complex conjugate, real and imaginary parts (of a complex number),
modulus and argument (of a complex number); polar coordinates, polar
form and exponential form (of a complex number), De Moivre’s Theo-
rem, Euler’s formula; differentiation, derivative, Leibniz notation, Chain
Rule (for differentiation), implicit differentiation, higher derivative; gra-
dient of a function, stationary point, local maximum, local minimum,
global maximum, global minimum, derivative of a complex-valued func-
tion; continuous function; integration, integral, integrand, indefinite in-
tegral, definite integral, arbitrary constant.
• Use the formulae for: the solution of a quadratic equation; the alge- The formulae are all given in
braic properties of indices (and the exponential function); the algebraic the Handbook.
properties of logarithm functions; various trigonometric identities; mul-
tiplying complex numbers in polar form; finding powers in polar form;
derivatives of standard functions; differentiating products, quotients and
composite functions; standard integrals; integration by parts and by sub-
stitution.
• Solve two simultaneous linear equations by Gaussian elimination.
• Factorize quadratic functions.
• Sketch the graphs of linear, quadratic, exponential and trigonometric
functions.
• Sketch the graphs of more general functions, including identifying sta-
tionary points and asymptotes.
• Use log–linear plots to recognize relationships of the form
y = Aekx .
• Use log–log plots to recognize relationships of the form y = Axb .
• Add, subtract, multiply and divide complex numbers, and move between
Cartesian, polar and exponential forms of a complex number.
• Find derived functions, using the table of standard derivatives and the
rules for differentiating various types of combination of functions.
• Find indefinite integrals, using the table of standard integrals and (in
simple cases) the rules for integration by substitution and for integration
by parts.
• Find definite integrals (of suitable functions).
• Use the computer to: simplify and expand algebraic expressions, graph
a function, solve an equation (numerically or symbolically), find deriva-
tives, find integrals.

52
Solutions to the exercises

Solutions to the exercises

Section 1 Substituting this into the first equation gives


1.1 (a) (i) 6.482 35 × 104 (Count the number of 2u − 5(−5) = 19,
places the decimal point is moved to the left to find so u = (19 − 25)/2 = −3.
the power of 10.) Thus the solution is u = −3, v = −5.
(ii) 7.3 × 10−5 (It is good practice to check solutions where you can,
(b) (i) y lies between 127.683 − 0.006 = 127.677 and and this is easily done here. With u = −3 and v = −5,
127.683 + 0.006 = 127.689; that is, in the interval we have
[127.677, 127.689]. 2u − 5v = 2(−3) − 5(−5) = −6 + 25 = 19,
3u + 4v = 3(−3) + 4(−5) = −9 − 20 = −29,
(ii) We cannot be certain of the fifth significant figure
(it could be either 7 or 8), so we can only give y as 127.7 so these values of u and v do satisfy the given equa-
(to four significant figures). tions.)
(c) The condition |x − 2.763| < 5 × 10−4 is equivalent 2.4 (a) Using formula (2.8), we obtain
to −5 × 10−4 < x − 2.763 < 5 × 10−4 . 
−7 ± 72 − 4 × 2 × (−4)
So x < 2.763 + 5 × 10−4 = 2.7635 x=
2×2
and x > 2.763 − 5 × 10−4 = 2.7625. √
−7 ± 49 + 32
So 2.7625 < x < 2.7635; that is, x lies in the interval =
4
(2.7625, 2.7635). −7 ± 9
= = 12 or −4.
Since 2.7635 does not lie in this interval, we can be 4
certain that x = 2.763 to three decimal places. (Again, these solutions can be checked by substitution
into the equation. For example, with x = −4,
2x2 + 7x − 4 = 32 − 28 − 4 = 0,
Section 2 as required.)
2.1 Because the reservoir is initially at 60% capacity, (b) We obtain x = −3 or 2.
V0 = 0.6C. 
The volume V will have reduced to 20% of C when −2k ± 4k 2 − 4m(mw 2 )
2.5 x =
0.2C = 0.6C − 10t, i.e. when 0.4C = 10t. √ 2m
−2k ± 2 k 2 − m2 w2
Thus crisis measures will be needed when t = 0.04C. =
(Note that, because of the domain conditions, this √2m
k k 2 − m2 w 2
solution is valid only if 0.04C ≤ 72 000, i.e. for =− ±
m √ m
C ≤ 1 800 000.)
k k 2 − m2 w 2
=− ± √
2.2 (a) We need m m2

2000 = 5(−3600) + c = −18 000 + c. k k 2 − m2 w 2
=− ±
Hence c = 2000 + 18 000 = 20 000. m m2

(b) (i) The cutter catches the boat when X = Y , i.e. k k2
=− ± − w2 .
when m m2

7t = 5t + 20 000. Putting K = k/m, this gives x = −K ± K 2 − w 2 , as
This gives 2t = 20 000, so t = 10 000. required.
10 000 seconds is 2 hours, 46 minutes and 40 seconds. 2.6 (a) a3 a5 = a3+5 = a8
So the cutter catches the boat at around 2.50 am.
(b) a3 /a5 = a3−5 = a−2 (or 1/a2 )
(ii) At t = 10 000, both X and Y are equal to 70 000.
(c) (a3 )5 = a3×5 = a15
So the cutter catches the boat 70 km from A, which is
inside territorial waters. (d) (2−1 )4 × 43 = 2−4 × (22 )3 = 2−4 × 26 = 22 = 4

2.3 Multiplying the first equation by 3
gives (e) 8−1/3 = 1/81/3 = 1/ 3 8 = 12
2 √
3u − 15 57
2 v = 2 .
(f ) 163/4 = (161/4 )3 = ( 4 16)3 = 23 = 8
Subtracting this from the second equation gives  3
15 57
(g) ( 49 )3/2 = 4
9 = ( 23 )3 = 27
8
(4 + 2 )v = −29 − 2 , √
23
= − 115 115 (h) (16x4 )1/2 = 161/2 (x4 )1/2 = 16x4×1/2 = 4x2
i.e. 2 v 2 , so v = − 23 = −5.

53
Unit 1 Getting started

2.7 (a) ln 7 + ln 4 − ln 14 = ln(7 × 4 ÷ 14) = ln 2 Section 3


2 2 2
(b) ln a + 2 ln b − ln(a b) = ln a + ln(b ) − ln(a b)
  3.1 (a) The figure shows a circle of radius 1, and radii
= ln a × b2 ÷ (a2 b) rotated through 0 (OA) and π2 (OB). We see from the
= ln(b/a) (or ln b − ln a) figure that A has coordinates (1, 0) = (cos 0, sin 0), and
(c) ex × (ey )2 ÷ e2x = ex × e2y ÷ e2x B has coordinates (0, 1) = (cos π2 ,sin π2 ). Therefore
= ex+2y−2x = e2y−x sin 0 = 0, cos 0 = 1, sin π2 = 1, cos π2 = 0.

(d) ln(ex × ey ) = ln(ex+y ) = x + y


(Alternatively, ln(ex × ey ) = ln(ex ) + ln(ey ) = x + y.) y
To simplify the expression in part (d), we first rear-
ranged it in the form ln(esomething ), which just equals 1B
something.
In parts (e)–(g), we first rearrange the expression as
eln(something) , which also just equals something. π
2
2 ln x ln(x2 ) 2
A
(e) e =e =x O 1 x
−2 ln x ln(x−2 ) −2 2
(f ) e =e =x (or 1/x )
(g) exp(2 ln x + ln(x + 1)) = exp(ln(x2 × (x + 1)))
= x2 (x + 1)
(Alternatively, exp(2 ln x + ln(x + 1))
= exp(ln(x2 )) × exp(ln(x + 1)) = x2 (x + 1).)
(b) Since sin 0 = 0, cosec 0 and cot 0 are not defined.
We have
2.8 Taking logs of ax = ekx gives tan 0 = 0 and sec 0 = 1.
ln(ax ) = ln(ekx ), i.e. x ln a = kx. Since cos π2 = 0, tan π2 and sec π2 are not defined.
This holds for all x so long as We have
k = ln a. cot π2 = 0 and cosec π2 = 1.
(c) We obtain:

2.9 (a) Taking exponentials of each side of sin π6 = 12 , cos π6 = 3
tan π6 = √13 ,
2 ,
ln y = 2.83 ln x + 0.37, √
cosec π6 = 2, sec π6 = √23 , cot π6 = 3;
we obtain
sin π4 = √1 , cos π4 = √12 , tan π4 = 1,
exp(ln y) = exp(2.83 ln x + 0.37), 2
√ √
i.e. y = exp(ln(x2.83 )) × exp(0.37) = 1.4x2.83 . cosec π4 = 2, sec π4 = 2, cot π4 = 1;

3

(b) If ln y = a ln x + b, then taking exponentials gives sin π3 = 2 , cos π3 = 12 , tan π3 = 3,
exp(ln y) = exp(a ln x + b), cosec π3 = √23 , sec π3 = 2, cot π3 = √13 .
i.e. y = exp(ln(xa )) × eb = eb xa = cxa , where c = eb is (d) sin θ = 0 if θ = 0, but also if θ differs from 0 by
a positive constant. ±2π, ±4π, ±6π, and so on. Also, sin θ = 0 if θ = π, or θ
differs from π by a multiple of 2π. In general, sin θ = 0
2.10 (a) (i) f (g(x)) = f (1 − x3 ) if θ = nπ, where n ∈ Z.
3 3
= e−(1−x )
= ex −1
3.2 (a) One solution is arcsin 0.8 = 0.93 (to two dec-
(ii) g(f (x)) = g(e−x ) = 1 − (e−x )3 = 1 − e−3x imal places). Looking at the graph of y = sin θ below,
we see that it is symmetric about θ = π2 . So there is
(b) We can obtain h(x) in two steps.
also a solution of sin θ = 0.8 at θ = π − 0.93 = 2.21 (to
Step 1 Calculate y = 4 + 9x2 . two decimal places). These are the only solutions with
Step 2 Apply y −4 to the result of Step 1. θ between 0 and 2π. (If θ is between π and 2π, sin θ is
Then h(x) = g(f (x)), where g(x) = x−4 and f (x) = negative.) The other solutions are obtained by adding
4 + 9x2 . (We need to exclude x = 0 from the domain multiples of 2π to these two. The solutions in the re-
of g, but f (x) is never equal to 0, so this is not a prob- quired range are (to two decimal places):
lem.) 0.93, 2.21, 7.21, 8.50, 13.49, 14.78.

54
Solutions to the exercises

w wv (2 − i)(3 + 4i) 10 5 2
(e) = 2 = = 25 + 25 i = 5 + 51 i
y v |v| 25
y = sin q 1 w 2+i 2
0.8 (f ) = 2
= = 5 + 51 i
w |w| 5
(g) w2 = (2 − i)(2 − i) = 3 − 4i
–π 0 0.93 π π –0.93 π 2π q (h) 2w − 3v = 4 − 2i − (9 − 12i) = −5 + 10i
2

4.2 We obtain

−2 ± 4−8
x= = − 12 ± 21 i.
4
(b) We saw in Exercise 3.1(c) that tan π4 = 1, so θ = π4
is one solution. We can see from the graph of tan (Fig-
4.3
ure 3.5) that there is one solution of this equation in the y
range − π2 to π2 , and that other solutions are obtained (0,4)
by adding multiples of π to this. We can express the
full set of solutions as
π
4+ nπ,
where n ∈ Z.
(− 3,1) (1,1)
3.3 (a) (i) sin 2θ = sin(θ + θ) π
f 4
= sin θ cos θ + cos θ sin θ (–2,0) (4,0) x
π
= 2 sin θ cos θ (–1,–1) 4

(ii) cos 2θ = cos(θ + θ) = cos θ cos θ − sin θ sin θ


= cos2 θ − sin2 θ
(a) 2, π
(b) (i) sin(2π − θ) = sin 2π cos θ − cos 2π sin θ √
(b)  2, π4 
= 0 × cos θ − 1 × sin θ = − sin θ √
(ii) cos(2π − θ) = cos 2π cos θ + sin 2π sin θ (c)  2, − 34π 
= 1 × cos θ + 0 × sin θ = cos θ (d) 4, 0
(iii) sin(π − θ) = sin π cos θ − cos π sin θ (e) 4, π2 
= 0 × cos θ − (−1) × sin θ = sin θ (f ) This is 2, π − φ, where φ is as shown in the fig-
ure. Now tan φ = √13 , so φ = π6 (see Solution 3.1(c)).
(iv) cos(π − θ) = cos π cos θ + sin π sin θ √
= (−1) × cos θ + 0 × sin θ = − cos θ So (− 3, 1) has polar coordinates 2, 5π6 .

(v) sin( π2 − θ) = sin π2 cos θ − cos π2 sin θ      


4.4 z = 2 cos − π4 + i sin − π4 = 2 √12 − √12 i
= 1 × cos θ − 0 × sin θ = cos θ √
(vi) cos( π2 − θ) = cos π2 cos θ + sin π2 sin θ = 2(1 − i)
= 0 × cos θ + 1 × sin θ = sin θ
4.5 This solution is identical to the solution to Exer-
For 0 < θ < π2 , the results of parts (v) and (vi) can be cise 4.3.
confirmed by examination of a right-angled triangle. √
  4.6 First express 1 − i in polar form, as  2, − π4 .
(vii) cos 3π 3π 3π
2 + x = cos 2 cos x − sin 2 sin x
Then
= 0 × cos x − (−1) × sin x = sin x √ √
 2, − π4 20 = ( 2)20 , − 20π 10
4  = 2 , −5π
= 1024, π,
adding 6π to the argument to obtain its principal value.
Section 4 Returning to Cartesian form, we obtain
(1 − i)20 = 1024(cos π + i sin π) = −1024.
4.1 (a) v = 3 + 4i

(b) |v| = 3 2 + 42 = 5 4.7 z = reiθ , so
(c) v − w = 1 − 3i Re(zeiωt ) = Re(reiθ eiωt )
= Re(rei(ωt+θ) )
(d) vw = (3 − 4i)(2 − i) = 6 − 8i − 3i + 4i2 = 2 − 11i
= r cos(ωt + θ).

55
Unit 1 Getting started

Section 5 5.5 (a) This is a quotient. We obtain


dy ( 1 )(x2 + 1) − (ln x)(2x) x + x−1 − 2x ln x
= x 2 2
= .
5.1 (a) ẋ(t) = −21 sin(3t + 2), dx (x + 1) (x2 + 1)2
ẍ(t) = −63 cos(3t + 2). 3t5
(b) f  (t) = 5t4 ln(3t + 4) + .
(b) Note that 7 cos(3t + 2) = x(t) − 5, so 3t + 4
−63 cos(3t + 2) = −9(x(t) − 5). (c) g  (t) = A sin(At + C) + (At + B)A cos(At + C), so
¨ = −9(x(t) − 5), or equivalently
Hence we have x(t) g  (0) = A sin C + AB cos C.
¨ + 9x(t) = 45.
x(t) (d) For x(t) = e−3t sin 4t, we want ẋ(t) and x(t).
¨ Us-
ing the Product Rule:
ẋ(t) = −3e−3t sin 4t + e−3t (4 cos 4t)
5.2 The rate at which the wage bill will be rising is
given by = e−3t (4 cos 4t − 3 sin 4t).
dB Using the Product Rule again:
= 105 (0.04) exp(0.04t) = 4000e0.04t .
dt ẍ(t) = −3e−3t (4 cos 4t − 3 sin 4t)
As a fraction of the future wage bill, B, the rate of rise + e−3t (−16 sin 4t − 12 cos 4t)
dB/dt is = e−3t (−7 sin 4t − 24 cos 4t).
1 dB 105 (0.04) exp(0.04t)
= = 0.04.
B dt 105 exp(0.04t)
5.6 In each case, we suggest an intermediate variable,
So the rate of rise is 4% per year. u, but (except in part (a)) omit details.
(a) Setting u = t2 , we obtain y = eu , dy/du = eu and
dy du/dt = 2t, so
5.3 (a) = (−0.9)(−0.5) exp(−0.5x)
dx dy dy du
= 0.45 exp(−0.5x). = = eu 2t = 2t exp(t2 ).
dt du dt
(b) F  (x) = 12x3 − 4, so putting x = 2 gives (b) Setting u = 3x3 + 4, we obtain
F  (2) = 12 × 23 − 4 = 92. f  (x) = 6(3x3 + 4)5 (9x2 ) = 54x2 (3x3 + 4)5 .
dy 1 d2 y 1 (c) Setting u = 3v + 4, we obtain
(c) = , and then 2 = − 2 .
dt t dt t dz
(d) F  (x) = 6 sec(2x) tan(2x) − 12 sin(−3x). = 3 sec2 (3v + 4).
dv
Hence
    √ (d) Setting u = 4 − z 2 , we obtain
F  π6 = 6 sec π3 tan π3 − 12 sin − π2 = 12 3 + 12. −z
g  (z) = 12 (4 − z 2 )−1/2 (−2z) = √ .

(e) g (t) = −3a sin(3t + φ) + 3b cos(3t + φ), and then 4 − z2
g  (t) = −9a cos(3t + φ) − 9b sin(3t + φ), so (e) Setting u = 1 + 2x2 , we have

g (0) = −9a cos φ − 9b sin φ.  −3/2
f (x) = 1 + 2x2 = u−3/2
(see Example 2.3), so
5.4 (a) v (z) = 3 sec2 z − 2 sin z.  −5/2 −6x
f  (x) = − 32 1 + 2x2 (4x) = √ 5 .
dy 1 + 2x2
(b) = 3A cos 3t − 3B sin 3t.
dt
π
The value of this at t = 12 is 5.7 (a) This is a composite function, with
π π
√ √ x
3A cos 4 − 3B sin 4 = 3A/ 2 − 3B/ 2 y = sec u, u= .
x2 + 1
= √3 (A − B).
2 Here u is a quotient, and
(c) We need to find the fourth derivative: du 1(x2 + 1) − x(2x) 1 − x2
= 2 2
= 2 .
f  (t) = 6 cos 3t, f  (t) = −18 sin 3t, dx (x + 1) (x + 1)2
f (3) (t) = −54 cos 3t, f (4) (t) = 162 sin 3t. Then, using the Chain Rule,
dy dy du
Since sin 3π
2 = −1, we obtain f
(4) π
( 2 ) = −162. =
dx du dx
3 1 − x2
(d) f  (y) = . = sec u tan u 2
1 + 9y 2 (x + 1)2
dz c dz c 1−x 2
x x
(e) = , so = when x = 0. = 2 sec tan .
dx cx + d dx d (x + 1) 2 2
x +1 2
x +1

56
Solutions to the exercises

(b) This is a product, but the second part of the prod-


f (v )
uct is a composite function. If v = exp(t3 + 1), then
dv 0.538
= 3t2 exp(t3 + 1)
dt
(using the intermediate function u = t3 + 1). Then, us-
ing the Product Rule,
dz
= 2t exp(t3 + 1) + t2 (3t2 exp(t3 + 1))
dt
= (3t4 + 2t) exp(t3 + 1). 0 22.36 v

5.8 (a) (i) The Product Rule gives (b) From the graph we see that the global√maximum
d 2 dy of f occurs at the local maximum, i.e. v = 500. The
(x y) = 2xy + x2 . global minimum occurs at the endpoint of the domain,
dx dx
i.e. v = 0.
(ii) The Composite Rule gives
d 3 dy
(y ) = 3y 2 .
dx dx
(b) Using implicit differentiation, we obtain Section 6
dy dy
3x2 + 2xy + x2 + 3y 2 = 0. In all solutions for this section, c is an arbitrary con-
dx dx
When x = −1 and y = 1, this gives stant.
dy dy
3−2+ +3 = 0. 6.1 (a) Using the Product Rule for derivatives,
dx dx
Hence dy/dx = − 14 , and the required gradient is − 14 . d x 1
− cos ax + 2 sin ax + c
dx a a
1  x a
5.9 We have f  (t) = −2 sin 2t + 2i cos 2t. = − cos ax + − (−a sin ax) + 2 cos ax
a a a
Then f  (t) = −4 cos 2t − 4i sin 2t. = x sin ax.
Therefore
5.10 To test for stationary points, use the Product x 1
x sin ax dx = − cos ax + 2 sin ax + c,
Rule to find a a
1 1 1
y  (x) = −2e− 2 x + 12 (2(x + 1))e− 2 x = (x − 1)e− 2 x . so verifying the given integral.
So y  (x) = 0 only at x = 1. The derivative is negative if (b) Using the Composite Rule for derivatives,
x < 1 and positive if x > 1, so this is a local minimum. d 1
We have y(1) = 2.574 (to three decimal places). − ln(cos ax) + c
dx a
1 1 d
= − (cos ax)
5.11 (a) f (v) = 0 only when v = 0. a cos ax dx
The denominator 4 + 1.5v + 0.008v 2 is positive for all 1 −a sin ax
= −
v ≥ 0, so f (v) is defined for all v ≥ 0. a cos ax
As v → ∞, f (v) → 0. = tan ax,
To find any stationary points, differentiate f (v) using so verifying the given integral.
the Quotient Rule, to obtain
(4 + 1.5v + 0.008v 2 ) − v(1.5 + 0.016v) 6.2 (a) If x = tan y, then, differentiating with respect
f  (v) = to x, we obtain
(4 + 1.5v + 0.008v 2 )2
d d dy dy
4 − 0.008v 2 1= (tan y) = (tan y) = sec2 y .
= . dx dy dx dx
(4 + 1.5v + 0.008v 2 )2
√ Then, using the formula sec2 y = 1 + tan2 y, and the
This is 0 when v = ± 500 = ±22.36 (to two decimal fact that x = tan y, we obtain
places).
dy dy
The negative stationary point is outside the domain 1 = (1 + tan2 y) = (1 + x2 ) .
dx dx
(v ≥ 0), so we need consider only v = 22.36. For Therefore
v < 22.36, f  (v) > 0, while for v > 22.36, f  (v) < 0. dy 1
Therefore v = 22.36 is a local maximum. We have = .
dx 1 + x2
f (22.36) = 0.538 (to three decimal places). Now, if x = tan y, then y = arctan x, so
A sketch graph of f is shown below. 1
dx = arctan x + c.
1 + x2

57
Unit 1 Getting started

(b) If x = sin y, then, differentiating with respect to x, (d) For t > 0,


dy 2
1 = cos y . 2 ln(4t) − dt = 2t(ln(4t) − 1) − 2 ln t + c.
dx t
2
Using sin y + cos2 y = 1, and the fact that cos y > 0 (e) Choose a = 1, b = −1 in the standard integral, and
because − π2 < y < π2 , we have note that we are in the case −1 < x < 1:

dy dy 1 1−x
cos y = 1 − sin2 y dx = 12 ln + c.
dx dx (x − 1)(x + 1) x+1
 dy
= 1 − x2 (since sin y = x).
dx
Therefore 6.6 (a) Using v2 − k2 = (v − k)(v + k), and tak-
dy 1 ing a = k and b = −k in the Handbook entry for
=√ .
dx 1 − x2 1
dx, we obtain
Now, if x = sin y, then y = arcsin x, so (x − a)(x − b)
1 1 1
√ dx = arcsin x + c. dv = dv
1 − x2 v −k
2 2 (v − k)(v + k)
This corresponds with the Handbook entry for 1 k−v
1 = ln + c (−k < v < k).
√ dx, with a = 1. 2k v+k
a − x2
2
(b) Sincea − bv 2 = −b(v 2 − a/b),
 we can use
part (a)
with k = a/b to obtain (for − a/b < v < a/b)
6.3 We have 1 1 1
dv = dv
cos2 ax dx = 1
(1 + cos 2ax) dx a − bv 2 −b 2
v − a/b
2    
1 1 1 a/b − v
= 1
x+ sin 2ax + c =  ln  +c
2 2a −b 2 a/b v + a/b
√ √ 
1 1 a−v b
= 12 x + sin 2ax + c. = − √ ln √ √ + c.
4a 2 ab v b+ a
1 5x
6.4 (a) 5e +c
(b) 2 tan 3t + c
6.7 (a) If u = 2 + 4y3 , then du/dy = 12y2 . So
(c) This does not quite match a Handbook entry as it
stands. We have 36 + 4x2 = 4(9 + x2 ), so y 2 exp(2 + 4y 3 ) dy = 1
12 12y 2 exp(2 + 4y 3 ) dy
1 1
2
dv = dv = 1
exp u du
36 + 4v 4(9 + v 2 ) 12
1 1
= 14 dv = 12 exp u + c
9 + v2
 v = 1
12 exp(2 + 4y 3 ) + c.
= 14 13 arctan +c
v 3 (b) If u = sin y, then du/dy = cos y. So
1
= 12 arctan + c. cos y sin2 y dy = u2 du = 13 u3 + c = 1
sin3 y + c.
3 3
(d) For y < 32 , the integrand is positive, and we have
(c) If u = 1 − t2 , then du/dt = −2t. So
1  
dy = − 12 ln(3 − 2y) + c. t 1 − t2 dt = − 12 −2t 1 − t2 dt
3 − 2y
(e) For y > 32 , the integrand is negative, and we have
= − 21 u1/2 du
1
dy = − 12 ln(−3 + 2y) + c
3 − 2y = − 21 ( 23 u3/2 ) + c
= − 12 ln(2y − 3) + c. 
= − 13 ( 1 − t2 )3 + c.
6.5 (a) −3 sin(−2t) − 2 cos 4t + c. Note that you (d) If u = 1 + x2 , then du/dx = 2x. So
can use cos θ = cos(−θ) to simplify the integrand (or x 2x
use sin θ = − sin(−θ) to simplify the result) to obtain 2
dx = 12 dx = 12 ln(1 + x2 ) + c,
1+x 1 + x2
3 sin 2t − 2 cos 4t + c.
since the integrand is of the form g  (x)/g(x) with
(b) arcsin(t/3) + c g(x) > 0.
5t3 + 7 7
(c) dt = 5t2 + dt
t t
= 53 t3 + 7 ln(−t) + c (t < 0).

58
Solutions to the exercises

(e) If u = 1 + sin2 t, then du/dt = 2 sin t cos t = sin 2t, (b) Since exp(a − 2y) = ea e−2y , we have
using the trigonometric formula for sin 2t. So the in-
tegrand is of the form g  (x)/g(x) with g(x) > 0, and exp(a − 2y) dy = ea e−2y dy
hence
= − 12 ea e−2y + c
sin 2t
dt = ln(1 + sin2 t) + c. = − 21 exp(a − 2y) + c.
1 + sin2 t
(f ) Using Equation (6.5) with g(y) = 1 − y 2 , so that (The same result can be obtained using integration by
g  (y) = −2y, we have (for y =
 ±1) substitution with u = a − 2y.)
y −2y (c) For 0 ≤ u ≤ 1, we have 3u + 5 > 0, so
dy = − 12 dy = − 12 ln |1 − y 2 | + c. 1
1 − y2 1 − y2 1  1
du = 13 ln(3u + 5) 0
0 3u + 5
= 13 (ln 8 − ln 5) = 13 ln 85 .
6.8 (a) Take f (x) = x and g (x) = e−x , so f  (x) = 1
and g(x) = −e−x . Then (d) For −3 ≤ u ≤ −2, we have 3u + 5 < 0, so
−2  −2
1
xe−x dx = x(−e−x ) − (−e−x ) dx du = 13 ln(−3u − 5) −3
−3 3u + 5

= −xe−x + e−x dx = 13 (ln 1 − ln 4) = − 31 ln 4.


(e) Substitute u = 3t + a, so du/dt = 3. Then
= −xe−x − e−x + c
1
= −(x + 1)e−x + c. ln(3t + a) dt = 3 3 ln(3t + a) dt
(b) Take f (x) = x2 and g  (x) = e−x , so f  (x) = 2x and 1
g(x) = −e−x . Then = 3 ln u du

x2 e−x dx = x2 (−e−x ) − 2x(−e−x ) dx = 13 u(ln |u| − 1) + c


= 31 (3t + a)(ln |3t + a| − 1) + c.
2 −x −x
= −x e +2 xe dx (f ) Use integration by parts with f (t) = t and g  (t) =
cos(3t + a), so f  (t) = 1 and g(t) = 13 sin(3t + a). Then
= −x2 e−x + 2(−(x + 1)e−x + b)
(by part (a), where b is an arbitrary constant) t cos(3t + a) dt
2 −x
= −(x + 2x + 2)e +c
= t( 13 sin(3t + a)) − 1
3 sin(3t + a) dt
(where c = 2b is an arbitrary constant).
= 13 t sin(3t + a) + 1
9 cos(3t + a) + c.

3/2 3/2
(g) We have
1 1 1
6.9 dz = 4 9 dz d
(3 + cos 2x) = −2 sin 2x
0 9 + 4z 2 0 4 + z2 dx
 3/2 and 3 + cos 2x > 0 for all x, so
1 1 x
= arctan π π
4 3 3 4 sin 2x 4 −2 sin 2x
2 2 0 dx = − 12 dx
= 16 (arctan 1 − arctan 0) 0 3 + cos 2x 0 3 + cos 2x
π

= 16 ( π4 − 0) = π = − 12 [ln(3 + cos 2x)]04


24
= − 21 (ln 3 − ln 4)
500 = − 12 ln 34 = 12 ln 43 .
6.10 (a) 2πr(0.1 + 0.002r) dr
0
 500
2 r3
= π 0.1r + 0.004
3 0
5003
= π 0.1 × 5002 + 0.004 ×
3
= 6.0 × 105 (to two significant figures).

59
UNIT 2 First-order differential equations
1

Study guide for Unit 2


2
This unit introduces the topic of differential equations. It is an important
field of study, and several subsequent units are also devoted to it. There are PC
many applications of differential equations throughout the course.
The subject is developed without assuming that you have come across it
before, but the unit assumes that you have previously had a basic grounding
in calculus. In particular, you will need to have a good grasp of the basic
rules for differentiation and integration. (These were revised in Unit 1 of
3
this course.)
From the point of view of later studies, Sections 3 and 4 contain the most
important material.
The recommended study pattern is to study one section per study session,
and to study the sections in the order in which they appear.
4
You will need the computer algebra package for the course for Subsection 2.3
and for all of Section 5. The computer work for Subsection 2.3 may be
postponed until later (for example until your study of Section 5) without
affecting your ability to study the subsequent sections. 5
PC

61
Unit 2 First-order differential equations

Introduction
An important class of the equations that arise in mathematics consists of
those that feature the rates of change of one or more variables with respect to
one or more others. These rates of change are expressed mathematically by
derivatives, and the corresponding equations are called differential equations.
Equations of this type crop up in a wide variety of situations. They are
found, for example, in models of physical, electronic, economic, demographic
and biological phenomena.
First-order differential equations, which are the particular topic of this unit,
feature derivatives of order one only; that is, if the rate of change of variable
y with respect to variable x is involved, the equations feature dy/dx but not
d2 y/dx2 , d3 y/dx3 , etc.
When a differential equation arises, it is usually an important aim to solve
the equation. For an equation that features the derivative dy/dx, this entails
expressing the dependent variable y directly in terms of the independent
variable x. The solution process requires the effect of the derivative to be
‘undone’. The reversal of differentiation is achieved by integration, so it is
to be expected that integration will feature significantly in the methods of
solution for differential equations.
Integration can be attempted either symbolically, to obtain an exact formula
for the integral, or numerically, to give approximate numerical values from
which the integral can be tabulated or graphed. The same two approaches
can therefore be tried to obtain solutions of differential equations, and both
are introduced in this unit.
Section 1 considers in detail one example of how a differential equation
arises in a mathematical model. This is followed by some basic definitions
and terminology associated with differential equations and their solutions.
We also note how errors and accuracy are defined.
Section 2 starts by looking at the direction field associated with a first-order
differential equation. This is a device for visualizing the overall behaviour of
the differential equation and of its solutions, and leads to a basic numerical
method of solution known as Euler’s method. Both direction fields and
Euler’s method are implemented in a computer subsection.
Section 3 turns to analytic (that is, symbolic) methods of solution, consid-
ering first direct integration and then separation of variables.
Section 4 describes a further analytic approach to solving differential equa-
tions, called the integrating factor method. It applies only to equations that
are linear. Linear first-order differential equations are important in their
own right, but also give valuable clues on how to solve linear second-order
differential equations, which are the subject of Unit 3.
In Section 5 you will see how each of the analytic methods from Sections 3
and 4 can be implemented on your computer.

62
Section 1 Some basics

1 Some basics
Subsection 1.1 develops a mathematical model for a specific situation which
leads naturally to a first-order differential equation. A key step in deriving
this equation is to apply the input–output principle, which is a useful device
for building relations between variables.
Subsection 1.2 addresses what is meant by the term ‘solution’ in the context
of first-order differential equations, and brings out the distinction between
the general solution and the various possible particular solutions. The spec-
ification of a constraint, or initial condition, usually permits us to find a
unique function that is a particular solution of the differential equation and
also satisfies the constraint.
The short Subsection 1.3 provides the definition and description of numerical
errors, in anticipation of Euler’s method in Section 2.

1.1 Why differential equations?


In the course you will meet many examples of differential equations. Fre-
quently these arise from studying the motion of physical objects, but we
shall start with an example drawn from biology and show how this leads
naturally to a particular differential equation.
Suppose that we are interested in the size of a particular population, and in
how it varies over time. The first point to make is that any population size is
measured in integers (whole numbers), so it is not clear how differentiation
will be relevant. (Differentiable functions must be continuous, and therefore
defined on an interval of real numbers in R.) Nevertheless, if the population
is large, say in the hundreds of thousands, a change of one unit will be
relatively very small, and in these circumstances we may choose to model
the population size as a continuous function of time. We shall write this
function as P (t), and our task is to show how P (t) may be described by a
differential equation.
Let us assume a fixed starting time (which we shall label t = 0). If the
population is not constant, then there will be ‘leavers’ and ‘joiners’. For
example, in a population of humans in a particular country, the former
will be those who die or emigrate, whilst the latter represent births and
immigrants.
It is usual to express birth rates as a proportion of the current population
size. For example, the UK Office for National Statistics quotes birth rates in
various age groups as a number per 1000 women. Death rates are specified
in a similar way. To emphasize that these rates are expressed as a proportion
of the current population, we shall use the terms ‘proportionate birth rate’ Note that in our model the
and ‘proportionate death rate’. proportionate birth rate is
expressed as a proportion of
For our simple model we shall ignore immigration and emigration, and con- the whole population, not
centrate solely on births and deaths. Denote the proportionate birth rate just the number of women.
by b and the proportionate death rate by c. Then, in a short interval of
time δt, we would expect
number of births bP (t)δt, (1.1)
number of deaths cP (t)δt, (1.2)
where P (t) is the population size at time t.

63
Unit 2 First-order differential equations

At this stage, we seek some relationship between the chosen variables. In


order to find this, we make use of the input–output principle, which can
be expressed as
accumulation = input − output .

This principle applies to any quantity whose change, over a given time in-
terval, is due solely to the specified input and output.
The accumulation δP of population over the time interval δt is the popu-
lation at the end of the interval minus the population at the start of the
interval; that is,
δP = P (t + δt) − P (t).
The input is the number of births (Equation (1.1)), and the output is the
number of deaths (Equation (1.2)). The input–output principle now enables
us to express the accumulation δP of the population over the time interval
δt as
δP bP (t)δt − cP (t)δt = (b − c)P (t)δt.
Dividing through by δt, we obtain
δP
(b − c)P (t).
δt
The approximations involved in deriving this equation become progressively
more accurate for shorter time intervals. So, finally, by letting δt tend to
zero, we obtain This is the step that requires
P to be a continuous (rather
dP
= (b − c)P (t). than discrete) function of t.
dt
(This follows because
dP P (t + δt) − P (t)
= lim
dt δt→0 δt
is the definition of the derivative of P .)
This is a differential equation because it describes dP /dt rather than the
eventual object of our interest (which is P itself). The purpose of this unit
is to enable you to solve a wide variety of such equations.
Of course, we can simplify the above equation slightly by using the pro-
P
portionate growth rate r, which is the difference between the proportionate
birth and death rates: r = b − c. Then our model becomes
dP
= rP.
dt r>0
For very simple population models, r is taken to be a constant. As we shall
see, this leads to a prediction of exponential growth (or, if r < 0, decay)
in population size with time, as illustrated in Figure 1.1. This may be a t
very good approximation for certain populations, but it cannot be sustained
indefinitely if r > 0. Figure 1.1
In practice, both the proportionate birth rate and the proportionate death
rate will vary, and so therefore will the proportionate growth rate. It turns
out to be convenient to model these changes as being dependent on the
population size, so that the proportionate growth rate r becomes a function
of P . The justification for this is as follows. When the population is low, one
may assume that there is potential for it to grow (assuming a reasonable
environment). The proportionate growth rate should therefore be high.
However, as the population grows, there will be competition for resources.

64
Section 1 Some basics

Thus the proportionate growth rate will decline, and in this way unlimited
(exponential) growth does not occur.
A particularly useful model arises from taking r(P ) to be a decreasing linear
function of P . We shall write this as
P You will see later why this
r(P ) = k 1 − , (1.3) particular form is chosen.
M
where k and M are positive constants. Looking at this formula, you can
see that the proportionate growth rate r decreases linearly with P , from the
value k (when P = 0) to 0 (when P = M ).
Using this expression for r, the above differential equation satisfied by P
becomes
dP P
= kP 1 − . (1.4)
dt M
This is well known to biologists as the logistic equation — we shall consider
it further in Section 2, and see how to solve it in Section 3. For now, we have
achieved our objective of showing that differential equations arise naturally
in modelling the real world.

Exercise 1.1
Suppose that a population obeys the logistic model (with the proportionate
growth rate given by Equation (1.3)), and that you are given the following
information. When P = 10 the proportionate growth rate is 1, and when
P = 10 000 the proportionate growth rate is 0. Find the corresponding
values of k and M .

1.2 Differential equations and solutions


This subsection introduces some of the fundamental concepts associated
with differential equations. First, however, you are asked to recall some
terminology and notation from your previous exposure to calculus. Some of this terminology and
notation was discussed in
The derivative of a variable y with respect to another variable x is denoted Unit 1.
in Leibniz notation by dy/dx. In this derivative expression we refer to y as
the dependent variable and to x as the independent variable.
There are other notations in use for derivatives. If the relation between
variables x and y is expressed in terms of a function f , so that y = f (x),
then the derivative may be written in function notation as f  (x).
A further notation, attributed to Newton, is restricted to cases in which
the independent variable is time, denoted by t. The derivative of y = f (t)
could be written in this case as ẏ, in which the dot over the y stands for the
d/dt of Leibniz notation. Thus we may express this derivative in any of the
equivalent forms
dy
= ẏ = f  (t).
dt
Further derivatives are obtained on differentiating this first derivative. The
second derivative of y = f (t) could be represented by any of the forms
d2 y
= y¨ = f  (t).
dt2
These possible notations have different strengths and weaknesses, and which
is most appropriate in any situation depends on the purpose at hand. You
will see all of these notations employed at various times during the course.

65
Unit 2 First-order differential equations

It is common practice in applied mathematics to reduce the proliferation of


symbols as far as possible. One aspect of this practice is that we often avoid
allocating separate symbols to variables and to associated functions. Thus,
in place of the equation y = f (t) (where y and t denote variables, and f
denotes the function that relates them), we could write y = y(t), which is Strictly speaking, this is an
read as ‘y is a function of t’. (You have seen examples of this in the previous abuse of notation, since there
subsection.) is ambiguity as to exactly
what the symbol y represents:
The following definitions explain just what are meant by a differential equa- it is a variable on the
tion, by the order of such an equation, and by a solution of it. left-hand side of y = y(t) but
a function on the right-hand
side. However, it is a very
convenient abuse.
Definitions
(a) A differential equation for y = y(x) is an equation that relates
the independent variable x, the dependent variable y, and one or
more derivatives of y.
(b) The order of such a differential equation is the order of the high-
est derivative that appears in the equation. Thus a first-order
differential equation for y = y(x) features only the first derivative,
dy/dx.
(c) A solution of such a differential equation is a function y = y(x)
that satisfies the equation.

These definitions have been framed in terms of an independent variable x


and a dependent variable y. You should be able to translate them to apply
to any other independent and dependent variables. Thus Equation (1.4) is
a differential equation in which t is the independent variable and P is the
dependent variable. It is a first-order equation, since dP /dt appears in it
but higher derivatives such as d2 P/dt2 do not. By contrast, the differential
equation
d2 y
3 + y 2 sin x = x2
dx2
is of second order, since the second derivative d2 y/dx2 appears in it but
higher derivatives do not.
The topic of this unit is first-order differential equations. Moreover, it con- Second-order differential
centrates upon first-order equations that can be expressed (possibly after equations are the subject of
some algebraic manipulation) in the form Unit 3.

dy
= f (x, y). Equation (1.4) is of this form,
dx P
The right-hand side here stands for an expression involving both, either or with f (t, P ) = kP 1 − .
M
neither of the variables x and y, but no other variables and no derivatives.
According to the definition above, a function has to satisfy a differential
equation in order to be regarded as a solution of it. The differential equation
is satisfied by the function provided that when the function is substituted This substitution includes the
into the equation, the left- and right-hand sides of the equation give an requirement that the function
identical expression. should be differentiable
(i.e. that it should have a
You are asked to verify in the next exercise that several functions are solu- derivative) at all points where
tions of corresponding first-order differential equations. Later in the unit, it is claimed to be a solution.

66
Section 1 Some basics

you will see how all of these differential equations may be solved; but even
when a solution has been deduced, it is worth checking in the manner of this
exercise (i.e. by substitution) that the supposed solution is indeed correct.

Exercise 1.2
Verify that each of the functions given below is a solution of the correspond-
ing differential equation.
dy
*(a) y = 2ex − (x2 + 2x + 2); = y + x2 . Remember that an asterisk
dx denotes an exercise (or part
dy of one) that is considered
(b) y = 12 x2 + 32 ; = x.
dx particularly important.
2
*(c) u = 2ex /2 ; u = xu.

27 − x2 √ √ dy x
(d) y = (−3 3 < x < 3 3); =− (y = 0). Note that the restriction
3 dx 3y y= 0 placed on the
*(e) y = t + e−t ; ẏ = −y + t + 1. differential equation in
part (d) is necessary to ensure
*(f) y = t + Ce−t ; ẏ = −y + t + 1. (Here C is an arbitrary constant.) that −x/3y is well defined.

In the last two parts of Exercise 1.2 you were asked to verify that
y = t + e−t and y = t + Ce−t
are solutions of the differential equation ẏ = −y + t + 1, where in the second
case C is an arbitrary constant. In saying that C is arbitrary, we mean that
it can assume any real value. Whatever number is chosen for C, the corre-
sponding expression for y(t) is always a solution of the differential equation.
The particular function y = t + e−t is just one example of such a solution,
obtained by choosing C = 1.
This demonstrates that solutions of a differential equation can exist in pro-
fusion; as a result, we need terms to distinguish between the totality of all
these solutions for a given equation and the individual solutions that are
completely specified.

Definitions
(a) The general solution of a differential equation is the collection
of all possible solutions of that equation.
(b) A particular solution of a differential equation is a single solu-
tion of the equation, and consists of a solution function whose rule
contains no arbitrary constant.

In many cases it is possible to describe the general solution of a first-order


differential equation by a single formula involving an arbitrary constant. For
example, y = t + Ce−t is the general solution of the equation ẏ = −y + t + 1;
this means that not only is y = t + Ce−t a solution whatever the value of C,
but also every particular solution of the equation may be obtained by giving
C a suitable value.

67
Unit 2 First-order differential equations

*Exercise 1.3
(a) Verify that, for any value of the constant C, the function y = C − 13 e−3x
is a solution of the differential equation
dy
= e−3x .
dx
(b) Verify that, for any value of the constant C, the function u = Cet − t − 1
is a solution of the differential equation
u̇ = t + u.
(c) Verify that, for any value of the constant C, the function
CM ekt
P =
1 + Cekt
is a solution of Equation (1.4) on page 65.

As you have seen, there are many solutions of a differential equation. How-
ever, a particular solution of the equation, representing a definite relation-
ship between the variables involved, is often what is needed. This is achieved
by using a further piece of information in addition to the differential equa-
tion. Often the extra information takes the form of a pair of values for the
independent and dependent variables.
For example, in the case of a population model, it would be natural to
specify the starting population, P0 say, and to start measuring time from
t = 0. We could then write
P = P0 when t = 0, or, equivalently, P (0) = P0 .
A requirement of this type is called an initial condition.

Definitions
(a) An initial condition associated with the differential equation
dy
= f (x, y)
dx
specifies that the dependent variable y takes some value y0 when
the independent variable x takes some value x0 . This is written
either as
y = y0 when x = x0
or as
y(x0 ) = y0 .
The numbers x0 and y0 are referred to as initial values.
(b) The combination of a first-order differential equation and an initial
condition is called an initial-value problem.

The word ‘initial’ in these definitions arises from those (frequent) cases in
which the independent variable represents time. In such cases, the differen-
tial equation describes how the system being modelled behaves once started,
while the initial condition specifies the configuration in which the system is
started off. In fact, if the initial condition is y(x0 ) = y0 , then we are often
interested in solving the corresponding initial-value problem for x > x0 . If x represents time, then
x > x0 is ‘the future’ after the
system has been started off.
68
Section 1 Some basics

We usually require that an initial-value problem should have a unique solu-


tion, since then the outcome is completely determined by how the system
behaves and its configuration at the start. Almost all the differential equa-
tions in this course do have unique solutions.

Example 1.1
Using the result given in Exercise 1.3(b), solve the initial-value problem
dy
= x + y, y(0) = 1.
dx
Solution
From Exercise 1.3(b), on replacing the variables t, u by x, y, respectively,
the general solution of the differential equation here is
y = Cex − x − 1.
The initial condition says that y = 1 when x = 0, and on feeding these values
into the general solution we find that
1 = Ce0 − 0 − 1 = C − 1.
Hence C = 2, and the particular solution of the differential equation that
solves the initial-value problem is
y = 2ex − x − 1.

Exercise 1.4
The size of a population (measured in hundreds of thousands) is modelled
by the logistic equation
dP P
= kP 1− , P (0) = 1,
dt M
where k = 0.15 and M = 10.
*(a) Use your answer to Exercise 1.3(c) to find a solution to this initial-value
problem.
(b) Can you predict the long-term behaviour of the population size from
your answer?

y
Finally in this subsection, note that one needs to keep an eye on the domain 4
of the function defining the differential equation. ‘Gaps’ in the domain 2
usually show up as some form of restriction on the nature of a solution –10 –5 5 10 x
–2
curve. For example, consider the differential equation
–4
dy 1 –6
= . (1.5)
dx x
It turns out that there are two distinct families of solutions of this equation, Figure 1.2
given by y = ln x + C (if x > 0) and y = ln(−x) + C (if x < 0). These two Since |x| = −x if x < 0, you
families of solutions are illustrated in Figure 1.2. Notice that the right-hand can see that this agrees with
side of Equation (1.5) is not defined at x = 0, and that there is no solution what we know from Unit 1,
that crosses the y-axis. namely that
1
This unit deals with numerical and analytic (symbolic) methods of solving dx = ln |x|.
x
differential equations. However, before we can discuss numerical methods,
we need to know something about the way that errors and accuracy are
described: this is the topic of the next subsection.

69
Unit 2 First-order differential equations

1.3 Approximations in calculations


We often find that we need to make a calculation based on numerical values
that are not exact — for example, there may be limitations on the accuracy
to which a measurement can be taken. Finding a numerical solution to a
differential equation almost always involves inexact arithmetic: any calcula-
tor or computer will have some limit on the accuracy (number of significant
figures) to which a decimal can be stored, and rounding errors will occur.
In fact, using a numerical method usually involves repeated calculations, so
inaccuracies may build up and have a significant effect on the result. When
using a numerical method, one wants (if possible) to know how accurate
the result is. With this in mind, we recall a few basic ideas relating to
approximation and accuracy.
One simple form of calculation is the evaluation of a function. Suppose that
we want the value of f (π) for some function f , and use a value of π which
is rounded to three decimal places, that is, 3.142. Since this is the value of
π to three decimal places, we know that
|3.142 − π| ≤ 0.0005 = 5 × 10−4
(since 3.1415 ≤ π < 3.1425). This gives us some idea of how accurate 3.142
is as an estimate of π. In general, we refer to the difference
approximate value − true value
as the error in the approximate value, and to the modulus of this difference,
that is,
|approximate value − true value|,
as the absolute error in the approximate value. So if ε is the absolute Notice that, by definition, the
error in using 3.142 as an estimate of π, then we know that absolute error is always
greater than or equal to zero,
ε ≤ 0.0005. whereas the error can be
positive or negative.
In this context, the quantity 0.0005 is referred to as an absolute error
bound.
If we take π to be 3.142, what is the consequent error in f (π)? This will
depend on the function f .

Exercise 1.5
For each of the functions f given below, use your calculator to find f (3.142). You are welcome to use your
Then estimate f (π) more accurately, using π = 3.141 592 6 (to seven decimal computer rather than a
places). Hence estimate the error in using f (3.142) as an approximation calculator if you prefer.
to f (π).
(a) f (x) = x3 (b) f (x) = e10x

In Exercise 1.5(b), you saw that an absolute error of less than 0.0005 in
the value of π results in an absolute error of the order of 1.8 × 1011 in
the calculated value of f (π), for f (x) = e10x . For this function f , errors
are severely magnified! However, the situation is not quite so bad as this
statement might suggest. The calculated value of f (π) is not completely
unreliable — it is accurate to two significant figures. The value of f (π) is
itself very large (4.4 × 1013 ), so an error of 1.8 × 1011 is not quite so serious
as it sounds. We often want a measure of ‘error’ that takes into account the
size of the error relative to the size of the number being calculated.

70
Section 2 Direction fields and Euler’s method

We define the relative error as


 
 approximate value − true value 
 .
 true value 
So the relative error in f (3.142) as an estimate of f (π) here is (roughly)
1.8 × 1011
0.4 × 10−2 .
4.4 × 1013
A relative error of this size corresponds to a value that is accurate to two
significant figures, as obtained in Exercise 1.5(b). Relative errors provide
a guide to the number of significant figures that can be relied on, while Usually, a relative error of
absolute errors relate to the number of decimal places that are accurate. 0.5 × 10−n corresponds to a
value that is accurate to n
significant figures.
End-of-section Exercises
Exercise 1.6
(a) Verify that, for any value of the constant C, the function
y = arcsin x + C (−1 < x < 1)
is a solution of the differential equation
dy 1
=√ .
dx 1 − x2
(b) Using the result of part (a), find the solution of the initial-value problem
dy 1
=√ , y( 12 ) = π2 .
dx 1 − x2

Exercise 1.7
(a) Verify that, for any value of the constant C, the function
 π 
x = tan(t + C) − 2 < t + C < π2
is a solution of the differential equation
ẋ = 1 + x2 .
(b) Using the result of part (a), find the solution of the initial-value problem
 
ẋ = 1 + x2 , x π4 = 1.

2 Direction fields and Euler’s method


Subsection 2.1 shows that qualitative information about the solutions of a
first-order differential equation may be gleaned directly from the equation
itself, without undertaking any form of integration process. The main con-
cept here is the direction field, sketches of which usually give a good idea of
how the graphs of solutions behave.
Direction fields can also be regarded as the starting point for a numeri-
cal (that is, calculational rather than algebraic) method of solution called
Euler’s method, which is described and applied in Subsection 2.2.
In Subsection 2.3 you will see how both direction fields and Euler’s method
can be implemented on your computer.

71
Unit 2 First-order differential equations

2.1 Direction fields


We start this subsection by considering what can be deduced about solutions
of the differential equation
dy
= f (x, y) (2.1)
dx
from direct observation of this equation.
In Section 1 we encountered the logistic equation Here we have
dP P P
= kP 1− , (2.2) f (t, P ) = kP 1− .
dt M M

where k and M are positive constants. In certain circumstances this is a


useful mathematical model of population changes, in which P (t) denotes
the size of the population at time t. The right-hand side of this equation
is equal to zero if either P = 0 or P = M . Hence, since dP /dt = 0 in both
cases, each of the constant functions P = 0 and P = M is a particular
solution of Equation (2.2). Within the model, these solutions correspond to
a complete absence of the population (P = 0), and an equilibrium population
y
level (P = M ) for which the proportionate birth and death rates are equal.
Such spotting of constant functions that are particular solutions is useful y0
f ( x0 , y 0 )
on occasion but of limited applicability. In general, more useful information 1
can be deduced from the observation that, for any given point (x, y) in the
plane, the equation
x0 x
dy
= f (x, y) (2.1)
dx
Figure 2.1 A graphical
describes the direction in which the graph of the particular solution through representation of the slope at
that point is heading (see Figure 2.1). This is because if y = y(x) is any the point (x0 , y0 )
solution of the differential equation, then dy/dx is the gradient or slope of
the graph of that function. Equation (2.1) therefore tells us that f (x, y)
represents the slope at (x, y) of the graph of the particular solution that For example, if
passes through (x, y). If the slope f (x, y) at this point is positive, then the f (x, y) = x + y, then the
corresponding solution graph is increasing (rising) from left to right through slope at the point (1, 2) is
f (1, 2) = 1 + 2 = 3, the slope
the point (x, y); if the slope is negative, then the graph is decreasing (falling); at the point (2, −7) is
and if the slope is zero, then the graph is horizontal at the point. f (2, −7) = 2 − 7 = −5, and
When looking at f (x, y) in this light, it is referred to as a direction field, the slope at the point (3, −3)
is f (3, −3) = 3 − 3 = 0.
since it describes a direction (slope) for each point (x, y) where f (x, y) is
defined.

Definition
A direction field associates a unique direction to each point within
a specified region of the (x, y)-plane. The direction corresponding to
the point (x, y) may be thought of as the slope of a short line segment
through the point.
In particular, the direction field for the differential equation
dy
= f (x, y)
dx
associates the direction f (x, y) with the point (x, y).

Direction fields can be visualized by constructing the short line segments


referred to above for a finite set of points in an appropriate region of the
plane, where typically the points are chosen to form a rectangular grid.

72
Section 2 Direction fields and Euler’s method

An example is shown in Figure 2.2, which corresponds to the differential


equation
dy
= x + y. (2.3) Here f (x, y) = x + y.
dx
In this case the chosen region is the set of points (x, y) such that −2 ≤ x ≤ 2
and 0 ≤ y ≤ 2, and the rectangular grid consists of the points at intervals
of 0.2 in both the x- and y-directions within this region.

0 1 2 x
–2 –1

Figure 2.2

From this diagram, we can gain a good qualitative impression of how the
graphs of particular solutions of Equation (2.3) behave. The aim is mentally
to sketch curves on the diagram in such a way that the tangents to the curves
are always parallel to the local slopes of the direction field. For example,
starting from the point (−1, 0.5) (that is, taking the initial condition to be
y(−1) = 0.5), we expect the solution graph initially to fall as we move to
the right. The magnitude of the negative slope decreases, however, and
eventually reaches zero, after which the slope becomes positive and then
increases. On this basis, we could sketch the graph of the corresponding
particular solution and obtain something like the curve shown in Figure 2.3.

(– 1 , 0 . 5)

0 1 2 x
–2 –1

Figure 2.3

73
Unit 2 First-order differential equations

Exercise 2.1
(a) Part of the direction field for the logistic equation
dP P
=P 1− This is Equation (2.2) with
dt 1000 k = 1 and M = 1000.
is sketched in Figure 2.4. Using this diagram, sketch the solution curves
that pass through the following points:
(0, 1500), (0, 1000), (0, 100), (0, 0), (0, −100).
From your results, describe the graphs of particular solutions of the
differential equation.

P
1500

1000

500

0
2 4 6 8 10 t

–500

Figure 2.4

(b) What does your answer to part (a) tell you about the predicted be-
haviour of a population whose size P (t) at time t is modelled by this
logistic equation?

Drawing by hand precise grids of line segments to represent direction fields


is not a good use of your time. However, it is a task that your computer
can be programmed to perform, as you will see in Subsection 2.3. Before
investigating this, you will see in Subsection 2.2 how the concept of direction
fields helps in constructing approximate numerical solutions for first-order
differential equations.

2.2 Euler’s method


In the previous subsection it was suggested that the graphs of particular
solutions of a differential equation
dy
= f (x, y)
dx
could be ‘mentally sketched’ on a diagram of the direction field given by
f (x, y). This was to be done in such a way that the tangent to the solution
curve is always ‘parallel to the local slope’ of the direction field. While this
gives a good visual image of the connection between the direction field and

74
Section 2 Direction fields and Euler’s method

the graph of a solution, it is somewhat short on precision. We could not


expect, by this approach, to predict with any accuracy the actual solution to
an initial-value problem. That task is the subject of the current subsection,
which shows how an initial-value problem
dy
= f (x, y), y(x0 ) = y0 , (2.4)
dx
may be ‘solved’ by calculational means. However, the direction field diagram
is still of use in explaining how this numerical method arises.
Suppose that instead of trying to sketch a solution curve to fit the direction
field, we move in a sequence of straight-line steps whose directions are gov-
erned by the direction field. The aim is to produce a sequence of points that
provide approximate values of the solution function y(x) for the initial-value
problem at a sequence of x-values. The steps are constructed as follows.
Corresponding to the given initial condition y(x0 ) = y0 , there is a point P0
in the (x, y)-plane with coordinates (x0 , y0 ), and this is our starting point.
At P0 , the direction field f (x, y) defines a particular slope, namely f (x0 , y0 ).
We move off from P0 along a straight line that has this slope, and continue
until we have travelled a horizontal distance h to the right of P0 . The point
that has now been reached is labelled P1 , as in Figure 2.5.
y

slope = f ( x 0, y0)
P1 ( x1, Y1)

P0 Y 1 – y0
(x 0, y0)
h

0 x0 x1 x

Figure 2.5

The idea is that the point P1 , whose coordinates have been denoted by
(x1 , Y1 ), provides an approximate value Y1 of the solution function y(x) at The reason for using Y1 here,
x = x1 . Now, unless the solution function follows a straight line as x moves rather than y1 , will be
from x0 to x1 , Y1 is unlikely to give the exact value of y(x1 ). However, the explained shortly.
hope is that, because we headed off from x0 along the correct slope, as given
by the direction field, Y1 will be reasonably close to the exact value. Before
worrying about accuracy, let us continue with the construction of the points
in our sequence.
The next thing that we need to do, before proceeding to the second step
in the construction process, is determine formulae for x1 and Y1 in terms
of x0 , y0 , h and f (x0 , y0 ). By the construction described, as the point P1
is reached from P0 by taking a step to the right of horizontal length h, we
have
x1 = x0 + h. (2.5)
We can express Y1 in terms of other quantities by equating two expressions
for the slope of the line segment P0 P1 ,
Y 1 − y0
= f (x0 , y0 ),
h
and then rearranging to give
Y1 = y0 + hf (x0 , y0 ). (2.6)
This completes the first step, and we now take a second step to the right.

75
Unit 2 First-order differential equations

The direction of the second step is along the line with slope defined by the
direction field at the point P1 , namely f (x1 , Y1 ). The second step moves us
from P1 through a further horizontal distance h to the right, to the point
labelled P2 , as illustrated in Figure 2.6. This point provides an approximate
value Y2 of the solution function y(x) at x = x2 .

y
slope = f ( x1, Y1) P2 ( x2, Y2)

Y 2 – Y1
( x1, Y1) P1
h
P0

0 x0 x1 x2 x

Figure 2.6

As in the first step, we need now to express the coordinates (x2 , Y2 ) of P2


in terms of x1 , Y1 , h and f (x1 , Y1 ). We have
x2 = x1 + h (2.7)
and (equating two expressions for the slope of the line segment P1 P2 )
Y2 − Y 1
= f (x1 , Y1 ),
h
which can be rearranged to give
Y2 = Y1 + hf (x1 , Y1 ). (2.8)
Having carried out two steps of the process, it is possible to see that the
same procedure can be applied to construct any number of further steps,
and we next generalize to a description of what happens at the (i + 1)th
step, where i represents any non-negative integer.
Suppose that after i steps we have reached the point Pi , with coordinates
(xi , Yi ). For the (i + 1)th step, we move away from Pi along the line with
slope f (xi , Yi ), as defined by the direction field at Pi . After moving through a
horizontal distance h to the right, we reach the point Pi+1 , whose coordinates
are denoted by (xi+1 , Yi+1 ), as illustrated in Figure 2.7. The point Pi+1
provides an approximate value Yi+1 of the solution function y(x) at x = xi+1 .

y
P i+1 ( xi+1 , Y i+1 )
slope = f ( x i , Y i )

Y i+1 – Yi

Pi
(xi , Yi )
h

0 xi xi+1 x

Figure 2.7

76
Section 2 Direction fields and Euler’s method

Arguing as before, we have


xi+1 = xi + h (2.9)
and (equating two expressions for the slope of the line segment Pi Pi+1 )
Yi+1 − Yi
= f (xi , Yi ),
h
which can be rearranged to give
Yi+1 = Yi + hf (xi , Yi ). (2.10)
Note that Equations (2.5) and (2.7) are the special cases of Equation (2.9)
for i = 0 and i = 1, respectively, and that Equation (2.8) is the special case
of Equation (2.10) for i = 1. If we also define Y0 to be equal to the initial
value y0 , then Equation (2.6) is the special case of Equation (2.10) for i = 0.
To sum up, for the initial-value problem (2.4), we have a procedure for
constructing a sequence of points
Pi with coordinates (xi , Yi ) (i = 1, 2, . . .),
where the values of xi and Yi for each value of i are determined by the re-
spective formulae (2.9) and (2.10). The starting point for the sequence is the
point P0 with coordinates (x0 , Y0 ), where Y0 = y0 . Because the procedure
is based on the direction field, each Yi provides an approximation at x = xi
to the value of the solution function y(x) for the initial-value problem. The
horizontal distance h by which we move to the right at each stage of the
procedure is called the step size or step length.
Figure 2.8 shows the constructed sequence of points, and for comparison in-
cludes a curve representing the graph of the exact solution of the initial-value
problem (2.4). This makes clear that the successive points P1 , P2 , P3 , . . . are
only approximations to points on the solution curve. In fact, the situation
shown in Figure 2.8 is typical of the behaviour of the constructed approx-
imations, in that they gradually move further and further from the exact
solution curve. This is because, at each step, the direction of movement is
along the slope of the direction field at Pi = (xi , Yi ) and not along the slope
of the direction field at (xi , yi ), where yi = y(xi ) denotes the value of the The common use of
exact solution function at x = xi ; that is, for each xi , the slope for the next yi = y(xi ) to represent the
step is defined by a point close to the solution curve rather than by the point exact solution at x = xi
explains why we use a
exactly on that curve. different notation, namely Yi ,
for the numerical
approximation to y(xi ).
y

y = y(x) P6

P5

P4
P3
P2
P1
P0

0 x0 x1 x2 x3 x4 x5 x6 x

Figure 2.8

77
Unit 2 First-order differential equations

Nevertheless, the formulae (2.9) and (2.10) provide a method for finding ap-
proximate solutions to the initial-value problem (2.4), in terms of numerical The accuracy of such
estimates Y1 , Y2 , Y3 , . . . at the respective domain values x1 , x2 , x3 , . . .. This approximate solutions, and
is called Euler’s method, and is summarized below. ways of improving accuracy,
will be considered shortly.

Procedure 2.1 Euler’s method


To apply Euler’s method to the initial-value problem Leonhard Euler (1707–1783)
was one of the most prolific
dy mathematicians of all time.
= f (x, y), y(x0 ) = y0 , (2.4)
dx (His surname is pronounced
proceed as follows. ‘oiler’.) He first devised this
method in order to compute
(a) Take x0 and Y0 = y0 as starting values, choose a step size h, and the orbit of the moon.
set i = 0.
(b) Calculate the x-coordinate xi+1 , using the formula
xi+1 = xi + h. (2.9)
(c) Calculate a corresponding approximation Yi+1 to y(xi+1 ), using
the formula
Yi+1 = Yi + hf (xi , Yi ). (2.10)
(d) If further approximate values are required, increase i by 1 and
return to Step (b).

Example 2.1
Consider the initial-value problem
dy
= x + y, y(0) = 1.
dx
Use Euler’s method, with step size h = 0.2, to obtain an approximation to
y(1).

Solution
We have x0 = 0, Y0 = y0 = 1, and f (xi , Yi ) = xi + Yi . The step size is given
as h = 0.2. Equation (2.9) with i = 0 gives
x1 = x0 + h = 0 + 0.2 = 0.2,
and Equation (2.10) with i = 0 gives
Y1 = Y0 + hf (x0 , Y0 ) = 1 + 0.2 × (0 + 1) = 1.2.
For the second step, we have (from Equation (2.9) with i = 1)
x2 = x1 + h = 0.2 + 0.2 = 0.4,
and (from Equation (2.10) with i = 1)
Y2 = Y1 + hf (x1 , Y1 ) = 1.2 + 0.2 × (0.2 + 1.2) = 1.48.
If more than a couple of steps of such a calculation have to be computed by
hand, then it is a good idea to lay out the calculation as a table. In this
case, by continuing as above and putting i in turn equal to 2, 3 and 4, we
obtain Table 2.1.

78
Section 2 Direction fields and Euler’s method

Table 2.1
i xi Yi f (xi , Yi ) = xi + Yi Yi+1 = Yi + hf (xi , Yi )
0 0 1 1 1.2 After each value of Yi+1 has
1 0.2 1.2 1.4 1.48 been calculated from the
2 0.4 1.48 1.88 1.856 formula and entered in the
3 0.6 1.856 2.456 2.347 2 last column, it is transferred
to the Yi column in the next
4 0.8 2.347 2 3.147 2 2.976 64
row.
5 1.0 2.976 64

So, at x = 1, Euler’s method with step size h = 0.2 gives the approximation
y(1) 2.976 64.

*Exercise 2.2
Use Euler’s method, with step size h = 0.2, to obtain an approximation to
y(1) for the initial-value problem
dy
= y, y(0) = 1.
dx

The solution of the initial-value problem given in Example 2.1 is in fact


known exactly, and is y = 2ex − x − 1. Putting x = 1, this gives This exact solution was found
in Example 1.1 (page 69).
y(1) = 2e − 1 − 1 = 3.436 56,
correct to five decimal places. This value may be compared with the ap-
proximation 2.976 64 for y(1) obtained by Euler’s method in Example 2.1,
and the comparison indicates that the approximation is not at all accurate.
Indeed, the absolute error in this case is
|2.976 64 − 3.436 56| = 0.459 92,
which is about 13% of the exact value, and indeed not even one decimal
place accuracy is achieved.
Similarly, the other values Yi (i = 1, 2, 3, 4) found in Example 2.1 contain
significant absolute errors when considered as approximations to the cor-
responding exact values yi = y(xi ). This is illustrated in general terms in
Figure 2.8, where the absolute error in approximation Yi is the vertical dis-
tance from the point Pi to the point directly above it on the exact solution
curve. As shown there, and for reasons given earlier, the absolute error tends
to increase as more and more steps are taken.
The realization that Euler’s method can produce values that are poor ap-
proximations to the exact solution of an initial-value problem invites us to
ask whether the accuracy of the approximations can be improved, using this
method. In fact, it is not hard to see that improvements in accuracy ought
to be achieved by reducing the step size h. Our earlier prescription for con-
structing the sequence of points P1 , P2 , P3 , . . . from the starting point P0
and the given direction field amounts loosely to saying ‘match the direction
of the solution curve at the current point, take a step, then adjust direction
so as to try not to move further away from the curve’. It seems natural,
therefore, that the approximations will improve if we reduce the size of the
steps taken and correspondingly ‘adjust direction’ more frequently. This is
illustrated for a hypothetical case in Figure 2.9.

79
Unit 2 First-order differential equations

y
estimate with
h = 0.4

with h = 0.2
y = y (x)
with h = 0.1

exact solution
at x = 0.4

0 0.1 0.2 0.3 0.4 x

Figure 2.9

In fact, it can be shown that the accuracy of Euler’s method does indeed
usually improve when we take a smaller step size.
To demonstrate this, consider the initial-value problem from Exercise 2.2.
This has the exact solution y = ex (as you can verify), and its value at
x = 1 is y(1) = e = 2.718 282, to six decimal places. In Exercise 2.2 you
showed that with a step size h = 0.2, Euler’s method gives the approximation
2.488 32 for y(1). Table 2.2 shows the corresponding results (to six decimal
places) obtained when we apply Euler’s method to this same initial-value
problem but with successively smaller step sizes h. In Exercise 2.2, where
h = 0.2, the value of y(1) was
approximated by Y5 . From
Table 2.2 the column for ‘number of
steps’ in Table 2.2, you can
h Approximation Absolute Number of see that y(1) is approximated
to y(1) error steps by:
0.1 2.593 742 0.124 539 10 Y10 when h = 0.1;
Y100 when h = 0.01;
0.01 2.704 814 0.013 468 100
Y1000 when h = 0.001;
0.001 2.716 924 0.001 358 1000
Y10000 when h = 0.0001.
0.0001 2.718 146 0.000 136 10000

As expected, the absolute errors in the third column of the table become
progressively smaller as h is reduced.
Looking more carefully at these absolute errors, we notice that they seem
to tend towards a sequence in which each number is a tenth of the previous
one. Since each value of h in the table is a tenth of the previous one, this
suggests that:
absolute error is approximately proportional to step size h.
This turns out to be a general property of Euler’s method, for sufficiently You will see this property
small values of the step size. So, not only do we know that accuracy can be stated formally in Unit 26,
improved by decreasing the step size h, but this general property also tells us where you will also see how
the property can be used to
that, by making h small enough, the absolute error in an approximation can estimate the size of absolute
errors.

80
Section 2 Direction fields and Euler’s method

be made as small as desired. In other words, the absolute error approaches


the limit zero as h approaches zero (as you might have expected from the
intuitive argument preceding Figure 2.9).

*Exercise 2.3
Suppose that when Euler’s method is applied to the problem in Exercise 2.2,
the absolute error in approximating y(1) is proportional to the step size h,
for sufficiently small h.
Use the last row of Table 2.2 to estimate the constant of proportionality, k
say, and hence to estimate the step size required to compute y(1) correct to
five decimal places (that is, so that the absolute error is less than 5 × 10−6 ).

A few words of caution are necessary at this point. Although the absolute
error can be made as small as we please by making the step size h sufficiently
small, this is valid only if the arithmetic is performed using sufficient decimal
places. Where a calculator or computer is involved, the number of decimal
places that can be used is limited, and as a result rounding errors may
be introduced into the calculations. After a certain point, any increase in
accuracy brought about by reducing the size of h may be swamped by these
rounding errors.
Moreover, rounding errors are not the only problem. Before concluding that
h should always be chosen to be very small, we must also consider the cost
of this additional accuracy. Now, by cost is meant the effort involved, which
can be measured in a variety of ways; commonly for iterative methods (such
as Euler’s method) it is measured by the number of steps taken. In general
for numerical methods, the greater the accuracy required, the greater the
cost. To illustrate this, look back at Table 2.2. The last column of the
table shows how the number of steps required for the calculation goes up
in inverse proportion to the step size: to move from x = 0 to x = 1, it In general, to move from a
takes 10 steps with step size h = 0.1, 100 steps with step size h = 0.01, to b (where b > a) with step
and so on. Since, for sufficiently small h, the error in Euler’s method is size h takes (b − a)/h steps.
approximately proportional to the step size, it follows that for this method
a ten-fold improvement in accuracy is paid for by a ten-fold increase in the
number of steps required.
So, for Euler’s method and similar methods, the choice of step size has to be
based on a compromise between the two opposing requirements of accuracy
and cost. Methods for choosing the step size are discussed in Unit 26, which
also introduces other numerical methods for solving initial-value problems
that are considerably more efficient than Euler’s method. In fact, Euler’s Greater efficiency means that
method is not suitable for high-accuracy work. Its virtue lies rather in its the same or better numerical
simplicity and its clear illustration of the basic principles of how differential accuracy is achieved with
fewer numerical
equations may be solved numerically. computations.
In any practical case, calculations of the type described in this subsection
are ideally suited to being performed on a computer, as you will see in the
next subsection.

81
Unit 2 First-order differential equations

2.3 Finding numerical solutions on the computer


In this subsection you will see how the computer can be used to construct
direction fields and to implement Euler’s method.

Use your computer to complete the following activities. PC


*Activity 2.1
Plot the direction field for the differential equation
dy
=x+y This is Equation (2.3)
dx (page 73).
in the region
−2 ≤ x ≤ 2, 0 ≤ y ≤ 2.
On the basis of the plot of the direction field, what can you say about the
graphs of solutions of the differential equation?

*Activity 2.2
Use Euler’s method to obtain approximations to y(1) for the initial-value
problem
dy
= x + y, y(0) = 0,
dx
using step sizes h = 1, 0.5, 0.2, 0.1, 0.01, 0.001, 0.0001, in turn. In each case,
plot the graph of the solution on an appropriate direction field diagram, and
observe how each graph compares with the previous one.

*Activity 2.3
Euler’s method is to be used to estimate the value of the function y(x) at
x = 0.1, 0.2, . . . , 1 for the initial-value problem
dy
= x + y, y(0) = 0.
dx
(a) Use the step sizes h = 0.1, 0.01, 0.001, 0.0001, in turn. Compare the
results in each case with the exact solution y = ex − x − 1, and comment
on how the size of the absolute error varies with h.
(b) Compare your estimates for the step sizes h = 0.1 and h = 0.01. Then
compare your estimates for all four step sizes. What can you conclude?

End-of-section Exercise

Exercise 2.4
(a) Without plotting the direction field, say what you can about the slopes
defined by the differential equation
dy
= f (x, y) = y + x2 .
dx
(b) Verify that your conclusions are consistent with the direction field dia-
gram in Figure 2.10.

82
Section 3 Finding analytic solutions

–2 –1 0 1 2 x

–1

–2

Figure 2.10
(c) On the basis of the direction field, what can be said about the graphs
of solutions of the differential equation?
(d) Write down the formulae required in order to apply Euler’s method to
the initial-value problem
dy
= y + x2 , y(−1) = −0.2,
dx
using a step size h = 0.1.

3 Finding analytic solutions


This section and the next look at methods for finding analytic solutions
of first-order differential equations — that is, solutions expressed in terms
of exact formulae. It is not always possible to find analytic solutions, and
in these cases numerical methods of approximate solution, such as Euler’s
method, are applied. Even when a formula for the solution is obtainable,
it may be so complicated that a numerical solution is preferred. However,
where a simple formula can be found, this is likely to be more informative
than use of a numerical method.
This section specializes from the form of differential equation
dy
= f (x, y)
dx
considered earlier. Subsection 3.1 looks at cases in which f (x, y) is taken to
be a function of x alone, f (x). You will see that the differential equa-
tion can then be solved by direct integration, assuming that the neces-
sary integration can be performed. Subsection 3.2 considers cases in which
f (x, y) = g(x)h(y) (the product of a function of x and a function of y).
These can be solved in principle by the method of separation of variables.

83
Unit 2 First-order differential equations

3.1 Direct integration


An example of a differential equation that can be solved is
dy
= x2 . (3.1)
dx
In order to do this, we need to find functions y(x) whose derivatives are x2 ;
one such function is y = 13 x3 . There are other functions with this same
derivative, for example y = 13 x3 + 1 and y = 13 x3 − 2. In fact, any function
of the form
y = 13 x3 + C, The values C = 0, C = 1 and
C = −2 give the three
where C is an arbitrary constant, satisfies the differential equation (3.1). particular solutions
This is an expression for the general solution. mentioned above.
The expression 13 x3 + C is also the indefinite integral of x2 : that is, This is hardly surprising,
since integration ‘undoes’ or
x2 dx = 13 x3 + C. reverses the effect of
differentiation.
In this case, therefore, the indefinite integral of x2 is the general solution of
Equation (3.1), and a similar connection applies more generally.
Consider the differential equation
dy
= f (x), (3.2) The function f (x) is assumed
dx to be continuous (i.e. its
where the right-hand side, f (x), is a function of x alone. Suppose that we graph has no breaks).
have a particular solution y = F (x) of this differential equation; in other
words, F (x) is an integral of f (x). In such circumstances, the general so-
lution of Equation (3.2) is given by y = F (x) + C, where C is an arbitrary
constant; and the indefinite integral of f (x) is given by the same expression,

f (x) dx = F (x) + C.

This means that the general solution of Equation (3.2) can be written down
directly as an indefinite integral; and, if the integration can be performed,
then the equation is solved.

Procedure 3.1 Direct integration


The general solution of the differential equation
dy
= f (x) (3.2)
dx
is
y= f (x) dx = F (x) + C,

where F (x) is an integral of f (x) and C is an arbitrary constant.

Once the general solution has been found, it is possible to single out a
particular solution by specifying a value for the constant C. As before, this
value may be found by applying an initial condition.

Example 3.1
(a) Find the general solution of the differential equation
dy
= e−3x .
dx

84
Section 3 Finding analytic solutions

(b) Find the particular solution of this differential equation that satisfies
the initial condition y(0) = 53 .
Solution
(a) On applying direct integration, we obtain the general solution

y= e−3x dx = − 13 e−3x + C,

where C is an arbitrary constant.


5 5
(b) In order to satisfy the initial condition y(0) = 3 (that is, y = 3 when
x = 0), we must have
5
3 = − 13 e0 + C,
so C = 2. The required particular solution is therefore
y = − 13 e−3x + 2.

Procedure 3.1 uses x for the independent variable and y for the dependent
variable. As usual, you should be prepared to translate this into situations
where other symbols are used for the variables. But remember that the
method of direct integration applies solely to first-order differential equations
for which the derivative is equal to a function of the independent variable
alone. Thus direct integration can be applied, for example, to the differential
equation
dx
= cos t,
dt
to give the general solution

x= cos t dt = sin t + C,

where C is an arbitrary constant. (Here t is the independent variable and x


is the dependent variable.) On the other hand, the differential equation
dx
= x2
dt
cannot be solved by direct integration, since the right-hand side here is a
function of the dependent variable, x.

Exercise 3.1
Solve each of the following initial-value problems.
dy
(a) = 6x, y(1) = 5.
dx
dv
(b) = e4u , v(0) = 2.
du
Remember that ẏ stands for
*(c) ẏ = 5 sin 2t, y(0) = 0. dy/dt, where t denotes time.

The method of direct integration succeeds in solving a differential equation


of the specified type whenever it is possible to carry out the integration that
arises, and this task may require you to apply any of the standard techniques
of integration, such as integration by parts and integration by substitution. Integration by parts and
For more difficult integrals, a computer algebra package can be used. integration by substitution
are revised in Unit 1. Many
integrations can be performed
by reference to the table of
standard integrals in the
course Handbook.

85
Unit 2 First-order differential equations

Exercise 3.2
Find the general solution of each of the following differential equations.
dy
(a) = xe−2x
dx
t
*(b) ṗ = (Hint : For the integral, try the substitution u = 1 + t2 .)
1 + t2

The answer to Exercise 3.2(b) can be generalized to any differential equation This is a simple extension of
of the form the result from Unit 1 that
f  (x)
dy f  (x) dx = ln |f (x)| + C,
=k (f (x) = 0), f (x)
dx f (x) for f (x) = 0.
where k is a constant, to give the general solution
y = k ln |f (x)| + C,
where C is an arbitrary constant.

3.2 Separation of variables


Direct integration applies, in an immediate sense, only to the very simplest
type of differential equation, as described by Equation (3.2). However, all
other analytic methods of solution for first-order equations eventually also
boil down to performing integrations. In this subsection, we consider how
to solve first-order differential equations of the form
dy
= f (x, y)
dx
where the right-hand side f (x, y) is the product of a function of x and a
function of y; that is, equations of the form
dy
= g(x)h(y). (3.3)
dx
One example of this type of differential equation is
dy Here we have g(x) = x and
= x(1 + y 2 ). (3.4) h(y) = 1 + y 2 .
dx
We divide both sides of this equation by 1 + y 2 , to obtain Note that 1 + y 2 is never zero,
so it is safe to divide by it.
1 dy
= x,
1 + y 2 dx
and then integrate both sides with respect to x, which gives
1 dy
dx = x dx. (3.5)
1 + y 2 dx
Applying the rule for integration by substitution (in Leibniz notation) to See Section 6 of Unit 1.
the left-hand side, we obtain
1 dy 1
2
dx = dy,
1 + y dx 1 + y2
so Equation (3.5) becomes
1
dy = x dx.
1 + y2
On performing the two integrations, we obtain See the table of standard
integrals in the Handbook.
arctan y = 12 x2 + C, (3.6)

86
Section 3 Finding analytic solutions

where C is an arbitrary constant. Making y the subject of the equation, we Note that one arbitrary
obtain the solution expression constant suffices.
y = tan( 12 x2 + C).
The approach just demonstrated applies more widely. In principle, it works
for any differential equation of the form
dy
= g(x)h(y). (3.3)
dx
On dividing this equation through by h(y) (for all values of y other than
those where h(y) = 0), we obtain
1 dy
= g(x).
h(y) dx
Integration with respect to x on both sides gives
1 dy
dx = g(x) dx,
h(y) dx
and, on applying the rule for integration by substitution to the left-hand
side, this becomes
1
dy = g(x) dx. (3.7) This is the form that you
h(y) need to remember! Note that
If the two integrals can be evaluated at this stage, then we reach an equation you can obtain it ‘informally’
that relates x and y and features an arbitrary constant. This equation is by dividing Equation (3.3) by
the general solution of the differential equation (for values of y other than h(y), ‘multiplying through
by dx’, and then adding the
those where h(y) = 0); but usually y will not be the subject of this equation. two integral signs.
It is a form of the general solution called an implicit (general) solution of
the differential equation. (An example of an implicit solution is provided
by Equation (3.6).) Usually, the final aim is to make y the subject of the
equation, if possible — that is, to manipulate the equation into the form
y = function of x.
This is called the explicit (general) solution of the differential equation.
In either case (implicit or explicit), a particular solution may be obtained
from the general solution as before, by applying an initial condition.
The method just described for solving differential equations of the form (3.3)
is called the method of separation of variables since, in Equation (3.7), we
have separated the variables to either side of the equation, with only the
dependent variable appearing on the left and only the independent variable
on the right. The method is summarized below.

Procedure 3.2 Separation of variables


This method applies to separable differential equations, which are of
the form
dy
= g(x)h(y). (3.3)
dx
(a) Divide both sides by h(y) (where h(y) = 0), and integrate both
sides with respect to x, to obtain
1
dy = g(x) dx. (3.7)
h(y)
(b) If possible, perform the integrations, to obtain an implicit form of
the general solution. It is a good idea to check, by
substitution into the original
(c) If possible, rearrange the formula found in Step (b) to give y in differential equation, that the
terms of x. This is the explicit (general) solution. function obtained is indeed a
solution.

87
Unit 2 First-order differential equations

The separation of variables method is useful, but there are some difficulties
with it. First, it may not be possible to perform the necessary integrations.
Second, the general solution obtained is restricted to those values of y such
that h(y) = 0. Third, it may not be possible to perform the necessary
manipulations to obtain an explicit solution.
Of these difficulties, the first can be overcome by use of a numerical method,
such as Euler’s method. The second will be discussed shortly. The third
will usually also need numerical techniques.
It is necessary to be careful about the domain or image set of the solution
obtained, as the following example illustrates.

Example 3.2
(a) Find the general solution of the differential equation
dy x
=− (y > 0).
dx 3y
(b) Find the particular solution that satisfies the initial condition y(0) = 3.

Solution
(a) The equation is of the form
dy
= g(x)h(y),
dx
where the obvious choices for g and h are
g(x) = −x and h(y) = 1/(3y). Notice that since y > 0,
h(y) is never zero.
We now apply Procedure 3.2. On dividing through by h(y) = 1/(3y)
(that is, multiplying through by 3y) and integrating with respect to x,
the differential equation becomes

3y dy = −x dx. With practice, you will be


able to move directly to this
Evaluating the integrals gives stage, as shown in
Procedure 3.2.
3 2
2y = − 12 x2 + B,
where B is an arbitrary constant. This is an implicit form of the general
solution.
On solving for y (and noting the condition y > 0 given above, which
determines the sign of the square root), we obtain the explicit general
solution

y = 13 (2B − x2 ).
This can be simplified slightly by writing C in place of 2B, where C
is another arbitrary constant. However, we need to recognize that the
formula for y represents a real quantity greater than zero only when the
argument of the square root is positive, so we must have C − x2 > 0.
This in turn means that C cannot be completely arbitrary, since it must
at least be positive. The general solution in this case is therefore Since x2 ≥ 0 for all x,
 √ √ C − x2 > 0 implies that
y = 13 (C − x2 ) (− C < x < C), C > x2 ≥ 0, so C must be
positive.
where C is a positive but otherwise arbitrary constant.

88
Section 3 Finding analytic solutions

(b) The initial condition is y(0) = 3, so we substitute


 x = 0 and y = 3 into
1
the general solution above. This gives 3 = 3 C, so C = 27, and the
required particular solution is
 √ √ You verified in Exercise 1.2(d)
y = 13 (27 − x2 ) (−3 3 < x < 3 3). that this function is a solution
of the differential equation.

*Exercise 3.3
A mass m(t) of a uranium isotope, which is present in an object at time t,
declines over time due to radioactive decay. Its behaviour is modelled by
the differential equation
dm
= −λm (m > 0), This model can be applied to
dt other radioactive substances
where the decay constant λ is a positive constant characteristic of the ura- by selecting the appropriate
nium isotope. value of the parameter λ.
(a) Find the general solution of this differential equation.
(b) Find the particular solution for which the initial amount of uranium
present (at time t = 0) is m0 .

The condition m > 0 in Exercise 3.3 arose from the modelling context. This
condition enabled us to find the general solution without needing to worry
about dividing by zero at Step (a) of the separation of variables method
(and hence without needing to restrict the image set further). Suppose we
were to forget the modelling context — that is, suppose we were to remove
the restriction m > 0. How does this affect the solution process? And how
do we cope with the case where m = 0? These questions are answered in the
following example where, to emphasize the absence of the previous modelling
context, the variables used are x and y.

Example 3.3
Find the general solution of the differential equation
dy
= −λy,
dx
where λ is a non-zero constant.
Solution
To apply the separation of variables method, we need to exclude the cases
where y = 0. So, for y = 0, on dividing through by y, integrating with
respect to x, and using the rule for integration by substitution on the left-
hand side, we obtain
1
dy = (−λ) dx. (3.8)
y
Integrating, we obtain You saw in Unit 1 that
1
ln |y| = −λx + B, dy = ln |y| (y =
 0).
y
where B is an arbitrary constant. Taking exponentials gives
|y| = e−λx+B
or, removing the modulus sign,
y = ±e−λx+B = ±eB e−λx = Ce−λx ,
where C = ±eB is a non-zero but otherwise arbitrary constant.

89
Unit 2 First-order differential equations

This is not quite the general solution, as we have to consider what happens
when y = 0. Now, looking at the above solution, it is natural to ask what
happens when C = 0. This gives the zero function, y = 0 for all x, and
inspection of the differential equation shows that this is a particular solution.
So we now have the general solution
y = Ce−λx ,
where C is an arbitrary constant. (Positive C corresponds to y > 0, negative
C to y < 0, and C = 0 to the particular solution y = 0.)

The above example illustrates that:


• the separation of variables method requires that h(y) = 0 and gives a
family of solutions containing an arbitrary constant;
• the case when h(y) = 0 is exceptional and can give extra solutions that
may or may not have the same form as the family of general solutions.
The following exercises provide you with some practice at applying the sepa-
ration of variables method and at completing the general solution for values
of y such that h(y) = 0.
Exercise 3.4
Find the general solution of each of the following differential equations.
dy y−1 dy 2y
*(a) = (x > 0) (b) = 2
dx x dx x +1
*Exercise 3.5
Solve the initial-value problem
dv
= eu+v , v(0) = 0.
du

End-of-section Exercises
Exercise 3.6
Find the general solution of each of the following differential equations,
where a is a non-zero constant.
dy 1
(a) = (u = a)
du u−a
dy 1
(b) = (x = 0, x = 1/a)
dx x(1 − ax)
Exercise 3.7
Find the general solution of each of the following differential equations.
(a) u = xu (b) ẋ = 1 + x2
Exercise 3.8
(a) Solve the initial-value problem The differential equation here
is the logistic equation
dP P
= kP 1 − , P (0) = P0 (where P0 > 0), (Equation (2.2)) which, as
dt M was pointed out earlier, may
where k and M are positive constants. be used as a model for the
size P (t) of a population at
(Hint : For the integral involving P , the solution for Exercise 3.6(b) time t. The direction field of
should be of use.) this equation in a specific case
(b) Describe what happens to the solution P (t) as t becomes large. was examined in Exercise 2.1
(page 74).

90
Section 4 Solving linear differential equations

4 Solving linear differential equations


This section presents one final method of analytic solution for first-order
differential equations. The details of this integrating factor method appear
in Subsection 4.2. It applies only to a particular form of equation known
as a linear differential equation. The definition and some properties of this
type of equation are introduced in Subsection 4.1.

4.1 Linear differential equations


This subsection introduces the concept of linearity as applied to differen-
tial equations. Here the concept is introduced in the context of first-order
differential equations, but you should be aware that the idea generalizes to
higher-order differential equations and is important from a theoretical point Linear second-order
of view. differential equations are
considered in Unit 3.

Definitions
(a) A first-order differential equation for y = y(x) is linear if it can be
expressed in the form
dy
+ g(x)y = h(x), (4.1) This differential equation can
dx be written in the general form
where g(x) and h(x) are given functions. dy
= f (x, y)
(b) A linear first-order differential equation is said to be homoge- dx
that we have been using by
neous if h(x) = 0 for all x, and inhomogeneous or non- putting
homogeneous otherwise. f (x, y) = −g(x)y + h(x).

For example, the differential equation


dy
− x2 y = x3
dx
is linear, with g(x) = −x2 and h(x) = x3 , whereas the equation
dy
= xy 2
dx
is not, due to the presence of the non-linear term y 2 .

*Exercise 4.1
Decide whether or not each of the following first-order differential equations
is linear.
dy dy dz
(a) + x3 y = x5 (b) = x sin x (c) = −3z 1/2
dx dx dt
dy dy
(d) ẏ + y 2 = t (e) x + y = y2 (f) (1 + x2 ) + 2xy = 3x2
dx dx

An important theorem guarantees that an initial-value problem based on a


linear first-order differential equation has a unique solution.

91
Unit 2 First-order differential equations

Theorem 4.1
If the functions g(x) and h(x) are continuous throughout an interval
(a, b) and x0 belongs to this interval, then the initial-value problem This includes the possibility
that either a = −∞ or b = ∞,
dy so the interval might be all of
+ g(x)y = h(x), y(x0 ) = y0 ,
dx the real line.
has a unique solution throughout the interval.

This is a very powerful result, since it means that once you have found a
solution in a particular interval, that solution will be the only one.
There is a particularly useful technique for solving linear differential equa-
tions, to which we turn next.

4.2 The integrating factor method


As you have seen, the method of separation of variables relies upon an
application of the rule for integration by substitution, which is equivalent to
the Composite Rule (or Chain Rule) for derivatives. It is natural to enquire
whether there might similarly be a method for solving first-order differential
equations that derives from the rule for integration by parts or, equivalently,
from the Product Rule for derivatives. There is indeed such a method, and
it is the subject of this subsection.
To introduce the topic, consider the differential equation
dy
(1 + x2 ) + 2xy = 3x2 . (4.2) As you saw in Exercise 4.1(f),
dx this differential equation is
Note first that 2x (the coefficient of y) is the derivative of 1 + x2 (the co- linear; but it is not soluble by
efficient of dy/dx). It follows from the Product Rule that direct integration or by
separation of variables.
d   dy
(1 + x2 )y = (1 + x2 ) + 2xy.
dx dx
The right-hand side of this equation is the same as the left-hand side of
Equation (4.2), so we can rewrite the latter as
d  
(1 + x2 )y = 3x2 . (4.3)
dx
Now the left-hand side here is just the derivative of (1 + x2 )y, so we can
apply direct integration to Equation (4.3) to obtain

(1 + x2 )y = 3x2 dx = x3 + C,

where C is an arbitrary constant. Division by 1 + x2 then gives the general


solution of Equation (4.2) explicitly, as
x3 + C
y= .
1 + x2

92
Section 4 Solving linear differential equations

This solution was arrived at by noting that the left-hand side of Equa-
tion (4.2) is of the form
dy dp
p + y, (4.4)
dx dx
where p = 1 + x2 , and that this form can be re-expressed, using the Product
Rule, as
d
(py).
dx
Linear differential equations need not come in this convenient form. For
example, the left-hand side of the equation
dy 2x 3x2
+ y= (4.5)
dx 1 + x2 1 + x2
is not of the form (4.4). However, Equation (4.2) can be obtained from
Equation (4.5) on multiplying through by p = 1 + x2 . For this reason,
p = 1 + x2 may be called an integrating factor for Equation (4.5): it is
the factor by which Equation (4.5) needs to be multiplied in order that the
resulting differential equation has a left-hand side of the form (4.4), enabling
direct integration to be performed.
This leaves the question of how such an integrating factor can be found,
starting from Equation (4.5). The answer comes from writing down the two
properties that such a function p = p(x) must satisfy, as follows.
• Multiplying Equation (4.5) by p gives, on the left-hand side,
dy 2x
p +p y.
dx 1 + x2
• The left-hand side must be of the form
dy dp
p + y. (4.4)
dx dx
Comparison of these two expressions shows that p must itself be a particular
solution of the differential equation
dp 2x
= p. (4.6)
dx 1 + x2
This is a homogeneous linear first-order differential equation, and we can
solve it by separation of variables. Indeed, following Procedure 3.2, the
 0)
equation becomes (for p =
dp 2x
= dx.
p 1 + x2
Performing the left-hand integral gives
2x
ln |p| = dx,
1 + x2
so
2x
|p| = exp dx . (4.7)
1 + x2

93
Unit 2 First-order differential equations

Now, performing the integral on the right,


 
|p| = exp ln(|1 + x2 |) + A Note that 1 + x2 > 0, so
|1 + x2 | = 1 + x2 .
= exp(A)|1 + x2 |
= D(1 + x2 ),
where D (= exp(A)) is a positive but otherwise arbitrary constant. Hence The case D = 0 corresponds
2 to the solution p = 0 of
p = ±D(1 + x ), Equation (4.6), but this
solution is not of interest.
which, by redefining D, can be written as
p = D(1 + x2 ),
where D is now a non-zero but otherwise arbitrary constant.
Thus an integrating factor for Equation (4.5) is p(x) = D(1 + x2 ). Multi-
plying through the equation by this factor yields
dy
D(1 + x2 ) + 2Dxy = 3Dx2 ,
dx
and now you can see that (since D =  0) the arbitrary constant D can be
chosen without affecting the applicability of the form (4.4). Therefore we
choose the integrating factor to have the simplest possible form — in this
case we obtain p(x) = 1 + x2 .
As you have seen, this leads to the solution of Equation (4.5) by direct inte-
gration, and the formula for this integrating factor is given by Equation (4.7)
as
2x
p = exp dx . (4.8)
1 + x2
This approach generalizes to any linear first-order differential equation, pro-
vided that the integrals involved can be evaluated. For an equation written
in the form
dy
+ g(x)y = h(x), (4.1)
dx
the function g(x) takes the place of 2x/(1 + x2 ) in Equation (4.5). To find
an integrating factor p = p(x) for Equation (4.1), the argument proceeds as
above, with 2x/(1 + x2 ) replaced by g(x) at each step. This leads to the
generalized form of Equation (4.8), namely

p = exp g(x) dx , (4.9) Remember that calculation of


the integrating factor does
which defines the integrating factor for Equation (4.1). not require the inclusion of a
constant of integration.
When Equation (4.1) is multiplied through by the integrating factor, the
resulting differential equation is
dy
p(x) + p(x)g(x)y = p(x)h(x), (4.10)
dx
the left-hand side of which, by the definition of p, is of the form (4.4); so The definition of p ensures
Equation (4.10) can be re-expressed, using the Product Rule, as that the left-hand side of
Equation (4.10) is of the
d form (4.4) since
(p(x)y) = p(x)h(x). (4.11)
dx dp d
= exp g(x) dx
Direct integration can then be used on Equation (4.11) to try to find the dx dx
general solution. = exp g(x) dx g(x)
This integrating factor method is summarized below. = p(x)g(x).

94
Section 4 Solving linear differential equations

Procedure 4.1 Integrating factor method


This method applies to differential equations of the form
dy
+ g(x)y = h(x). (4.1)
dx
(a) Determine the integrating factor

p = exp g(x) dx . (4.9) The constant of integration is


not needed here.
(b) Multiply Equation (4.1) by p(x) to recast the differential equation
as You can, if you wish, check
dy that you have found p
p(x) + p(x)g(x)y = p(x)h(x). correctly by checking that
dx dy
(c) Rewrite the differential equation as p(x) + p(x)g(x)y
dx
d
d = (p(x)y) ,
(p(x)y) = p(x)h(x). dx
dx i.e. by checking that
(d) Integrate this last equation, to obtain dp/dx = p(x)g(x).

p(x)y = p(x)h(x) dx.


It is a good idea to check, by
(e) Divide through by p(x), to obtain the general solution in explicit substitution into the original
form. equation, that the function
obtained is indeed a solution.

As with the separation of variables method, it may not be possible to perform


the necessary final integration. However, in the remainder of this subsection
we give examples and exercises for which this method can be used.

Example 4.1
Use the integrating factor method to find the general solution of each of the The first example cannot be
following differential equations. solved by separation of
variables. The latter two can,
dy 2xy dy y−1 dy 2y
(a) =x− 2 (b) = (x > 0) (c) = as you saw in Exercise 3.4.
dx x +1 dx x dx 1 + x2 You can compare these
answers with those obtained
Solution earlier.
(a) On rearranging the differential equation as
dy 2xy
+ 2 = x,
dx x + 1
we see that it is in the form of Equation (4.1) with
2x
g(x) = and h(x) = x.
+1x2
The integrating factor (from Equation (4.9)) is therefore
2x
p = exp dx
x2+1
= exp(ln |x2 + 1|)
= exp(ln(x2 + 1)) (since 1 + x2 > 0)
= x2 + 1. Checking, we see that
dp
= 2x = g(x)p(x).
dx

95
Unit 2 First-order differential equations

Multiplying both sides of the differential equation by this factor yields


dy
(x2 + 1) + 2xy = x(x2 + 1),
dx
and the differential equation thus becomes
d  2 
(x + 1)y = x(x2 + 1).
dx
Integrating both sides gives

(x2 + 1)y = x(x2 + 1) dx

= (x3 + x) dx

= 14 x4 + 12 x2 + C,
where C is an arbitrary constant. Finally, to obtain an explicit solution
we divide by x2 + 1 to obtain
x4 + 2x2 + 4C
y= .
4(x2 + 1)
(b) On rearranging the differential equation as
dy 1 1
− y=− ,
dx x x
we see that it is in the form of Equation (4.1) with g(x) = h(x) = −1/x.
The integrating factor (from Equation (4.9)) is therefore
1
p = exp − dx
x
= exp(− ln x) (since x > 0) Recall that a ln x = ln(xa )
1 and hence, in particular,
= exp ln − ln x = ln(x−1 ) = ln(1/x).
x
1
= . Checking, we see that
x
dp 1
Multiplying through the equation by p(x) = 1/x gives = − 2 = g(x)p(x).
dx x
1 dy 1 1
− y = − 2,
x dx x2 x
and the differential equation becomes
d 1 1
y =− .
dx x x2
Integration then gives
y 1
= − dx
x x2
1
= + C,
x
where C is an arbitrary constant. The general solution is therefore
y = 1 + Cx,
where C is an arbitrary constant.

96
Section 4 Solving linear differential equations

(c) In order to put the given differential equation into the form (4.1), we
need to bring the term in y to the left-hand side to obtain
dy 2
− y = 0. (4.12)
dx 1 + x2
Hence, in this case, we have g(x) = −2/(1 + x2 ) and h(x) = 0. The The equation is homogeneous.
integrating factor is
2
p = exp − dx = exp(−2 arctan x) = e−2 arctan x . Checking, we see that
1 + x2
dp −2e−2 arctan x
Multiplying through by the integrating factor gives =
dx 1 + x2
dy 2y −2 arctan x 2
e−2 arctan x− e = 0. = e−2 arctan x −
dx 1 + x2 1 + x2
Thus the differential equation can be rewritten as = p(x)g(x).

d  −2 arctan x 
e y = 0.
dx
It follows, on integrating, that
e−2 arctan x y = C, or, equivalently, y = Ce2 arctan x ,
where C is an arbitrary constant. This is the general solution.

*Exercise 4.2
Find the general solution of each of the following differential equations. For part (b), see
dy dy Examples 1.1 and 2.1.
(a) − y = ex sin x (b) =y+x A direction field diagram is
dx dx shown in Figure 2.2.
Exercise 4.3
Use the integrating factor method to solve each of the following initial-value You saw the differential
problems. equation in part (a) in
Exercise 3.7(a), where you
(a) u = xu, u(0) = 2. solved it using separation of
(b) tẏ + 2y = t2 , y(1) = 1. variables.

End-of-section Exercises

Exercise 4.4
Which method would you use to try to solve each of the following linear
first-order differential equations?
dy dy
(a) + x3 y = x5 (b) = x sin x
dx dx
dv dy
(c) + 5v = 0 (d) (1 + x2 ) + 2xy = 1 + x2
du dx
Exercise 4.5
Solve each of the following initial-value problems. The differential equation in
part (a) is equivalent to that
(a) ẏ + y = t + 1, y(1) = 0. considered in parts (e) and (f)
(b) e3t ẏ =1 − e3t y, y(0) = 3. of Exercise 1.2.

97
Unit 2 First-order differential equations

Exercise 4.6
Find the general solution of each of the following differential equations.
dy
(a) x − 3y = x (x > 0)
dx
dv
(b) + 4v = 3 cos 2t
dt
(Hint : If a and b are non-zero constants, then
eat
eat cos bt dt = (a cos bt + b sin bt) + C,
a2 + b 2
where C is an arbitrary constant.)

5 Finding analytic solutions on the


computer
In this section you will see how direct integration, the method of separation
of variables and the integrating factor method can be used on the computer
to solve first-order differential equations.

Use your computer to complete the following activities. PC


*Activity 5.1
Use direct integration to solve the initial-value problem
dy
= e−3x , y(0) = 53 .
dx
Compare your solution with that obtained in Example 3.1.

*Activity 5.2
Use separation of variables to find the general solution of each of the follow-
ing differential equations.
dy dy 2y dy
(a) = −λy (b) = 2 (c) = 1 + y2
dx dx x +1 dx
Compare your solutions with those obtained in Example 3.3 and Exer-
cises 3.4(b) and 3.7(b), respectively.

*Activity 5.3
Use the integrating factor method to solve the following initial-value prob-
lems.
dy
(a) = x + y, y(0) = 0.
dx
dy
(b) x + 2y = x2 , y(1) = 1.
dx
Compare your solutions with those obtained in Exercises 4.2(b) and 4.3(b),
respectively.

98
Outcomes

*Activity 5.4
Use the integrating factor method to find the general solution of each of the
following differential equations.
dy dy
(a) x − 3y = x (b) + 4y = 3 cos 2x
dx dx
Compare your solutions with those obtained in Exercises 4.6(a) and 4.6(b),
respectively.

Outcomes
After studying this unit you should be able to:
• understand and use the basic terminology relating to differential equa-
tions and their solutions;
• check by substitution whether a given function is a solution of a given
first-order differential equation or initial-value problem;
• find from the general solution of a first-order differential equation the
particular solution that satisfies a given initial condition;
• appreciate the difficulties with domains and image sets for the solution
of some differential equations;
• deduce the qualitative behaviour of solutions from consideration of a
first-order differential equation itself, as visualized from its direction
field;
• set up the formulae required by Euler’s method for solving an initial-
value problem, carry out a few steps of the method by hand, and use
the computer to deal with large numbers of steps;
• recognize when a first-order differential equation is soluble by direct
integration, and carry out that integration when appropriate, by hand
in simple cases and otherwise on the computer;
• recognize when a first-order differential equation is separable, and ap-
ply the method of separation of variables by hand in simple cases and
otherwise on the computer;
• recognize when a first-order differential equation is linear, and solve such
an equation by the integrating factor method, by hand in simple cases
and otherwise on the computer.

99
Unit 2 First-order differential equations

Solutions to the exercises

Section 1 (f ) If y = t + Ce−t , then


dy
P ẏ = = 1 − Ce−t
1.1 We have r(P ) = k 1 − , so we simply need dt
M and
to solve the following pair of simultaneous equations:
f (t, y) = −y + t + 1 = −(t + Ce−t ) + t + 1
10
k 1− = 1, = 1 − Ce−t .
M
10 000 1.3 In each case, differences in notation notwithstand-
k 1− = 0.
M ing, the differential equation has the form
From the second equation, since k > 0, we see immedi- dy
= f (x, y),
ately that M = 10 000. Substituting in the first equa- dx
tion leads to and we need to show that the given function y = y(x)
999 1000 satisfies this equation, i.e. gives the same expression for
k = 1, so k = .
1000 999 either side of the equation.
(a) If y = C − 13 e−3x , then
1.2 In each case, differences in notation notwithstand- dy
ing, the differential equation has the form = e−3x and f (x, y) = e−3x .
dx
dy
= f (x, y), (b) If u = Cet − t − 1, then
dx
and we need to show that the given function y = y(x) du
u̇ = = Cet − 1
satisfies this equation, i.e. gives the same expression for dt
either side of the equation. and
(a) If y = 2ex − (x2 + 2x + 2), then differentiating y f (t, u) = t + u = Cet − 1.
gives (c) If
dy
= 2ex − 2x − 2, CM ekt
dx P = ,
1 + Cekt
and substituting the expression for y into the expression then, using the Quotient Rule for differentiation,
for f gives     
dP (1 + Cekt ) CM kekt − CM ekt Ckekt
f (x, y) = y + x2 = 2ex − (x2 + 2x + 2) + x2 dt
=
(1 + Cekt )2
= 2ex − 2x − 2, CM ekt 1 + Cekt − Cekt
as required. =k kt
1 + Ce 1 + Cekt
(b) If y = 12 x2 + 32 , then CM ekt Cekt
=k kt
1−
dy 1 + Ce 1 + Cekt
= x and f (x, y) = x.
dx P
2 = kP 1 − .
(c) If u = 2ex /2 , then M
du 2
u = = 2xex /2 1.4 (a) From Exercise 1.3(c) we know that
dx
and CM ekt 10Ce0.15t
P (t) = =
f (x, u) = xu = 2xex /2 .
2
1 + Cekt 1 + Ce0.15t
 √ √ is a solution of the differential equation. The initial
(d) If y = (27 − x2 )/3 (−3 3 < x < 3 3), then condition P (0) = 1 then implies (since e0 = 1)
−1/2
dy x 27 − x2 10C
=− 1= , so C = 19 .
dx 3 3 1+C
and A particular solution is therefore
10 0.15t
x x 27 − x2
−1/2 e 10e0.15t
f (x, y) = − =− . P = 9 1 0.15t = .
3y 3 3 1 + 9e 9 + e0.15t
(e) If y = t + e−t , then (b) Dividing top and bottom by e0.15t , we see that
dy 10
ẏ = = 1 − e−t P = −0.15t
9e +1
.
dt
and For large values of t, the exponential term on the bot-
f (t, y) = −y + t + 1 = −(t + e−t ) + t + 1 = 1 − e−t . tom will be very small. The result is that P will ap-
proach the value 10 in the long term.

100
Solutions to the exercises

1.5 Calculator results are given to eight significant fig- The graphs of solutions through a starting point above
ures. (Different calculators may give slightly different the line P = 1000 appear to decrease, but at a slower
results.) and slower rate, tending from above towards the limit
P = 1000 as t increases.
(a) f (3.142) = 31.018 339,
The graphs of solutions through starting points in the
f (3.141 592 6) = 31.006 275.
region 0 < P < 1000 are increasing, with slope grow-
So we have ing before the level P = 500 is reached and declining
error = f (3.142) − f (π) thereafter. For large values of t, these graphs tend from
f (3.142) − f (3.141 592 6) below towards the limit P = 1000.
0.012. For a starting point in the region P < 0, the graphs de-
crease without limit and with steeper and steeper slope.
(b) f (3.142) = 4.421 123 2 × 1013 ,
These various cases are illustrated by typical graphs in
f (3.141 592 6) = 4.403 148 2 × 1013 . the figure below.
So we have
error = f (3.142) − f (π) P
f (3.142) − f (3.141 592 6)
1500
1.8 × 1011 .

1000
1.6 (a) If y = arcsin x + C (−1 < x < 1), then differ-
entiating gives
dy 1 500
=√ ,
dx 1 − x2
so y satisfies the given differential equation.
0
(b) The initial condition is y( 12 ) = π2 , i.e. y = π2 when 2 4 6 8 x
x = 12 . On substituting these values into the solution
from part (a), we have –500
π 1 π
2 = arcsin 2 +C = 6 + C.
π
This gives C = 3, so the solution of the initial-value (b) If the differential equation is considered as a model
problem is of population behaviour, then the region P < 0 must be
π excluded. The analysis above leads to the following pre-
y = arcsin x + 3 (−1 < x < 1).
dictions for the population.
1.7 (a) If x = tan(t + C) (− π2 < t + C < π2 ), then dif- • If the population is zero at the start, then it remains
ferentiating gives zero.
dx • If the population size starts at 1000, then it remains
ẋ = = sec2 (t + C) = 1 + tan2 (t + C) = 1 + x2 , fixed at this level.
dt
so x satisfies the given differential equation. • If the population starts at a level higher than 1000,
  then it declines (more and more gradually) towards
(b) The initial condition is x π4 = 1, i.e. x = 1 when
t = π4 . On substituting these values into the solution 1000.
from part (a), we have • If the population starts at a level below 1000 (but
  above 0), then it increases and eventually tends
1 = tan π4 + C .
gradually towards 1000.
This gives C = arctan 1 − π4 = 0, so the solution of the
initial-value problem is
 π 
x = tan t − 2 < t < π2 . 2.2 For the initial-value problem
dy
= y, y(0) = 1,
dx
we have x0 = 0, Y0 = y0 = 1 and f (xi , Yi ) = Yi . The
Section 2 step size is given as h = 0.2. Equation (2.9) with i = 0
gives
x1 = x0 + h = 0 + 0.2 = 0.2,
2.1 (a) The slope is shown to be zero at all points on
the horizontal lines P = 0 and P = 1000, so these corre- and Equation (2.10) with i = 0 gives
spond to constant solutions of the differential equation. Y1 = Y0 + hf (x0 , Y0 ) = 1 + 0.2 × 1 = 1.2.
(As pointed out earlier in the text, these two solutions
can also be spotted directly from the form of the differ-
ential equation.)

101
Unit 2 First-order differential equations

Applying Equations (2.9) and (2.10) in turn for


i = 1, 2, 3, 4, we obtain the following table. y
2
i xi Yi f (xi , Yi ) = Yi Yi+1 = Yi + hf (xi , Yi )
0 0 1 1 1.2
1 0.2 1.2 1.2 1.44 1
2 0.4 1.44 1.44 1.728
3 0.6 1.728 1.728 2.073 6
4 0.8 2.073 6 2.073 6 2.488 32
5 1.0 2.488 32
–2 –1 0 1 2 x

The approximation to y(1) is 2.488 32.

–1
2.3 Since we are told that, for sufficiently small h, the
absolute error is proportional to the step size h, we can
deduce from the last row of Table 2.2 that there exists
a constant k such that –2
0.000 136 = 0.0001k,
so k = 1.36. In order to determine y(1) correct to five (c) It appears from the direction field that there are
decimal places, h must be such that several types of solution. Any solution whose graph
1.36h < 5 × 10−6 cuts the y-axis above the origin has positive slope at all
or points. The solution graph that passes through the ori-
5 × 10−6 gin has zero slope there, but positive slope everywhere
h< 3.7 × 10−6 . else. Any solution graph that cuts the y-axis below
1.36
So a suitable choice of h would be 10−6 = 0.000 001. the origin has a maximum (where it meets y = −x2 for
x < 0). Some of these graphs also have a minimum
(In fact, using this value of h gives an approximation to
(where they meet y = −x2 for x > 0). Others have no
y(1) of 2.718 280, which is correct to 5 decimal places.)
minimum (though this is not clear from the diagram
given). A solution graph of each type is sketched be-
2.4 (a) The slope defined by the direction field low.
f (x, y) = y + x2 is zero when y = −x2 , which is a
parabola in the lower half-plane with vertex at the y
origin. Below this parabola we have y < −x2 and 2
f (x, y) < 0, while above the parabola we have y > −x2
and f (x, y) > 0. Thus all slopes for points of the plane
below the parabola y = −x2 are negative, and all slopes
for points above it are positive. 1
Also, if x is fixed, then f (x, y) = y + x2 is an increasing
function as y increases. If instead y is fixed, then for
x > 0, f (x, y) increases as x increases, and for x < 0,
f (x, y) increases as x becomes more negative. These –1 0 x
–2 1 2
observations indicate that the slope given by the di-
rection field increases as we move from bottom to top
along any vertical line, whereas on moving along any
horizontal line, the slope increases with distance from –1
the y-axis.
(b) The features described in the solution to part (a)
are all apparent on the direction field diagram. This –2
direction field diagram is repeated below, with the
parabola y = −x2 superimposed upon it. (Note that
this parabola does not represent a solution of the dif- (d) The initial-value problem is
ferential equation.) dy
= y + x2 , y(−1) = −0.2.
dx
From Equations (2.9) and (2.10), the necessary formu-
lae are
xi+1 = xi + h,
Yi+1 = Yi + hf (xi , Yi ).

102
Solutions to the exercises

For the current problem, x0 = −1, Y0 = y0 = −0.2, (b) The differential equation ṗ = t/(1 + t2 ) has general
f (xi , Yi ) = Yi + x2i and h = 0.1. The particular formu- solution
lae needed here are therefore: t
p= dt.
xi+1 = xi + 0.1, where x0 = −1; 1 + t2
Yi+1 = Yi + 0.1(Yi + x2i ), where Y0 = −0.2. Using the hint provided, we make the substitution
u = 1 + t2 , for which du/dt = 2t. This gives
The second of these formulae can also be written as
t 1
Yi+1 = 1.1Yi + 0.1x2i , where Y0 = −0.2. 2
dt = 12 (2t) dt
1+t 1 + t2
1
= 12 du
u
Section 3 = 12 ln u + C (since u = 1 + t2 > 0)
= 1
2 ln(1 + t2 ) + C,
3.1 We apply direct integration to find the general so- where C is an arbitrary constant. The general solution
lution. In each case, C is an arbitrary constant. of the differential equation is therefore
(a) The differential equation dy/dx = 6x has general p = 12 ln(1 + t2 ) + C.
solution
y= 6x dx = 3x2 + C. 3.3 (a) The differential equation is dm/dt = −λm,
where m > 0. Following Procedure 3.2, we obtain
From the initial condition y(1) = 5, we have 5 = 3 + C, 1
so C = 2. The solution of the initial-value problem is dm = (−λ) dt
m
therefore
and, since m > 0, integration produces
y = 3x2 + 2.
ln m = −λt + B,
(b) The differential equation dv/du = e4u has general where B is an arbitrary constant. On solving this equa-
solution tion for m, by taking the exponential of both sides, we
v= e4u du = 14 e4u + C. obtain
m = e−λt+B = eB e−λt = Ce−λt ,
From the initial condition v(0) = 2, we have 2 = 14 + C,
so C = 74 . The solution of the initial-value problem is where C = eB is a positive (since eB > 0 for all B), but
therefore otherwise arbitrary, constant. The general solution is
therefore
v = 14 e4u + 74 .
m = Ce−λt ,
(c) The differential equation ẏ = 5 sin 2t has general
where C is a positive but otherwise arbitrary constant.
solution
(b) The initial condition is m(0) = m0 , from which we
y= 5 sin 2t dt = − 52 cos 2t + C. have m0 = Ce0 , so C = m0 . The required particular
From the initial condition y(0) = 0, we have solution is therefore
0 = − 25 + C, so C = 52 . The solution of the initial-value m = m0 e−λt .
problem is therefore
y = 52 (1 − cos 2t). 3.4 (a) The differential equation is
dy y−1
= , where x > 0.
3.2 (a) The differential equation dy/dx = xe−2x has dx x
general solution In order to apply the separation of variables method, we
need to exclude the cases where y = 1. So, for y =  1,
y= xe−2x dx. on applying Procedure 3.2 we have
The integral may be found using integration by parts. 1 1
dy = dx.
Taking f (x) = x and g  (x) = e−2x , and using the for- y−1 x
mula Since x > 0, for y = 1 (so that y − 1 = 0), integration
produces
f (x)g  (x) dx = f (x)g(x) − f  (x)g(x) dx,
ln |y − 1| = ln x + B,
we have where B is an arbitrary constant. On solving this equa-
−2x
xe dx = − 21 xe−2x + 1 −2x
2e dx tion for y, by first taking the exponential of both sides,
we obtain
= − 21 xe−2x − 41 e−2x + C, y = 1 ± eln x+B = 1 ± eB eln x = 1 + Cx,
where C is an arbitrary constant. The general solution where C = ±eB is a non-zero but otherwise arbitrary
of the differential equation is therefore constant.
y = − 14 (2x + 1)e−2x + C.

103
Unit 2 First-order differential equations

Examination of the differential equation shows that Integration produces the general solution
C = 0 also gives a solution (the constant function y = ln |u − a| + C,
y = 1). The general solution is therefore
where C is an arbitrary constant.
y = 1 + Cx,
(b) The general solution of dy/dx = 1/(x(1 − ax)),
where C is an arbitrary constant.
where x = 0, x = 1/a, is given by
(If you cannot convince yourself that this is the general 1
solution for all y, including y = 1, then you will see it y= dx.
x(1 − ax)
proved in Example 4.1.)
This can be solved by using the substitution u = 1/x.
(b) The differential equation is dy/dx = 2y/(x2 + 1). Alternatively, if the integral is rearranged as
In order to apply the separation of variables method, 1 1
 0,
we need to exclude the cases where y = 0. So, for y = y=− dx,
a (x − 0)(x − 1/a)
on applying Procedure 3.2 we have
then we can use the table of standard integrals in the
1 2
dy = dx. Handbook. Either method of integration gives the gen-
y 2
x +1 eral solution  
Since y = 0, integration produces 1 

y = C − ln  − a ,
ln |y| = 2(arctan x + B), x
where B is an arbitrary constant. On solving this equa- where C is an arbitrary constant.
tion for y, we obtain
y = ±e2 arctan x+2B = ±e2B e2 arctan x = Ce2 arctan x , 3.7 Each of the differential equations can be solved by
where C = ±e2B is a non-zero but otherwise arbi- separation of variables.
trary constant. Examination of the differential equa- (a) The differential equation is u = du/dx = xu. For
tion shows that C = 0 also gives a solution (the zero the cases where u = 0, we divide through by u and in-
function y = 0). The general solution is therefore tegrate with respect to x. This gives
y = Ce2 arctan x , 1
du = x dx.
where C is an arbitrary constant. u
(If you cannot convince yourself that this is the general Integration produces
solution for all y, including y = 0, then you will see it ln |u| = 12 x2 + B,
proved in Example 4.1.) where B is an arbitrary constant. On solving this equa-
3.5 The differential equation is dv/du = eu+v = eu ev . tion for u, we obtain
2 2 2
Dividing through by ev and integrating with respect u = ±ex /2+B = ±eB ex /2 = Cex /2 ,
to u, we obtain where C = ±eB is a non-zero but otherwise arbitrary
e−v dv = eu du. constant. However, the case C = 0 can be added, since
it can be seen by inspection of the differential equation
Integration produces that the zero function u = 0 is a solution. Hence the
−e−v = eu + B, general solution is
2
where B is an arbitrary constant. On solving this equa- u = Cex /2
,
tion for v, we obtain where C is an arbitrary constant.
v = − ln(−eu − B) = − ln(C − eu ),
2
(You verified that u = 2ex /2 is a particular solution of
where C = −B. We need C > 0 and u < ln C in order this differential equation in Exercise 1.2(c).)
for the argument of ln here to be positive. Hence the (b) The differential equation is ẋ = dx/dt = 1 + x2 .
general solution is We divide through by 1 + x2 and integrate with respect
v = − ln(C − eu ) (u < ln C), to t. This gives
where C is a positive but otherwise arbitrary constant. 1
dx = 1 dt.
The initial condition v(0) = 0 gives 0 = − ln(C − e0 ), 1 + x2
so C − e0 = 1 and hence C = 2. The solution of the Integration produces
initial-value problem is therefore arctan x = t + C,
v = − ln(2 − eu ) (u < ln 2). where C is an arbitrary constant. On solving for x, we
have
3.6 Each of the differential equations is soluble by di-
x = tan(t + C).
rect integration.
A suitable domain for the solution is − π2 < t + C < π2 ,
(a) The general solution of dy/du = 1/(u − a), where
since the image set of arctan is the interval − π2 , π2 .
u= a, is given by
1
y= du.
u−a

104
Solutions to the exercises

Thus the general solution is Section 4


 π 
x = tan(t + C) − 2 < t + C < π2 ,
4.1 (a) The equation dy/dx + x3 y = x5 is linear, with
where C is an arbitrary constant.
g(x) = x3 and h(x) = x5 .
(You verified that this is a solution of the given differ-
ential equation in Exercise 1.7.) (b) The equation dy/dx = x sin x is linear, with
g(x) = 0 (for all x) and h(x) = x sin x.

3.8 (a) The given equation is dP/dt = kP (1 − P/M ). (c) The equation dz/dt = −3z 1/2 is not linear (because
First, note that (as remarked upon in Section 2) the of the z 1/2 term).
constant functions P = 0 and P = M are both solu- (d) The equation ẏ + y 2 = t is not linear (because of
tions. Assuming that we are considering neither of these the y 2 term).
possibilities (we are certainly not interested in P = 0
(e) The equation x(dy/dx) + y = y 2 is not linear (be-
since we know that P0 > 0), we can use the separation
cause of the y 2 term).
of variables method to obtain
1 1 (f ) The equation (1 + x2 )(dy/dx) + 2xy = 3x2 is lin-
dP = 1 dt.
k P (1 − P/M ) ear, since we can divide through by 1 + x2 to obtain
The integral on the left-hand side is of the form evalu- dy/dx + 2xy/(1 + x2 ) = 3x2 /(1 + x2 ), which is of the
ated in Exercise 3.6(b), with 1/M in place of a. Hence defined form with g(x) = 2x/(1 + x2 ) and h(x) =
we have  3x2 /(1 + x2 ).

1 1 1 
− ln  − = t + B, 4.2 (a) The given equation is dy/dx − y = ex sin x.
k P M
Comparison with Equations (4.1) and (4.9) shows that
where B is an arbitrary constant. On solving for P , we
the integrating factor is
find first that
1 1 = exp(−x) = e−x .
− = ±e−k(t+B) p = exp (−1) dx
P M
= ±e−kB e−kt Multiplying through by p(x) gives
= Ce −kt
, dy
e−x − e−x y = sin x.
where C = ±e −kB
is a non-zero but otherwise arbitrary dx
Thus the differential equation can be rewritten as
constant. However, note that C = 0 corresponds to the
constant solution P = M already noted, so the restric- d −x
(e y) = sin x.
 0 may be dropped. Hence we obtain
tion C = dx
−1 On integrating, we find the general solution
1
P = + Ce−kt (ekt = −M C), e−x y = − cos x + C,
M
or, equivalently,
where C is an arbitrary constant.
y = ex (C − cos x),
From the initial condition P (0) = P0 , we have
−1 where C is an arbitrary constant.
1 1 1
P0 = + Ce0 , so C = − . (b) The given equation, when rearranged into
M P0 M
The solution of the initial-value problem is therefore form (4.1), is dy/dx − y = x. This has the same left-
−1 hand side as the differential equation in part (a), and
1 1 1 hence the same integrating factor, p = e−x . Multiplying
P = + − e−kt ,
M P0 M through by p(x) gives
which yields dy
M e−x − e−x y = xe−x .
P = . dx
1 + (M/P0 − 1)e−kt Thus the differential equation can be rewritten as
Finally, we rewrite this in the more familiar form d −x
(e y) = xe−x .
M ekt dx
P = kt . On integrating (by parts on the right-hand side), we
e + (M/P0 − 1)
find
(b) As t → ∞ we have e−kt → 0, and consequently the
value of P (t) approaches M . e−x y = xe−x dx
(Note that this is true whether the starting value P0 is
greater than or less than M . This result is consistent = −xe−x + e−x dx
with the specific direction field shown in Figure 2.4.)
= −xe−x − e−x + C
= C − (x + 1)e−x ,
where C is an arbitrary constant. After multiplying
through by ex , the general solution in explicit form is
y = Cex − (x + 1).

105
Unit 2 First-order differential equations

4.3 (a) The given equation, when rearranged into 4.5 (a) The given equation is dy/dt + y = t + 1. Com-
form (4.1), is du/dx − xu = 0. The integrating factor is parison with Equations (4.1) and (4.9) shows that the
integrating factor is
p = exp (−x) dx
p = exp 1 dt
= exp(−x2 /2)
2 = exp(t)
= e−x /2
.
= et .
Multiplying through by p(x) gives
Multiplying through by p(t) gives
2 du 2
e−x /2 − xe−x /2 u = 0. dy
dx et + et y = (t + 1)et .
Thus the differential equation can be rewritten as dt
d −x2 /2 Thus the differential equation can be rewritten as
(e u) = 0. d t
dx (e y) = (t + 1)et .
On integrating, we find the general solution dt
2 2 On integrating (by parts on the right-hand side), we
e−x /2 u = C, or, equivalently, u = Cex /2
, find
where C is an arbitrary constant.
et y = (t + 1)et dt
From the initial condition u(0) = 2, we have 2 = Ce0 , so
C = 2. Hence the solution of the initial-value problem
= (t + 1)et − et dt
is
2
u = 2ex /2 . = (t + 1)et − et + C
(b) After division by t, the given equation can be writ- = tet + C,
ten as dy/dt + (2/t)y = t. (To avoid division by zero, where C is an arbitrary constant. After multiplying
we take t > 0, say, which is consistent with the initial through by e−t , the general solution in explicit form is
condition.) The integrating factor is y = Ce−t + t.
2
p = exp dt From the initial condition y(1) = 0, we have
t 0 = Ce−1 + 1, so C = −e. Hence the solution of the
= exp(2 ln t) initial-value problem is
= exp(ln(t2 )) y = t − e1−t .
2
=t . (b) After division by e3t and rearrangement, the given
Multiplying through by p(t) gives equation becomes dy/dt + y = e−3t . This has the same
dy left-hand side as the differential equation in part (a),
t2 + 2ty = t3 . and hence the same integrating factor, p = et . Multi-
dt
Thus the differential equation can be rewritten as plying through by p(t) gives
d 2 dy
(t y) = t3 . et + et y = e−2t .
dt dt
On integrating, we find the general solution Thus the differential equation can be rewritten as
t2 y = 14 t4 + C, or, equivalently, y = 41 t2 + Ct−2 , d t
(e y) = e−2t .
where C is an arbitrary constant. dt
On integrating, we find the general solution
From the initial condition y(1) = 1, we have 1 = 14 + C,
et y = − 12 e−2t + C,
so C = 34 . Hence the solution of the initial-value prob-
lem is or, equivalently,
y = 14 (t2 + 3t−2 ). y = Ce−t − 12 e−3t ,
where C is an arbitrary constant.
4.4 (a) and (d) require the integrating factor method. From the initial condition y(0) = 3, we have
(b) is best solved by direct integration. (c) can be 3 = Ce0 − 12 e0 , so C = 72 . Hence the solution of the
solved by separation of variables or the integrating fac- initial-value problem is
tor method.
y = 12 (7e−t − e−3t ).

106
Solutions to the exercises

4.6 (a) After division by x (where x > 0), the given


equation becomes dy/dx − (3/x)y = 1. The integrating
factor is
3
p = exp − dx
x
= exp(−3 ln x)
= exp(ln(x−3 ))
= x−3 .
Multiplying through by p(x) gives
dy
x−3 − 3x−4 y = x−3 .
dx
Thus the differential equation can be rewritten as
d −3
(x y) = x−3 .
dx
On integrating, we find the general solution
x−3 y = − 12 x−2 + C,
or, equivalently,
y = Cx3 − 12 x,
where C is an arbitrary constant.
(b) The given equation is dv/dt + 4v = 3 cos 2t. The
integrating factor is

p = exp 4 dt

= exp(4t)
= e4t .
Multiplying through by p(t) gives
dv
e4t + 4e4t v = 3e4t cos 2t.
dt
Thus the differential equation can be rewritten as
d 4t
(e v) = 3e4t cos 2t.
dt
On integrating (using the hint for the right-hand side,
with a = 4 and b = 2), we find
e4t v = 20
3 4t
e (4 cos 2t + 2 sin 2t) + C,
where C is an arbitrary constant. After multiplying
through by e−4t , the general solution in explicit form is
v= 3
20 (4 cos 2t + 2 sin 2t) + Ce−4t .

107
UNIT 3 Second-order differential
equations 1

Study guide for Unit 3 2


This unit extends the ideas of Unit 2 from first-order differential equations to
a particular type of second-order differential equation. This type of second-
order differential equation has a variety of applications, some of which are
considered later in the course.
This unit requires no previous knowledge beyond that required for Unit 2, 3
apart from some familiarity with complex numbers. The relevant material
on complex numbers was revised in Unit 1 of this course.
Sections 1 and 2 contain the most important material.
The recommended study pattern is to study one section per study session 4
and to study the sections in the order in which they appear. However,
PC
you may find that Sections 1 and 2 take you rather longer than Sections 3
and 4, and you may wish to spread your study of Sections 1 and 2 over three
sessions.
Section 4 uses the computer algebra package for the course and is designed to
help you to understand the nature of the solutions obtained in Sections 1–3.

109
Unit 3 Second-order differential equations

Introduction
Unit 2 introduced you to differential equations, and in particular to first-
order differential equations that can be written in the form
dy
= f (x, y).
dx
Such an equation is said to be of first order because it involves only the first
derivative dy/dx of the function y = y(x).
This unit considers second-order differential equations, that is, differential The order of a differential
equations that involve a second (but no higher) derivative. Examples of equation was defined in
second-order differential equations are Subsection 1.2 of Unit 2.

d2 y dy x d2 y
− 3 + 2y = 4e and 3 + y 2 sin x = x2 . A second-order differential
dx2 dx dx2 equation may or may not
As in the case of first-order differential equations, second-order differential include a first derivative.
equations, and in particular the derivatives in such equations, can be written
in a variety of notations. For example, the second derivative of a dependent
variable y with respect to an independent variable t (representing time) may Of course, the dependent
be written as d2 y/dt2 , ÿ, y  , and so on. Also, as in the case of first-order variable is not always y!
equations, the dependent variable and sometimes the independent variable
can be considered as functions, and the same symbol is frequently used for
both the variable and the corresponding function.
One particularly simple example of a second-order differential equation, with
dependent variable s and independent variable t, is
d2 s
= a, (0.1)
dt2
where a is a given constant. This equation can be solved by applying direct See Subsection 3.1 of Unit 2.
integration twice. One application gives
ds
= a dt = at + C,
dt
where C is an arbitrary constant. Integrating a second time gives

s= (at + C) dt = 12 at2 + Ct + D, (0.2)

where D is another arbitrary constant. Equation (0.2) is the general solution Recall, from Unit 2, that the
of the second-order differential equation (0.1). general solution of a
differential equation is the
Now, in Unit 2 you saw that the general solution of a first-order differential collection of all possible
equation usually involves just one arbitrary constant. But here, even for solutions of that equation.
such a simple second-order differential equation, the general solution involves
two arbitrary constants (namely C and D). It is a property of second-order
differential equations that the general solution usually involves two arbitrary
constants.
The remainder of the unit proceeds as follows. Section 1 concentrates on
homogeneous linear constant-coefficient second-order differential equations,
leaving inhomogeneous equations to Section 2. Section 3 considers the types
of condition needed to move from a general to a particular solution. Finally,
Section 4 uses the computer to examine the nature of solutions.

110
Section 1 Homogeneous differential equations

1 Homogeneous differential equations


After a short introduction in Subsection 1.1, Subsection 1.2 shows how ho-
mogeneous second-order differential equations can be solved. Subsection 1.3
explains why the solutions thus obtained are indeed the required general
solutions.

1.1 First thoughts


You will recall that a particular solution of a first-order differential equation
is obtained by applying a single condition (known as an initial condition ) to
the general solution in order to find a particular value of the single arbitrary
constant. In the case of a second-order differential equation, a particular
solution is obtained by applying two conditions to the general solution in
order to find particular values of the two arbitrary constants. The following
example illustrates this.

Example 1.1
Suppose that a car is travelling with constant acceleration a along a straight
road. If, at time t, its distance from a fixed point is s, then its velocity
is given by ds/dt, its acceleration is given by d2 s/dt2 , and its motion is
modelled by
d2 s
= a. (0.1)
dt2
If the car is initially stationary at position s = 0 and thereafter has a con-
stant acceleration of 2 m s−2 , how long does it take for the car to attain a
velocity of 30 m s−1 , and what distance has it travelled in that time?
Solution
You saw in the Introduction that integrating Equation (0.1) leads to
ds
= at + C and s = 12 at2 + Ct + D,
dt
where C and D are arbitrary constants. To find these constants (and hence
answer the questions asked), we need to make use of the conditions given.
These are that the car is initially stationary (i.e. ds/dt = 0 when t = 0) at
position s = 0 (i.e. s = 0 when t = 0). The first of these conditions, together
with the equation ds/dt = at + C, tells us that C = 0. With C = 0, the
second equation becomes s = 12 at2 + D, and this together with the second
condition tells us that D = 0.
Therefore, when a = 2, we have
d2 s ds
2
= 2, = 2t, s = t2 .
dt dt
So the velocity is ds/dt = 30 when 2t = 30, i.e. after 15 seconds, and in this
time the car has travelled s = 152 = 225 metres.

The solution of second-order differential equations is rarely as easy as the


solution of Equation (0.1) above. In fact, the approach of repeated direct
integration works for only some equations of the form
d2 y
= f (x).
dx2

111
Unit 3 Second-order differential equations

Most second-order differential equations cannot be solved by analytic meth-


ods at all, and numerical methods have to be employed instead. However, Such numerical methods are
there is one important class of second-order differential equations that can discussed in Unit 26.
be solved by analytic means: this is the topic of this unit, and we introduce
it next.

Linear constant-coef ficient differential equations


This unit considers linear constant-coefficient second-order differential equa- You met the idea of a linear
tions. But what exactly do the terms ‘linear’ and ‘constant-coefficient’ mean first-order differential
in this context? equation in Unit 2.

The answer lies in the following definitions.

Definitions
(a) A second-order differential equation for y = y(x) is linear if it can Compare the definitions for
be expressed in the form first-order equations in
Subsection 4.1 of Unit 2. The
d2 y dy important feature is the
a(x) + b(x) + c(x)y = f (x), linear combination of y and
dx2 dx
its derivatives on the
where a(x), b(x), c(x) and f (x) are given continuous functions. left-hand side.
(b) A linear second-order differential equation is constant-coefficient
if the functions a(x), b(x) and c(x) are all constant, so that the
equation is of the form
d2 y dy
a 2
+b + cy = f (x), (1.1)
dx dx
where a = 0. If a = 0, then the equation is
first-order.
(c) A linear constant-coefficient second-order differential equation is
said to be homogeneous if f (x) = 0 for all x, and inhomoge-
neous (or non-homogeneous) otherwise.

Linear constant-coefficient second-order differential equations can be written


in other ways. For example, we can divide Equation (1.1) through by a to
obtain an equation of the form
d2 y dy
2
+β + γy = φ(x),
dx dx
and this more closely resembles the definition of linear first-order differential
equations from Unit 2.

*Exercise 1.1
Consider the following second-order differential equations.
d2 y d2 y dy d2 y dy
(i) = x2 (ii) 3 + 4 + y = x2
(iii) 3 +4 +y =0
dx2 dx 2 dx dx 2 dx
d2 y dy d2 y dy
(iv) xy  + x2 y = 0 (v) 2y 2 + xy = 3 (vi) 2y 2 + 4y = 3
dx dx dx dx
d2 t dt
(vii) 2 2 + 3 + 4t = sin θ (viii) x ¨ = −4t (ix) x ¨ = −4x
dθ dθ
(a) Which of the equations are linear and constant-coefficient?
(b) Which of the linear constant-coefficient equations are homogeneous?
(c) For each equation, identify the dependent and independent variables.

112
Section 1 Homogeneous differential equations

One of the main reasons for concentrating on linear constant-coefficient dif-


ferential equations is that there is a large body of theory upon which we can
call in order to solve them. The next subsection illustrates this.

The principle of super position


A key theoretical result will turn out to be extremely useful throughout this
unit. This is known as the principle of superposition, and is the fundamental
property of linear differential equations.
Suppose that we have a solution y1 (x) of
d2 y dy
a 2
+b + cy = f1 (x), Here a, b and c can be
dx dx functions of x.
and a solution y2 (x) of
d2 y dy
a +b + cy = f2 (x).
dx2 dx
Then we claim that the linear combination k1 y1 + k2 y2 , where k1 and k2 are
constants, is a solution of
d2 y dy
a +b + cy = k1 f1 (x) + k2 f2 (x). (1.2)
dx2 dx
In fact, this is easy to see, for if we substitute k1 y1 + k2 y2 directly into
Equation (1.2), we obtain
d2 d
a 2
(k1 y1 + k2 y2 ) + b (k1 y1 + k2 y2 ) + c (k1 y1 + k2 y2 )
dx dx
d2 y1 d2 y2 dy1 dy2
= a k1 2 + k2 2 + b k1 + k2 + c (k1 y1 + k2 y2 )
dx dx dx dx
d2 y1 dy1 d2 y2 dy2
= k1 a 2 + b + cy1 + k2 a 2 + b + cy2
dx dx dx dx
= k1 f1 (x) + k2 f2 (x),
as required.
We summarize this important result as a theorem.

Theorem 1.1 Principle of super position


If y1 (x) is a solution of the linear second-order differential equation
d2 y dy
a 2
+b + cy = f1 (x),
dx dx
and y2 (x) is a solution of the linear second-order differential equation
d2 y dy
a2
+b + cy = f2 (x)
dx dx
(with the same left-hand side), then the function
y(x) = k1 y1 (x) + k2 y2 (x),
where k1 and k2 are constants, is a solution of the differential equation
d2 y dy
a +b + cy = k1 f1 (x) + k2 f2 (x).
dx2 dx

113
Unit 3 Second-order differential equations

1.2 Method of solution


This subsection develops a method for solving homogeneous linear constant-
coefficient second-order differential equations, i.e. equations of the form
d2 y dy
a 2
+b + cy = 0, (1.3)
dx dx
where a, b, c are constants and a = 0.
To see how this method arises, consider the first -order differential equation
dy
b + cy = 0, (1.4)
dx
where b and c are constants and b = 0. This is a homogeneous linear equa-
tion; as can be shown using the integrating factor method from Unit 2, this
has a general solution of the form y = Aeλx , where A is an arbitrary con-
stant and λ is some fixed constant. To find λ, we could solve the equation
as in Unit 2 ; alternatively, we can substitute y = Aeλx into Equation (1.4).
Then we have dy/dx = λAeλx , and so
dy
b + cy = bλAeλx + cAeλx = (bλ + c)Aeλx .
dx
Therefore, for y = Aeλx to be a solution, (bλ + c)Aeλx must be zero, for
all x. Since A is arbitrary and eλx > 0, for all x, we must have bλ + c = 0,
i.e. λ = −c/b.
This useful idea of substituting y = Aeλx as a trial solution can be applied
to Equation (1.3) as well. Let us suppose that Equation (1.3) has a solution
of the form y = Aeλx , for some value of λ. If so, then dy/dx = λAeλx and
d2 y/dx2 = λ2 Aeλx , and substituting into the left-hand side of Equation (1.3)
gives
d2 y dy
a 2
+b + cy = aλ2 Aeλx + bλAeλx + cAeλx
dx dx
= (aλ2 + bλ + c)Aeλx .
Hence y = Aeλx is indeed a solution of Equation (1.3), for any value of A,
provided that λ satisfies Note that the discussion here
applies irrespective of
aλ2 + bλ + c = 0. (1.5) whether λ is real or complex.
The consequences of λ being
Equation (1.5) plays such an important role in solving linear constant-
complex are explained later.
coefficient second-order differential equations that it is given a special name.

Definition
The auxiliary equation of the homogeneous linear constant-coefficient The auxiliary equation is
second-order differential equation sometimes called the
characteristic equation.
d2 y dy
a
2
+b + cy = 0
dx dx
is the quadratic equation
aλ2 + bλ + c = 0. (1.5)

The auxiliary equation is obtained from the differential equation by replacing


dy d2 y
y by 1, by λ, and by λ2 .
dx dx2

114
Section 1 Homogeneous differential equations

Example 1.2
Write down the auxiliary equation of the differential equation
d2 y dy
3 −2 + 4y = 0.
dx2 dx
Solution
The auxiliary equation is
3λ2 − 2λ + 4 = 0.

*Exercise 1.2
Write down the auxiliary equation of each of the following differential equa-
tions.
d2 y dy
(a) 2
−5 + 6y = 0 (b) y  − 9y = 0 (c) x
¨ + 2ẋ = 0
dx dx

Now, so far, we know that y = Aeλx is a solution of Equation (1.3) pro-


vided that λ satisfies its auxiliary equation. But the auxiliary equation is
a quadratic equation with real coefficients, and so has two roots (which in The roots of a quadratic
general are distinct). These two roots, λ1 and λ2 say, give two solutions equation were discussed in
y1 = Ceλ1 x and y2 = Deλ2 x of Equation (1.3), where C and D are arbitrary Unit 1.
constants. If λ1 = λ2 , then we obtain
only one solution. This case is
dealt with separately below.
Example 1.3
(a) Write down the auxiliary equation of the differential equation
d2 y dy
2
−3 + 2y = 0,
dx dx
and find its roots λ1 and λ2 .
(b) Deduce that y1 = Cex and y2 = De2x are both solutions of the differen-
tial equation, for any values of the two constants C and D.
(c) Show that y = Cex + De2x is also a solution of the differential equation,
for any values of the two constants C and D.
Solution
(a) The auxiliary equation is
λ2 − 3λ + 2 = 0.
This equation may be solved, for example, by factorizing in the form Using the formula

(λ − 1)(λ − 2) = 0, to give the two roots λ1 = 1 and λ2 = 2. −b ± b2 − 4ac
λ1 , λ2 =
(b) Since λ1 = 1 and λ2 = 2 are the roots of the auxiliary equation, y1 = Cex 2a
produces the same answer.
and y2 = De2x are solutions of the differential equation, for any values It does not matter which of
of C and D. the roots is called λ1 and
which is called λ2 .
(c) To show that y = Cex + De2x is a solution of the differential equation,
we differentiate and substitute into the differential equation. Differen-
tiating to obtain the first and second derivatives of y gives
dy
= Cex + 2De2x
dx
and
d2 y
= Cex + 4De2x .
dx2

115
Unit 3 Second-order differential equations

Substituting these into the left-hand side of the differential equation


gives
d2 y dy
2
−3 + 2y
dx  dx     
= Cex + 4De2x − 3 Cex + 2De2x + 2 Cex + De2x
= C(1 − 3 + 2)ex + D(4 − 6 + 2)e2x
= 0.
Hence y = Cex + De2x is a solution of the differential equation, for any
values of C and D.

In Subsection 1.3 we shall prove that if λ1 and λ2 are distinct roots of the
auxiliary equation of a homogeneous linear constant-coefficient second-order
differential equation, then any solution is of the form
y = Ceλ1 x + Deλ2 x , (1.6)
for some choice of constants C and D.

*Exercise 1.3
Use the auxiliary equation to find the general solution of each of the following
differential equations.
d2 y dy d2 y dy d2 z
(a) + 5 + 6y = 0 (b) 2 + 3 = 0 (c) − 4z = 0
dx2 dx dx2 dx du2

We now consider an example where the two roots of the auxiliary equation
are equal, in which case the above recipe does not work! Indeed, in light
of the earlier discussion, you might expect the solution always to be of the
form y = Aeλ1 x + Beλ2 x , where A and B are arbitrary constants. But if
λ1 = λ2 , this reduces to y = (A + B)eλ1 x = Ceλ1 x , where C = A + B is a
single arbitrary constant, so this cannot be the general solution of a second -
order differential equation.

Example 1.4
(a) Write down the auxiliary equation of the differential equation
d2 y dy
2
+6 + 9y = 0,
dx dx
and find its roots λ1 and λ2 .
(b) Deduce that y1 = Ce−3x is a solution of the differential equation, for
any value of the constant C.
(c) Show that y2 = Dxe−3x is also a solution, for any value of the con-
stant D.
(d) Deduce that y = (C + Dx)e−3x is also a solution of the differential equa-
tion, for any values of the two constants C and D.
Solution
(a) The auxiliary equation is
λ2 + 6λ + 9 = 0.
The left-hand side is the perfect square (λ + 3)2 , so the auxiliary equa-
tion has equal roots λ1 = λ2 = −3.
(b) Since λ1 = −3 is a root of the auxiliary equation, y1 = Ce−3x is a solu- Note that the ‘other’ root
tion of the differential equation, for any value of C. λ2 = −3 gives the same
solution.
116
Section 1 Homogeneous differential equations

(c) To show that y2 = Dxe−3x is a solution of the differential equation, we


differentiate and substitute into the differential equation. Differentiating
to obtain the first and second derivatives of y2 gives
dy2  
= De−3x + Dx −3e−3x = D(1 − 3x)e−3x , Here we are using the Product
dx Rule for differentiation.
d2 y2
= −3De−3x + D(1 − 3x)(−3e−3x ) = D(−6 + 9x)e−3x .
dx2
Substituting these into the left-hand side of the differential equation
gives
d2 y2 dy2
2
+6 + 9y2 = D(−6 + 9x)e−3x + 6D(1 − 3x)e−3x + 9Dxe−3x
dx dx
= D(−6 + 6)e−3x + D(9 − 18 + 9)xe−3x
= 0.
Hence y2 = Dxe−3x is a solution of the differential equation, for any
value of D.
(d) Since y1 = Ce−3x and y2 = Dxe−3x are both solutions of the differential
equation, the principle of superposition (Theorem 1.1) tells us that so is
y = Ce−3x + Dxe−3x = (C + Dx)e−3x , for any values of C and D.

The solution in Example 1.4 is of the form y = Ceλ1 x + Dxeλ1 x . The extra x
in the second term, Dxeλ1 x , is needed, in this special case, to incorporate the
second arbitrary constant required by the general solution of a second-order
differential equation.
In general, when λ1 = λ2 , y = xeλ1 x is a solution of Equation (1.3) (page 114).
To see this, differentiate twice to obtain
dy
= eλ1 x + λ1 xeλ1 x = (1 + λ1 x)eλ1 x ,
dx
d2 y
= λ1 eλ1 x + λ1 (1 + λ1 x)eλ1 x = (2λ1 + λ21 x)eλ1 x ,
dx2
and substitute into the left-hand side of Equation (1.3) to obtain
d2 y dy
a 2
+b + cy
dx  dx  
= a (2λ1 + λ21 x)eλ1 x + b (1 + λ1 x)eλ1 x + c xeλ1 x
 
= eλ1 x a(2λ1 + λ21 x) + b(1 + λ1 x) + cx
 
= eλ1 x (2aλ1 + b) + (aλ21 + bλ1 + c)x . (1.7)
Since λ1 is the solution of the auxiliary equation, we have aλ21 + bλ1 + c = 0.
Also, the formula method for solving the auxiliary equation aλ2 + bλ + c = 0
gives

−b ± b2 − 4ac
λ= ;
2a
since in this case we have equal roots, we must have b2 − 4ac = 0, so
λ1 = −b/2a, and therefore 2aλ1 + b = 0. Thus the right-hand side of Equa-
tion (1.7) is zero, and y = xeλ1 x is indeed a solution of Equation (1.3).
Therefore, when λ1 = λ2 , by the principle of superposition,
y = Ceλ1 x + Dxeλ1 x = (C + Dx)eλ1 x , (1.8)
where C and D are arbitrary constants, is always a solution of Equa-
tion (1.3). In fact, as you will see in Subsection 1.3, Equation (1.8) gives
the general solution.

117
Unit 3 Second-order differential equations

*Exercise 1.4
Use the auxiliary equation to find the general solution of the following dif-
ferential equations.
d2 y dy
(a) 2
+2 +y =0 (b) s̈ − 4ṡ + 4s = 0
dx dx

Equations (1.6) and (1.8) give us the general solution of Equation (1.3) for
the cases where the roots λ1 and λ2 of the auxiliary equation are distinct
and equal, respectively. However, the distinct roots of a quadratic equation
may not be real — they could consist of a pair of complex conjugate roots Recall that the complex
λ1 = α + βi and λ2 = α − βi. If the auxiliary equation has such a pair of conjugate of the complex
roots, we can still write the general solution in the form number α + βi is α − βi.

y = Aeλ1 x + Beλ2 x = Ae(α+βi)x + Be(α−βi)x , You will soon see why we use
A and B for the arbitrary
but we now have a complex-valued solution. constants (rather than our
usual choice of C and D).
Since Equation (1.3) has real coefficients, we would like a real-valued solu-
tion. In order to achieve this, we shall need to allow A and B to be complex.
Then we can use Euler’s formula, which tells us that Euler’s formula was discussed
in Unit 1.
eiβx = cos βx + i sin βx and e−iβx = cos βx − i sin βx.
Now
y = Aeλ1 x + Beλ2 x
= Ae(α+βi)x + Be(α−βi)x
= Aeαx eiβx + Beαx e−iβx
= Aeαx (cos βx + i sin βx) + Beαx (cos βx − i sin βx)
= eαx ((A + B) cos βx + (Ai − Bi) sin βx)
= eαx (C cos βx + D sin βx), The constants in the final
expression are now C and D,
where C = A + B and D = (A − B)i. Provided that any initial conditions in keeping with our previous
are real-valued, C and D are real, and this is the required real-valued solution solutions.
containing two arbitrary constants.

Example 1.5
(a) Write down the auxiliary equation of the differential equation
d2 y dy
−6 + 13y = 0,
dx2 dx
and show that its roots are λ1 = 3 + 2i and λ2 = 3 − 2i.
(b) Confirm that y1 = e3x cos 2x and y2 = e3x sin 2x are both solutions of
the differential equation.
(c) Deduce that y = e3x (C cos 2x + D sin 2x) is also a solution of the differ-
ential equation, for any values of the two constants C and D.
Solution
(a) The characteristic equation is
λ2 − 6λ + 13 = 0.
The formula method gives
√ √
6 ± 36 − 4 × 1 × 13 6 ± −16
λ= = = 3 ± 2i,
2 2
so the two complex conjugate roots are λ1 = 3 + 2i and λ2 = 3 − 2i. With the previous notation
we have α = 3 and β = 2.

118
Section 1 Homogeneous differential equations

(b) To confirm that y1 = e3x cos 2x is a solution of the differential equation,


we differentiate and substitute into the differential equation. Differen-
tiating to obtain the first and second derivatives of y1 gives
dy1
= 3e3x cos 2x + e3x (−2 sin 2x)
dx
= e3x (3 cos 2x − 2 sin 2x),
d2 y1
= 3e3x (3 cos 2x − 2 sin 2x) + e3x (−6 sin 2x − 4 cos 2x)
dx2
= e3x (5 cos 2x − 12 sin 2x).
Substituting these into the left-hand side of the differential equation
gives
d2 y1 dy1
−6 + 13y1 = e3x (5 cos 2x − 12 sin 2x)
dx2 dx
− 6e3x (3 cos 2x − 2 sin 2x) + 13e3x cos 2x
= e3x [(5 − 18 + 13) cos 2x + (−12 + 12) sin 2x]
= 0.
Hence y1 = e3x cos 2x is a solution.
Similarly, for y2 = e3x sin 2x we have
dy2
= e3x (2 cos 2x + 3 sin 2x),
dx
d2 y2
= e3x (12 cos 2x + 5 sin 2x),
dx2
and substituting into the left-hand side of the differential equation gives
d2 y2 dy2
2
−6 + 13y2 = e3x [(12 − 12) cos 2x + (5 − 18 + 13) sin 2x]
dx dx
= 0.
Hence y2 = e3x sin 2x is also a solution.
(c) Since y1 = e3x cos 2x and y2 = e3x sin 2x are both solutions of the dif-
ferential equation, the principle of superposition (Theorem 1.1) tells us
that so is
y = Ce3x cos 2x + De3x sin 2x
= e3x (C cos 2x + D sin 2x),
for any values of C and D.

*Exercise 1.5
Use the auxiliary equation to find the general solution of each of the following
differential equations.
d2 y dy
(a) +4 + 8y = 0
dx2 dx
d2 θ
(b) + 9θ = 0
dt2

We now summarize the method of solving these differential equations as a


procedure.

119
Unit 3 Second-order differential equations

Procedure 1.1 The solution of homogeneous linear constant­


coefficient second-order differential equations
The general solution of the homogeneous linear constant-coefficient
second-order differential equation
d2 y dy
a 2
+b + cy = 0, (1.3)
dx dx
where a, b, c are (real) constants and a = 0, may be found as follows.
(a) Write down the auxiliary equation
aλ2 + bλ + c = 0, (1.5)
and find its roots λ1 and λ2 .
(b) (i) If the auxiliary equation has two distinct real roots λ1 and λ2 ,
the general solution of the differential equation is
y = Ceλ1 x + Deλ2 x .
(ii) If the auxiliary equation has two equal real roots
λ1 = λ2 , the general solution of the differential equation is
y = (C + Dx)eλ1 x .
(iii) If the auxiliary equation has a pair of complex conjugate roots
λ1 = α + βi and λ2 = α − βi, the general solution of the differential
equation is
y = eαx (C cos βx + D sin βx).
In each case, C and D are arbitrary constants.

It is worth noting that the three cases in part (b) of Procedure 1.1 corre-
spond to three different possibilities that arise when solving the characteris-
tic equation aλ2 + bλ + c = 0. These three different possibilities relate to the
value of the discriminant b2 − 4ac: b2 − 4ac > 0 corresponds to case (i),
b2 − 4ac = 0 to case (ii), and b2 − 4ac < 0 to case (iii).

*Exercise 1.6
Find the general solution of each of the following differential equations.
d2 y
(a) + 4y = 0
dx2
(b) u (x) − 6u (x) + 8u(x) = 0
d2 y dy
(c) +2 =0
dx2 dx
d2 y dy
(d) −2 +y =0
dx2 dx
d2 y
(e) − ω2 y = 0, where ω is a real constant
dx2
d2 y dy
(f) +4 + 29y = 0
dx2 dx

120
Section 1 Homogeneous differential equations

Exercise 1.7
Small oscillations of the pendulum of a clock can be modelled by the differ-
ential equation
g
θ¨ = − θ,
l θ
l
where g is the magnitude of the acceleration due to gravity, l is the length
of the pendulum, and θ is the angle the pendulum makes with the vertical
(see Figure 1.1). Solve the differential equation to obtain an expression for
θ in terms of g and l.

1.3 The general solution


Figure 1.1
In this subsection we prove that the solutions we discovered in Subsection 1.2
are indeed the most general solutions of Equation (1.3). This proof is in-
cluded for completeness, and you will not be expected to reproduce it; how-
ever, it does provide some useful revision of the integrating factor method
from Unit 2.

Theorem 1.2
Suppose that the roots of the equation
aλ2 + bλ + c = 0
are λ1 and λ2 . Then the general solution of the second-order linear
constant-coefficient differential equation
d2 y dy
a +b + cy = 0 (1.3)
dx2 dx
can always be written in the form
Ceλ1 x + Deλ2 x (λ1 = λ2 ),
y(x) =
(C + Dx)eλx (λ1 = λ2 = λ).

In order to prove this it will be convenient to divide through Equation (1.3)


by a, obtaining the equivalent form
d2 y b dy c
2
+ + y = 0. (1.9)
dx a dx a
Now we resort to a trick, in order to obtain a first-order differential equation
dy
in w = − ky, where k is a constant yet to be determined.
dx
dw d2 y dy
Noting that = 2 − k , Equation (1.9) can be written first as
dx dx dx
dw b dy c Note that we have eliminated
+ +k + y = 0, d2 y
dx a dx a the term.
dx2
dy
and then, substituting = w + ky, as
dx
dw b c
+ + k (w + ky) + y = 0.
dx a a
Simplifying, this yields
dw b c b
+ +k w+ +k +k y = 0.
dx a a a

121
Unit 3 Second-order differential equations

Now, choosing k to be a root of the quadratic equation


b c
k2 + k + = 0,
a a
which simplifies to
ak2 + bk + c = 0, (1.10)
we indeed arrive at a first-order differential equation for w:
dw b
+ + k w = 0. (1.11)
dx a
Of course, Equation (1.10) is the auxiliary equation associated with Equa-
tion (1.3), with roots

−b ± b2 − 4ac
λ1 , λ2 = ,
2a
so these are the values of k that we may choose. Before proceeding, notice
that the sum of the two roots satisfies
b
λ1 + λ2 = − .
a
For the sake of definiteness, choose k = λ1 . The above differential equa- It does not matter which root
tion (1.11) for w may therefore be written as we choose, since there is no
particular significance in the
dw b labelling of λ1 and λ2 .
+ + λ1 w = 0.
dx a
But since λ1 + λ2 = −b/a, we have
b
+ λ1 = −λ2 .
a
Thus Equation (1.11) becomes
dw
− λ2 w = 0.
dx
This is a first-order linear differential equation, and in principle we could
solve it using the integrating factor technique from Unit 2. However, the
equation is of a particularly simple form that we have seen before, so we can
write down the solution as
w = Aeλ2 x ,
where A is an arbitrary constant. Now, remembering that
dy dy
w= − ky = − λ1 y,
dx dx
we arrive at a first-order differential equation for y:
dy
− λ1 y = Aeλ2 x . (1.12)
dx
This is a non-homogeneous linear first-order differential equation that we
can solve, once again using the integrating factor method of Unit 2. The
integrating factor is

exp −λ1 dx = e−λ1 x .

Multiplying through Equation (1.12) by this factor produces the equation


dy
e−λ1 x − λ1 e−λ1 x y = Ae(λ2 −λ1 )x ,
dx

122
Section 1 Homogeneous differential equations

or
d  −λ1 x
ye = Ae(λ2 −λ1 )x . (1.13)
dx
We now have two cases to consider. If λ2 − λ1 = 0, then integrating both This is the case when the
sides yields auxiliary equation has
distinct roots.
A
ye−λ1 x = e(λ2 −λ1 )x + C,
λ2 − λ1
where C is an arbitrary constant. Multiplying through by eλ1 x , we arrive at
the explicit form of the general solution
A
y= eλ2 x + Ceλ1 x .
λ2 − λ1
Finally, replacing the arbitrary constant A/(λ2 − λ1 ) by D gives the required
result:
y = Ceλ1 x + Deλ2 x .
If λ1 = λ2 = λ, say, then Equation (1.13) becomes This is the case when the
d  −λx
auxiliary equation has equal
ye = A, roots.
dx
and integrating gives
ye−λx = Ax + C,
where C is an arbitrary constant. Re-labelling A as D, and multiplying
through by eλx , we arrive at the explicit form of the general solution
y = (C + Dx)eλx .
This completes the proof of Theorem 1.2.

End-of-section Exercises
Exercise 1.8
(a) Write down the auxiliary equation of the differential equation
dy d2 y
3 − y − 2 2 = 0.
dx dx
(b) Solve this auxiliary equation.
(c) Write down the general solution of the differential equation.

Exercise 1.9
Find the general solution of each of the following differential equations.
d2 y dy d2 y
(a) + 2 + 2y = 0 (b) − 16y = 0
dx2 dx dx2
d2 y dy d2 θ dθ
(c) 2
+ 4y = 4 (d) 2
+3 =0
dx dx dt dt
Exercise 1.10
For which values of the constant k does the differential equation
d2 y dy
2
+ 4k + 4y = 0
dx dx
have a general solution with oscillating behaviour, that is, a general solution
which involves sines and cosines?

123
Unit 3 Second-order differential equations

2 Inhomogeneous differential equations


Section 1 was concerned with finding the general solution of homogeneous
linear constant-coefficient second-order differential equations. This section
concerns inhomogeneous linear constant-coefficient second-order differential
equations, i.e. equations of the form
d2 y dy
a 2
+b + cy = f (x), (2.1)
dx dx
where a, b, c are real constants, a = 0, and f (x) is a given continuous real-
valued function of x.
Subsection 2.1 gives the general method for constructing solutions of Equa-
tion (2.1). Subsection 2.2 shows how to find an appropriate particular
solution of the differential equation, for use in constructing the general so-
lution, in cases where the function f (x) takes one of a few particular forms.
Subsection 2.3 deals with certain cases where complications can arise. Sub-
section 2.4 shows how to deal with cases where f (x) is a combination of the
functions discussed in Subsection 2.2.

2.1 General method of solution


The basic method used for finding the general solution of Equation (2.1)
depends on the principle of superposition (Theorem 1.1), and is illustrated
in the following example.

Example 2.1
Show that y = Ce−2x + De−3x + 2 is a solution of the inhomogeneous dif-
ferential equation
d2 y dy
+5 + 6y = 12 (2.2)
dx2 dx
for any values of the constants C and D.
Solution
We know from Exercise 1.3(a) that the homogeneous differential equation
d2 y dy
2
+5 + 6y = 0 (2.3)
dx dx
has a general solution yc = Ce−2x + De−3x , where C and D are arbitrary
constants.
Now consider the constant function yp = 2. This is a particular solution of The notation yc and yp will
Equation (2.2) since d2 yp /dx2 = dyp /dx = 0 and 6yp = 12. be explained shortly.

Therefore, by the principle of superposition (Theorem 1.1),


y = yc + yp = Ce−2x + De−3x + 2
is a solution of Equation (2.2), for any values of C and D.

Equation (2.3) is an example of an associated homogeneous equation —


i.e. the homogeneous equation associated with the inhomogeneous equa-
tion (2.2) by making its right-hand side zero. The solutions yc and yp also

124
Section 2 Inhomogeneous differential equations

have special names in this context: yc , the general solution of the asso-
ciated homogeneous equation (2.3), is called the complementary function,
and yp , a particular solution of the inhomogeneous equation (2.2), is called
a particular integral.

Definitions
Let
d2 y dy
a 2
+b + cy = f (x) (2.1)
dx dx
be an inhomogeneous linear constant-coefficient second-order differen-
tial equation.
(a) Its associated homogeneous equation is
d2 y dy
a 2
+b + cy = 0.
dx dx
(b) The general solution yc of the associated homogeneous equation is
known as the complementary function for the original inhomo-
geneous equation (2.1).
(c) Any particular solution yp of the original inhomogeneous equa-
tion (2.1) is referred to as a particular integral for that equation. The term particular integral
is used here, rather than the
term particular solution used
Later in this section we shall show how to find particular integrals for a wide in some other texts, to
variety of equations. Before we do that, it is important to realize the full distinguish it from the
particular solution to
significance of finding just one particular integral. Equation (2.1) that satisfies
given initial or boundary
*Exercise 2.1 conditions (see Section 3).
Suppose that we have found two different particular integrals yp1 , yp2 for
Equation (2.1). Use the principle of superposition to show that the function
yp1 − yp2 is then a solution of the associated homogeneous equation.

The result of Exercise 2.1 shows the true significance of finding a particular
integral. For if we do so then, since from Section 1 we know how to solve That is, we can find the
the associated homogeneous equation, we can find all particular integrals complementary function.
simply by adding the complementary function. We have proved the following
important result.

Theorem 2.1
If yc is the complementary function for an inhomogeneous linear
constant-coefficient second-order differential equation, and yp is a par-
ticular integral for that equation, then yc + yp is the general solution
of that equation.

Note that yc , being the general solution of the associated homogeneous equa-
tion, will contain two arbitrary constants, whereas yp , being a particular
solution, will contain none.
Let us now see how the method based on Theorem 2.1 can be applied.

125
Unit 3 Second-order differential equations

Example 2.2
Find the general solution of the differential equation
d2 y
+ 9y = 9x + 9. (2.4)
dx2
Solution
The associated homogeneous equation is
d2 y
+ 9y = 0,
dx2
which has the general solution
yc = C cos 3x + D sin 3x. See Exercise 1.5(b), although
there different symbols were
This is the complementary function for Equation (2.4). used for the variables.
A particular integral for Equation (2.4) is You will see in the next
subsection how to find such a
yp = x + 1. particular integral.
This may be verified by differentiation and substitution: yp = 1 and yp = 0,
and substituting into the left-hand side of Equation (2.4) gives
yp + 9yp = 0 + 9(x + 1) = 9x + 9,
which is the same as the right-hand side of Equation (2.4), as required.
The general solution of Equation (2.4) is, therefore, by Theorem 2.1,
y = yc + yp = C cos 3x + D sin 3x + x + 1,
where C and D are arbitrary constants.

The method of Example 2.2 may be summarized as follows.

Procedure 2.1 The solution of inhomogeneous linear constant­


coefficient second-order differential equations
To find the general solution of the inhomogeneous linear constant-
coefficient second-order differential equation
d2 y dy
a +b + cy = f (x):
dx2 dx
(a) find its complementary function yc , i.e. the general solution of the The reason why yc is found
associated homogeneous differential equation first will become clear in
Subsection 2.3.
d2 y dy
a 2
+b + cy = 0,
dx dx
using Procedure 1.1;
(b) find a particular integral yp .
The general solution is y = yc + yp .

It is worth noting that, by Theorem 2.1, any choice of particular integral


in Procedure 2.1 gives the same general solution. The formula obtained
for the general solution may look different for different choices of particular
integral, but they are in fact always equivalent. For example, in Example 2.2
the particular integral yp = x + 1 was chosen, and the form of the general
solution was obtained as y = C cos 3x + D sin 3x + x + 1. It would have been
equally valid to have chosen, as a particular integral, yp = x + 1 + sin 3x.

126
Section 2 Inhomogeneous differential equations

In that case, the form of the general solution would have been obtained as
y = C cos 3x + D sin 3x + x + 1 + sin 3x. This looks a little different, but it
may be written in the form y = C cos 3x + (D + 1) sin 3x + x + 1; and, since
C and D are arbitrary constants, this form of the general solution represents
exactly the same family of solutions.
*Exercise 2.2
Consider the following differential equations.
d2 y d2 y dy
(a) 2
+ 4y = 8 (b) −3 + 2y = 6 See Exercise 1.6(a) and
dx dx2 dx Example 1.3.
For each equation:
• write down its associated homogeneous equation and its complementary
function yc ;
• find a particular integral of the form yp = p, where p is a constant;
• write down the general solution.

When using Procedure 2.1, the complementary function is found by using


Procedure 1.1. However, the procedures for finding a particular integral are
another matter. In Exercise 2.2, where the right-hand sides of the equa-
tions are constants, it was possible to find a particular integral almost ‘by
inspection’; but this method is generally inadequate. Fortunately, there ex-
ist procedures for finding a particular integral for equations involving wide
classes of right-hand-side functions f (x). The remainder of this section con-
siders some of the simpler cases, where it is possible to determine the form of
a particular integral by inspection, although some manipulation is required
in order to determine the values of certain coefficients.

2.2 Finding a particular integral by the method of


undeter mined coefficients
In the previous subsection you saw that the inhomogeneous linear constant-
coefficient second-order differential equation
d2 y dy
a 2
+b + cy = f (x) (2.1)
dx dx
can be solved by first solving the associated homogeneous equation, using
the methods of Section 1, and then finding a particular integral of Equa-
tion (2.1), which depends upon the function f (x). This and the next two
subsections show you how to find a particular integral when f (x) is a poly- There exist procedures for
nomial, exponential or sinusoidal function, or a sum of such functions. finding a particular integral
for fairly general types of
You saw an example of the approach in Exercise 2.2. There the functions continuous function f (x), but
f (x) were constants and you tried a constant function y = p as a particular these are not considered in
integral, substituting into the differential equation to find a suitable value this course.
for p. In general, we try a function of the same form as f (x) as a partic-
ular integral, and substitute into the differential equation to find suitable
values for its unknown coefficients. The function that we try is known as a
trial solution, and the method is known as the method of undetermined
coefficients.
The following examples illustrate the method. Bear in mind, though, that
the method (and hence these examples) finds only a particular integral for
the differential equation; to find the general solution you would need to find
the complementary function and combine it with the particular integral,
according to Procedure 2.1.

127
Unit 3 Second-order differential equations

A polynomial function (f (x) = mn xn + mn−1 xn−1 + · · · + m1 x + m0 )


Let us first consider a case where f (x) is a linear function (i.e. a polynomial
of degree 1).

Example 2.3
Find a particular integral for
d2 y dy
3 2
−2 + y = 4x + 2.
dx dx
Solution
We try a solution of the form
y = p1 x + p0 ,
where p1 and p0 are coefficients to be determined so that the differential
equation is satisfied. To try this solution, we need the first and second
derivatives of y:
dy d2 y
= p1 , = 0.
dx dx2
Substituting these into the left-hand side of the differential equation gives
d2 y dy
3 2
−2 + y = 3 × 0 − 2p1 + (p1 x + p0 ) = p1 x + (p0 − 2p1 ).
dx dx
Therefore, for y = p1 x + p0 to be a solution of the differential equation, we
require that
p1 x + (p0 − 2p1 ) = 4x + 2 for all x. (2.5)
To find the two unknown coefficients p1 and p0 , we compare the coefficients Comparing coefficients works
on both sides of Equation (2.5). Comparing the terms in x gives p1 = 4. because two polynomials are
Comparing the constant terms gives p0 − 2p1 = 2, so that p0 = 2 + 2p1 = equal if and only if all their
corresponding coefficients are
2 + 2 × 4 = 10. Therefore we have the particular integral the same.
yp = 4x + 10.
Checking : if y = 4x + 10, then dy/dx = 4, d2 y/dx2 = 0, and substituting
into the left-hand side of the differential equation gives
d2 y dy
3 2
−2 + y = 3 × 0 − 2 × 4 + (4x + 10) = 4x + 2,
dx dx
as required.

You will have noticed in Example 2.3 that substituting a linear trial solution
y = p1 x + p0 into the left-hand side of the differential equation resulted
in a linear function, namely p1 x + (p0 − 2p1 ), whose coefficients could be
compared with those of the linear target function 4x + 2 to obtain values
for p1 and p0 . This is really the key to the method. If the target function
is linear, choosing a linear trial solution ensures that substituting into the
left-hand side of the differential equation results in a linear function whose
coefficients can be compared with those of the target function. Similarly,
as you will see below, if the target function belongs to certain other classes
of functions, choosing as a trial solution a general function from that class
ensures that substitution into the left-hand side of the differential equation
produces another function from the same class, whose coefficients can be
compared with those of the target function, thus enabling values to be given
to the coefficients of the trial solution. The method will work provided that
all the derivatives of functions in the class are also in the class.

128
Section 2 Inhomogeneous differential equations

*Exercise 2.3
Find particular integrals of the form y = p1 x + p0 for each of the following
differential equations.
d2 y dy d2 y dy
(a) 2
− 2 + 2y = 2x + 3 (b) 2
+2 + y = 2x
dx dx dx dx

Note that, in Exercise 2.3(b), although f (x) is just a multiple of x, it is not


possible to find a solution of the form y(x) = p1 x. It is necessary for the Try it and see what goes
trial solution to contain terms like those in f (x) and all its derivatives, so wrong.
that in this case the trial solution must be of the form y = p1 x + p0 . So,
in general, even if m0 = 0 in f (x) = m1 x + m0 , so that f (x) = m1 x, you
should use a trial solution of the form y = p1 x + p0 .
However, if f (x) = m0 is a constant function, then the trial solution need You saw examples of this in
only be a constant function y = p0 . Exercise 2.2.

In general, if f (x) = mn xn + mn−1 xn−1 + · · · + m1 x + m0 , where mn = 0,


then a trial solution of the form y = pn xn + pn−1 xn−1 + · · · + p1 x + p0 should
be used.

*Exercise 2.4
Find a particular integral for
y  − y = t2 .

 
An exponential function f (x) = mekx

Example 2.4
Find a particular integral for
d2 y
+ 9y = 2e3x .
dx2
Solution
We try a solution of the form
y = pe3x , Since the derivative of e3x is
3e3x , the exponent (3x)
where p is a coefficient to be determined so that the differential equation is appearing in y(x) should be
satisfied. Differentiating y = pe3x gives the same as that appearing in
f (x), and only the coefficient
dy d2 y
= 3pe3x , = 9pe3x . p is to be determined.
dx dx2
Substituting these into the left-hand side of the differential equation gives
d2 y
+ 9y = 9pe3x + 9pe3x = 18pe3x .
dx2
Therefore, for y = pe3x to be a solution of the differential equation, we
require that 18pe3x = 2e3x for all x. Hence p = 19 , and
yp = 19 e3x
is a particular integral for the given differential equation.

129
Unit 3 Second-order differential equations

*Exercise 2.5
Find a particular integral for
d2 y dy
2 −2 + y = 2e−x .
dx2 dx

A sinusoidal function (f (x) = m cos Ωx + n sin Ωx) This type of function f (x) is
particularly important in
Following on from earlier ideas, the trial solution must contain terms like many practical applications.
those in f (x) and all its derivatives; so, even if f (x) contains only a sine or
only a cosine, the trial solution y(x) must contain both a sine and a cosine.
However, the value of the parameter Ω appearing in y(x) should be the same
as that appearing in f (x).

Example 2.5
Find a particular integral for
d2 y dy
2
+2 + 2y = 10 sin 2x.
dx dx
Solution
We try a solution of the form
y = p cos 2x + q sin 2x,
where p and q are coefficients to be determined so that the differential equa-
tion is satisfied. Differentiating y gives
dy d2 y
= −2p sin 2x + 2q cos 2x, = −4p cos 2x − 4q sin 2x.
dx dx2
Substituting these into the left-hand side of the differential equation gives
d2 y dy
2
+2 + 2y = (−4p cos 2x − 4q sin 2x)
dx dx
+ 2 (−2p sin 2x + 2q cos 2x)
+ 2(p cos 2x + q sin 2x)
= (−2p + 4q) cos 2x + (−4p − 2q) sin 2x.
Therefore, for y = p cos 2x + q sin 2x to be a solution of the differential equa-
tion, we require that
(−2p + 4q) cos 2x + (−4p − 2q) sin 2x = 10 sin 2x for all x. (2.6)
To find p and q, we compare the coefficients of cos and sin on both sides Comparing coefficients works
of Equation (2.6). Comparing cos terms gives −2p + 4q = 0. Comparing because the cosine and sine
sin terms gives −4p − 2q = 10. Solving these simultaneous equations gives functions are linearly
independent :
p = −2, q = −1. Hence if a sin rx + b cos rx = 0 for
yp = −2 cos 2x − sin 2x all x, then a = b = 0.

is a particular integral for the given differential equation.

*Exercise 2.6
Find a particular integral for
d2 y dy
− = cos 3t + sin 3t.
dt2 dt

130
Section 2 Inhomogeneous differential equations

The following procedure summarizes the results of this subsection.

Procedure 2.2 Method of undeter mined coefficients


To find a particular integral for the inhomogeneous linear constant-
coefficient second-order differential equation
d2 y dy
a 2
+b + cy = f (x),
dx dx
use a trial solution y(x) that has a form similar to that of f (x). For
simple forms of f (x), the following table gives the appropriate form of
trial solution. Note that there are
exceptional cases where these
f (x) Trial solution y(x) trial solutions do not work.
See Subsection 2.3.
mn xn + mn−1 xn−1 + · · · + m1 x + m0 pn xn + pn−1 xn−1 + · · · + p1 x + p0
mekx pekx
m cos Ωx + n sin Ωx p cos Ωx + q sin Ωx

To determine the coefficient(s) in y(x), differentiate y(x) twice, sub-


stitute into the left-hand side of the differential equation, and equate
coefficients of corresponding terms.

Exercise 2.7
What form of trial solution y should you use in order to find a particular
integral for each of the following differential equations?
d2 y d2 y dy
(a) − y = e 3x (b) +2 − 4y = sin 3x
dx2 dx2 dx

*Exercise 2.8
Find the general solution of each of the following differential equations. The complementary functions
were obtained in Exercise 1.9
d2 y dy d2 θ dθ
(a) + 2 + 2y = 4 (b) + 3 = 9 cos 3t parts (a) and (d).
dx2 dx dt2 dt

Exercise 2.9
A long horizontal rectangular beam of length l rests on rigid supports at
each end. It is important in civil engineering to determine how much such a l
beam ‘sags’. A simple model of this ‘sag’, or vertical displacement y, is the
differential equation
Ry  − Sy + 12 Q(l − x)x = 0, y
x displacement
where R, S and Q are constants related to the structure of the beam, and
x is the distance from one end of the beam (as illustrated in Figure 2.1).
Find the general solution of the differential equation in the case where R, S Figure 2.1
and Q are all equal to 1.

In Subsection 2.4 you will see how the principle of superposition can be
used in combination with Procedure 2.2 to solve differential equations whose
right-hand-side functions f (x) are sums of polynomial, exponential and
sinusoidal functions. But first let us look at some exceptional cases for
which Procedure 2.2 does not work and needs to be adapted.

131
Unit 3 Second-order differential equations

2.3 Exceptional cases


There are some exceptional cases for which Procedure 2.2 fails. The aim of
this subsection is to indicate when such difficulties arise, and how a partic-
ular integral may be found in those circumstances. Let us begin with an
example.

Example 2.6
Find a particular integral for
d2 y
− 4y = 2e2x .
dx2
Solution
Using Procedure 2.2, let us try y = pe2x . Differentiating this gives
dy d2 y
= 2pe2x , = 4pe2x .
dx dx2
Substituting these into the left-hand side of the differential equation gives
d2 y
− 4y = 4pe2x − 4pe2x = 0.
dx2
So there is no value of p that gives a particular integral of the form y = pe2x .
The trouble is that the complementary function, i.e. the general solution of
the associated homogeneous equation
d2 y
− 4y = 0,
dx2
is y = Ce−2x + De2x , thus the trial solution is a solution of the associated See Exercise 1.3(c).
homogeneous equation (with C = 0, D = p). Hence, on substituting the
trial solution y = pe2x into the inhomogeneous equation, the left-hand side
is zero for any value of p, so it cannot be equal to the non-zero right-hand
side.
In such circumstances, the difficulty can generally be overcome by multiply-
ing the trial solution suggested in Procedure 2.2 by x. Thus, in this case, the There is an analogy here with
trial solution should be modified to take the form y = pxe2x . Differentiating the case of the homogeneous
this gives differential equation when the
characteristic equation has
dy equal roots; in that case,
= pe2x + 2pxe2x = p(1 + 2x)e2x , when eλx is one solution of
dx
d2 y the equation, another solution
= 2pe2x + 2p(1 + 2x)e2x = 4p(1 + x)e2x . is given by xeλx .
dx2
Substituting these into the left-hand side of the differential equation gives
d2 y
− 4y = 4p(1 + x)e2x − 4pxe2x = 4pe2x .
dx2
Therefore y = pxe2x is a solution of the differential equation provided that
4pe2x = 2e2x for all x. Hence p = 12 , and
yp = 12 xe2x
is a particular integral for the given differential equation.

The problem with the trial solution being a solution of the associated ho-
mogeneous equation can occur irrespective of the form of the trial solution
(i.e. polynomial, exponential or sinusoidal), but in most cases it can be
overcome by multiplying the trial solution suggested in Procedure 2.2 by x.

132
Section 2 Inhomogeneous differential equations

When using Procedure 2.2, you should check whether the proposed trial
solution is a solution of the associated homogeneous equation, and if so try
multiplying it by x. (This is why it is important to find yc before yp in
Procedure 2.1.)
Exercise 2.10
Find a particular integral for each of the following differential equations. The complementary functions
are given in the solutions to
d2 y dy x d2 y dy
(a) 2
− 3 + 2y = 4e (b) 2 2
+3 =1 Example 1.3 and
dx dx dx dx Exercise 1.3(b).
Exercise 2.11
The motion of a marble dropped from the Clifton Suspension Bridge into
the River Avon can be modelled by the differential equation
mẍ + rẋ − mg = 0,
x
where m is the mass of the marble, r is a constant related to air resistance,
g is the magnitude of the acceleration due to gravity, and x is the vertical
distance from the point of dropping (as shown in Figure 2.2). Find an
expression for x in terms of the time t. Figure 2.2

We have seen that Procedure 2.2 fails if the trial solution is a solution of
the associated homogeneous differential equation: in such cases we multiply
the suggested trial solution by x and use this as the trial solution. Another
situation in which it is necessary to multiply the suggested trial solution by
x is illustrated in the following example.
Example 2.7
Find a particular integral for
d2 y dy
2
+2 = 2x + 2.
dx dx
Solution
Using Procedure 2.2, let us try y = p1 x + p0 . Differentiating this gives
dy d2 y
= p1 , = 0.
dx dx2
Substituting these into the left-hand side of the differential equation gives
d2 y dy
+2 = 2p1 .
dx2 dx
But there is no value of p1 that satisfies 2p1 = 2x + 2 for all x.
The problem this time is that the complementary function is y = C + De−2x , See Exercise 1.6(c).
so the p0 part of the trial solution is a solution of the associated homogeneous
equation (with C = p0 , D = 0). Hence, on substituting the trial solution
y = p1 x + p0 into the inhomogeneous equation, the p0 part disappears, and
the trial solution effectively reduces to y = p1 x. The result in this case is
that, after substituting the trial solution and its derivatives into the left-
hand side of the equation, there are not enough terms on the left-hand side
to compare with the terms in the right-hand-side function.
Again, the difficulty can be overcome by multiplying the trial solution sug-
gested by Procedure 2.2 by x, to give y = p1 x2 + p0 x. Differentiating this
gives
dy d2 y
= 2p1 x + p0 , = 2p1 .
dx dx2
133
Unit 3 Second-order differential equations

Substituting these into the left-hand side of the differential equation gives
d2 y dy
2
+2 = 2p1 + 2(2p1 x + p0 ) = 4p1 x + (2p1 + 2p0 ).
dx dx
Therefore y = p1 x2 + p0 x is a solution of the differential equation provided
that 4p1 x + (2p1 + 2p0 ) = 2x + 2 for all x. This gives p1 = 12 , p0 = 12 , so
yp = 12 (x2 + x)
is a particular integral for the given differential equation.

To summarize, Procedure 2.2 will fail if all or part of the suggested trial
solution is a solution of the associated homogeneous equation. In such cases,
a particular integral can usually be found by multiplying the trial solution
by x.
However, it may sometimes be necessary to multiply the trial function by x
more than once, as explained in Procedure 2.3 and Exercise 2.12.

Procedure 2.3 Exceptional cases


Suppose that you try using the method of undetermined coefficients
(described in Procedure 2.2) for finding a particular integral for an in-
homogeneous linear constant-coefficient second-order differential equa-
tion.
If this fails because all or part of the trial solution is a solution of
the associated homogeneous equation, then try multiplying the trial
solution by the independent variable x.
If all or part of the resulting trial solution is still a solution of the
associated homogeneous equation, then try multiplying by x again.

Exercise 2.12
Find a particular integral for You found the complementary
function in Exercise 1.6(d).
d2 y dy
−2 + y = ex .
dx2 dx

2.4 Combining cases


You have seen how to find a particular integral when the right-hand-side
function f (x) is polynomial, exponential or sinusoidal. In this subsection
you will see how to find a particular integral when f (x) is a combination of
such functions, by using the principle of superposition.

Example 2.8
Find a particular integral for
d2 y
+ 9y = 2e3x + 18x + 18. (2.7)
dx2
Solution
In Example 2.4 (page 129) you saw that yp = 19 e3x is a particular integral
for
d2 y
+ 9y = 2e3x ,
dx2

134
Section 3 Initial conditions and boundar y conditions

and in Example 2.2 (page 126) you saw that yp = x + 1 is a particular


integral for
d2 y
+ 9y = 9x + 9.
dx2
Therefore, by the principle of superposition (Theorem 1.1), a particular
integral for Equation (2.7) is
yp = 19 e3x + 2 × (x + 1) = 91 e3x + 2x + 2.

The approach of Example 2.8 is to find particular integrals for differential


equations involving each part of f (x) separately, and then to use the prin-
ciple of superposition to combine the two.

Exercise 2.13
Find particular integrals for each of the following differential equations.
d2 y dy
(a) 2
−2 + y = 4ex − 3e2x See Exercise 2.12.
dx dx
d2 x dx
(b) 2 2 + 3 + 2x = 12 cos 2t + 10
dt dt

End-of-section Exercise
*Exercise 2.14
Find the general solution of each of the following differential equations. You will find some help for
d2 θ parts (a), (d), (e) and (f) in
(a) + 4θ = 2t (b) u (t) + 4u (t) + 5u(t) = 5 Exercises 1.3 and 1.6, and
dt2 Example 1.3.
d2 Y dY d2 y
(c) 3 2 − 2 − Y = e2x + 3 (d) − 4y = e−2x
dx dx dx2
d2 y d2 y dy
(e) 2
+ 4y = sin 2x + 3x (f) 2
−3 + 2y = 2ex − 5e2x
dx dx dx

3 Initial conditions and boundar y


conditions
In Section 2 you saw how to find the general solution of an inhomogeneous
linear constant-coefficient second-order differential equation as a combina-
tion of a complementary function and a particular integral. In practice,
however, we usually want a particular solution that satisfies certain addi-
tional conditions. Recall that a particular solution is one that does not
involve arbitrary constants. In Unit 2 you saw how one additional condi-
tion (called an initial condition) was needed to find a value for the single
arbitrary constant in the general solution of a first-order differential equa-
tion in order to obtain a particular solution. In the case of second-order
differential equations, in order to obtain a particular solution two additional
conditions are needed to obtain values for the two arbitrary constants in the
general solution.

135
Unit 3 Second-order differential equations

There are two types of additional conditions for second-order differential


equations: initial conditions and boundary conditions. Problems involving
such conditions are called initial-value problems and boundary-value prob-
lems, respectively, and are discussed in Subsections 3.1 and 3.2.

3.1 Initial-value problems


For a first-order differential equation, an initial condition consists of specify- See Unit 2.
ing the value of the dependent variable (y = y0 , say) at a given value of the
independent variable (x = x0 ), and is often written in the form y(x0 ) = y0 .
One fairly obvious way of specifying two additional conditions for a second-
order differential equation is to give the values of both the dependent vari-
able (y = y0 ) and its derivative (dy/dx = z0 ) for the same given value of the
independent variable (x = x0 ).
There are many examples of such a pair of initial conditions occurring nat-
urally as part of a problem. One example is provided by the marble being
dropped from the Clifton Suspension Bridge, in Exercise 2.11. In that ex-
ample, x is the vertical distance from the point of dropping. The obvious θ0
choice of origin for the time t is the time at which the marble is dropped.
Therefore a naturally occurring pair of initial conditions is that, at time
t = 0, we know both the position x = 0 and the speed ẋ = 0 (since the mar-
ble is dropped, i.e. is released with zero initial velocity). Another example
is provided by the clock pendulum in Exercise 1.7. In this example, when
the pendulum changes direction, its rate of change of angle θ is momentarily
zero; also, when it changes direction, it makes its greatest angle θ0 with the
vertical (see Figure 3.1). Therefore, if we measure time t from the moment
the pendulum changes direction, we have the initial conditions θ = θ0 and
θ̇ = 0 when t = 0. Figure 3.1

Definitions
(a) Initial conditions associated with a second-order differential equa-
tion with dependent variable y and independent variable x specify
that y and dy/dx take values y0 and z0 , respectively, when x takes
the value x0 . These conditions can be written as
dy
y = y0 and = z0 when x = x0
dx
or as
y(x0 ) = y0 , y  (x0 ) = z0 .
The numbers x0 , y0 and z0 are often referred to as initial values.
(b) The combination of a second-order differential equation and initial
conditions is called an initial-value problem.

Let us now see how initial conditions can be used to determine values for
the two arbitrary constants and hence find a particular solution.

Example 3.1
Find the particular solution of the differential equation
d2 y dy
2
−3 + 2y = 0
dx dx
that satisfies the initial conditions y = 0 and dy/dx = 1 when x = 0.

136
Section 3 Initial conditions and boundar y conditions

Solution
From Example 1.3 we know that the general solution is
y = Cex + De2x , (3.1)
where C and D are two arbitrary constants. One of the initial conditions
involves the derivative of the solution, so we need to obtain the derivative
of the general solution:
dy
= Cex + 2De2x . (3.2)
dx
The initial conditions state that y(0) = 0, y  (0) = 1. Substituting x = 0,
y = 0 into Equation (3.1) gives
0 = Ce0 + De0 = C + D,
while substituting x = 0, dy/dx = 1 into Equation (3.2) gives
1 = Ce0 + 2De0 = C + 2D.
Solving these equations gives C = −1, D = 1, so the required particular Note that, when checking a
solution is particular solution, you
should check that it satisfies
y = −ex + e2x . the initial or boundary
conditions as well as the
differential equation.
*Exercise 3.1
Find the solutions of the following initial-value problems.
   
(a) u (t) + 9u(t) = 0, u π2 = 0, u π2 = 1. See Exercise 1.5(b).
d2 y dy
(b) −3 + 2y = 4ex , y(0) = 4, y  (0) = 2. See Example 1.3 and
dx2 dx Exercise 2.10(a).
d2 y dy
(c) 2
−2 + y = 4ex − 3e2x , y(0) = 4, y  (0) = −1. See Exercises 1.6(d)
dx dx and 2.13(a).

You saw in Unit 2 that an initial-value problem involving a linear first-


order differential equation has a unique solution under certain circumstances.
(Such circumstances hold for nearly every such initial-value problem that you
are likely to come across in practice.) The same is true of initial-value prob-
lems involving a linear constant-coefficient second-order differential equa-
tion, as the following theorem makes clear.

Theorem 3.1
The initial-value problem
d2 y dy
a +b + cy = f (x), y(x0 ) = y0 , y  (x0 ) = z0 ,
dx2 dx
where a, b, c are real constants with a = 0, and f (x) is a given contin-
uous real-valued function on an interval (r, s), with x0 ∈ (r, s), has a
unique solution on that interval.

Note that one consequence of this theorem is that if the differential equation
is homogeneous and the initial conditions are of the form y(x0 ) = 0 and
y  (x0 ) = 0, then the unique solution must be the zero function y = 0, since
it satisfies the differential equation and the initial conditions.

137
Unit 3 Second-order differential equations

3.2 Boundar y-value problems


The two conditions in an initial-value problem (the value of the dependent
variable y and its derivative dy/dx) both relate to the same value of x.
However, the two conditions needed to determine values for the arbitrary
constants need not relate to the same value of x. We could have one condi-
tion for x = x0 and another for x = x1 , say. For example, consider again the
‘sagging’ beam from Exercise 2.9. Two known conditions on this beam are
its zero displacements at the ends of the beam, where it rests on the rigid
supports: that is, its boundary conditions are y(0) = 0 and y(l) = 0 (where l
is the length of the beam). This pair of boundary conditions gives the value
of y at two different points, but in general each boundary condition could
specify the value of either y or dy/dx (or even a relationship between them).

Definitions
(a) Boundary conditions associated with a second-order differential
equation with dependent variable y and independent variable x
specify that y or dy/dx (or some combination of the two) takes
values y0 and y1 at two different values x0 and x1 , respectively, of x.
The numbers x0 , x1 , y0 and y1 are often referred to as boundary
values.
(b) The combination of a second-order differential equation and bound-
ary conditions is called a boundary-value problem.

The conditions are referred to as ‘boundary’ conditions because, as in the


beam example, they often relate to conditions at the endpoints x0 and x1 of
an interval [x0 , x1 ] on which we are interested in exploring the differential
equation.
Let us now see how boundary conditions can be used to determine values
for the two arbitrary constants and hence find a particular solution.

Example 3.2
Find the particular solution of the differential equation
d2 y
+ 9y = 0
dx2
that satisfies the boundary conditions y = 0 when x = 0 and dy/dx = 1
when x = π3 .
Solution
From Exercise 1.5(b), we know that the general solution is
y = C cos 3x + D sin 3x, (3.3)
where C and D are two arbitrary constants.
One of the boundary conditions involves the derivative of the solution, so
we need to obtain the derivative of the general solution:
dy
= −3C sin 3x + 3D cos 3x. (3.4)
dx
 
The boundary conditions state that y(0) = 0, y  π3 = 1. Substituting x = 0,
y = 0 into Equation (3.3) gives
0 = C cos 0 + D sin 0 = C,

138
Section 3 Initial conditions and boundar y conditions

i.e. C = 0. Substituting x = π3 , y  = 1 and C = 0 into Equation (3.4) gives


1 = 3D cos π = −3D.
Therefore C = 0, D = − 13 , so the required particular solution is
y = − 13 sin(3x).

Exercise 3.2
Find the solution of the following boundary-value problem:
d2 y dy
−3 + 2y = 4ex , y  (0) = 2, y(1) = 0. See Exercise 3.1(b).
dx2 dx

Exercise 3.3
Use the differential equation of Exercise 2.9, with R = S = Q = 1, namely
y  − y + 12 (l − x)x = 0, (3.5)
to determine the vertical displacement at the centre of a beam of length
2 metres resting on rigid supports at its ends.

Unlike the case of initial-value problems, boundary-value problems may not


have solutions even when the differential equation is linear and constant-
coefficient with a continuous real-valued right-hand-side function, as the
following example illustrates.

Example 3.3
Try to find a solution of the boundary-value problem
d2 y π
+ 4y = 0, y(0) = 0, y 2 = 1.
dx2
Solution
From Exercise 1.6(a), the general solution is
y = C cos 2x + D sin 2x,
where C and D are two arbitrary constants.
π
The boundary conditions state that y(0) = 0, y 2 = 1. Substituting each
of these into the general solution in turn gives
0 = C cos 0 + D sin 0 = C,
1 = C cos π + D sin π = −C.
There is no solution for which C = 0 and C = −1, so there is no solution of
the differential equation that satisfies the given boundary conditions.

Fortunately it is rare for a boundary-value problem that models a real-life


situation to have no solution (and in such cases it is usually possible to
reformulate the model to overcome the difficulty).
Not only is it possible for boundary-value problems to have no solution, but
it is also possible for them to have solutions that are not unique, as the
following example illustrates.

139
Unit 3 Second-order differential equations

Example 3.4
Find the solution of the boundary-value problem
d2 y dy
+4 + 5y = 5, y(0) = 1, y(π) = 1.
dx2 dx
Solution
From Exercise 2.14(b), the general solution is
y = e−2x (C cos x + D sin x) + 1,
where C and D are two arbitrary constants.
The boundary conditions state that y(0) = 1, y(π) = 1. Substituting each
of these into the general solution in turn gives
1 = e0 (C cos 0 + D sin 0) + 1 = C + 1,
1 = e−2π (C cos π + D sin π) + 1 = −Ce−2π + 1.
Both of these equations reduce to C = 0, but D can take any value, so any
solution of the form
y = De−2x sin x + 1
satisfies the differential equation and the boundary conditions.

In Example 3.4, there is not a unique solution of the differential equation


that satisfies the given boundary conditions, but instead there is an infinite
family of possible solutions.
Finally, a word of reassurance: most of the boundary-value problems that
you will come across in this course will have a unique solution.

End-of-section Exercises

*Exercise 3.4
For each of the following problems, identify the conditions as either initial You found the general
conditions or boundary conditions, and find the solution of each problem. solution of the differential
equation in Exercise 1.6(a).
(a) u (x) + 4u(x) = 0, u(0) = 1, u (0) = 0.
 
(b) u (x) + 4u(x) = 0, u(0) = 0, u π2 = 0.
   
(c) u (x) + 4u(x) = 0, u π2 = 0, u π2 = 0.
 
(d) u (x) + 4u(x) = 0, u(−π) = 1, u π4 = 2.
 
(e) u (x) + 4u(x) = 0, u (0) = 0, u π4 = 1.

Exercise 3.5
Find the solution (if any) of each of the following problems.
(a) u (t) + 4u (t) + 5u(t) = 0, u(0) = 0, u (0) = 2. See Exercise 2.14(b).
d2 y dy dy
(b) +2+ 2y = 0, where y = 0 and = 0 when x = 0. See Exercise 1.9(a).
dx2 dx dx
 
¨ + 9x = 3(1 − πt), x(0) = 13 , ẋ π3 = 0.
(c) x See Exercise 1.5(b).

140
Section 4 The nature of solutions

4 The nature of solutions


This section is intended principally to assist in the understanding of the na-
ture of oscillatory solutions to problems involving linear constant-coefficient
second-order differential equations. For the whole of this section we shall
assume that the differential equation has the form
d2 y dy
a 2 +b + cy = f (x),
dx dx
in which a, b and c are positive. (This is almost always the case for equations
arising in mechanics.)

4.1 Transients
Consider the equation
d2 y dy
a 2 +b + cy = 0 with a > 0, b > 0, c > 0.
dx dx
The nature of the general solution depends on the nature of the roots of the
auxiliary equation aλ2 + bλ + c = 0. More specifically, you saw in Proce-
dure 1.1 that
λ1 , λ2 real and distinct ⇒ solution y(x) = Ceλ1 x + Deλ2 x ,
λ1 , λ2 real and equal ⇒ solution y(x) = (C + Dx)eλ1 x ,
λ1 , λ2 complex (= α ± iβ) ⇒ solution y(x) = eαx (C cos βx + D sin βx),
where in each case C and D are arbitrary real constants.
Since the auxiliary equation has λ1 and λ2 as roots, it may be written as
(λ − λ1 )(λ − λ2 ) = 0,
or
λ2 − (λ1 + λ2 )λ + λ1 λ2 = 0.
When we divide through the original auxiliary equation by a, we obtain
b c
λ2 + λ + = 0.
a a
Comparing the coefficients of λ in these two versions of the same quadratic
equation, we find that
b c
λ1 + λ2 = − and λ1 λ2 = .
a a
Now, using the fact that a, b and c are positive, we can make some interesting
deductions. First, c/a must be positive so, if they are real, λ1 and λ2
have the same sign. Also, since −b/a is negative, the sum λ1 + λ2 must
be negative. There is only one conclusion to draw from this: if λ1 and λ2
are real, then both are negative. If on the other hand λ1 and λ2 form the
complex conjugate pair α ± iβ, then their sum is
λ1 + λ2 = 2α.
Now λ1 + λ2 = −b/a being negative implies that α must be negative.
The upshot is that, in the above list of solutions, all the exponential terms
have a negative index. Thus for large values of x, the magnitude of all the
above solutions will become small. This phenomenon represents damping,
and you will meet it again in Unit 17.
The graphs of typical complementary functions in the three cases are shown
in Figure 4.1.

141
Unit 3 Second-order differential equations

Figure 4.1

In such cases, when the complementary function tends to zero, it is known as


the transient or the transient solution, in that it essentially disappears
for large enough x.
If we now turn to the inhomogeneous equation
d2 y dy
a 2
+b + cy = f (x),
dx dx
the above discussion shows that when a, b and c are positive, the comple-
mentary function is transient and will not affect the long-term behaviour of
the solution.

142
Section 4 The nature of solutions

For this reason, a particular integral not involving part of the transient
solution is known as the steady-state solution. In most cases, for large x,
because the transient solution then has little effect, the solution settles down However, it is still possible
to a ‘steady state’ given by the contribution from that particular integral. that a particular integral may
Two typical solutions to initial-value problems of this type are sketched at decay at an even faster rate
than the complementary
the top of Figure 4.2. Here you can see the two examples of a particular function!
solution, followed by the contributions made to each by the transient and
the steady-state solution.

y y
2 particular solution 2 particular solution

1 1

0 x 0 x
10 20 30 40 50 10 20 30 40 50
–1 –1

–2 –2

y y
2 transient solution 2 transient solution

1 1

0 x 0 x
10 20 30 40 50 10 20 30 40 50
–1 –1

–2 –2

y y
2 steady-state solution 2 steady-state solution

1 1

0 x 0 x
10 20 30 40 50 10 20 30 40 50
–1 –1

–2 –2

Figure 4.2

Example 4.1
Consider the differential equation of Example 2.5 (page 130),
d2 y dy
+2 + 2y = 10 sin 2x,
dx2 dx
together with the initial conditions y(0) = −1 and y  (0) = −2. Find the
particular solution and the steady-state behaviour of the solution.
Solution
The auxiliary equation is
λ2 + 2λ + 2 = 0,
with roots −1 ± i. In Example 2.5 we found a particular integral
y(x) = −2 cos 2x − sin 2x.

143
Unit 3 Second-order differential equations

Therefore the general solution of the differential equation is


y(x) = e−x (C cos x + D sin x) − 2 cos 2x − sin 2x.
Substituting the initial conditions shows that C = D = 1, so the particular
solution to the initial-value problem is
y(x) = e−x (cos x + sin x) − 2 cos 2x − sin 2x.
This is made up of the transient e−x (cos x + sin x), which quickly dies away,
and the steady-state solution −2 cos 2x − sin 2x, which determines the long-
term behaviour.
We say that the terms −2 cos 2x and − sin 2x dominate the solution for
large positive values of x.

In the remainder of this section you are invited to investigate these ideas
using the computer algebra package.

4.2 Solving initial-value problems on the computer


The computer algebra package for the course allows you to solve initial-
value problems involving linear constant-coefficient second-order differential
equations in general, and not just when all the coefficients are positive. The
following activity asks you to solve such problems, and to examine the nature
and behaviour of the solutions by means of graphs.

Use your computer to complete the following activity. PC


*Activity 4.1
Solve each of the following initial-value problems.
d2 y dy
(a) 2
+3 + 2y = 4ex , y(0) = 4, y  (0) = 2.
dx dx
d2 y dy
(b) 2 2 + 3 = sin x, y(0) = 0, y  (0) = −1.
dx dx
d2 y dy
(c) 4 2 + 4 + y = 2 cos 2x, y(1) = 0, y  (1) = 1.
dx dx
d2 y dy
(d) 5 2 + 6 + 5y = 4 sin x, y(0) = 1, y  (0) = −2.
dx dx
d2 y dy
(e) 2
−4 + 4y = 3 cos 3x, y( π2 ) = 0, y  ( π2 ) = 12 .
dx dx
In each case, consider the long-term behaviour of the solution, and try to
identify which terms will dominate the solution for large positive values of x.

144
Outcomes

Outcomes
After studying this unit you should be able to:
• understand and use the terminology relating to linear constant-coefficient
second-order differential equations;
• understand the key role of the principle of superposition in the solution
of linear constant-coefficient second-order differential equations;
• obtain the general solution of a homogeneous linear constant-coefficient
second-order differential equation using the solutions of its auxiliary
equation;
• use the method of undetermined coefficients to find a particular integral
for an inhomogeneous linear constant-coefficient second-order differential
equation with certain simple forms of right-hand-side function;
• obtain the general solution of an inhomogeneous linear constant-
coefficient second-order differential equation by combining its comple-
mentary function and a particular integral;
• use the general solution together with a pair of initial or boundary condi-
tions to obtain, when possible, a particular solution of a linear constant-
coefficient second-order differential equation;
• understand the nature of the solutions of linear constant-coefficient
second-order differential equations with positive coefficients, particularly
those involving transient and steady-state parts;
• use the computer algebra package for the course to solve a second-order
differential equation.

145
Unit 3 Second-order differential equations

Solutions to the exercises


Section 1 (Note how simple the solution is when there is no first-
derivative term in the differential equation. In general,
1.1 (a) Equations (i), (ii), (iii), (vii), (viii) and (ix) the solution of an equation of the form y  + ω2 y = 0,
are linear and constant-coefficient. (Equations (v) where ω is a constant and x is the independent variable,
and (vi) are non-linear; (iv) is linear but not constant- is y = C cos ωx + D sin ωx.)
coefficient.)
1.6 (a) The auxiliary equation is λ2 + 4 = 0, which
(b) Of the linear constant-coefficient equations, only has solutions λ = ±2i. The general solution is there-
(iii) and (ix) are homogeneous. fore
(c) In Equations (i)–(vi) the (dependent, independent) y = C cos 2x + D sin 2x.
variable pairs are all (y, x). In Equations (vii), (viii) (You could also have written down this general solution
and (ix) they are (t, θ), (x, t) and (x, t), respectively. directly using the remark in Solution 1.5(b).)

1.2 (a) λ2 − 5λ + 6 = 0 (b) The auxiliary equation is λ2 − 6λ + 8 = 0, which


has solutions λ1 = 4 and λ2 = 2. The general solution
(b) λ2 − 9 = 0 is therefore
(c) λ2 + 2λ = 0 u = Ce4x + De2x .
(c) The auxiliary equation is λ2 + 2λ = 0, which has
2
1.3 (a) The auxiliary equation is λ + 5λ + 6 = 0. solutions λ1 = 0 and λ2 = −2. The general solution is
Solving this by factorization as (λ + 2)(λ + 3) = 0 gives therefore
the roots λ1 = −2 and λ2 = −3. The general solution y = C + De−2x .
is therefore
(d) The auxiliary equation is λ2 − 2λ + 1 = 0, which
y = Ce−2x + De−3x .
has solutions λ1 = λ2 = 1. The general solution is
(b) The auxiliary equation is 2λ2 + 3λ = 0. This can therefore
be factorized as λ(2λ + 3) = 0, so its roots are λ1 = 0 y = (C + Dx)ex .
and λ2 = − 23 . The general solution is therefore
(e) The auxiliary equation is λ2 − ω2 = 0, which has
y = Ce0 + De−3x/2 = C + De−3x/2 .
solutions λ = ±ω. The general solution is therefore
(c) The auxiliary equation is λ2 − 4 = 0, i.e. λ2 = 4, so y = Ceωx + De−ωx .
its roots are λ1 = −2 and λ2 = 2. The general solution
is therefore (f ) The auxiliary equation is λ2 + 4λ + 29 = 0, which
has solutions λ = −2 ± 5i. The general solution is there-
z = Ce−2u + De2u .
fore
e−2x (C cos 5x + D sin 5x).
1.4 (a) The auxiliary equation is λ2 + 2λ + 1 = 0,
which can be factorized as (λ + 1)2 = 0, giving equal 2
1.7 The auxiliary
 equation is λ + g/l = 0, which has
roots λ1 = λ2 = −1. The general solution is therefore
solutions λ = ±i g/l. The general solution is therefore
y = (C + Dx)e−x .  
g g
(b) The auxiliary equation factorizes as (λ − 2)2 = 0, θ = C cos t + D sin t .
l l
which has equal roots λ1 = λ2 = 2. The general solu-
(This is another example of an equation involving no
tion is therefore
first-derivative term. So you could have written down
s = (C + Dt)e2t . the general solution directly using the remark in Solu-
tion 1.5(b).)
1.5 (a) The auxiliary equation is λ2 + 4λ + 8 = 0,
which has solutions

1.8 (a) The required auxiliary equation is
−4 ± 16 − 32 3λ − 1 − 2λ2 = 0,
λ= = −2 ± 2i.
2 or, equivalently,
The general solution is therefore
2λ2 − 3λ + 1 = 0.
y = e−2x (C cos 2x + D sin 2x).
(b) The two solutions of the auxiliary equation are
(b) The auxiliary equation is λ2 + 9 = 0, which has λ1 = 12 and λ2 = 1.
solutions
(c) By Procedure 1.1, the general solution is
λ = ±3i.
1
The general solution is therefore y = Ce 2 x + Dex ,
θ = e0 (C cos 3t + D sin 3t) = C cos 3t + D sin 3t. where C and D are arbitrary constants.

146
Solutions to the exercises

1.9 (a) λ2 + 2λ + 2 = 0 has solutions −1 ± i, so the (b) Substituting y = p1 x + p0 and its derivatives into
general solution is y = e−x (C cos x + D sin x). the differential equation gives
(b) λ2 − 16 = 0 has solutions ±4, so the general solu- 0 + 2p1 + (p1 x + p0 ) = p1 x + (2p1 + p0 ) = 2x.
tion is y = Ce4x + De−4x . Hence p1 = 2, p0 = −4, and a particular integral is
2 yp = 2x − 4.
(c) λ − 4λ + 4 = 0 has solutions λ1 = λ2 = 2, so the
general solution is y = (C + Dx)e2x .
(d) λ2 + 3λ = 0 has solutions λ1 = 0 and λ2 = −3, so 2.4 We try y = p2 t2 + p1 t + p0 , which has derivatives
the general solution is θ = C + De−3t . y  = 2p2 t + p1 , y  = 2p2 . Substituting these into the
differential equation gives
1.10 The auxiliary equation λ2 + 4kλ + 4 = 0 can be 2p2 − (p2 t2 + p1 t + p0 ) = −p2 t2 − p1 t + (2p2 − p0 )
solved
√ using the formula method to give λ = −2k ±
2 k 2 − 1. So there are complex conjugate solutions, = t2 .
leading to a general solution involving sines and cosines, Hence p2 = −1, p1 = 0, p0 = −2, and a particular inte-
when k 2 < 1, i.e. when |k| < 1. gral is
yp = −t2 − 2.

Section 2 2.5 We try a solution of the form y = pe−x , which has


derivatives dy/dx = −pe−x , d2 y/dx2 = pe−x . Substi-
2.1 We could check this directly, by substituting tuting these into the differential equation gives
y = yp1 − yp2 into the associated homogeneous equa- 2pe−x + 2pe−x + pe−x = 5pe−x = 2e−x .
tion. However, it is easier to appeal to the principle Hence p = 25 , and a particular integral is
of superposition. Since yp1 and yp2 both satisfy
yp = 25 e−x .
d2 y dy
a 2 +b + cy = f (x),
dx dx
Theorem 1.1 shows that the combination y = yp1 − yp2 2.6 We try y = p cos 3t + q sin 3t, which has derivatives
indeed satisfies dy
d2 y dy = −3p sin 3t + 3q cos 3t,
a 2 +b + cy = f (x) − f (x) = 0, dt
dx dx d2 y
as required. = −9p cos 3t − 9q sin 3t.
dt2
Substituting into the differential equation gives
2.2 (a) The associated homogeneous equation is (−9p cos 3t − 9q sin 3t) − (−3p sin 3t + 3q cos 3t)
d2 y/dx2 + 4y = 0. The complementary function (see
Exercise 1.6(a)) is yc = C cos 2x + D sin 2x. = −(9p + 3q) cos 3t + (3p − 9q) sin 3t
Trying a solution of the form yp = p, where p is a con- = cos 3t + sin 3t.
stant, in the original equation d2 y/dx2 + 4y = 8 gives Hence we have a pair of simultaneous equations
0 + 4p = 8, so that p = 2. Thus a particular integral is −9p − 3q = 1,
yp = 2. 3p − 9q = 1.
By Procedure 2.1, the general solution is
Adding three times the second equation to the first
2 1
y = C cos 2x + D sin 2x + 2. shows that q = − 15 , whence p = − 15 . A particular
(b) The associated homogeneous equation is integral is thus
1 2
d2 y/dx2 − 3dy/dx + 2y = 0. The complementary func- yp = − 15 cos 3t − 15 sin 3t.
tion (see Example 1.3) is yc = Cex + De2x .
Trying a solution of the form yp = p in the original equa-
tion d2 y/dx2 − 3dy/dx + 2y = 6 gives 0 − 0 + 2p = 6, so 2.7 (a) Try y = pe3x .
that p = 3. Thus a particular integral is yp = 3. (b) Try y = p cos 3x + q sin 3x.
By Procedure 2.1, the general solution is
y = Cex + De2x + 3.
2.8 (a) From Exercise 1.9(a), the complementary
function is yc = e−x (C cos x + D sin x). To find a par-
2.3 (a) Substituting y = p1 x + p0 and its derivatives ticular integral, try y = p0 . Substituting into the differ-
into the differential equation gives ential equation gives
0 − 2p1 + 2(p1 x + p0 ) = 2p1 x + (2p0 − 2p1 ) = 2x + 3. 0 + 0 + 2p0 = 2p0 = 4.
Equating coefficients gives p1 = 1, p0 = 52 . Therefore a Hence p0 = 2, and a particular integral is yp = 2.
particular integral is Therefore the general solution is
yp = x + 52 . y = e−x (C cos x + D sin x) + 2.

147
Unit 3 Second-order differential equations

(b) From Exercise 1.9(d), θc = C + De−3t . To find a Hence p = −4, and a particular integral is
particular integral, try θ = p cos 3t + q sin 3t. Differen- yp = −4xex .
tiating gives
dθ (b) From Exercise 1.3(b), the associated homogeneous
= −3p sin 3t + 3q cos 3t, equation has general solution y = C + De−3x/2 , and the
dt
trial solution y = p0 suggested by Procedure 2.2 is a so-
d2 θ
= −9p cos 3t − 9q sin 3t. lution of this equation (with C = p0 , D = 0). So we try
dt2 y = p0 x. Differentiating twice gives
Substituting into the differential equation gives
dy d2 y
(−9p cos 3t − 9q sin 3t) + 3(−3p sin 3t + 3q cos 3t) = p0 , = 0.
dx dx2
= (9q − 9p) cos 3t − (9q + 9p) sin 3t Substituting into the left-hand side of the differential
= 9 cos 3t. equation gives 3p0 = 1, so p0 = 13 , and a particular in-
This gives a pair of simultaneous equations to solve: tegral is
−9p + 9q = 9, yp = 13 x.
−9p − 9q = 0.
Hence p = − 21 , q = 12 , and a particular integral is 2.11 The associated homogeneous equation is
θp = − 12 cos 3t + 12 sin 3t. Therefore the general solu- mλ2 + rλ = 0,
tion is with solutions λ = 0 and λ = −r/m. The complemen-
θ = C + De−3t − 1
2 cos 3t + 1
2 sin 3t. tary function is therefore
xc = C + De−rt/m .
2.9 Putting the equation into standard form and using The inhomogeneous term is mg, so Procedure 2.2 sug-
R = S = Q = 1 gives gests a trial solution x = p0 . However, this is a solution
y  − y = − 12 (l − x)x = − 12 lx + 21 x2 . of the associated homogeneous equation (with C = p0 ,
D = 0). Hence we try x = p0 t instead. Differentiating
The associated homogeneous equation is y  − y = 0, and substituting gives
which has auxiliary equation λ2 − 1 = 0. This has roots
λ = ±1, so the complementary function is rp0 = mg,
yc = Cex + De−x . so
mg
To obtain a particular integral, we try a function of p0 = .
r
the form y = p2 x2 + p1 x + p0 . Its derivatives are y  = Hence a particular integral is
2p2 x + p1 , y  = 2p2 . Substituting into the differential mgt
equation gives xp = ,
r
2p2 − (p2 x2 + p1 x + p0 ) = −p2 x2 − p1 x + (2p2 − p0 ) and the general solution is
= 12 x2 − 12 lx. mgt
x = C + De−rt/m + .
r
Hence p2 = − 12 , p1 = 12 l, p0 = −1, and a particular in-
tegral is
2.12 From Exercise 1.6(d), the associated homoge-
yp = − 12 x2 + 21 lx − 1.
neous equation has general solution y = (C + Dx)ex ,
Therefore the general solution is so not only is the trial solution y = pex suggested by
y = Cex + De−x − 12 x2 + 21 lx − 1. Procedure 2.2 a solution of the associated homogeneous
differential equation (with C = p, D = 0), but so is
y = pxex (with C = 0, D = p). So we try y = px2 ex .
2.10 (a) From Example 1.3, the associated homoge- Differentiating twice gives
neous equation has general solution y = Cex + De2x ,
dy
and the trial solution y = pex suggested by Proce- = 2pxex + px2 ex = p(2x + x2 )ex ,
dure 2.2 is a solution of this equation (with C = p, dx
D = 0). So we try y = pxex instead. Differentiating d2 y
= p(2 + 2x)ex + p(2x + x2 )ex
twice gives dx2
= p(2 + 4x + x2 )ex .
dy
= pex + pxex = p(1 + x)ex , Substituting into the differential equation gives
dx
d2 y p(2 + 4x + x2 )ex − 2p(2x + x2 )ex + px2 ex = 2pex
= pex + p(1 + x)ex = p(2 + x)ex .
dx2 = ex .
Substituting into the left-hand side of the differential Hence p = 12 , and a particular integral is
equation gives
yp = 12 x2 ex .
p(2 + x)ex − 3p(1 + x)ex + 2pxex = −pex = 4ex .

148
Solutions to the exercises

2.13 (a) From Exercise 2.12, yp = 12 x2 ex is a partic- Hence p1 = 12 , p0 = 0, and a particular integral is
ular integral for θp = 12 t.
d2 y dy
−2 + y = ex . Therefore the general solution is
dx2 dx
So, using the principle of superposition, we can find a θ = C cos 2t + D sin 2t + 12 t.
particular integral for the given differential equation if (b) The auxiliary equation is λ2 + 4λ + 5 = 0, which
we can find one for has solutions λ = −2 ± i. So the complementary func-
d2 y dy tion is
−2 + y = −3e2x .
dx dx uc = e−2t (C cos t + D sin t).
We try y = pe2x , which has derivatives
To find a particular integral, try u = p0 . Substituting
dy d2 y
= 2pe2x , = 4pe2x . gives 5p0 = 5. Hence p0 = 1, and a particular integral
dx dx2 is
Substituting into the differential equation gives
up = 1.
4pe2x − 4pe2x + pe2x = pe2x = −3e2x .
Therefore the general solution is
Hence p = −3, and yp = −3e2x is a particular integral
u = e−2t (C cos t + D sin t) + 1.
for the differential equation with right-hand side −3e2x .
Thus, using the principle of superposition, a particular (c) The auxiliary equation is 3λ2 − 2λ − 1 = 0, which
integral for the given differential equation is has solutions λ1 = 1 and λ2 = − 13 . So the complemen-
tary function is
yp = 4( 12 x2 ex ) − 3e2x = 2x2 ex − 3e2x .
Yc = Cex + De−x/3 .
(b) This time we do not have a particular integral for
Consider first the e2x term on the right-hand side
any part of the right-hand-side function, so we need to
of the equation. To find a particular integral, try
start from scratch.
Y = pe2x . The derivatives are dY /dx = 2pe2x and
First consider the 12 cos 2t term on the right-hand side, d2 Y /dx2 = 4pe2x . Substituting gives
and try x = p cos 2t + q sin 2t as a trial solution. This
has derivatives 3(4pe2x ) − 2(2pe2x ) − pe2x = 7pe2x = e2x .
dx Hence p = 17 , and a particular integral is
= −2p sin 2t + 2q cos 2t,
dt Yp = 17 e2x .
2
d x Now consider the 3 term on the right-hand side of the
= −4p cos 2t − 4q sin 2t.
dt2 equation, and try Y = p0 . Substituting gives −p0 = 3,
Substituting into the differential equation gives so p0 = −3, and a particular integral is
2(−4p cos 2t − 4q sin 2t) + 3(−2p sin 2t + 2q cos 2t) Yp = −3.
+ 2(p cos 2t + q sin 2t)
Therefore, using the principle of superposition, a
= 6(q − p) cos 2t − 6(p + q) sin 2t particular integral for the differential equation with
= 12 cos 2t. f (x) = e2x + 3 is
So p + q = 0, q − p = 2, hence p = −1, q = 1, and a Yp = 17 e2x − 3.
particular integral is Therefore the general solution is
xp = − cos 2t + sin 2t. Y = Cex + De−x/3 + 17 e2x − 3.
Now consider the 10 term, and try x = p0 . Substituting
(d) From Exercise 1.3(c), the complementary function
into the differential equation gives 2p0 = 10, so p0 = 5,
is
and a particular integral is
yc = Ce−2x + De2x .
xp = 5.
To find a particular integral, since e−2x is a so-
Therefore, using the principle of superposition, a
lution of the associated homogeneous equation, try
particular integral for the differential equation with
y = pxe−2x . The derivatives are dy/dx = p(1 − 2x)e−2x
f (t) = 12 cos 2t + 10 is
and d2 y/dx2 = 4p(x − 1)e−2x . Substituting gives
xp = − cos 2t + sin 2t + 5.
4p(x − 1)e−2x − 4pxe−2x = −4pe−2x = e−2x .
Hence p = − 14 , and a particular integral is
2.14 (a) From Exercise 1.6(a), the complementary
yp = − 41 xe−2x .
function is
Therefore the general solution is
θc = C cos 2t + D sin 2t.
y = Ce−2x + De2x − 14 xe−2x .
To find a particular integral, try θ = p1 t + p0 . Substi-
tuting this and its derivatives into the differential equa-
tion gives
4(p1 t + p0 ) = 2t.

149
Unit 3 Second-order differential equations

(e) From Exercise 1.6(a), the complementary function Section 3


is
yc = C cos 2x + D sin 2x. 3.1 (a) From Exercise 1.5(b), the general solution is
To find a particular integral, we note that, from u = C cos 3t + D sin 3t.
part (a), a particular integral for d2 y/dx2 + 4y = 2x is Its derivative is
yp = 12 x. So we need to consider only the sin 2x term,
u = −3C sin 3t + 3D cos 3t.
and then use the principle of superposition. For this
term, noting the form of the complementary function, Substituting the initial condition t = π2 , u = 0 into the
try y = x(p cos 2x + q sin 2x). The derivatives are general solution gives D = 0. Substituting the initial
dy condition t = π2 , u = 1 into the derivative gives C = 13 .
= (p + 2qx) cos 2x + (q − 2px) sin 2x, Hence the required particular solution is
dx
d2 y u = 13 cos 3t.
= (4q − 4px) cos 2x − (4p + 4qx) sin 2x.
dx2 (b) From Example 1.3 and Exercise 2.10(a), the gen-
Substituting gives eral solution is
(4q − 4px) cos 2x − (4p + 4qx) sin 2x y = Cex + De2x − 4xex .
+ 4x(p cos 2x + q sin 2x) Its derivative is
= 4q cos 2x − 4p sin 2x y  = Cex + 2De2x − 4(1 + x)ex .
= sin 2x. Substituting the initial condition x = 0, y = 4 into the
Hence p = − 14 , q = 0, and a particular integral is general solution gives C + D = 4. Substituting the ini-
yp = − 14 x cos 2x. tial condition x = 0, y  = 2 into the derivative gives
So, using the principle of superposition, a particular in- C + 2D − 4 = 2. Hence C = 2, D = 2, and the required
tegral for the given differential equation is particular solution is
yp = − 14 x cos 2x + 32 ( 12 x) = − 41 x cos 2x + 43 x. y = 2ex + 2e2x − 4xex .
Therefore the general solution is (c) From Exercises 1.6(d) and 2.13(a), the general so-
y = C cos 2x + D sin 2x − 1 3 lution is
4 x cos 2x + 4 x.
y = (C + Dx)ex + 2x2 ex − 3e2x .
(f ) From Example 1.3, the complementary function is
Its derivative is
yc = Cex + De2x .
y  = (C + D + Dx)ex + (4x + 2x2 )ex − 6e2x .
Consider first the 2ex term on the right-hand side of the
equation. To find a particular integral, since ex appears Substituting the initial condition x = 0, y = 4 into the
in the complementary function, try y = pxex , which has general solution gives C − 3 = 4. Substituting the ini-
derivatives tial condition x = 0, y  = −1 into the derivative gives
dy d2 y C + D − 6 = −1. Hence C = 7, D = −2, and the re-
= p(1 + x)ex , = p(2 + x)ex . quired particular solution is
dx dx2
Substituting into the differential equation gives y = (7 − 2x)ex + 2x2 ex − 3e2x
p(2 + x)ex − 3p(1 + x)ex + 2pxex = −pex = 2ex . = (7 − 2x + 2x2 )ex − 3e2x .
Hence p = −2, and a particular integral is
3.2 From Exercise 3.1(b), the general solution is
yp = −2xex .
y = Cex + De2x − 4xex ,
Now consider the −5e2x term on the right-hand side of
the equation. To find a particular integral, since e2x and its derivative is
appears in the complementary function, try y = pxe2x , y  = Cex + 2De2x − 4(1 + x)ex .
which has derivatives Substituting the boundary condition x = 0, y  = 2 into
dy d2 y the derivative gives C + 2D = 6. Substituting x = 1,
= p(1 + 2x)e2x , = p(4 + 4x)e2x .
dx dx2 y = 0 into the general solution gives Ce + De2 − 4e = 0,
Substituting into the differential equation gives which can be rearranged to give C + eD = 4. Hence
p(4 + 4x)e2x − 3p(1 + 2x)e2x + 2pxe2x = pe2x C = (8 − 6e)/(2 − e), D = 2/(2 − e), and the required
particular solution is
= −5e2x .
8 − 6e x 2 2x
Hence p = −5, and a particular integral is y= e + e − 4xex .
2−e 2−e
yp = −5xe2x .
Therefore, using the principle of superposition, a 3.3 From Exercise 2.9, the general solution of Equa-
particular integral for the differential equation with tion (3.5) is y = Cex + De−x − 12 x2 + 12 lx − 1, which
f (x) = 2ex − 5e2x is for l = 2 becomes
yp = −2xex − 5xe2x . y = Cex + De−x − 12 x2 + x − 1.
Therefore the general solution is The boundary conditions, resulting from the beam rest-
ing on supports at its two ends, are y(0) = 0, y(2) = 0.
y = Cex + De2x − 2xex − 5xe2x .

150
Solutions to the exercises

Substitution of these into the differential equation gives 3.5 Parts (a) and (b) are initial-value problems, and
C + D − 1 = 0 and Ce2 + De−2 − 1 = 0. Multiply- therefore by Theorem 3.1 each has a unique solution.
ing the second equation by e2 gives C + D = 1 and However, part (c) is a boundary-value problem, which
Ce4 + D = e2 as the equations to solve. Subtracting may have no solution, a unique solution, or an infinite
the equations gives C(e4 − 1) = e2 − 1. This gives number of solutions.
e2 − 1 e2 − 1 1 (a) From Exercise 2.14(b), the general solution is
C= 4 = 2 2
= 2 ,
e −1 (e + 1)(e − 1) e +1 u = e−2t (C cos t + D sin t).
1 e2 + 1 − 1 e2 Its derivative is
D =1−C =1− 2 = = .
e +1 e2 + 1 e2 + 1
u = e−2t ((−2C + D) cos t − (C + 2D) sin t).
Hence the required particular solution is
1 The condition u(0) = 0 gives C = 0. The condition
y= 2 (ex + e2−x ) − 12 x2 + x − 1. u (0) = 2 gives D = 2. The solution is therefore
e +1
At the centre of the beam, x = 1, so y 0.148. The u = 2e−2t sin t.
displacement or ‘sag’ at the centre of the beam is ap- (b) The differential equation is homogeneous and the
proximately 0.148 m or about 14.8 cm. initial conditions are both equal to zero. Hence the so-
lution is the zero function y = 0.

3.4 Problems (a) and (c) are initial-value problems; (c) From Exercise 1.5(b), the complementary function
problems (b), (d) and (e) are boundary-value problems. is
The differential equation is the same in each case, and xc = C cos 3t + D sin 3t.
from Exercise 1.6(a) its general solution is To find a particular integral, try x = p1 t + p0 . Substi-
u = C cos 2x + D sin 2x. tuting into the differential equation gives
The derivative is 9(p1 t + p0 ) = 3(1 − πt).
u = −2C sin 2x + 2D cos 2x. Hence p1 = − π3 , p0 = 13 , and a particular integral is
xp = − π3 t + 31 .
(a) The condition u(0) = 1 gives C = 1. The condi-
tion u (0) = 0 gives D = 0. The required solution is Therefore the general solution is
therefore x = C cos 3t + D sin 3t − π3 t + 31 ,
u = cos 2x. and its derivative is
(b) The ẋ = −3C sin 3t + 3D cos 3t − π3 .
 condition u(0) = 0 gives C = 0. The condi-
tion u π2 = 0 gives C = 0 also. D therefore remains The 1
arbitrary, so there is an infinite number of solutions, of  π condition x(0) = π3 gives C = 0. The condition
ẋ 3 = 0 gives D = − 9 . The solution is therefore
the form
x = − π9 sin 3t − π3 t + 31 .
u = D sin 2x.
 
(c) The condition
 u π2 = 0 gives C = 0. The condi-
tion u π2 = 0 gives D = 0. The required solution is
therefore the zero function
u = 0.
(Alternatively, since the differential equation is homoge-
neous and the initial conditions are both equal to zero,
by the remarks after Theorem 3.1 the solution is the
zero function u = 0.)
(d) The condition
 u(−π) = 1 gives C = 1. The con-
dition u π4 = 2 gives D = 2. The required solution is
therefore
u = cos 2x + 2 sin 2x.

(e) The condition u (0) = 01 gives D = 0. The condi-
 π
tion u 4 = 1 gives C = − 2 . The required solution is
therefore
u = − 12 cos 2x.

151
UNIT 4 Vector algebra
1

Study guide for Unit 4


This unit assumes no previous knowledge of vectors. You will need to know 2
only basic algebra and trigonometry, and how to use Cartesian coordinates
for specifying a point in a plane.
The recommended study pattern is to study one section per study session,
and to study the sections in the order in which they appear.
3

153
Unit 4 Vector algebra

Introduction
We often need to represent physical quantities such as mass, force, velocity,
acceleration, time, etc., mathematically. Most of the physical quantities
that we need can be classified into two types: scalars and vectors. Scalar
quantities are quantities, like mass, temperature, energy, volume and time,
that can be represented by a single real number. Other quantities, like
force, velocity and acceleration, possess magnitude and direction in space,
and cannot be represented by a single real number; they are called vector
quantities.
The definitive vector quantity is displacement. The displacement of a point
specifies the position of the point in space relative to some reference point.
We use the concept of displacement whenever we want to describe spatial
relationships. Consider, for example, the instructions written in blood on a
pirate’s treasure map:
Take five paces due north from the big oak tree, then seven paces due
west, and then dig down for three metres.
This is a specification of a displacement vector — the displacement of the
treasure from a reference point (the big oak tree). In fact, this particular
way of specifying the displacement of the treasure is known as the Cartesian
description of a displacement, although the pirate probably didn’t know
that. Alternatively, there is the so-called polar description of the same
displacement (equating paces with metres):
Starting at the big oak tree, dig for 9.1 paces along a straight sloping
line inclined at 19◦ below the horizontal at a bearing of 54◦ west of north.
This ‘distance (or magnitude) plus direction’ specification will also get you
to the treasure, although less conveniently because it is more difficult to
dig along a sloping line. These two different specifications are shown in
Figure 0.1.

N
5 paces
7 paces
54°

19°

3 metres
9.1 paces

Figure 0.1

Section 1 defines a vector and discusses ways of representing vectors in two


dimensions. Section 3 discusses another way of representing vectors, one
that easily generalizes from two to three (or more) dimensions. Sections 2
and 4 consider ways of operating on and combining vectors — that is, they
provide the fundamentals of vector algebra.

154
Section 1 Describing and representing vectors

1 Describing and representing vector s


Subsection 1.1 explains what scalars and vectors are. Subsections 1.2 and 1.3
then explain how to denote vectors symbolically (i.e. algebraically) and how
to show them in diagrams. Subsection 1.4 explains what is meant by saying
that two vectors are equal to one another, which is a necessary first step
in the development of an algebra for vectors. Subsection 1.5 introduces a
method for representing vectors in two dimensions that can be useful in a
variety of physical situations.

1.1 Scalars and vectors


A scalar is any quantity, such as mass, time, volume and temperature, that
can be represented mathematically by a single real number (and often a unit
of measurement). Real numbers themselves are examples of scalars, and you
can regard the terms scalar and real number as synonymous. Examples of
scalar quantities, quoted to some convenient degree of accuracy, are:
the mass of the Earth, 5.975 × 1024 kilograms;
the temperature of melting ice, 0 degrees Celsius;
my current bank account balance, −153.12 pounds sterling;
pi (π), 3.141 59. . . .
A real number x is defined by two properties: its modulus |x| and its sign. The modulus of a real number
Thus the magnitude of a scalar x is |x|. For example, the magnitude is also called its magnitude.
of my current account balance is |−153.12| pounds = 153.12 pounds, which
sounds a lot better since it doesn’t remind me that I’m in debt. Note that
magnitudes are always non-negative (i.e. positive or zero).
A vector quantity is any quantity, such as force, velocity, displacement, etc.,
that has a magnitude and a direction in space (or, in two dimensions, a
direction in a plane). An example is the velocity of a motor car travelling
on the M4 motorway from London to Bristol with a speed of 95 km per hour The familiar term speed is
in a westerly direction. The magnitude of the velocity vector is 95 (dropping used to mean the magnitude
units for convenience), and the direction of the velocity vector is due west. of a velocity. Speed is a
non-negative scalar.
Thus the specification of a vector consists of:
(a) a non-negative real number, called its modulus or magnitude;
(b) a direction in space.
This unit is mainly concerned with just two vector quantities: displacement
and velocity. Later in the course you will come across other vector quantities
such as force, torque and momentum. Fortunately, all vector quantities obey
exactly the same laws of algebra. Thus what you learn about displacements
and velocities in this unit can be carried over to all vector quantities.

1.2 Vector notation


Vectors are denoted in printed text by bold letters, e.g. v, F. In your
written work, you should denote vectors by drawing either a straight line or
a squiggly line under the letter, e.g. v, F or ∼
v, ∼
F . Thus if a symbol is used
to represent the velocity of an object, then it must be handwritten by you
as either v or ∼
v (but will be printed in the text as v).

It is important that you learn to write vectors using the underlining: if


you do not do so, someone reading your work may not be able to tell
that you are referring to a vector. In particular, you may lose marks!

155
Unit 4 Vector algebra

The modulus or magnitude of the vector v is denoted by |v| or, sometimes, We read |v| as ‘the modulus
where there is no possibility of ambiguity, by v; |v| is a non-negative scalar. of v’ or ‘the magnitude of v’,
or simply ‘mod v’.

1.3 Using arrows to represent vectors


y
A vector can be conveniently represented in a diagram by an arrow, i.e. a 2
straight line with an arrowhead on it. The tail of the arrow may be placed
at some fixed origin, the direction of the arrow is chosen to represent that 1
π
of the vector, and its length is chosen to be proportional to the magnitude 4

of the vector. In Figure 1.1, which uses the origin of the Cartesian coor- O 1 2 x
dinate system as the fixed origin, the shorter arrow represents a vector of
magnitude 1 in the positive
√ x-direction, and the longer arrow represents Figure 1.1
a vector of magnitude 2 2 in a direction at π4 radians (45◦) to the posi-
tive x-direction. (Note that we use the convention that positive angles are y
measured anticlockwise.) If we decide to denote these vectors by letters a 2 b
and b, respectively, then we can also put this information on the diagram,
by writing a and b near the arrowheads, as shown in Figure 1.2. 1

Note that in this course, and commonly elsewhere, the arrows representing a
O 1 2 x
vectors are drawn using thick lines. This helps to distinguish vector arrows
from other arrowed lines such as those representing the coordinate axes
(e.g. Figures 1.1 and 1.2) or those representing compass directions (e.g. Figure 1.2
Figure 1.3).

*Exercise 1.1
Represent the following two vectors on a diagram by arrows:
• vector a has magnitude 3 units and points in the positive y-direction;
• vector b has magnitude 4 units and points in the direction at π3 radians
(60◦) to the positive x-direction.

Vector notation and the use of arrows in diagrams is now illustrated further
by specific reference to displacement vectors and velocity vectors.
Displacement is the position of a point in space relative to some reference
point or origin. For example, the city of Leeds is 296 km from the city of
Bristol in the direction of 15◦ east of north (N 15◦ E). The displacement of
Leeds from Bristol can be specified as the vector It would be wrong to write
s = 296 km, because the
s = 296 km N 15◦ E. left-hand side is a vector
symbol and the right-hand
Here the bold symbol s has been used to denote the displacement. Note that
side is a scalar.
both magnitude and direction are specified: the magnitude of the displace-
ment is |s| = 296 km, and the direction is specified by the compass bearing
N 15◦ E. N s

The displacement s = 296 km N 15◦ E can be represented in a diagram by an


arrow, as shown in Figure 1.3. The length of the arrow represents 296 km, s = 296
which may be shown in the diagram by writing |s| = 296.
15°
For any two points P and Q, we can define the displacement vector from
P to Q: it is the vector whose magnitude is the distance from P to Q and
whose direction is the direction of the straight line from P to Q. A useful
−−→ O
notation for this vector is P Q (see Figure 1.4). In this context the symbol
P Q (without an arrow) represents the length of the straight line joining P
−−→ −−→ −−→ −−→ 0 100 200 300
and Q, i.e. P Q = |P Q|. Note that P Q = QP but P Q = QP (because P Q
−−→ Scale (km)
and QP are in opposite directions).
Figure 1.3
156
Section 1 Describing and representing vectors


Displacement vector PQ
Q
−−→
The displacement vector P Q is the vector whose magnitude is the dis-
tance from P to Q and whose direction is the direction of the straight P
line from P to Q.
Figure 1.4

One query may have occurred to you. What is the displacement vector of a
−−→
point from itself? In other words, what is the vector P P ? Clearly its length
is zero, but what is its direction? The answer is that it does not have one!
We define the zero vector to be the unique vector with magnitude zero
−−→
and no direction. It is denoted by 0. Thus we can conclude that P P = 0.

Zero vector
The zero vector is the unique vector with magnitude zero and no Be particularly careful to
direction. It is denoted by 0. underline the zero vector
(0 or 0 ) in your written work!

A constant velocity is also defined by a magnitude and a direction. For N


instance, in a weather forecast, a typical wind velocity might be 35 knots
from the north-west. It is not sufficient to say that ‘the wind velocity is O
35 knots’; the obvious question about such a statement would be ‘from
which direction?’. The vector v representing this velocity has magnitude 35 45° v | = 35
and direction from the north-west and towards the south-east (since the
air is travelling in the south-easterly direction). It can be represented on a
v
diagram as shown in Figure 1.5. The length of the arrow represents a wind
speed of 35 knots. 0 10 20 30 40
Note that the direction of a vector consists of two attributes: Scale (knots)
(a) an orientation, represented by the slope of the arrow in diagrams like
Figure 1.5
Figures 1.1 to 1.5;
(b) a sense, represented by the arrowhead.
For instance, the arrow representing the velocity 35 knots from the north-
west in Figure 1.5 is a line making an angle of 45◦ anticlockwise from the
south direction (the orientation) and an arrowhead pointing towards south-
east as opposed to north-west (the sense).

*Exercise 1.2
The displacement of Birmingham from Derby is 57 km in the direction
S 30◦ W. The displacement of Leicester from Derby is 32 km in the direction
S 45◦ E.
Draw a diagram, to a suitable scale, representing these two displacements
by arrows.

Exercise 1.3
A car travelling from London along the M1 with speed 70 mph heads in the
direction N 60◦ W near Junction 14.
Represent the velocity of the car by an arrow, drawn to a suitable scale.

157
Unit 4 Vector algebra

1.4 Equality of vector s

Definition
Two vectors are said to be equal if they have the same magnitude and
the same direction.

y
You have seen how to represent a vector by an arrow. This definition of
equality of vectors tells us that the two features needed to define a vector 3
uniquely are its magnitude and direction. This means that any two arrows b
2
drawn at different places on the page but which are equal in length, parallel
and have the same sense, can be used to represent the same vector. For 1
instance, the two arrows in Figure 1.6 are each of length 2 units and point d
in the positive x-direction. They represent two equal vectors, and we write O 1 2 3 x
b = d. In other words, the arrow representing a vector does not have to be
drawn so that its tail is at any particular point. Figure 1.6

Example 1.1
Figure 1.7 shows several vectors represented by arrows drawn to scale. Find
the vector equal to the vector a.

y
c
3 b
2 e
f
d
1

–4 –3 –2 –1 O 1 2 3 4 5 6 x
–1
a
g
–2

–3
h

Figure 1.7

Solution
We are looking for a vector that is equal in length to a (i.e. one unit), parallel
to a and points in the same direction (i.e. the positive x-direction). There
are two arrows (and thus vectors) other than a that point in the positive
x-direction; they are c and h. (The arrow representing d points in the
negative x-direction.) The magnitudes of c and h are 1 unit and 3 units,
respectively. Since the magnitude of a is 1 unit, c = a but h = a.
Note that although a and c are drawn at different places in the (x, y)-plane,
they are equal in magnitude and have the same direction, so they are equal
vectors.

*Exercise 1.4
Which vector in Figure 1.7 is equal to vector b?

158
Section 1 Describing and representing vectors

1.5 Polar representation of two-dimensional vector s


This subsection introduces a systematic way of specifying the magnitude
and direction of a vector in a coordinate system.
You should be familiar with using a two-dimensional Cartesian coordinate
system for specifying the position (x, y) of a point in a plane, and indeed the
same system is commonly used for displaying vectors (as in Figures 1.1, 1.2,
1.6 and 1.7). The plane polar coordinate system, however, is in some
sense a more natural one for specifying vectors since it effectively regards
magnitude and direction as two coordinates.
Let r be a vector on a plane surface. Introduce a Cartesian coordinate
system, and draw the vector as an arrow with its tail at the origin O, as
in Figure 1.8. The magnitude of r is |r| = r, the distance of the tip P of
the arrow from O. The direction of r is specified by the angle φ measured
(usually in radians) anticlockwise from the positive x-axis.

y
r
y = r sin φ P
r = |r|

φ
O x = r cos φ x

Figure 1.8

We have not quite finished the description, because a vector now has many
representations (since rotating the line segment OP through 2nπ, where n is In fact, a vector has an
any integer, leaves it unchanged). To avoid this ambiguity, we shall normally infinite number of
take φ to lie in the range −π < φ ≤ π. (Note that under this convention a representations!
vector below the x-axis has a negative value for φ — see Figure 1.9.)
Thus the endpoint P of a vector r is specified by the two numbers r (a dis-
tance) and φ (an angle). These two numbers r and φ are the plane polar y positive φ
coordinates (or simply polar coordinates) of the endpoint P of the vec-
tor r, when the tail of its arrow is at O. We use the notation r, φ in order
to distinguish polar coordinates from the Cartesian variety, so the vector is φ
now specified as x
r = r, φ. y

You can see from Figure 1.8 that the polar coordinates r, φ of P are related x
to the Cartesian coordinates (x, y) of P by the following formulae: φ
x = r cos φ, y = r sin φ;
2 2 1/2 negative φ
r = (x + y ) , tan φ = y/x.
However, the statement tan φ = y/x does not define φ uniquely since, for Figure 1.9
example, tan φ = tan(π + φ). To pin down the value of φ in the range
−π < φ ≤ π, we can use the two equations
y
sin φ = y/r and cos φ = x/r. quadrant 2 quadrant 1
S A
In practice, when finding the angle φ from the values of x and y, it usually sin > 0 sin > 0
helps to sketch the Cartesian coordinates in the (x, y)-plane so that you can cos < 0 cos > 0
see in which quadrant φ must lie. The signs of sin and cos for angles in sin < 0 sin < 0 x
the four quadrants are shown in Figure 1.10: you will find it useful to know cos < 0 cos > 0
these. (A simple acronym to aid the memory is ‘CAST’: starting from the T C
lower right, and working anticlockwise round the quadrants, the following quadrant 3 quadrant 4
are positive: Cos, All (of sin, cos, tan), Sin and Tan.)
Figure 1.10
159
Unit 4 Vector algebra

Example 1.2
Give the polar representation of the vectors a, b, e and g in Figure 1.11.

e 2
b

g
–1 O 1 x

–1 a

Figure 1.11

Solution
In Figure 1.11 each vector is drawn as an arrow from the origin, so the
polar representation of each vector is given by the polar coordinates of its
endpoint. In some cases we can specify r and φ simply from inspection of
Figure 1.11, but we shall use the above formulae in order to illustrate the
general method.
The endpoint of a has Cartesian coordinates (1, −1), so we have

r = (12 + (−1)2 )1/2 = 2,
√ √
sin φ = y/r = −1/ 2 and cos φ = x/r = 1/ 2.

Thus φ = − π4 radians, i.e. a =  2, − π4 . (Since − π2 < − π4 < 0, the angle
coordinate − π4 indicates that a should lie in the fourth quadrant, which is
confirmed by Figure 1.11.)
Vector b is of length 2 units and points in the positive y-direction. The
Cartesian coordinates of its endpoint are (0, 2), so we have
r = (02 + 22 )1/2 = 2,
sin φ = y/r = 2/2 = 1 and cos φ = x/r = 0/2 = 0.
π
Hence φ = radians (which is obvious from the fact that b points in the
2
positive y-direction). Hence b = 2, π2 .
The endpoint of vector e has Cartesian coordinates (−1, 2), so
√ √ √
r = ((−1)2 + 22 )1/2 = 5, sin φ = 2/ 5 and cos φ = −1/ 5,

giving φ = 2.034 radians. Hence e =  5, 2.034. (Since π2 < 2.034 < π, the
angle coordinate 2.034 indicates that e should lie in the second quadrant,
which is confirmed by Figure 1.11.)
Finally, g is of unit length (i.e. of length 1 unit) and points in the positive
x-direction. The Cartesian coordinates of its endpoint are (1, 0), so
r = (12 + 02 )1/2 = 1, sin φ = 0 and cos φ = 1.
Thus g = 1, 0. (This is an exceptional case where the numerical values of
the coordinates are the same in the two coordinate systems.)

*Exercise 1.5
Complete Table 1.1. Each row should show the Cartesian and corresponding
polar coordinates of a particular point. If any entry is invalid, say so and
explain why.

160
Section 2 Scaling and adding vectors

Table 1.1
Cartesian Polar
coordinates (x, y) coordinates r, φ
(0, −1) 1, − π2 
(1, 1)
4, − π4 
6, π
(−1, −1)
−1, π
108 , exp(0.1π)

*Exercise 1.6
As you saw earlier, the displacement of Leeds from Bristol can be expressed
as s = 296 km N 15◦ E (see Figure 1.3 on page 156). Express this vector in
polar form r, φ using a suitable coordinate system.

The polar representation of vectors can be a useful representation in a variety


of physical situations, as you will see later in the course. (It is generalized
to three dimensions in Unit 23.)

End-of-section Exercises
Exercise 1.7
The following is a list of some physical quantities: temperature, velocity,
volume, energy, force, displacement, time, acceleration. Decide which are
scalar and which are vector quantities.

Exercise 1.8
What are the polar coordinates of a point Q whose Cartesian coordinates
−−→
are (0, −3)? What is the magnitude of the vector OQ where O is the origin
of coordinates?

2 Scaling and adding vector s


This section defines two arithmetic or algebraic operations involving vectors.
The first and simpler of these is the multiplication of a vector by a scalar,
or scaling of a vector. The second is the addition of two vectors to give a
third vector called the resultant of the two vectors.

2.1 Scaling of a vector


Consider vectors c and h in Figure 1.7 (page 158). Both vectors point in
the same direction, but h has a length three times that of c. We say that h
is a scaling of c by the number 3, and we write h = 3c.

161
Unit 4 Vector algebra

Generally, if v is a vector and m is a positive number, then the product mv


is a vector in the same direction as v but with magnitude m|v|, i.e. m times
the magnitude of v. This multiplication of a vector by a scalar is called
scaling or scalar multiplication, and mv is called a scalar multiple of v. For Note that there is no
example, if v has magnitude 4 and points in the positive x-direction, then multiplication sign between
3v has magnitude 12 and points in the positive x-direction also. This is the m and the v. In vector
algebra the dot and cross
illustrated in Figure 2.1. symbols are reserved for other
products, to be discussed in
O O Section 4.
v x 3v x

Figure 2.1 v O

We can also scale a vector v by a negative number. When m is negative,


the vector mv has magnitude |m||v| and points in the opposite direction
to v. A special case is when m = −1. Then the vector (−1)v has the same O –v
magnitude as v but points in the opposite direction; see Figure 2.2. We
normally write (−1)v simply as −v, i.e. (−1)v = −v. Figure 2.2
What happens when we multiply a vector by zero (m = 0)? The above
definitions imply that the result should be a vector with magnitude zero.
You will recall from Section 1 that there is a special vector with magnitude
zero, namely the zero vector, 0. Thus 0v = 0.

Definition
For any vector v and any real number m, the scalar multiple mv is
the vector with magnitude |m||v| which is:
• in the same direction as v if m > 0;
• in the opposite direction to v if m < 0;
• the zero vector (i.e. with unspecified direction) if m = 0.
The multiplication of v by m is called scaling or scalar multiplica-
tion.

Example 2.1
(a) Let u represent the velocity of my car travelling with a speed of 30 mph
along a straight road due north. Write down, in terms of u, the velocity
A B
of a car overtaking me and travelling at 45 mph. If another car is trav-
elling in the opposite direction to me with speed 60 mph, write down
this car’s velocity in terms of u. F C
−−

(b) If ABCDEF is a regular hexagon (Figure 2.3) and, for example, AB
represents the displacement vector from A to B, write down algebraic
E D
relations connecting:
−−→ −−→ −→ −−→
(i) AB and ED; (ii) AF and DC. Figure 2.3

Solution
(a) The velocity vector u has magnitude 30 mph and points due north. The
car overtaking me is travelling in the same direction but has a velocity
of magnitude 45 mph; suppose that its velocity vector is denoted by v
(see Figure 2.4). Then v is parallel to u and has the same sense as u, Two vectors are parallel if
and |v| = 45
30 |u|. Therefore v is just a scaling of u, i.e.
they have either the same, or
opposite, directions.
v = 1.5u.

162
Section 2 Scaling and adding vectors

Now suppose that the velocity of the car travelling in the opposite di-
rection is denoted by w. Then w is parallel to u but has the opposite N
30 |u|. So we can write
sense to u, and |w| = 60
50
w = −2u, v
40
where the negative sign indicates the opposite sense.
30 u
(b) The opposite sides of a regular hexagon are parallel, and all the sides
have the same length. 20
−−
→ −−→
(i) Thus the displacement vectors AB and ED have equal magnitudes 10
and the same direction. So we have
O
−−
→ −−→
AB = ED.
−→ −−→ Figure 2.4
(ii) The displacement vectors AF and DC have equal magnitudes but
opposite directions, thus
−→ −−→ −−→ −→
AF = −DC (or, equivalently, DC = −AF ).

*Exercise 2.1
(a) If d is the displacement vector from Bristol to Leeds, write down in
terms of d the displacement vectors from Leeds to Bristol and from
Leeds to Leeds.
(b) If v represents the velocity of a wind of 35 knots from the north-east,
what vectors represent the following?
(i) A wind of 70 knots from the north-east.
(ii) A wind of 35 knots from the south-west.
(iii) Still air.
(c) Relate the direction and magnitude of −1.5v to those of v, where v is
any given non-zero vector. Do the same for −kv, where k is an arbitrary
positive number. B C
−−→
(d) If ABCD is a parallelogram (Figure 2.5) and, for example, AB repre-
sents the displacement vector from A to B, write down algebraic rela-
A D
tions connecting:
−−
→ −−→ −−→ −−→
(i) AB and DC; (ii) BC and DA. Figure 2.5
(e) If v is any non-zero vector, what are the magnitude and direction of the
1
vector v?
|v|

Unit vector s
1
The vector v in Exercise 2.1(e) is a vector that has magnitude 1 and
|v|
points in the direction of v. It is called the unit vector in the direction of v.
The unit vector in the direction of v is often denoted by the symbol v .

Definition
For any non-zero vector v, the unit vector in the direction of v is the
vector
1
=
v v.
|v|

163
Unit 4 Vector algebra

Unit vectors are often used to denote directions in the plane, or in space.
A particular example is provided by the unit vectors in the positive directions
of the x- and y-axes in the plane Cartesian coordinate system. These unit We shall develop the
vectors are denoted by i and j, respectively, and are called Cartesian unit Cartesian representation of
vectors. vectors in Section 3.

Figure 2.6 shows these Cartesian unit vectors and two other vectors, a and b. y
The vector a has magnitude 2 and points in the positive x-direction; b has
magnitude 3.5 and points in the positive y-direction. The unit vector i has 3.5 b
magnitude 1 and points in the same direction as a. Thus we can write a in 3
terms of i by a scaling:
2
a = 2i. a
1 j
Similarly, we can write b in terms of j:
i
b = 3.5j. O 1 2 3 4 x

Any vector parallel to the x- or y-axis can be written as a scaling of i or j.


Figure 2.6
Note that although i and j are
Exercise 2.2 shown in Figure 2.6 with their
Four vectors, a, b, c and d, of magnitudes 2, 2.5, 3 and 1, respectively, are tails at the origin, this is not
shown in Figure 2.7. The directions of the four vectors are defined by the necessary. They can be drawn
at any convenient position,
arrows. Write down a, b, c and d as scalings of the Cartesian unit vectors
provided only that they are of
i and j. unit magnitude and point in
the positive x- and
y-directions, respectively —
y see, for example, Figure 2.7.
3 c
a
2
b
1

–4 –3 –2 –1 1 2 3 4 x
–1 j
d –2
i
–3

Figure 2.7

Exercise 2.3
Let the unit vectors i and j denote the directions of east and north, respec-
tively. Specify the following vectors as scalings of i and j.
(a) The wind velocity of 35 km per hour due south.
(b) The displacement of Bristol from London (112 miles due west).
(c) The displacement of London from Bristol.

164
Section 2 Scaling and adding vectors

2.2 Addition of vector s


Leeds
d1
What is meant by the addition of vectors? Suppose that we make a journey
from Bristol to Leeds, and then another journey from Leeds to Norwich. The
first journey produces a displacement of d1 and the second a displacement d2
of d2 . The net result of the two journeys is a displacement of d3 from Bristol Norwich
to Norwich. This is illustrated by the triangle of displacements shown in d3
Bristol
Figure 2.8. Displacements are said to add by the triangle rule, and we write
d3 = d1 + d2 . The vector d3 is called the resultant of d1 and d2 .
Figure 2.8
Velocities also add by the triangle rule, and so do forces, accelerations and
all other vector quantities. Thus the triangle rule is also called the vector
addition rule.

Triangle rule or vector addition rule


a+b Q
To add any two vectors a and b: choose an origin O; draw the line b
OP in the direction of a and with length equal to the magnitude of a;
and draw the line P Q in the direction of b and with length equal to
the magnitude of b (as in Figure 2.9). Then a + b is the vector with O P
a
magnitude equal to the length of OQ and with the direction from O
to Q. The vector a + b is called the sum or resultant of a and b. Figure 2.9

Note that the sum of two displacement vectors can also be written using the
notation
−−→ −−→ −−→
OP + P Q = OQ.
Now recall that when discussing displacements we mentioned the zero vector
0 (representing no displacement). Once addition of vectors is introduced, we y
need the zero vector in order to answer questions such as ‘what is i + (−1)i?’. c
Geometrically, no construction is needed when adding the zero vector, which
obeys the rather obvious rule
a + 0 = a.
π
a
Exercise 2.4 O
3
π
4
x
Three vectors a, b and c of magnitudes 3, 2 and 4 are shown in Figure 2.10.
(a) Draw a rough sketch to show the vectors a + b and a + c. b
(b) Sketch the vector −b, and draw a rough sketch to show the addition of
a and −b. Figure 2.10

Exercise 2.4(b) suggests a definition of vector subtraction. To subtract


the vector b from the vector a, we add the vectors a and −b by the triangle
rule of vector addition; that is, in symbols,
a − b = a + (−b).

*Exercise 2.5
A vector a has magnitude 3 units and points in the positive x-direction. A
vector b has magnitude 4 units and points in the positive y-direction. Draw
a diagram showing the vectors a + b and a − b.

165
Unit 4 Vector algebra

Vector addition is commutative, i.e. the order in which we add two vectors
does not matter. This can be illustrated by reference to vectors a and c of y
Exercise 2.4 (see Figure 2.11). The triangle OP1 Q illustrates the addition a Q
−−→ P2
a + c, while triangle OP2 Q illustrates c + a. The same resultant OQ is c a c
obtained in both cases. Thus c ++ c
a
a + c = c + a.

Exercise 2.6
For the particular cases of the vectors a, b, c defined in Exercise 2.4, and O a P1 x
for the scalar m = 2, draw sketches to illustrate the associative property of
vector addition, Figure 2.11
(a + b) + c = a + (b + c),
and the distributive property of scaling over vector addition,
m(a + b) = ma + mb.

An alternative geometrical construction for adding two vectors can be seen


from Figure 2.11. It is called the parallelogram rule. Draw the two
−−→ −−→
vectors OP 1 and OP 2 with the same beginning point O. Complete the
−−→
parallelogram OP1 QP2 . Then the resultant vector is the vector OQ on
the diagonal of the parallelogram. The parallelogram rule gives the same
resultant as the triangle rule.

2.3 Algebraic rules for scaling and adding vector s


Subsections 2.1 and 2.2 showed how to multiply a vector by a scalar and
how to add vectors, i.e. what is meant by mv and a + b. We also saw
illustrations of the commutative, associative and distributive rules. These
are only some of the algebraic rules for manipulating vectors by addition
and scaling. A complete list of these rules, which apply whether or not the
vectors are confined to a plane, is given below.

Notice that these rules say


Algebraic rules for scaling and adding vectors nothing about the
Let a, b and c be vectors, and let m, m1 and m2 be scalars. multiplication of one vector
by another: vector
1 Addition is commutative: a + b = b + a.
multiplication is defined in
2 Addition is associative: (a + b) + c = a + (b + c). Section 4. Nor has anything
3 ma is a vector with magnitude |m||a|, in the same direction as a been said about division by a
when m > 0 and in the opposite direction when m < 0. vector: in fact, division by a
vector is not defined.
4 Scaling is associative: m1 (m2 a) = (m1 m2 )a.
A more abstract approach
5 Scaling is distributive: (m1 + m2 )a = m1 a + m2 a. would be to define a vector to
6 Scaling is distributive over vector addition: m(a + b) = ma + mb. be something that obeys
7 Addition and scaling involving the zero vector are as expected: these rules, then explore the
consequences. This is the
0 + a = a and 0a = 0. approach taken in the
8 Subtraction is defined by a − b = a + (−1)b. second-level pure
mathematics course.
These rules allow us to manipulate algebraic expressions involving scalings
and vector addition in a familiar way.

Example 2.2
Simplify the expression
2(a + b) + 3(b + c) − 5(a + b − c).

166
Section 2 Scaling and adding vectors

Solution
Strict use of the rules requires us to write the expression solely in terms of
addition. So we have
2(a + b) + 3(b + c) − 5(a + b − c)
= 2(a + b) + 3(b + c) + (−5)(a + b + (−1)c) (using Rule 8)
= 2(a + b) + 3(b + c) + (−5)((a + b) + (−1)c) (using Rule 2)
= 2a + 2b + 3b + 3c + (−5)(a + b) + (−5)((−1)c) (using Rule 6 three times)
= 2a + 2b + 3b + 3c + (−5)(a + b) + 5c (using Rule 4)
= 2a + 2b + 3b + 3c + (−5)a + (−5)b + 5c (using Rule 6)
= 2a + (−5)a + 2b + 3b + (−5)b + 3c + 5c (using Rule 1 several times)
= (2 − 5)a + (2 + 3 − 5)b + (3 + 5)c (using Rule 5 four times)
= (−3)a + 0 + 8c (using Rule 7)
= (−3)a + 8c (using Rule 7)
= 8c + (−3)a (using Rule 1)
= 8c + (−1)(3a) (using Rule 4)
= 8c − 3a (using Rule 8).
However, because Rules 1, 2, 4, 5, 6, 7 and 8 are exactly the same as
the familiar rules for manipulating algebraic expressions involving scalar
quantities, we would usually write the solution more succinctly as
2(a + b) + 3(b + c) − 5(a + b − c)
= 2a + 2b + 3b + 3c − 5a − 5b + 5c
= (2 − 5)a + (2 + 3 − 5)b + (3 + 5)c
= 8c − 3a.

In the following exercises, use the more succinct method.

Exercise 2.7
Simplify the expression 4(a − c) + 3(c − b) + 2(2a − b − 3c).

*Exercise 2.8
Find the vector x in terms of a and b in the following vector equations.
(a) 2b + 4x = 7a (b) n(b − a) + x = m(a − b)

In Subsection 1.1 a vector was defined as a quantity having a magnitude and


a direction. In fact, this definition is incomplete in that it does not include
a rule for combining two such quantities. Hence a complete definition of a
vector is as follows.

Vectors
• A vector has magnitude and direction.
• Any two vectors can be added by the triangle rule.
• A vector can be scaled by a real number in such a way that the
above rules apply.

It is a rather surprising fact that so many physical quantities — displace-


ment, velocity, acceleration, force, torque, momentum, to name but a few —
all qualify as vectors under the above simple definition. This is one reason
why the subject of vectors is so important.

167
Unit 4 Vector algebra

End-of-section Exercises y
b
Exercise 2.9
The vectors a and b are represented by the arrows shown in Figure 2.12.
The magnitudes of a and b are 4 and 6, respectively. Draw a sketch to show
the vectors a + b, a − b and 2a + 12 b.
π
a
Exercise 2.10 4
O x
If v = −4.7u, what can you say about the magnitude and direction of v in
terms of the magnitude and direction of the non-zero vector u? Figure 2.12

Exercise 2.11
−−

If ABCD is a quadrilateral, with AB denoting the displacement vector from
−−→ −−→ −−→
A to B, and BC, CD, DA defined similarly, show that
−−
→ −−→ −−→ −−→
AB + BC + CD + DA = 0.

*Exercise 2.12
Two vectors p and q are defined in polar form: p = 3, π2 , q = 4, π. Sketch
p, q and p + q, and give the polar forms of 5p, −q and p + q.
Exercise 2.13
(a) Which of the following proposed general rules is true for the scalar
multiplication of a vector in polar form? (Assume m > 0.)
?
mr, φ = mr, φ,
?
mr, φ = mr, mφ.
(b) Does the following proposed general rule hold for the addition of vectors
in polar form?
?
r1 , φ1  + r2 , φ2  = r1 + r2 , φ1 + φ2 

3 Cartesian components of a vector


So far we have approached vectors, and the laws of vector addition and scal-
ing, geometrically. To add vectors geometrically requires drawing diagrams
representing the vectors by arrows. An alternative, and sometimes more
convenient, algebraic approach to representing vectors is developed in this
section, first in two dimensions and then in three.

3.1 Vector s in two dimensions


We have already seen in Subsection 2.1 how to write vectors that are parallel
y
to the x-axis or y-axis as scalar multiples of the Cartesian unit vectors i
A = (a1, a 2 )
and j, respectively. (Recall that i and j are the unit vectors in the directions C
a
of the positive x- and y-axes, respectively.) Of course, in general, vectors
do not lie parallel to either the x-axis or the y-axis. However, you will see
in this subsection how use of the rules for vector addition and for scalar O B x
multiplication allows us to write any vector in the (x, y)-plane in terms of
the Cartesian unit vectors i and j. Figure 3.1
−→
Consider an arbitrary vector a = OA in the (x, y)-plane, whose tail is at
−→
the origin O, as shown in Figure 3.1. The vector OA is called the position Note that if A were in one of
vector of the point A, and its endpoint is determined by the Cartesian the other three quadrants of
coordinates a1 and a2 of A, i.e. by the distances OB and OC, respectively. the plane, then one or both of
a1 , a2 would be negative.
168
Section 3 Cartesian components of a vector

−−→ −−→
Furthermore, the vectors OB and OC can be written as scalings of the
Cartesian unit vectors i and j:
−−→ −−→
OB = a1 i and OC = a2 j.
Hence the triangle rule (or parallelogram rule) for the addition of vectors
−−→ −−→
allows the vector a to be expressed as the sum of OB and OC, i.e. as
−−→ −−→
a = OB + OC or a = a1 i + a2 j.
The latter is called the component form of a, and the numbers a1 and a2 You may also see these
are called the i- and j-components of a, respectively. numbers referred to as the
x- and y-components of a.
When the tail of the vector a is not at the origin, its components are defined
in an obvious way.

y
q2 C Q
a
a2

p2 B
a1
P

O p1 q1 x

Figure 3.2

Referring to Figure 3.2, the components of a are


a1 = q1 − p 1 and a2 = q2 − p 2 .
A shorter way of writing a vector in component form is as an ordered pair
of numbers, (a1 , a2 ), where the unit vectors i and j are not shown explicitly.
This notation needs to be used with care because the coordinates of a point
in a plane are also denoted in this way, and vectors are conceptually different
from points. To avoid such confusion, in this course the column vector
notation
 
a1 This is the way in which
a= many computer algebra
a2
packages display vectors.
will be used instead. In the text, to save space, the column vector will often
be written as a = [a1 a2 ]T , where the transpose symbol T here changes the
row into a column.

Definition
−−→
A vector a = P Q in the (x, y)-plane, where P is the point (p1 , p2 ) and
Q is the point (q1 , q2 ), has component form
a = a1 i + a2 j,
where a1 = q1 − p1 and a2 = q2 − p2 , and i and j are the Cartesian unit
vectors.
The component form may also be written as
 
a1
a= or a = [a1 a2 ]T .
a2
The numbers a1 and a2 are the (Cartesian) components of a.

169
Unit 4 Vector algebra

*Exercise 3.1
Write each of the vectors in Figure 1.7 (page 158) in the form a = a1 i + a2 j
and as a column vector.

The magnitude of a vector given in component form is found very easily.


For example, the magnitude of the vector a in Figure 3.2 is just
 the length
of the line P Q. This is found by Pythagoras’s Theorem to be a21 + a22 .

Magnitude of a two-dimensional vector in component form


−−→
If a = P Q = a1 i + a2 j, where P and Q have coordinates (p1 , p2 ) and
(q1 , q2 ), respectively, then
 
|a| = a21 + a22 = (q1 − p1 )2 + (q2 − p2 )2 .

Vectors in component form can also be added and scaled very easily, by
making use of the algebraic rules for the scaling and adding of vectors. For
example,
a + b = (a1 i + a2 j) + (b1 i + b2 j)
= a1 i + a2 j + b 1 i + b 2 j
= (a1 i + b1 i) + (a2 j + b2 j)
= (a1 + b1 )i + (a2 + b2 )j.
So, to add two vectors one adds their respective components. Similarly,
ma = m(a1 i + a2 j)
= (ma1 )i + (ma2 )j,
so scaling a vector is achieved by scaling its components.

Adding and scaling two-dimensional vectors in component


form
If a = a1 i + a2 j, b = b1 i + b2 j and m is a scalar, then
a + b = (a1 + b1 )i + (a2 + b2 )j
and
ma = (ma1 )i + (ma2 )j.
Equivalently, using column vector notation,
     
a1 b1 a1 + b 1
+ =
a2 b2 a2 + b 2
and
   
a ma1
m 1 = .
a2 ma2

Exercise 3.2
Figure 3.3 shows four vectors in the (x, y)-plane.

170
Section 3 Cartesian components of a vector

y
3 b

a
1
c
–4 –3 –2 –1 1 2 3 4 5 6 x
–1
d
–2

Figure 3.3

(a) Write down the vectors in component form.


(b) Draw a diagram to verify that the scaling 3.5a is the same when obtained
geometrically or algebraically using components.
(c) Use the triangle rule to obtain the vector a + b. Verify that this vector
is the same as that obtained by adding the component forms of a and b.
(d) Find, algebraically, the components of the vector 2a + b − c. Hence find
the magnitude of the vector 2a + b − c.

*Exercise 3.3
  
p+q −3
(a) Find the numbers p and q if r = ,s= and r = s.
p−q 7
 
1 1
(b) Find the magnitude of the vector t if t = u + v, where u = √
  2 1
1 1
and v = √ .
2 −1
Exercise 3.4 y
The three vectors a, b and c in Figure 3.4 are specified in polar coordinates b
by a
     
a = 2, π3 , b = 3, 34π , c = 1, π6 .
c
(a) What are the magnitudes of the three vectors? x
O
(b) Write down the vectors a, b and c in terms of i and j.
(c) Obtain the vector a + c in terms of i and j. Figure 3.4

y
In Exercise 3.4 the vectors were given in polar coordinate form, which is a
just a systematic way of specifying magnitude and direction. The process
of finding the Cartesian components of a vector given its magnitude and
direction is known as resolving a vector into its components. This is j
essentially what you did in Exercise 3.4(b). Thus given the magnitude |a| φ
of the vector a in Figure 3.5, and its direction φ, we can resolve it into its x
i
components:
a1 = |a| cos φ and a2 = |a| sin φ. Figure 3.5

Conversely, given the components a1 and a2 of a vector, we can specify its You had practice at doing
magnitude and direction: these calculations in
Subsection 1.5.
|a| = (a21 + a22 )1/2 , cos φ = a1 /|a|, sin φ = a2 /|a|.

171
Unit 4 Vector algebra

You will see this idea again in Section 4. For now, note that if you wish to
add two vectors in polar form it will be necessary first to resolve them into y
their Cartesian components (since there is no convenient formula for vector
addition in polar coordinates). v

Exercise 3.5 5
(a) Resolve the vector v of magnitude 5 shown in Figure 3.6 into its Carte- 5π
sian components. 18

(b) Find the magnitudes and directions of the vectors x



a = 3i − j and b = −3i + 3j. Figure 3.6

3.2 Vector s in three dimensions S

Thus far we have discussed vectors in the plane, reaching the component
representation of such vectors in the previous subsection. However, the
world is three-dimensional, and few real problems are restricted to a plane
surface. For example, starting at point A at one corner of the cube shown in B
Figure 3.7, you can reach the opposite corner S by three successive displace-
−→ −−→ −→
ments: AQ + QB + BS. In order to work with such addition of displace- A Q
ments in three dimensions, it is necessary to introduce a three-dimensional
coordinate system. Figure 3.7

A three-dimensional Cartesian coordinate system


Consider a two-dimensional Cartesian coordinate system Oxy. Draw a third
axis, the z-axis, through the origin O, perpendicular to both the x- and y-
axes of the two-dimensional system. This produces a coordinate system with
three mutually perpendicular axes, the x-, y- and z-axes (see Figure 3.8),
intersecting at O. Alternatively, the coordinate system can be characterized
by three planes:
• the (x, y)-plane, which contains the x- and y-axes and is perpendicular
to the z-axis;
• the (x, z)-plane, analogously defined;
• the (y, z)-plane, again analogously defined.
Any point P can be represented uniquely by its perpendicular distances from The x-axes shown in
the (x, y)-, (x, z)- and (y, z)-planes. These distances, called the (Cartesian) Figures 3.8 and 3.9 are meant
coordinates of P , are shown in Figure 3.8. to point out of the plane of
the page.
z-axis z
(y, z)-plane
4
C S
3 (2, 3, 4)
P
R 2
P
p3
1
O
p2 O
B y-axis y
p1 1 2 3
1
A Q 2

x-axis x

Figure 3.8 Figure 3.9

172
Section 3 Cartesian components of a vector

QP , RP and SP are perpendicular to the (x, y)-plane, (x, z)-plane and


(y, z)-plane, respectively. positive z

We denote the point P by the ordered triple of coordinates (p1 , p2 , p3 ), where y

p1 = SP = OA,
x
p2 = RP = OB,
p3 = QP = OC. y
For example, the point (2, 3, 4) is shown in Figure 3.9.
When drawing Figure 3.9 it was necessary to choose one of two possible ways x
for the positive z-direction to be defined; these are shown in Figure 3.10,
where in both cases the y-axis is meant to point into the plane of the page,
away from you. positive z

The usual convention for relating the positive directions of x, y and z is given Figure 3.10
by the following rule, called the right-hand rule. The right hand is held
with the middle finger, first finger and thumb placed (roughly) perpendicular
to each other, and the other two fingers closed (see Figure 3.11). If the
thumb and first finger are pointing in the directions of the positive x- and
y-axes, respectively, then the middle finger is pointing in the direction of
the positive z-axis.
Alternatively, you can think of Figure 3.9 as showing a corner of a room
(with the z-axis pointing upwards). If you are standing in the corner facing
outwards, then the left-hand edge of the floor is the y-axis, and the right-
hand edge is the x-axis. A coordinate system defined in this way is called
a right-handed system. Only right-handed systems will be used in this
course. The systems drawn in Figure 3.9 and the top of Figure 3.10 are
right-handed systems.

clockwise screw
positive z-direction turn moves
in

positive y-direction

(a)

positive x-direction
z

Figure 3.11

An alternative definition of the same positive z-direction is given by the


screw rule, stated as follows. Suppose that we are turning a screw into
a piece of wood; then a clockwise rotation makes the screw move into the
wood (see Figure 3.12(a)). If we turn the screw in the sense from x to y
as shown in Figure 3.12(b), then the direction in which the screw moves is turn x to y
x y
along the positive z-direction.
(b)
For the rest of this unit the screw rule will be used to characterize a right-
handed system, but you should use whichever rule you find easier to apply. Figure 3.12

173
Unit 4 Vector algebra

Exercise 3.6
Decide which of the sets of perpendicular axes in Figure 3.13 define right-
handed coordinate systems.

y y x y
z

O O O O
z z y x

x x z
(a) (b) (c) (d)

Figure 3.13

(The x-axis points out of the plane of the paper in (a) and (b). The z-axis
points into and out of the plane of the paper in (c) and (d), respectively.)

The component form of three-dimensional vector s


The algebraic representation of vectors can be extended to vectors in three
z
dimensions, such as in Figure 3.14. The vector a, drawn from the origin O, is
the position vector of point A with three-dimensional Cartesian coordinates a
A
(a1 , a2 , a3 ). A third Cartesian unit vector k is introduced to represent the a3 k
positive z-direction. We now have three Cartesian unit vectors, i, j and k, O
y
which are perpendicular to each other. The vector a may thus be written
in component form as a1 i
⎡ ⎤
a1
a2 j
a = a1 i + a2 j + a3 k or a = ⎣ a2 ⎦ or a = [a1 a2 a3 ]T .
a3 x k

j
i
Definition
The position vector of a point A relative to the origin O of three- Figure 3.14
dimensional space is the displacement of A from O, i.e. the vector
−→
a = OA.
The i-, j- and k-components of the position vector a are the coordinates These may sometimes be
a1 , a2 and a3 of the point A, respectively. referred to as x-, y- and
z-components.

The components of vectors not based at the origin are defined similarly, as
follows.

Definition
−−→
A vector a = P Q in three-dimensional space, where P is the point Note that the component
(p1 , p2 , p3 ) and Q is the point (q1 , q2 , q3 ), has component form form may also be written as

a1
a = a1 i + a2 j + a3 k, a = a2
a3
where a1 = q1 − p1 , a2 = q2 − p2 , a3 = q3 − p3 , and i, j, k are the or
Cartesian unit vectors. The numbers a1 , a2 , a3 are the (Cartesian)
a = [a1 a2 a3 ]T .
components of a.

174
Section 3 Cartesian components of a vector

As in two dimensions, the operations of vector algebra can be expressed in


terms of components.

Adding and scaling three-dimensional vectors in component


form
If a = a1 i + a2 j + a3 k, b = b1 i + b2 j + b3 k and m is a scalar, then
a + b = (a1 + b1 )i + (a2 + b2 )j + (a3 + b3 )k
and
ma = (ma1 )i + (ma2 )j + (ma3 )k.

The magnitude of a vector in terms of its components a1 , a2 , a3 canbe found


using Pythagoras’s Theorem (see Figure 3.15). The length ON is a21 + a22 ,
and OA2 = ON 2 + N A2 . But OA = |a|, thus

|a| = a21 + a22 + a32 .

A
a
a3 k

a3
O
a1 y
a12+ a 2
a 1i 2

a2 a2 j N
x

Figure 3.15

This can be summarized as follows.

Magnitude of a three-dimensional vector in component form


−−→
If a = P Q = a1 i + a2 j + a3 k, where the points P and Q have coordi-
nates (p1 , p2 , p3 ) and (q1 , q2 , q3 ), respectively, then
 
|a| = a21 + a22 + a32 = (q1 − p1 )2 + (q2 − p2 )2 + (q3 − p3 )2 .

*Exercise 3.7
Given vectors a = i + j + k, b = 2i − 3j − k and c = 3i + k:
(a) express d = 2a − 3b and e = a − 2b + 4c in component form;
(b) find the magnitudes of the vectors d and e;
(c) evaluate |a|, and write down a unit vector in the direction of a;
(d) find the components of a vector x such that a + x = b.

Exercise 3.8
⎡ ⎤ ⎡ ⎤
1 2
Find the magnitude of the vector p = 3 ⎣ 0 ⎦ − ⎣ 3 ⎦.
6 −1

175
Unit 4 Vector algebra

Vector equation of a straight line


One useful application of position vectors (in two or three dimensions) is in
obtaining a vector equation of a straight line.
Example 3.1
Find the position vector of a point T lying on the straight-line segment P Q y
(see Figure 3.16) in terms of the position vectors of P and Q. P
T
p
Solution
t
−→ Q
Let T be any point on P Q (see Figure 3.16). The position vector OT of T q
relative to the origin can also be written, using the triangle rule, as
−→ −−→ −→
OT = OP + P T .
−→ −−→
Now P T = sP Q, for some number s, and the point T traces out the line O x
segment P Q as s varies from 0 to 1. Thus the straight-line segment P Q is
described by the vector equation Figure 3.16
−→ −−→ −−→
OT = OP + sP Q (0 ≤ s ≤ 1).
−−→ −−→ −→
Writing p = OP , q = OQ, t = OT , and noting (using the triangle rule) that
−−→ −−→ −−→
P Q = OQ − OP = q − p, this equation can also be written as
t = p + s(q − p) = (1 − s)p + sq (0 ≤ s ≤ 1).

Note that if the parameter s in Example 3.1 is allowed to range over all the
real numbers (−∞ < s < ∞), then the point T traces out the entire straight
line of which P Q is a segment. Also note that the ideas in Example 3.1 are
easily extended to three dimensions.

Vector equation of a straight line


If P and Q are any two distinct points on a straight line in space,
with position vectors p and q, respectively, with respect to some given
origin, then the vector equation of the straight line is
t = (1 − s)p + sq (−∞ < s < ∞), If 0 ≤ s ≤ 1, then the
equation represents only the
where t represents the position vector of any point on the line. line segment P Q.

*Exercise 3.9
Write down, in component form, the vector equation of the straight line on
which lie the points with Cartesian coordinates (1, 1, 2) and (2, 3, 1).

End-of-section Exercises
Exercise 3.10
Let a = 2i − j, b = i + 3j + 5k and c = j − 2k.
(a) Find the magnitudes of a and b, and describe the direction of a.
(b) Find the vectors a + b, 2a − b and c + 2b − 3a in component form.
(c) What is the endpoint Q of the displacement represented by the vector
2a − b if (0, 2, 3) is its beginning point P ?
Exercise 3.11
Write the vectors 0, i, j and k as column vectors in three dimensions.

176
Section 4 Products of vectors

4 Products of vector s
So far in this unit we have defined two algebraic operations: vector addition
(by the triangle rule) and the scaling of a vector. The addition of vectors can
be usefully applied only to two vectors representing the same type of physical
quantity. For example, the addition of a displacement and a velocity has no
physical meaning. However, vectors representing the same or different types
of physical quantities can be combined in operations that are called the dot
product and the cross product. They are called products because in some
respects they behave like ‘multiplications’ in the algebra of real numbers.
Dot products and cross products of vectors have numerous applications in
geometry, mechanics and electromagnetism.
In this section the dot product and cross product are defined geometrically
and also in terms of components of vectors. The dot product of two vectors
is interpreted in terms of projecting a shadow of one vector onto another,
and is applied to the problem of finding the angle between two vectors or
lines. The cross product of two vectors is interpreted as a vector whose
magnitude is an area. Both dot and cross products can be used in problems
involving finding the areas of plane figures and the volumes of solid objects.

4.1 The dot product

Definition
The dot product of two vectors a and b is
The product a . b is read as
a . b = |a| |b| cos θ, ‘a dot b’.
where θ (0 ≤ θ ≤ π) is the angle between the directions of a and b (see
Figure 4.1). b

The dot product of two vectors is a scalar quantity, i.e. it is a real number:
a . b is the product of the three scalars |a|, |b| and cos θ. So the operation of θ
the dot product combines two vectors to define a scalar, and for this reason
the dot product is also called the scalar product. The angle θ lies in the
range 0 ≤ θ ≤ π: the value of a . b is positive for 0 ≤ θ < π2 , i.e. when θ is a
an acute angle; the value of a . b is negative for π2 < θ ≤ π, i.e. when θ is
obtuse; the value of a . b is zero for θ = π2 , i.e. when θ is a right angle. Figure 4.1

It is important, when writing a dot product, to make sure that the dot
between the vectors is clear.
b

*Exercise 4.1
Three vectors a, b and c of magnitudes 2, 4 and 1 units, respectively, lying
in the same plane, are represented by arrows as shown in Figure 4.2. The 4
angle between the vectors a and b is π3 radians, and that between the vectors c π
b and c is π6 radians. Use the definition of dot product to find the values of 6
1
a . b, b . c, a . c and b . b. π
3

2 a

Figure 4.2

177
Unit 4 Vector algebra

This exercise demonstrates two important properties of the dot product.


(a) If two vectors a and b are perpendicular to each other (i.e. the angle
between them is π2 radians), then since cos π2 = 0,
a . b = |a| |b| cos π2 = 0.
(b) The dot product of a vector with itself gives the square of the magnitude
of the vector, i.e.
a . a = |a| |a| cos 0 = |a|2 .
The converse of (a) also holds: if a and b are two non-zero vectors such
that a . b = 0, then the definition of the dot product tells us that cos θ = 0;
therefore θ = π2 and the vectors are perpendicular.
In the product of real numbers, xy = 0 implies that either x or y (or both)
is zero. In contrast, for the dot product, a . b = 0 gives an extra possibility:
either a or b (or both) is the zero vector, or the angle between a and b is
π
2 radians.

Properties of the dot product


The following are some important properties of the dot product of two vec-
tors. They include the rules for manipulating dot products in algebraic
expressions.

Properties of the dot product


Let a, b and c be vectors, and let m be a scalar. These properties can all be
1 a . b is a scalar. derived from the definition of
the dot product, but the
2 a . b = b . a, i.e. the dot product is commutative. derivations are not given here.
3 a . (b + c) = a . b + a . c and (a + b) . c = a . c + b . c, i.e. the dot
product is distributive over vector addition.
4 (ma) . b = m(a . b) = a . (mb), i.e. a scalar can be ‘moved through’
a dot product.
5 If neither a nor b is the zero vector, then a . b = 0 if and only if a
is perpendicular to b.
6 a . a = |a|2 .

The following example shows how these properties can be used to simplify
expressions.

Example 4.1
Expand the expression x . y, given that x = 2u + v and y = u − 5v. Cal-
culate its value when u and v are perpendicular unit vectors.
Solution
x . y = (2u + v) . (u − 5v)
= (2u) . (u − 5v) + v . (u − 5v) (Property 3)
= (2u) . u + (2u) . (−5v) + v . u + v . (−5v) (Property 3)
= 2(u . u) − 10(u . v) + v . u − 5(v . v) (Property 4)
= 2(u . u) − 9(u . v) − 5(v . v) (Property 2)
Now u . u = |u|2 = 1 and v . v = |v|2 = 1 when u and v are unit vectors.
Furthermore, u . v = 0 when u and v are perpendicular vectors. So when u
and v are perpendicular unit vectors, we have
x . y = 2 − 0 − 5 = −3.

178
Section 4 Products of vectors

*Exercise 4.2
(a) Expand the expression (a + b) . (a − b).
(b) Expand the expression |a + b|2 . Recall that |a|2 = a . a.

Exercise 4.3
Given that a and b are perpendicular unit vectors:
(a) find the value of m such that the two vectors 2a + 3b and ma + b are
perpendicular;
(b) find the value of |c| if c = 3a + 5b.

Finally, a word of caution: (a . b)c is not in general the same as a(b . c). In general, if m is a scalar
The vector (a . b)c is a scaling of c by the number a . b, whereas a(b . c) is a and a is a vector, we can
scaling of a by the number b . c. Clearly these two vectors are not generally write ma or am as
convenient, although ma is
even parallel, let alone equal. For example, if a = b = i and c = j, then more usual; thus a(b . c)
(a . b)c = (i . i)j = j but a(b . c) = i(i . j) = 0. means the same as (b . c)a.

The component form of the dot product


We saw in Section 3 that an arbitrary vector a in three dimensions may be
expressed in terms of the Cartesian unit vectors as
⎡ ⎤
a1

a = a1 i + a2 j + a3 k = a2 ⎦ . k
a3
With this representation, vector addition and scaling become simple alge-
braic operations without any reference to diagrams. The definition of the
dot product was expressed in terms of the magnitudes of two vectors and j
the angle between them. We shall now see how to express the dot product
in terms of components of vectors.
i
First observe that, by definition, i, j and k are unit vectors and are perpen-
dicular to one another (see Figure 4.3). Thus: Figure 4.3
i . j = j . i = 0, i . k = k . i = 0, j . k = k . j = 0; Note that for the
right-handed system shown,
i . i = 1, j . j = 1, k . k = 1. the unit vector i points out of
the plane of the page towards
If two vectors a and b have component forms a = a1 i + a2 j + a3 k and
you.
b = b1 i + b2 j + b3 k, then the dot product of a and b may be written as
(a1 i + a2 j + a3 k) . (b1 i + b2 j + b3 k).
We can now apply Properties 3 and 4 of the dot product and the above rules
for combining i, j and k to this expression to obtain a very simple formula
for the dot product of vectors in component form. Specifically, we have
(a1 i + a2 j + a3 k) . (b1 i + b2 j + b3 k)
= a1 i . (b1 i + b2 j + b3 k) + a2 j . (b1 i + b2 j + b3 k) + a3 k . (b1 i + b2 j + b3 k)
= a1 i . b 1 i + a1 i . b 2 j + a1 i . b 3 k
+ a2 j . b 1 i + a2 j . b 2 j + a2 j . b 3 k
+ a3 k . b 1 i + a3 k . b 2 j + a3 k . b 3 k
= a1 b1 (i . i) + a1 b2 (i . j) + a1 b3 (i . k)
+ a2 b1 (j . i) + a2 b2 (j . j) + a2 b3 (j . k)
+ a3 b1 (k . i) + a3 b2 (k . j) + a3 b3 (k . k)
= a1 b 1 + a2 b 2 + a3 b 3 .

179
Unit 4 Vector algebra

This extremely important formula is worth remembering.

Component form of the dot product


If a = a1 i + a2 j + a3 k and b = b1 i + b2 j + b3 k, then
a . b = a1 b 1 + a2 b 2 + a3 b 3 .

*Exercise 4.4
If a = 4i + j − 5k and b = i − 3j + k, show that a . b = −4. What does the
negative sign tell us?

The angle between two vector s


The component form of the dot product has an important application in
calculating the angle between two vectors. You have already seen that if
a . b = 0 and neither a nor b is zero, then a and b are perpendicular. For
instance, if a = 2i − j and b = 2i + 4j, then a . b = (2 × 2) + (−1 × 4) = 0,
so the angle between a and b is π2 radians. In general, the equation defining
the dot product of a and b, i.e. a . b = |a| |b| cos θ, gives the following simple
expression for finding the angle between a and b.

Angle between two vectors


The angle θ between any two non-zero vectors a and b is given by
a.b a1 b 1 + a2 b 2 + a3 b 3
cos θ = = 2  ,
|a| |b| a1 + a22 + a32 b12 + b22 + b32
where 0 ≤ θ ≤ π.

Example 4.2

(a) Find the angle between the vector a = i + 3k and the x-axis.
√ √
(b) Find the angle between the vectors a = i + 3k and b = 3i − 2j + 3k.
√ √
(c) Show that c = −2 3i + 2k is perpendicular to a = i + 3k.
Solution
(a) The direction of the x-axis is the same as the direction of i, and the
angle θ between a and i is given by
a.i a1 1 1
cos θ = = =√ = .
|a| |i| |a| 1+3 2
Thus the angle between a and the x-axis is π3 radians.
√ √
(b) We have |a| = 1 + 3 = 2, |b| = 3 + 4 + 9 = 4 and
√ √ √
a . b = (1 × 3) + (0 × −2) + ( 3 × 3) = 4 3.
Therefore the angle θ between a and b is given by
√ √
4 3 3
cos θ = = ,
2×4 2
so θ = π6 radians.

180
Section 4 Products of vectors

(c) To test whether a and c are perpendicular, we calculate their dot prod-
uct:
√ √
a . c = (i + 3k) . (−2 3i + 2k)
√ √
= (1 × −2 3) + (0 × 0) + ( 3 × 2)
= 0.
Since a . c = 0 and a and c are non-zero vectors, c is perpendicular
to a.

*Exercise 4.5
Consider the vectors
a = 2i − 3j + k and b = −i + 2j + 4k.
Find the magnitudes of a and b, and the angle between them.

Resolving a vector into components


The dot product has a useful geometric interpretation.

*Exercise 4.6
If a = a1 i + a2 j + a3 k, find the values of a . i, a . j and a . k.

The solution to Exercise 4.6 shows the important fact that the i-component
of any vector a may be found by taking the dot product a . i. The j- and a A
k-components can be found similarly (by taking dot products with j and k,
respectively).
We can also find the components of a vector in other directions. Suppose P
−→ q
u
that a vector a, represented by OA, makes an angle θ with a unit vector u
(see Figure 4.4). Draw the line AP perpendicular to the direction of u. O
Then the distance OP is seen from simple trigonometry to be |a| cos θ. Now
observe that the dot product of a and u is Figure 4.4
a . u = |a| |u| cos θ = |OP | (since |u| = 1).
The distance OP represents the component of a in the direction of u. Note that a . u will be
negative if θ > π2 , i.e. if P and
u lie on opposite sides of O.
Definition
The component of a vector a in the direction of an arbitrary unit
vector u is a . u.

*Exercise 4.7
Consider the vectors
a = 2i − 3j + k and b = −i + 2j + 4k.
(a) Which of the following vectors is perpendicular to a?
c = −i + j + 3k, d = −2i + k, e = −i − j − k.
(b) Find the component of the vector a + 2b in the direction of the line
joining the origin to the point (1, 1, 1).

181
Unit 4 Vector algebra

Resolving vectors will be a vital technique in subsequent units, and some-


times you will need to be able to resolve a vector into components in di-
rections other than horizontal and vertical. For example, suppose that two N j
forces, N and W, are acting at a point on an inclined plane (see Figure 4.5). i
These forces can be represented by vectors, and you will see that it may
be convenient to take axes as shown, with i pointing up the plane. It is W
then necessary to be able to resolve N and W into components along and
perpendicular to the plane. Figure 4.5
The dot product method of obtaining components always works, but a ge-
ometric view is also useful. This follows because the component of a vector
a in the direction of a unit vector u is
a . u = |a| cos θ,
where θ is the angle between a and u. We summarize the method as a
procedure.

Procedure 4.1 Resolving a vector into components


Given a vector a and a unit vector u, to find the component of a in
the direction of u, do the following.
• Find (usually from a diagram) the angle θ between a and u (with
0 ≤ θ ≤ π).
• The component of the vector a in the direction of the unit vector
u is |a| cos θ.
• If necessary (for example, if θ > π2 ), use the trigonometric formulae
from the Handbook to simplify the result.

The following example uses Cartesian unit vectors that are not horizontal
and vertical.

Example 4.3
Suppose that the unit vector i points up a plane which is inclined at an angle
α to the horizontal, and the unit vector j is perpendicular to the plane, as
shown in Figure 4.6. Find the i- and j-components of the vectors N and W. N
π−a π
Solution 2
+a
a j
It is easy to resolve the vector N into its component form:
a W i
N = 0i + |N|j = |N|j.
For the vector W, we note from the geometry of the diagram that the angles Figure 4.6
between W and i, and W and j, are given by π2 + α and π − α, respectively.
Applying Procedure 4.1 twice (with u = i and then u = j) allows us to j
resolve W into its component form: v
W= |W| cos( π2
+ α) i + |W| cos(π − α) j π
6
= −|W| sin α i − |W| cos α j. i
π
So the i- and j-components of N are 0 and |N| respectively, and those of W 6
are −|W| sin α and −|W| cos α.

*Exercise 4.8 w
The two-dimensional vectors v and w in Figure 4.7 have magnitudes 1.5
and 2, respectively. Resolve v and w into their i- and j-components. Figure 4.7

182
Section 4 Products of vectors

Exercise 4.9
In Figure 4.8 the point P lies on a line making an angle α with the x-axis.
The vectors a, b, c, d have magnitudes 1, 1.5, 1.5 and 2, respectively, and
point in the directions shown.

c
b

P
d
j
α a
i x

Figure 4.8

Resolve each of these vectors into their i- and j-components.

*Exercise 4.10
Figure 4.9 shows a configuration similar to Figure 4.8, but with the unit
vectors i and j aligned along and perpendicular to the line, respectively.

j
y
i
c
b

P
d

α a
x

Figure 4.9

Resolve each of the vectors a, b, c and d into their i- and j-components.

*Exercise 4.11
The vectors p, q and r in Figure 4.10 have magnitudes 2.5, 3 and 2.5,
respectively. Resolve p, q and r into their i- and j-components.

p j

α
β γ i x

q
r

Figure 4.10

183
Unit 4 Vector algebra

Exercise 4.12
j
The sum of the two-dimensional vectors a, b, c in Figure 4.11 is the zero vec- a
tor, and |c| = 2. By resolving the vectors into their components, determine
the magnitudes of a and b.
i

π
4.2 The cross product b 4

You have seen that the dot product of two vectors is a scalar (i.e. a real
number). In contrast, the cross product of two vectors is a vector, whose
direction is perpendicular to both. The cross product has numerous appli- c
Figure 4.11
cations in geometry and mechanics, as you will see later in the course.

Definition
The cross product of two vectors a and b is
The product a × b is read as
a × b = (|a| |b| sin θ) 
c, ‘a cross b’.
where θ (0 ≤ θ ≤ π) is the angle between the directions of a and b, and

c is a unit vector perpendicular to both a and b, whose sense is given
by the right-hand screw rule as shown in Figure 4.12. direction
of screw’s
motion
The angle θ between two vectors a and b lies in the range 0 ≤ θ ≤ π, so
sin θ ≥ 0 and hence |a| |b| sin θ ≥ 0. So the cross product of a and b is a
vector with magnitude |a| |b| sin θ and direction defined by 
c. The direction
of 
c is the direction in which the screw in Figure 4.12 would advance when
turned from a towards b through the angle θ. Notice that  c is not defined ^
c b
if a and b are parallel or if a or b is the zero vector; but in these cases
|a| |b| sin θ = 0 so we take a × b = 0. The cross product is also called the
vector product, which stresses the fact that a × b is a vector. θ turn
a to b
The order of writing down a and b is very important. According to the
screw rule, b × a is a vector in the direction opposite to a × b. Figure 4.13 a
shows what would happen to the screw in Figure 4.12 if we turned from b
to a: it would ‘unscrew’. The unit vector d  in the direction of b × a is in
 Figure 4.12
the opposite sense to c, i.e. d = −
c. Hence

b × a = (|b| |a| sin θ) d = −(|b| |a| sin θ) 
c = −(a × b).
direction
*Exercise 4.13 of screw’s
motion
Three vectors u, v and w lie in the (x, y)-plane. Their magnitudes are 2, 3
and 4 units, respectively, their directions make angles π6 , π3 and π6 radians,
respectively, with the positive x-axis, and they have positive j-components.
Use the definition of the cross product to find the vectors u × v, u × w and
v × w. b

Exercise 4.13 illustrates an important property of the cross product. If two θ turn
vectors a and b are parallel, then the angle θ between their directions is zero b to a
or π radians, so the cross product of a and b is the zero vector, because the ^
magnitude of the vector, i.e. |a| |b| sin θ, is zero. The converse also holds: if d a
a and b are two non-zero vectors such that a × b = 0, then the definition
of the cross product tells us that sin θ = 0; therefore θ = 0 or θ = π, and the Figure 4.13
vectors are parallel. We can also deduce that
a × a = 0 for any vector a.

184
Section 4 Products of vectors

So we can test for perpendicular vectors by using the dot product and for
parallel vectors by using the cross product.

Properties of the cross product


The following are some important properties of the cross product of two
vectors. They include the rules for manipulating cross products in algebraic
expressions.

Properties of the cross product


Let a, b and c be vectors, and let m be a scalar.
1 a × b is a vector.
2 b × a = −(a × b).
3 a × (b+c) = (a × b)+(a × c) and (a+b) × c = (a × c)+(b × c),
i.e. the cross product is distributive over vector addition.
4 (ma) × b = m(a × b) = a × (mb), i.e. a scalar can be ‘moved
through’ a cross product.
5 If neither a nor b is the zero vector, then a × b = 0 if and only if
a and b are parallel.
6 a × a = 0.
7 In general, a × (b × c) = (a × b) × c.

These properties can all be derived from the definition of the cross product,
but the derivations are not given here. Note in particular Property 2: the
cross product is not commutative — the order does matter.

The component form of the cross product

*Exercise 4.14
(a) Show that i × j = k, j × k = i and k × i = j.
(b) Calculate j × i, k × j and i × k.
(c) Calculate i × i, j × j and k × k.
(d) Expand and simplify
(i + k) × (i + j + k) and (i × (i + k)) − ((i + j) × k).

The cyclic pattern of the products i × j, j × k, k × i and of the products


i × k, k × j, j × i, as demonstrated in Exercise 4.14, can be remembered
using Figure 4.14. For example, if we go round the circle clockwise starting
at i, we have
i
i × j = k, j × k = i, k × i = j.
However, if we go in an anticlockwise direction, the cross products are neg-
ative:
k j
i × k = −j, k × j = −i, j × i = −k.
If two vectors a and b have component forms a = a1 i + a2 j + a3 k and Figure 4.14
b = b1 i + b2 j + b3 k, then the cross product a × b may be written as

185
Unit 4 Vector algebra

a × b = (a1 i + a2 j + a3 k) × (b1 i + b2 j + b3 k)
= a1 i × (b1 i + b2 j + b3 k)
+ a2 j × (b1 i + b2 j + b3 k)
+ a3 k × (b1 i + b2 j + b3 k) (using Property 3)
= a1 i × b 1 i + a1 i × b 2 j + a1 i × b 3 k
+ a2 j × b 1 i + a2 j × b 2 j + a2 j × b 3 k
+ a3 k × b 1 i + a3 k × b 2 j + a3 k × b 3 k (using Property 3)
= a1 b2 (i × j) + a1 b3 (i × k)
+ a2 b1 (j × i) + a2 b3 (j × k)
+ a3 b1 (k × i) + a3 b2 (k × j) (using Properties 4 and 6)
= (a2 b3 − a3 b2 )i + (a3 b1 − a1 b3 )j + (a1 b2 − a2 b1 )k (using the results above).
We highlight this important formula.

Component form of the cross product


If a = a1 i + a2 j + a3 k and b = b1 i + b2 j + b3 k, then
a × b = (a2 b3 − a3 b2 )i + (a3 b1 − a1 b3 )j + (a1 b2 − a2 b1 )k
⎡ ⎤
a2 b 3 − a 3 b 2
= ⎣ a3 b 1 − a 1 b 3 ⎦ .
a1 b 2 − a 2 b 1

This formula is not easy to remember or use. For this reason, simpler meth- Another quick way to
ods have been devised, such as the following, which is known as Sarrus’s evaluate cross products is to
Rule. Given two vectors a = [a1 a2 a3 ]T and b = [b1 b2 b3 ]T , draw a use determinants. This
method is introduced in
tableau with i, j and k in the top row, then repeat i and j. In the second Unit 9 when we discuss
row do the same with the components of a, and in the third row with those determinants. If you already
of b. Then following the diagonal lines as shown, and multiplying the en- know this method, then we
tries, gives the corresponding components of the cross product a × b, which suggest that you continue to
are the elements on the fourth row of the tableau. use it.

i j k i j
b b " b " "
b b" b" "
bb "" bb " "" bb
a1 a2 a3 a1 " a2
" b " b " b
" b" b" b
" "b "b bb
b1 " b2 " b b3 " b b1 b2
" " " b b b
" " " b b b
"" "" "
" b
b bb bb
−a2 b1 k −a3 b2 i −a1 b3 j a2 b 3 i a3 b 1 j a1 b 2 k

(The diagonals pointing to the right yield positive terms, while those point-
ing to the left have a minus sign.)

Example 4.4
If a = 2i + j − k and b = i − 3j + 4k, find a × b.
Solution
Since a1 = 2, a2 = 1, a3 = −1 and b1 = 1, b2 = −3, b3 = 4, the formula
above gives
a × b = ((1 × 4) − (−1 × −3))i
+ ((−1 × 1) − (2 × 4))j
+ ((2 × −3) − (1 × 1))k
= i − 9j − 7k.

186
Section 4 Products of vectors

Alternatively, using the tableau, we have

i j k i j
b b " b " "
b b" b" "
bb "
" bb "
" bb ""
2 1 −1 2 1
" b " b " b
" b" b" b
"" "" bb "" bb bb
1 −3 4 1 −3
" " " b b b
" " " b b b
"" "" "" bb bb bb
−k −3i −8j 4i −j −6k

so a × b = i − 9j − 7k, as before.

*Exercise 4.15
If a = 2i − 3j + k, b = −i + 2j + 4k and c = −4i + 6j − 2k, find a × b, a × c
and b × c. From your results, what can you say about a and c?

*Exercise 4.16
If a = 2i + 2j + k and b = 4i + 4j − 7k, find a unit vector whose direction
is perpendicular to the directions of both a and b.

b
We close the section, and the unit, with some useful geometric applications
of the cross product. The following example is the first step.

Example 4.5 θ
Any two non-zero and non-parallel vectors a and b define a parallelogram, a
as shown in Figure 4.15. Express the area of the parallelogram in terms of
a × b. Figure 4.15
Solution
The area A of the parallelogram defined by the two vectors a and b is a
b
the same as the area of the rectangle of height |b| sin θ and width |a| (see
Figure 4.16). Thus A = |a| |b| sin θ, and this is the magnitude of a × b. So b sin θ
A = |a × b|.
θ
a
Area of a parallelogram
The area of a parallelogram with sides defined by vectors a and b is Figure 4.16
|a × b|.
b
This idea is easily extended for the area of a triangle. Any two non-zero,
non-parallel vectors a and b define a triangle (see Figure 4.17). The area of
this triangle is half that of the corresponding parallelogram, so it is 12 |a × b|.

a
Area of a triangle
The area of a triangle with sides defined by vectors a and b is 12 |a × b|. Figure 4.17

187
Unit 4 Vector algebra

Using the formula for the area of a parallelogram, it is easy to find the
volume of a parallelepiped (see Figure 4.18). This is given by A parallelepiped is like a
distorted brick. All of its
volume of parallelepiped = base area × vertical height h faces are parallelograms.
= |(a × b) . c|.
Here we have made use of the fact that the base is a parallelogram (assumed
to be in the (x, y)-plane) defined by the vectors a and b. The base therefore c
has an area equal to the magnitude of a × b. Now the vertical height h h
is the component of the vector c in the direction of the Cartesian unit b
vector k pointing vertically upwards, i.e. it is the z-component of c, given
a
by c . k = k . c. So the volume of the parallelepiped is |a × b|(k . c). But
the vector product a × b points vertically upwards and can therefore be
Figure 4.18
expressed as |a × b| k. Hence the volume of the parallelepiped is
|a × b|(k . c) = (|a × b| k) . c = (a × b) . c.
Of course, the scalar (a × b) . c can be negative if one of the defining vec- The scalar quantity
tors a or b is chosen to be in the opposite direction to the one chosen in (a × b) . c is an example of a
Figure 4.18, or if the order of the cross product is reversed. The modulus scalar triple product.
signs in the formula |(a × b) . c| ensure that the volume comes out positive.

End-of-section Exercises
Exercise 4.17
(a) Is a . b a vector?
(b) Can a . b be negative?
(c) What is special about a and b if a . b = |a| |b|?
(d) If a . b = 0, what can you say about a and b?
(e) If a × b = 0, what can you say about a and b?

Exercise 4.18
Suppose that the vectors r and s are directed towards north and north-east,
respectively, and define r × s = t.
(a) What is the direction of t?
(b) In what direction is s × r?
(c) In what direction is t × r?
(d) If |r| = |s| = 1, what is |t|?
(e) Calculate the vector t × (r × s).
(f) If |r| = |s| = 1, what is the value of r . s?
(g) If |r| = |s| = 1, what is the value of s . (t × r)?

Exercise 4.19
Find the value of (a × b) . a for any non-zero vectors a and b.

188
Outcomes

Outcomes
After studying this unit you should be able to:
• understand the meaning of the terms scalar, vector, displacement vector,
unit vector and position vector, and know what it means to say that two
vectors are equal;
• use vector notation and represent vectors as arrows on diagrams;
• use the plane polar coordinate representation of the magnitude and di-
rection of a vector, and convert between the polar coordinates and the
Cartesian coordinates of the endpoint of a vector drawn from the origin;
• scale a vector by a number, and add two vectors geometrically using the
triangle rule (or the parallelogram rule);
• resolve a vector into its Cartesian components, and scale and add vectors
given in Cartesian component form;
• calculate the dot product (scalar product) and cross product (vector
product) of two given vectors;
• determine whether or not two given vectors are perpendicular or parallel
to one another;
• determine the magnitude of a vector and the angle between the directions
of two vectors;
• write down the vector equation of a given straight line;
• resolve a vector in a given direction;
• manipulate vector expressions and equations involving the scaling, ad-
dition, dot product and cross product of vectors;
• use the cross product to determine the area of a parallelogram or triangle.

189
Unit 4 Vector algebra

Solutions to the exercises

Section 1
Cartesian Polar
1.1 coordinates (x, y) coordinates r, φ
 
y (0, −1) 1, − π
√ π2 
b (1, 1) 2, 4
3 a √ √  
(2 2, −2 2) 4, − π4
(−6, 0) 6, π
2 √ 
(−1, −1) 2, − 34π
1 −1, π
π (2.003 × 107 , 9.797 × 107 ) 8
10 , exp(0.1π)
3

O x The entry in row 6 is an invalid entry because r must


1 2 3
be non-negative.

1.6 The most obvious choice is the Cartesian coordi-


nate system with origin at Bristol (see Figure 1.3 on
1.2 page 156), the x-axis pointing east and the y-axis point-
N ing north. Then r = 296 and φ = π2 − 15 π 5π
180 = 12 . Hence

Derby
s = 296, 12 . Another choice would have the origin at
Bristol but the x-axis pointing from Bristol to Leeds,
in which case you would have s = 296, 0. (Infinitely
45° 32 km many other choices are possible.)
30°
1.7 Scalar quantities: temperature, volume, energy,
57 km time.
Leicester
Vector quantities: velocity, force, displacement, accel-
eration.

1.8 Since (0, −3) lies on the negative part of the y-axis,
we can immediately write down the polar coordinates
−−→ −−→
0 10 20 30 of OQ as 3, − π2 , so |OQ| = 3.
Birmingham Scale (km) Alternatively, using the formulae
r = (02 + (−3)2 )1/2 = 3,
sin φ = −3/3 = −1, cos φ = 0
1.3 gives the same results.
N

Section 2
70 mph
60°
2.1 (a) Leeds to Bristol: −d.
Leeds to Leeds: 0.
(b) (i) 2v (ii) −v (iii) 0
(c) The vector −1.5v has magnitude 1.5|v| and direc-
tion opposite to v.
0 10 20 30 40 50 60 70
The vector −kv (k positive) has magnitude k|v| and
Scale (mph) direction opposite to v.
−−→ −−→
(d) (i) The vectors AB and DC are equal in length
and parallel, and point the same way (i.e. have the same
1.4 f = b, as both are of length 2 units and both point direction). Thus
−−→ −−→
in the positive y-direction. AB = DC.
−−→ −−→
(ii) The vectors BC and DA are equal in length and
parallel, but point in opposite directions. Thus
1.5 The completed table is as follows. −−→ −−→ −−→ −−→
BC = −DA (or, equivalently, DA = −BC).

190
Solutions to the exercises

1 1 2.6 The following sketch illustrates the associative


(e) v is a scaling of v by the positive scalar m = .
|v| |v| property.
1
The direction of v is thus the same as that of v, and
|v|
1
the magnitude is m|v| = |v| = 1. y
|v|
c
(a + b) + c
= a + (b + c)
2.2 a = 2i, b = −2.5i, c = 3j, d = −j. c
b+ c c
c
b+
2.3 (a) −35j (where |j| represents 1 km per hour). O
a
x
(b) −112i (where |i| represents 1 mile).
b π b π
(c) 112i (where |i| represents 1 mile). 3 3
a+b

2.4 (a) y (To evaluate (a + b) + c, we go first to a + b (in


c a+c c the lower quadrant) and then add c. To evaluate
a + (b + c), we go first to a (along the x-axis) and then
add b + c.)
The following sketch illustrates the distributive prop-
erty.
a
O
x
y
b
a+b a 2a
b O π π x
4 4
b
(b) y b a+b
2b
a + (– b)
–b –b 2(a + b)
= 2a + 2b

O a x

b 2.7 4(a − c) + 3(c − b) + 2(2a − b − 3c)


= 4a − 4c + 3c − 3b + 4a − 2b − 6c
2.5 = 8a − 5b − 7c
y

b a+b b 2.8 (a) 2b + 4x = 7a, therefore


4x = 7a − 2b,
so
x = 74 a − 12 b.
(b) n(b − a) + x = m(a − b), therefore
O x = m(a − b) − n(b − a)
a x
= m(a − b) + n(a − b)
= (m + n)(a − b).

–b a–b –b

191
Unit 4 Vector algebra

2.9 2.13 (a) The first rule is true and the second is false
y (scalar multiplication does not change direction).
a+b
b b (b) The proposed rule does not hold. (Consider r + r,
for example, where r = r, φ. The proposed rule gives
2a + 12 b r + r = 2r, 2φ, whereas actually r + r = 2r = 2r, φ.)
1
2b (There is an algebraic rule for adding vectors in polar
π π
a 4 4 form, but it is rather unwieldy. This is one reason why
O
2a x Section 3 introduces the Cartesian representation of a
vector, for which there is a simple algebraic rule for the
addition of vectors.)

a–b –b Section 3
3.1 a = i = [1 0]T
b = 2j = [0 2]T
2.10 The magnitude of v is 4.7 times the magnitude c = i = [1 0]T (= a)
of u. v is parallel to u, but the sense of v is opposite
to the sense of u, i.e. v and u have opposite directions. d = −2i = [−2 0]T
e = −i + 2j = [−1 2]T
2.11 By the triangle rule, f = 2j = [0 2]T (= b)
−−→ −−→ −→ g = i − j = [1 −1]T
AB + BC = AC,
−−→ −−→ −→ h = 3i = [3 0]T
CD + DA = CA.
−→ −→
But CA = −AC, so we have
−−→ −−→ −−→ −−→ −→ −→ 3.2 (a) a = 3i + j, b = −i + 3j, c = −3i − 2j,
AB + BC + CD + DA = AC − AC = 0. d = 3i + j (= a).

C (b)
y
B
3.5
3.5 a
A
D


−→ −−→ −→ 1
Alternatively, AB + BC = AC. Hence a
−−→ −−→ −−→ −→ −−→ −−→
AB + BC + CD = AC + CD = AD,
3 10.5 x
so
−−
→ −−→ −−→ −−→ −−→ −−→
AB + BC + CD + DA = AD + DA = 0, 3.5a = 3.5(3i + j) = 10.5i + 3.5j, as in the diagram.
−−→ −−→
since AD = −DA.
(There are various other possible arguments!) (c) y
4
b
2.12 The vectors p, q and p + q are sketched below. a+b
3
y

p+q p 〈3, π 〉 2
2

φ 1
q a
〈4, π 〉 x
O 1 2 3 4 x
π
5p = 15, 2 , −q = 4, 0.
a + b = (3i + j) + (−i + 3j) = 2i + 4j, as in the diagram.
Since the √ directions of p and q are at right angles,
|p + q| = 32 + 42 = 5 and φ = π2 + arctan 43 = 2.498 (d) 2a + b − c = 2(3i + j) + (−i + 3j) − (−3i − 2j)
radians, so = 8i + 7j.
√ √
p + q = 5, 2.498. Thus |2a + b − c| = 82 + 72 = 113.

192
Solutions to the exercises

3.3 (a) Given r = s, we can equate the corresponding 3.7 (a) d = 2(i + j + k) − 3(2i − 3j − k)
components. Thus = −4i + 11j + 5k,
p + q = −3 and p − q = 7,
e = (i + j + k) − 2(2i − 3j − k) + 4(3i + k)
which gives p = 2 and q = −5. = 9i + 7j + 7k.
     √ √
1 1 1
(b) t = u + v = √ + (b) |d| = (−4)2 + 112 + 52 = 162 (= 9 2),
2 1 −1  √
  √  |e| = 92 + 72 + 72 = 179.
1 2 2 √ √
=√ = .
2 0 0 (c) |a| = 12 + 12 + 12 = 3.

Hence |t| = 2. A unit vector in the direction of a is
1 1
a = √ (i + j + k).
|a| 3
3.4 (a) |a| = 2, |b| = 3, |c| = 1. (d) If a + x = b, then
(b) Use the formulae x = r cos φ and y = r sin φ (see x = b − a = (2i − 3j − k) − (i + j + k)
Subsection 1.5). = i − 4j − 2k.
First consider the vector a. The Cartesian components Thus the components of x are 1, −4 and −2.
of a are the numbers a1 and a2 given by

a1 = 2 cos π3 = 1, a2 = 2 sin π3 = 3. ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
3 2 1
Thus
√ 3.8 p = ⎣ 0 ⎦ − ⎣ 3 ⎦ = ⎣ −3 ⎦, so
a = a1 i + a2 j = i + 3j. 18 −1 19
Similarly for b and c:
    |p| = (12 + (−3)2 + 192 )1/2 = 3711/2 ( 19.26).
b = 3 cos 3π 3π √3
4 i + 3 sin 4 j = − 2 i +
√3 j,
2
    √
c = cos π6 i + sin π6 j = 23 i + 12 j.
3.9 Relative to the origin of the Cartesian coordinate
(c) We now have system, the two points have position vectors i + j + 2k
 √ √
and 2i + 3j + k. Thus the vector equation of the line is
a + c = i + 3j + 23 i + 12 j
 √ √ t = (1 − s)(i + j + 2k) + s(2i + 3j + k)
= 1 + 23 i + 3 + 21 j = (1 + s)i + (1 + 2s)j + (2 − s)k,
1.866i + 2.232j. where −∞ < s < ∞.

 √
3.5 (a) The components are 3.10 (a) |a| = 22 + (−1)2 = 5,
 √
v1 = 5 cos 5π
18 3.214, |b| = 12 + 32 + 52 = 35.
5π The vector a lies in the (x, y)-plane, and the angle√φ
v2 = 5 sin 18 3.83.
(b) The magnitudes are that it makes with
√ the x-axis is given by cos φ = 2/ 5
√ and sin φ = −1/ 5. Hence φ −0.4636 radians.
|a| = (( 3)2 + (−1)2 )1/2 = 2,
√ (b) a + b = 3i + 2j + 5k,
|b| = ((−3)2 + 32 )1/2 = 3 2 4.243.
2a − b = 3i − 5j − 5k,
To specify the directions, we need a reference direction.
Using the plane polar coordinate convention, we can c + 2b − 3a = −4i + 10j + 8k.
−−→
specify the angle φ with respect to the positive x-axis. (c) The vector P Q is equal to 2a − b. The point Q is
Thus, for vector a, −−→
√ the end of the vector OQ, which is given by
cos φ = a1 /|a| = 3/2, −−→ −−→ −−→
OQ = OP + P Q
sin φ = a2 /|a| = −1/2, = (2j + 3k) + (3i − 5j − 5k)
hence φ = − π6 . = 3i − 3j − 2k,
For vector b,
√ √ so Q is the point (3, −3, −2).
cos φ = −3/(3 2) = −1/ 2,
√ √
sin φ = 3/(3 2) = 1/ 2,
3.11 0 = [0 0 0]T ,
hence φ = 3π
4 . i = [1 0 0]T ,
j = [0 1 0]T ,
3.6 Systems (b), (c) and (d) are right-handed. k = [0 0 1]T .

193
Unit 4 Vector algebra

Section 4 4.6 a . i = (a1 i + a2 j + a3 k) . i


= a1 i . i + a 2 j . i + a 3 k . i
4.1 a . b = |a| |b| cos θ = 2 × 4 × cos π3 = 4, = a1 .
√ Similarly,
b . c = |b| |c| cos θ = 4 × 1 × cos π6 = 2 3,
  a . j = a2 and a . k = a3 .
a . c = |a| |c| cos θ = 2 × 1 × cos π3 + π6
(Notice that this means that the components of a vec-
= 2 cos π2 = 0, tor are given by the dot products of the vector with the
b . b = |b| |b| cos θ = 4 × 4 × cos 0 = 16. Cartesian unit vectors i, j, k.)

4.2 (a) (a + b) . (a − b) = a . (a − b) + b . (a − b) 4.7 (a) a . c = −2, a . d = −3, a . e = 0.


Thus only e is perpendicular to a.
=a.a−a.b+b.a−b.b
=a.a−a.b+a.b−b.b (b) First,
=a.a−b.b a + 2b = j + 9k.
Now a suitable vector along the line joining the origin
(b) |a + b|2 = (a + b) . (a + b)
to the point (1, 1, 1) is i + j + k. The corresponding unit
= a . (a + b) + b . (a + b) vector is u = √13 (i + j + k). The component of a + 2b
=a.a+a.b+b.a+b.b in the direction of this line is
=a.a+a.b+a.b+b.b u . (a + 2b) = √1 (i + j + k) . (j + 9k)
3
= a . a + 2a . b + b . b = 10
√ .
3

4.3 (a) If 2a + 3b and ma + b are perpendicular, then π


4.8 The angle between i and v is 6,
and that be-
(2a + 3b) . (ma + b) = 0. tween j and v is − π π
= π
Also, |v| = 1.5. So the
2 6 3.
Expanding this expression, i-component of v is
√ √
2ma . a + 2a . b + 3mb . a + 3b . b = 0. |v| cos π6 = 3
× 3
= 3 3
2 2 4 ,
Now a and b are perpendicular, so a . b = b . a = 0, and the j-component of v is
and they are unit vectors, so a . a = b . b = 1. Thus |v| cos π3 = 32 × 12 = 34 .
2m + 3 = 0,
The angle between i and w is π2 + π6 = 23π , and that be-
so m = −1.5. tween j and w is π − π6 = 56π . Also, |w| = 2. Moreover,
(b) |c|2 = c . c using the formulae from the Handbook,
= (3a + 5b) . (3a + 5b) cos 23π = cos(π − π3 )
= 9a . a + 15a . b + 15b . a + 25b . b. = cos π cos π3 + sin π sin π3
Thus, since a and b are perpendicular unit vectors, = − cos π3
|c|2 = 9 + 25 = 34, = − 12

so |c| = 34 ( 5.831). and
cos 5π π
6 = cos(π − 6 )
= cos π cos π6 + sin π sin π6
4.4 a . b = (4 × 1) + (1 × −3) + (−5 × 1) = −4.
= − cos π6
The negative sign tells us that the angle between a and √
3
b is between π2 and π radians, i.e. it is an obtuse angle. =− 2 .
The i-component of w is therefore
 √ |w| cos 23π = −2 × 12 = −1,
4.5 |a| = 22 + (−3)2 + 12 = 14, and the j-component of w is
 √ √ √
|b| = (−1)2 + 22 + 42 = 21. |w| cos 56π = −2 × 3
= − 3.
2
Also, In summary,
a . b = (2 × −1) + (−3 × 2) + (1 × 4) = −4, √
3 3

v= 4 i + 34 j and w = −i − 3j.
so if θ is the angle between a and b, then
a.b −4 4
cos θ = =√ √ =− √ .
|a| |b| 14 × 21 7 6
The negative sign means that θ is obtuse, so θ 1.806
radians.

194
Solutions to the exercises

4.9 The technique here is the same for all the vectors. and the j-component of d is
One must find the angle between the vector in question |d| cos( π2 + α) = −2 sin α
and the unit vectors i and j.
(using the usual trigonometric formulae). So
Vector a points vertically downwards, in the direc-
d = −2 cos α i − 2 sin α j.
tion −j. Hence i and a are perpendicular, and
a = 0i − j = −j. 4.10 As in the previous exercise, we must find the
The angle between i and b is α, and the angle between angles between a, b, c, d and the unit vectors i and j.
j and b is π2 − α. Hence the i-component of b is First notice that b points in the direction i, so b = 1.5i.
|b| cos α = 1.5 cos α, Similarly, d points in the direction −i, so d = −2i.
and the j-component of b is Also, c points in the direction j, so c = 1.5j.
|b| cos( π2 − α) = 1.5 sin α,
y
where we have used the formula
cos(β − α) = cos β cos α + sin β sin α
to evaluate cos( π2 − α) (see the Handbook). So
b = 1.5 cos α i + 1.5 sin α j. P α
π
The angle between i and c is 2 + α, and the angle be- π
−a
tween j and c is α. j 2

α a
i
y x

c The remaining vector, a, makes an angle π2 + α with i,


α
and an angle π − α with j. Hence the i-component of a
α is
P
|a| cos( π2 + α) = − sin α,
j and the j-component of a is
α |a| cos(π − α) = − cos α.
i x Therefore
a = − sin α i − cos α j.
Therefore the i-component of c is
|c| cos( π2 + α) = −1.5 sin α, 4.11 Here the angle between i and p is π − α, and
π
where we have used the formula the angle between j and p is 2 − α. Therefore the i-
component of p is
cos(β + α) = cos β cos α − sin β sin α
|p| cos(π − α) = −2.5 cos α,
to evaluate cos( π2 + α) (see the Handbook).
and the j-component of p is
The j-component of c is
|p| cos( π2 − α) = 2.5 sin α.
|c| cos α = 1.5 cos α,
Thus
so
p = −2.5 cos α i + 2.5 sin α j.
c = −1.5 sin α i + 1.5 cos α j.
The angle between i and q is π − β, and the angle be-
Finally, the angle between i and d is π − α, and the
tween j and q is π2 + β. Therefore the i-component of
angle between j and d is π2 + α.
q is
|q| cos(π − β) = −3 cos β,
y
and the j-component of q is
|q| cos( π2 + β) = −3 sin β.
Hence
α q = −3 cos β i − 3 sin β j.
P
d Finally, the angle between i and r is γ, and the angle
π
j 2
−a between j and r is π2 + γ. Thus the i-component of r is
α |r| cos γ = 2.5 cos γ,
i x and the j-component of r is
|r| cos( π2 + γ) = −2.5 sin γ.
Thus the i-component of d is So
|d| cos(π − α) = −2 cos α, r = 2.5 cos γ i − 2.5 sin γ j.

195
Unit 4 Vector algebra

4.12 The i-component of a is clearly zero, while the 4.14 (a) i, j and k are unit vectors forming a right-
j-component is simply |a|. Similarly, the i-component handed system.
of b is −|b| while the j-component is zero. Hence
a = |a| j and b = −|b| i. k
The i- and j-components of c are, respectively,

|c| cos π4 = 2 × √12 = 2
and
√ j
|c| cos( π2 + π4 ) = −2 sin π4 = − 2,
so
√ √ i
c= 2i − 2j.
Given that a + b + c = 0, the sum of all the i-
components of a + b + c must be zero, and so must the Thus, using the definition of the cross product,
sum of all the j-components. Therefore (i-components) i × j = (|i| |j| sin π2 )k = k.

0 − |b| + 2 = 0 Similarly,
and (j-components) j×k=i and k × i = j.

|a| + 0 − 2 = 0. (b) Since (a × b) = −(b × a) for any vectors a and b,
√ we have
Thus we see that |a| = |b| = 2.
j × i = −k, k × j = −i and i × k = −j.
(c) Since a × a = 0 for any vector a, we have
4.13 For the sake of clarity, here is a diagram showing
i × i = j × j = k × k = 0.
u, v and w (where all three vectors start at O) drawn
in the (x, y)-plane. (The z-axis points out of the page.) (d) (i + k) × (i + j + k)
= (i × (i + j + k)) + (k × (i + j + k))
= (0 + k + (−j)) + (j + (−i) + 0)
y
v = −i + k,
(i × (i + k)) − ((i + j) × k)
w = (0 + (−j)) − (−j + i)
= −i.

u
π
6
π
4.15 To compute a × b, we use Sarrus’s Rule:
6
i j k i j
O x
2 −3 1 2 −3
−1 2 4 −1 2
The cross products are all perpendicular to the (x, y)- −3k −2i −8j −12i −j 4k
plane.
so a × b = −14i − 9j + k.
A unit vector in the direction of u × v is k, so
  Similarly for a × c:
u × v = |u| |v| sin π6 k = (2 × 3 × 12 )k = 3k.
The angle between u and w is zero, so i j k i j
2 −3 1 2 −3
u × w = (|u| |w| sin 0) 
c = (2 × 4 × 0) 
c = 0
c = 0.
−4 6 −2 −4 6
A unit vector in the direction of v × w is −k, so
  −12k −6i 4j 6i −4j 12k
v × w = |v| |w| sin π6 (−k) = (3 × 4 × 12 )(−k)
so a × c = 0.
= −6k.
Finally, for b × c:
i j k i j
−1 2 4 −1 2
−4 6 −2 −4 6
8k −24i −2j −4i −16j −6k
so b × c = −28i − 18j + 2k.
Since a × c = 0, and neither vector is zero, the vectors
a and c are parallel. In fact, c = −2a.

196
Solutions to the exercises

4.16 A vector perpendicular to a and b is a × b, which


we can compute using Sarrus’s Rule:
i j k i j
2 2 1 2 2
4 4 −7 4 4
−8k −4i 14j −14i 4j 8k
so a × b = −18i + 18j.
We are asked for a unit vector, so the obvious choice is
1 −18i + 18j
(a × b) = √
|a × b| 18 2
= √12 (−i + j).
(Note that √1 (i − j) is also a unit vector perpendicular
2
to a and b. This can be obtained by considering b × a
rather than a × b.)

4.17 (a) No, it is a scalar.


π
(b) Yes, if the angle between a and b is between 2 and
π radians.
(c) If either a = 0 or b = 0, then indeed a . b =
|a| |b| (= 0). So assume that a and b are both non-
zero. If a . b = |a| |b| cos θ = |a| |b|, then cos θ = 1, so
θ = 0, i.e. a and b are in the same direction.
(d) If a . b = 0, then either a = 0 or b = 0 (or both),
or a and b are perpendicular.
(e) If a × b = 0, then either a = 0 or b = 0 (or both),
or a and b are parallel (but may have opposite senses).

4.18 (a) t is perpendicular to both r and s, and its


sense is vertically down, i.e. into the ground.
(b) Conversely, the sense of s × r is vertically up.
(c) t × r is perpendicular to t (and thus in the hori-
zontal plane) and perpendicular to r, and by the screw
rule its sense is due east.
(d) |t| = |r| |s| sin π4 = √1
2

(e) t × (r × s) = t × t = 0
(f ) r . s = |r| |s| cos π4 = √1
2

(g) s . (t × r) = |s| |t × r| cos π4 (by part (c))


 
= |s| |t| |r| sin π2 cos π4
=1× √1 ×1×1× √1 = 1
2 2 2

4.19 (a × b) . a = 0 for any non-zero vectors a and b,


because a × b is perpendicular to a, and the dot prod-
uct of perpendicular vectors is zero.
(If a and/or b is the zero vector, then the answer is still
zero.)

197
Block 1

Index
absolute error 70, 80 cost 81
absolute error bound 70 cotangent 23
absolute value 9 cross product 184, 185
accumulation 64
addition of vectors 165, 170, 175 De Moivre’s Theorem 31
algebraic rules for scaling and adding vectors 166 death rate 63
analytic solution 83 decay constant 89
angle between vectors 180 decimal places 7
approximate solution 75 decreasing function 38
arbitrary constant 41, 67, 94 definite integral 48
arccos 25 dependent variable 10, 65, 85
arcsin 25 derivative 32, 65
arctan 25 derived function 32
area of a parallelogram 187 difference of two squares 16
area of a triangle 187 differentiable 66
Argand diagram 30 differential equation 66
argument of a complex number 30 explicit solution 87
associated homogeneous equation 125 first-order 66
associativity 166 general solution 67
asymptote 39 homogeneous 91
auxiliary equation 114 implicit solution 87
inhomogeneous 91
base of an exponential 17 linear 91
birth rate 63 non-homogeneous 91
boundary condition 138 order 66
boundary value 138 particular solution 67
boundary-value problem 138 solution 66
bounded above 39 differentiation 31
bounded below 39 of a complex-valued function 37
direct integration 84
Cartesian components of a vector 169, 174 direction field 72
Cartesian coordinates 159, 164, 172 direction of a vector 155, 157
Cartesian unit vectors 164, 174 discrete model 10
Chain Rule 36 discriminant 15, 120
characteristic equation 114 displacement 154, 156
closed form of a recurrence system 10 displacement vector 156
closed interval 8 distributivity 166
codomain of a function 25 division of complex numbers 28
coincident roots 28 domain 69
column vector 169 domain of a function 10, 44
commutativity 166 dominate 144
complementary function 125 dot product 177, 178
complex conjugate 28
complex exponential 31 efficiency 81
complex number 27 equal roots 28
complex-valued function 37 equal vectors 158
component form of a vector 169, 174 equation of a straight line 176
component form of cross product 186 error 70
component form of dot product 180 error bound 9
component of a vector 169, 174, 181 Euler’s formula 31
composite function 21 Euler’s method 78
Composite Rule 36 explicit solution 87
composition of functions 21 exponent of an exponential 17
constant of integration 41 exponential form of a complex number 31
constant-coefficient equation 112 exponential function 17, 18, 129
continuous function 40
continuous model 10 factorization 15
cosecant 23 first derivative 33
cosine 23 first-order differential equation 66

198
Index

formula method for a quadratic equation 14 method of undetermined coefficients 127, 131
function 10 minimum 39
function notation for derivatives 33 modulus of a complex number 29
modulus of a number 9
Gaussian elimination 13 modulus of a real number 155
general solution 110, 120, 121, 125, 126 modulus of a vector 155
general solution of a differential equation 67 multiplication of a vector by a scalar 162
global maximum 39 multiplication of complex numbers 30
global minimum 39
gradient 12, 32, 72 natural logarithm function 18
Newtonian notation 33
half-open interval 8 non-homogeneous differential equation 91
homogeneous differential equation 91 non-homogeneous equation 112
homogeneous equation 112 nth derivative 33
image set of a function 10 nth root 17
imaginary part of a complex number 27 nth-order polynomial 28
implicit differentiation 37
implicit solution 87 open interval 8
increasing function 38 order of a derivative 33
indefinite integral 41, 84 order of a differential equation 66
independent variable 10, 65, 85 orientation of a vector 157
index of an exponential 17 output 64
inhomogeneous differential equation 91
inhomogeneous equation 112, 124 parallel vectors 162
initial condition 68, 136 parallelepiped 188
initial value 68, 136 parallelogram rule 166
initial-value problem 68, 136 parameter 11
input 64 particular integral 125, 131
input–output principle 64 particular solution 111, 135
integer 7 particular solution of a differential equation 67
integral 42, 84 perfect square 16
integrand 41 periodic function 23
integrating factor 94 perpendicular vectors 178
integrating factor method 95 plane polar coordinates 159
integration 41 polar coordinates 29, 159
by parts 47 polar form of a complex number 30
by substitution 45 polynomial function 128
interval 8 polynomial of degree n 28
inverse function 18 population model 63
inverse trigonometric functions 25 position vector 168, 174
irrational number 7 power function 20
power of an exponential 17
Leibniz notation 33 principal value of the argument 30
linear constant-coefficient second-order differential principle of superposition 113
equation 112 Product Rule 35
linear differential equation 91 proportionate birth rate 63
linear function 12 proportionate death rate 63
local maximum 38 proportionate growth rate 64
local minimum 38
log plot 19 quadratic equation 14
logarithm function 18 quadratic function 14
logistic equation 65, 72, 90 Quotient Rule 35
log–linear plot 19
log–log plot 20 rate of change 32
lower bound 39 rational number 7
real number 7, 155
magnitude of a number 9 real part of a complex number 27
magnitude of a real number 155 recurrence system 10
magnitude of a scalar 155 reducing the step size 79
magnitude of a vector 155, 170, 175 relative error 71
maximum 39 repeated roots 28

199
Block 1

resolving a vector into components 171, 181 solving inhomogeneous equations 126
resultant 165 speed 155
right-hand rule 173 stationary point 38
right-handed system 173 steady-state solution 143
roots step length 77
of a polynomial equation 28 step size 77, 79
of a quadratic equation 14 subtraction of vectors 165
sum of 14 sum of vectors 165
rounding 8
rounding error 81 tangent function 23
third derivative 33
Sarrus’s Rule 186 transient 142
scalar 155 transient solution 142
scalar multiple 162 transpose symbol 169
scalar multiplication 162 trial solution 127
scalar product 177 triangle rule 165
scalar triple product 188 trigonometric functions 22
scaling 162, 166 trigonometric identities 26
scaling of a vector 162, 170, 175
scientific notation 7 undetermined coefficients, method of 127, 131
screw rule 173 unit vector 163, 174
secant 23 upper bound 39
second derivative 33
sense of a vector 157 variable 10
separable differential equation 87 vector 155, 167
separation of variables method 87 vector addition 165, 170, 175
sign of a real number 155 vector addition rule 165
significant figures 7 vector equation of a straight line 176
simultaneous linear equations 13 vector product 184
sine 23 vector subtraction 165
sinusoidal function 130 velocity 155, 157
slope 12, 32, 72, 75 volume of a parallelepiped 188
smooth function 40
solution of a differential equation 66 (x, y)-plane 172
solution of a quadratic equation 14
solving homogeneous equations 120 zero vector 157, 165, 166

200

You might also like